Skip to content
Open
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 126 additions & 0 deletions text/preprocessing-api-rework.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
- Start Date: 2021-07.10
- RFC PR: (leave this empty)
- Svelte Issue: (leave this empty)

# Preprocessing API rework

## Summary

Introduce a new preprocessing API which is simpler but allows for more flexibility

## Motivation

The current preprocessing API is both a little hard to grasp at first and not flexible enough to satisfy more advanced use cases. Its problems:

- Ordering is somewhat arbitrary, as it runs markup preprocessors first, then script/style. Preprocessors that want to be executed at a different point are forced to do dirty workarounds. It also lead to a PR implementing somewhat of a escape hatch for this (https://github.com/sveltejs/svelte/pull/6031)
- Script/Style preprocessors may want to remove attributes, right now it's not possible to do (unless they become a markup preprocessor and do it themselves somehow) (https://github.com/sveltejs/svelte-preprocess/issues/260, https://github.com/sveltejs/svelte/issues/5900)
- In general, the distinction between markup/script/style forces a decision on the preprocessor authors that may lock them in to a suboptimal solution

The solution for a better preprocessing API therefore should be

- easier to grasp and reason about
- execute preprocessors predictably
- provide more flexibility

## Detailed design

The preprocessor API no longer is split up into three parts. Instead of expecting an object with `{script, style, markup}` functions, it expects a function to which is handed the complete source code, and that's it:

```typescript
result: {
code: string,
dependencies: Array<string>
} = await svelte.preprocess(
(input: { code: string, filename: string }) => Promise<{
code: string,
dependencies?: Array<string>,
map?: any
}>
)
```

Additionally, `svelte/preprocess` exports new utility functions which essentially establish the current behavior:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the preprocess function live in svelte/compiler but having the utility functions live in svelte/preprocess seems a bit confusing to me. I'm not sure what I'd suggest though. Having preprocess in svelte/compiler be the "old" preprocessor and preprocess in svelte/preprocessor be the "new" preprocessor would eliminate the need for different behavior based on what types of arguments were passed (and then we'd eventually get rid of the old one in Svelte 4), but that might be even more confusing.

Having the compiler and the preprocessor be separate concerns (exported from separate modules) seems appealing to me, but I'm not sure how valuable of a distinction that would be to force users to make. Almost everyone is going to be interacting with this with svelte.config.js, not by directly calling preprocess and compile.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that even though most users will interact with the whole system inside the config file, there is still merit to the idea of moving everything new into svelte/preprocess for the sake of internal organization. I like the idea.


### extractStyles

```typescript
function extractStyles(code: string): Array<{
start: number;
end: number;
content: { text: string; start: number; end: number };
attributes: Array<{
name: string;
value: string;
start: number;
end: number;
}>;
}>;

extracts the style tags from the source code, each with start/end position, content and attributes

### extractScripts

Same as `extractStyles` but for scripts

### replaceInCode

```typescript
function replaceInCode(
code: string,
replacements: Array<{ code: string; start: number; end: number; map?: any }>
): { code: string; map: any };
```

Performs replacements at the specified positions. If a map is given, that map is adjusted to map whole content, not just the part that was processed. The result is the replaced code along with a merged map.

These three functions would make it possible to reimplement a script preprocessor like this:

```javascript
function transformStuff(...) { /* user provided function */ }
function getDependencies(...) { /* user provided function */ }
function script({code}) {
const scripts = extractScripts(code);
const replacements = scripts.map(transformStuff);
return {
...replaceInCode(code, replacements),
dependencies: getDependencies(replacements)
}
}
```

Using these three functions, we could also construct convenience functions like `replaceInScript` which would make it possible for preprocessor authors to do `return replaceInScript(code, transformStuff)`. What functions exactly to provide is up for discussion, the point is that we should provide primitives to ensure more flexibility for advanced use cases.

Since preprocessors are now "just" functions, there's no ordering headache anymore, preprocessors are invoked in order, giving full control for composability.

### Roadmap, backwards-compatibility

This new functionality could be implemented in Svelte 3, where the `preprocess` function checks if the passed in preprocessor is an object (current API) or a function (proposed API). In Svelte 4, this would become the default, and we could provide a function for preprocessors that don't support the new API yet.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a clarification is in order that we'd provide a function which developers could use in their config files to wrap preprocessors that have not yet taken the necessary steps to adopt the new API.

As written, you might interpret it as "we will provide a function to preprocessor authors".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, there needs to be more work done here to explain what we do in Svelte 3, I think.

If you pass an array of preprocessors, some of which use the new API, and some of which use the old API, then what do you do about ordering? If preprocessor A uses the old API, but B uses the new API, you've got a condition where B is expecting that A will run first. But if A uses script or style, it won't have done anything when B runs. In this case, my ugly workaround is still necessary to get PostCSS turned into proper CSS before Svelte Image does its thing.

Perhaps we still need to have developers (not preprocessor authors) signal that they want to opt in to the new API via some config setting. Then we can just force them to use the provided function to wrap old-style preprocessors (as they would have to do in Svelte 4 anyway).

So with no "experimental" flag set, we respect either style of API, but preserve the old ordering mechanism: any new-style API usage will be ordered in the same fashion as markup is today. Additionally, when we encounter the new-style API, we log a warning message to the terminal with a brief message about ordering and a link to visit for more info.

When the experimental flag set to true, we remove the old-style API entirely which fixes ordering automatically. Additionally, when we encounter the old-style API, we throw an error with an explanation that "Preprocessor {name} attempted to use the deprecated API, but you've opted in to the new style. Either wrap your preprocessor in the legacyPreprocessor function or opt out of the new behavior." Also with a link to read more.


```javascript
export function legacyPreprocessor(preprocessor) {
return async ({ code, filename }) => {
const processedBody = await (preprocessor?.markup({ content, filename }) ??
Promise.resolve({ code: content }));

// .. etc
return mergeParts(processedBody, processedScript, processedStyle);
};
}
```

## How we teach this

Adjust docs

## Drawbacks

None that I can think of right now

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main drawback is that this is planned to be a breaking change in Svelte 4. But that is mitigated by providing the legacyPreprocessor function.

There are other potential drawbacks... but I cannot guess at them because I do not know the original thinking that led to doing three separate passes on preprocessors to begin with. That decision led to this current ordering problem which is addressed in this RFC, but it ostensibly was solving some other problem at the time. In theory, this change would re-introduce whatever that problem was.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked the other maintainers about the decision for the API back then and there was no conscious "this is good because X" reason for that. The main argument was to provide some ease of use, but that would be achieved just as well with the helper functions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am hoping that this change is a slam dunk, then.


## Alternatives

None that I can think of right now

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ordering

  1. just a config flag: opt in to order preprocessors top to bottom with no special second or third pass on script or style.
  2. Some nesting mechanism as I created in my PR that you linked at the top. (For those who have not been following, I personally prefer this RFC solution to the PR that I originally opened)

Attr adjustments within script/style

  1. You could fix the inability to edit tag information in scripts and styles by adding another optional property in the return of either script or style that we'd then act on internally. The bit below is adapted from current Svelte docs. See the commented lines which indicate the change I mean.
type Preprocessor = {
	markup?: (input: { content: string, filename: string }) => Promise<{
		code: string,
		dependencies?: Array<string>
	}>,
	script?: (input: { content: string, markup: string, attributes: Record<string, string>, filename: string }) => Promise<{
		code: string,
		dependencies?: Array<string>
		// Add the following property
		attributes?: Record<string, string>
	}>,
	style?: (input: { content: string, markup: string, attributes: Record<string, string>, filename: string }) => Promise<{
		code: string,
		dependencies?: Array<string>
		// Add the following property
		attributes?: Record<string, string>
	}>
}
  1. We could still provide a helper function for grabbing script and style directly out of markup as described in this RFC, but without changing the underlying API at all. Preprocessor authors could more easily transition from style or script to markup using the function we provide.

As for these alternatives, this RFC feels like the better, more thorough approach to me.


## Unresolved questions

- We could expand the functionality of `extractScripts`/`extractStyles`. Right now, every script/style is processed, not just the top level ones. Enhance the Svelte parser with a mode that only parses the top level script/style locations, but not its contents?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd set this as a priority (albeit a low one). This came up a few times in the svelte-preprocess issues. However, the question persists: should we make it possible for someone to write, let's say, typescript code inside a <svelte:head> tag? Or is it outside of the scope of the preprocessing step? One code is for the compiler, the other for the browser.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since after preprocessing all script tags should be JavaScript it doesn't make much of a difference I'd say. But it certainly feels weird to write the non-top-level script/style tags in another language.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it certainly feels weird to write the non-top-level script/style tags in another language.

100% with you. I'd focus on top-level only for the time being.

- What about preprocessors inside moustache tags? Should the Svelte parser be adjusted for an opt-in parsing mode where a Javascript-like syntax for moustache tags is assumed to extract its contents and provide this as another utility function for preprocessing? (https://github.com/sveltejs/svelte/issues/4701)