diff --git a/README.md b/README.md index ba189da..11e489d 100644 --- a/README.md +++ b/README.md @@ -417,29 +417,11 @@ For the time being, the Chrome built-in AI team is moving forward more aggresive ## Privacy considerations -### General concerns about language-model based APIs +Please see [the specification](https://webmachinelearning.github.io/writing-assistance-apis/#privacy). -If cloud-based language models are exposed through this API, then there are potential privacy issues with exposing user or website data to the relevant cloud and model providers. This is not a concern specific to this API, as websites can already choose to expose user or website data to other origins using APIs such as `fetch()`. However, it's worth keeping in mind, and in particular as discussed in our [Goals](#shared-goals), perhaps we should make it easier for web developers to know whether a cloud-based model is in use, or which one. +## Security considerations -If on-device language models are updated separately from browser and operating system versions, this API could enhance the web's fingerprinting service by providing extra identifying bits. Mandating that older browser versions not receive updates or be able to download models from too far into the future might be a possible remediation for this. - -Finally, we intend to prohibit (in the specification) any use of user-specific information that is not directly supplied through the API. For example, it would not be permissible to fine-tune the language model based on information the user has entered into the browser in the past. - -### Detecting available options - -The [`availability()` API](#testing-available-options-before-creation) specified here provide some bits of fingerprinting information, since the availability status of each option and language can be one of four values, and those values are expected to be shared across a user's browser or browsing profile. In theory, this could be up to ~6.6 bits for the current set of summarizer options, plus an unknown number more based on the number of supported languages, and then this would be roughly tripled by including writer and rewriter. - -In practice, we expect the number of bits to be much smaller, as implementations will likely not have separate, independently-downloadable pieces of collateral for each option value. (For example, in Chrome's case, we anticipate having a single download for all three APIs.) But we need the API design to be robust to a variety of implementation choices, and have purposefully designed it to allow such independent-download architectures so as not to lock implementers into a single strategy. - -There are a variety of solutions here, with varying tradeoffs, such as: - -* Grouping downloads to reduce the number of bits, e.g. by ensuring that downloading the "formal" tone also downloads the "neutral" and "casual" tones. This costs the user slightly more bytes, but hopefully not many. -* Partitioning downloads by top-level site, i.e. repeatedly downloading extra fine-tunings or similar and not sharing them across all sites. This could be feasible if the collateral necessary to support a given option is small; it would not generally make sense for the base language model. -* Adding friction to the download with permission prompts or other user notifications, so that sites which are attempting to use these APIs for tracking end up looking suspicious to users. - -We'll continue to investigate the best solutions here. And the specification will at a minimum allow user agents to add prompts and UI, or reject downloads entirely, as they see fit to preserve privacy. - -It's also worth noting that a download cannot be evicted by web developers. Thus the availability states can only be toggled in one direction, from `"downloadable"` to `"downloading"` to `"available"`. And it doesn't provide an identifier that is very stable over time, as by browsing other sites, users will gradually toggle more and more of the availability states to `"availale"`. +Please see [the specification](https://webmachinelearning.github.io/writing-assistance-apis/#security). ## Stakeholder feedback diff --git a/index.bs b/index.bs index eeb8b9d..b5adcd2 100644 --- a/index.bs +++ b/index.bs @@ -244,13 +244,13 @@ enum SummarizerLength { "short", "medium", "long" }; 1. [=Assert=]: this algorithm is running [=in parallel=]. - 1. If there is some error attempting to determine whether the user agent supports summarizing text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null. + 1. If there is some error attempting to determine whether the user agent [=model availability/can support=] summarizing text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null. - 1. If the user agent supports summarizing text into the type of summary described by |type|, in the format described by |format|, and with the length guidance given by |length| without performing any downloading operations, then return "{{Availability/available}}". + 1. If the user agent [=model availability/currently supports=] summarizing text into the type of summary described by |type|, in the format described by |format|, and with the length guidance given by |length|, then return "{{Availability/available}}". - 1. If the user agent believes it can summarize text according to |type|, |format|, and |length|, but only after finishing a download (e.g., of an AI model or fine-tuning) that is already ongoing, then return "{{Availability/downloading}}". + 1. If the user agent believes it will be able to [=model availability/support=] summarizing text according to |type|, |format|, and |length|, but only after finishing a download that is already ongoing, then return "{{Availability/downloading}}". - 1. If the user agent believes it can summarize text according to |type|, |format|, and |length|, but only after performing a download (e.g., of an AI model or fine-tuning), then return "{{Availability/downloadable}}". + 1. If the user agent believes it will be able to [=model availability/support=] summarizing text according to |type|, |format|, and |length|, but only after performing a not-currently-ongoing download, then return "{{Availability/downloadable}}". 1. Otherwise, return "{{Availability/unavailable}}". @@ -260,7 +260,7 @@ enum SummarizerLength { "short", "medium", "long" }; 1. [=Assert=]: this algorithm is running [=in parallel=]. - 1. If there is some error attempting to determine whether the user agent supports summarizing text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null. + 1. If there is some error attempting to determine whether the user agent [=model availability/can support=] summarizing text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null. 1. Return a [=language availabilities triple=] with: @@ -425,6 +425,8 @@ The inputQuota getter steps are to return The summarization should conform to the guidance given by |type|, |format|, and |length|, in the definitions of each of their enumeration values. + The summarization process must conform to the guidance given in [[#privacy]] and [[#security]], notably including (but not limited to) [[#privacy-user-input]] and [[#security-runtime]]. + If |outputLanguage| is non-null, the summarization should be in that language. Otherwise, it should be in the language of |input| (which might not match that of |context| or |sharedContext|). If |input| contains multiple languages, or the language of |input| cannot be detected, then either the output language is [=implementation-defined=], or the implementation may treat this as an error, per the guidance in [[#summarizer-errors]]. 1. While true: @@ -621,7 +623,7 @@ When summarization fails, the following possible reasons may be surfaced to the "{{UnknownError}}" -

All other scenarios, or if the user agent would prefer not to disclose the failure reason. +

All other scenarios, including if the user agent believes it cannot summarize and also meet the requirements given in [[#privacy]] or [[#security]]. Or, if the user agent would prefer not to disclose the failure reason.

This table does not give the complete list of exceptions that can be surfaced by the summarizer API. It only contains those which can come from certain [=implementation-defined=] steps. @@ -820,13 +822,13 @@ enum WriterLength { "short", "medium", "long" }; 1. [=Assert=]: this algorithm is running [=in parallel=]. - 1. If there is some error attempting to determine whether the user agent supports writing text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null. + 1. If there is some error attempting to determine whether the user agent [=model availability/can support=] writing text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null. - 1. If the user agent supports writing text with the tone described by |tone|, in the format described by |format|, and with the length guidance given by |length| without performing any downloading operations, then return "{{Availability/available}}". + 1. If the user agent [=model availability/currently supports=] writing text with the tone described by |tone|, in the format described by |format|, and with the length guidance given by |length|, then return "{{Availability/available}}". - 1. If the user agent believes it can write text according to |tone|, |format|, and |length|, but only after finishing a download (e.g., of an AI model or fine-tuning) that is already ongoing, then return "{{Availability/downloading}}". + 1. If the user agent believes it will be able to [=model availability/support=] writing text according to |type|, |format|, and |length|, but only after finishing a download that is already ongoing, then return "{{Availability/downloading}}". - 1. If the user agent believes it can write text according to |tone|, |format|, and |length|, but only after performing a download (e.g., of an AI model or fine-tuning), then return "{{Availability/downloadable}}". + 1. If the user agent believes it will be able to [=model availability/support=] writing text according to |type|, |format|, and |length|, but only after performing a not-currently-ongoing download, then return "{{Availability/downloadable}}". 1. Otherwise, return "{{Availability/unavailable}}". @@ -836,7 +838,7 @@ enum WriterLength { "short", "medium", "long" }; 1. [=Assert=]: this algorithm is running [=in parallel=]. - 1. If there is some error attempting to determine whether the user agent supports writing text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null. + 1. If there is some error attempting to determine whether the user agent [=model availability/can support=] writing text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null. 1. Return a [=language availabilities triple=] with: @@ -972,6 +974,8 @@ The inputQuota getter steps are to return [=th The written output should conform to the guidance given by |tone|, |format|, and |length|, in the definitions of each of their enumeration values. + The writing process must conform to the guidance given in [[#privacy]] and [[#security]], notably including (but not limited to) [[#privacy-user-input]] and [[#security-runtime]]. + If |outputLanguage| is non-null, the writing should be in that language. Otherwise, it should be in the language of |input| (which might not match that of |context| or |sharedContext|). If |input| contains multiple languages, or the language of |input| cannot be detected, then either the output language is [=implementation-defined=], or the implementation may treat this as an error, per the guidance in [[#writer-errors]]. 1. While true: @@ -1130,7 +1134,7 @@ When writing fails, the following possible reasons may be surfaced to the web de "{{UnknownError}}" -

All other scenarios, or if the user agent would prefer not to disclose the failure reason. +

All other scenarios, including if the user agent believes it cannot write and also meet the requirements given in [[#privacy]] or [[#security]]. Or, if the user agent would prefer not to disclose the failure reason.

This table does not give the complete list of exceptions that can be surfaced by the writer API. It only contains those which can come from certain [=implementation-defined=] steps. @@ -1329,13 +1333,13 @@ enum RewriterLength { "as-is", "shorter", "longer" }; 1. [=Assert=]: this algorithm is running [=in parallel=]. - 1. If there is some error attempting to determine whether the user agent supports rewriting text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null. + 1. If there is some error attempting to determine whether the user agent [=model availability/can support=] rewriting text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null. - 1. If the user agent supports rewriting text with the tone modification described by |tone|, in the format described by |format|, and with the length modification given by |length| without performing any downloading operations, then return "{{Availability/available}}". + 1. If the user agent [=model availability/currently supports=] rewriting text with the tone modification described by |tone|, in the format described by |format|, and with the length modification given by |length|, then return "{{Availability/available}}". - 1. If the user agent believes it can rewrite text according to |tone|, |format|, and |length|, but only after finishing a download (e.g., of an AI model or fine-tuning) that is already ongoing, then return "{{Availability/downloading}}". + 1. If the user agent believes it will be able to [=model availability/support=] rewriting text according to |type|, |format|, and |length|, but only after finishing a download that is already ongoing, then return "{{Availability/downloading}}". - 1. If the user agent believes it can rewrite text according to |tone|, |format|, and |length|, but only after performing a download (e.g., of an AI model or fine-tuning), then return "{{Availability/downloadable}}". + 1. If the user agent believes it will be able to [=model availability/support=] rewriting text according to |type|, |format|, and |length|, but only after performing a not-currently-ongoing download, then return "{{Availability/downloadable}}". 1. Otherwise, return "{{Availability/unavailable}}". @@ -1345,7 +1349,7 @@ enum RewriterLength { "as-is", "shorter", "longer" }; 1. [=Assert=]: this algorithm is running [=in parallel=]. - 1. If there is some error attempting to determine whether the user agent supports rewriting text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null. + 1. If there is some error attempting to determine whether the user agent [=model availability/can support=] rewriting text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null. 1. Return a [=language availabilities triple=] with: @@ -1481,6 +1485,8 @@ The inputQuota getter steps are to return [= The rewritten output should conform to the guidance given by |tone|, |format|, and |length|, in the definitions of each of their enumeration values. + The rewriting process must conform to the guidance given in [[#privacy]] and [[#security]], notably including (but not limited to) [[#privacy-user-input]] and [[#security-runtime]]. + If |outputLanguage| is non-null, the rewritten output text should be in that language. Otherwise, it should be in the language of |input| (which might not match that of |context| or |sharedContext|). If |input| contains multiple languages, or the language of |input| cannot be detected, then either the output language is [=implementation-defined=], or the implementation may treat this as an error, per the guidance in [[#rewriter-errors]]. 1. While true: @@ -1643,7 +1649,7 @@ When rewriting fails, the following possible reasons may be surfaced to the web "{{UnknownError}}" -

All other scenarios, or if the user agent would prefer not to disclose the failure reason. +

All other scenarios, including if the user agent believes it cannot rewrite and also meet the requirements given in [[#privacy]] or [[#security]]. Or, if the user agent would prefer not to disclose the failure reason.

This table does not give the complete list of exceptions that can be surfaced by the rewriter API. It only contains those which can come from certain [=implementation-defined=] steps. @@ -1812,6 +1818,22 @@ Every [=interface=] [=interface/including=] the {{DestroyableModel}} interface m :: 1. If |availability| is "{{Availability/downloadable}}", then: + 1. If |realm|'s [=realm/global object=] does not have [=transient activation=], then: + + 1. [=Queue a global task=] on the [=AI task source=] given |realm|'s [=realm/global object=] to [=reject=] |promise| with a "{{NotAllowedError}}" {{DOMException}}. + + 1. Abort these steps. + + 1. [=Consume user activation=] given |realm|'s [=realm/global object=]. + + 1. The user agent may display a user interface to the user to confirm that they want to perform the download operation given by |startDownload|, or to show the progress of the download. Alternately, the user agent may decide to deny the ability to perform |startDownload| based on implicit signals of the user's intent, including the considerations in [[#privacy-availability-eviction]] and [[#security-disk-space]]. If the user explicitly or implicitly signals that they do not want to start the download, then: + + 1. [=Queue a global task=] on the [=AI task source=] given |realm|'s [=realm/global object=] to [=reject=] |promise| with a "{{NotAllowedError}}" {{DOMException}}. + + 1. Abort these steps. + +

The case where the user cancels the download after it starts is handled later, as part of the download loop. + 1. Let |startDownloadResult| be the result of performing |startDownload| given |options|. 1. If |startDownloadResult| is false, then: @@ -1832,6 +1854,8 @@ Every [=interface=] [=interface/including=] the {{DestroyableModel}} interface m

This prevents the web developer-perceived progress from suddenly jumping from 0% to 90%, and then taking a long time to go from 90% to 100%. It also provides some protection against the (admittedly not very powerful) fingerprinting vector of measuring the current download progress across multiple sites. +

If the actual number of bytes necessary to download is 0, but the user agent is faking a download for the reasons described in [[#privacy]] (notably [[#privacy-language-availability]]), then set this number to an [=implementation-defined=] value that helps with the download faking. + 1. Let |lastProgressFraction| be 0. 1. Let |lastProgressTime| be the [=monotonic clock=]'s [=monotonic clock/unsafe current time=]. @@ -1840,13 +1864,13 @@ Every [=interface=] [=interface/including=] the {{DestroyableModel}} interface m 1. While true: - 1. If downloading has failed, then: + 1. If downloading has failed, or the user has canceled the download, then: 1. [=Queue a global task=] on the [=AI task source=] given |realm|'s [=realm/global object=] to [=reject=] |promise| with a "{{NetworkError}}" {{DOMException}}. 1. Abort these steps. - 1. Let |bytesSoFar| be the number of bytes downloaded so far. + 1. Let |bytesSoFar| be the number of bytes downloaded so far. (Or the number of bytes fake-downloaded so far, if the user agent is faking the download.) 1. [=Assert=]: |bytesSoFar| is greater than or equal to 0, and less than or equal to |totalBytes|. @@ -1856,7 +1880,7 @@ Every [=interface=] [=interface/including=] the {{DestroyableModel}} interface m 1. Let |progressFraction| be [$floor$](|rawProgressFraction| × 65,536) ÷ 65,536. -

+

We use a fraction, instead of firing a progress event with the number of bytes downloaded, to avoid giving precise information about the size of the model or other material being downloaded.

|progressFraction| is calculated from |rawProgressFraction| to give a precision of one part in 216. This ensures that over most internet speeds and with most model sizes, the {{ProgressEvent/loaded}} value will be different from the previous one that was fired ~50 milliseconds ago.

@@ -1955,8 +1979,14 @@ Every [=interface=] [=interface/including=] the {{DestroyableModel}} interface m 1. Set |lastProgressTime| to the [=monotonic clock=]'s [=monotonic clock/unsafe current time=]. +

If |document| stops being [=Document/fully active=], this loop does not terminate, and the user agent should not cancel the download, for the reasons explained in [[#privacy-availability-cancelation]]. It could pause the download, effectively meaning that the loop will never again have observable effects such as firing {{CreateMonitor/downloadprogress}} events. But even in such a case, future calls to |getAvailability| given |options| need to return "{{Availability/downloading}}" instead of "{{Availability/downloadable}}", and the material downloaded so far needs to persist even across user agent restarts. + +

If the user agent does continue downloading while |document| is not [=Document/fully active=], then the loop will periodically queue tasks to fire {{CreateMonitor/downloadprogress}} events anyway. If the document becomes [=Document/fully active=] again, by coming out of the back/forward cache, these tasks will be run at that time, and the download progress will be reported to the web developer. + 1. [=If aborted=], then abort these steps. +

The user agent should not actually cancel the underlying download, as explained in [[#privacy-availability-cancelation]]. As above, it could fulfill this requirement by pausing the download, but it cannot cancel discard the progress made so far. + 1. [=Initialize and return an AI model object=] given |promise|, |options|, a no-op algorithm, |initialize|, and |create|. @@ -2246,6 +2276,8 @@ Every [=interface=] [=interface/including=] the {{DestroyableModel}} interface m 1. Let |availability| be the result of |compute| given |options|. + 1. If |availability| is "{{Availability/available}}" or "{{Availability/downloading}}", and if [[#privacy-availability-masking|download masking]] is needed to protect the user's privacy, the user agent should set |availability| to "{{Availability/downloadable}}". + 1. [=Queue a global task=] on the [=AI task source=] given |global| to perform the following steps: 1. If |availability| is null, then [=reject=] |promise| with an "{{UnknownError}}" {{DOMException}}. @@ -2267,6 +2299,8 @@ Every [=interface=] [=interface/including=] the {{DestroyableModel}} interface m 1. Return "{{Availability/available}}".

+For the purposes of our algorithms related to model availability, a user agent currently supports an operation if it can perform that operation without first downloading the necessary capabilities. (For example, without first downloading an AI model or fine tuning.) Such determination of support should incorporate the privacy considerations described in [[#privacy-model-version]]. That is, even if a user agent has a suitable model available or could in theory download one, it may choose instead to report the operation as unsupported, in order to avoid using models whose versions skew too far from the user agent's version. +

Language availability

A language availabilities partition is a [=map=] whose [=map/keys=] are "{{Availability/downloading}}", "{{Availability/downloadable}}", or "{{Availability/available}}", and whose [=map/values=] are [=sets=] of strings representing [=Unicode canonicalized locale identifiers=]. [[!ECMA-402]] @@ -2282,15 +2316,15 @@ A language availabilities triple is a [=struct=] with the following [ 1. Let |partition| be «[ "{{Availability/available}}" → an empty [=set=], "{{Availability/downloading}}" → an empty [=set=], "{{Availability/downloadable}}" → an empty [=set=] ]». - 1. [=list/For each=] human language |languageTag|, represented as a [=Unicode canonicalized locale identifier=], for which the user agent supports |purpose|, without performing any downloading operations: + 1. [=list/For each=] human language |languageTag|, represented as a [=Unicode canonicalized locale identifier=], for which the user agent [=model availability/currently supports=] |purpose|: 1. [=set/Append=] |languageTag| to |partition|["{{Availability/available}}"]. - 1. [=list/For each=] human language |languageTag|, represented as a [=Unicode canonicalized locale identifier=], for which the user agent is currently downloading material (e.g., an AI model or fine-tuning) to support |purpose|: + 1. [=list/For each=] human language |languageTag|, represented as a [=Unicode canonicalized locale identifier=], for which the user agent believes it will be able to [=model availability/support=] |purpose|, but only after finishing a download that is already ongoing: 1. [=set/Append=] |languageTag| to |partition|["{{Availability/downloading}}"]. - 1. [=list/For each=] human language |languageTag|, represented as a [=Unicode canonicalized locale identifier=], for which the user agent believes it can support |purpose|, but only after performing a not-currently-ongoing download (e.g., of an AI model or fine-tuning): + 1. [=list/For each=] human language |languageTag|, represented as a [=Unicode canonicalized locale identifier=], for which the user agent believes it will be able to [=model availability/support=] |purpose|, but only after performing a not-currently-ongoing download: 1. [=set/Append=] |languageTag| to |partition|["{{Availability/downloadable}}"]. @@ -2372,3 +2406,132 @@ A quota exceeded error information is a [=struct=] with the fo

Task source

[=Tasks=] queued by this specification use the AI task source. + +

Privacy considerations

+ +Unlike many "privacy considerations" sections, which only summarize and restate privacy considerations that are already normatively specified elsewhere in the document, this section contains some normative requirements that are not present elsewhere, and adds more detail to the normative requirements present elsewhere. The novel normative requirements are called out using strong emphasis. + + + +

Model availability

+ +For any of the APIs that use the infrastructure described in [[#supporting]], the exact download status of the AI model or fine-tuning data can present a fingerprinting vector. How many bits this vector provides depends on the options provided to the API creation, and how they influence the download. + +For example, if the user agent uses a single model, with no separately-downloadable fine-tunings, to support the summarizer, writer, and rewriter APIs, then the download status provides two bits (corresponding to the four {{Availability}} values) across all three APIs. In contrast, if the user agent downloads separate fine-tunings for each value of {{SummarizerType}}, {{SummarizerFormat}}, and {{SummarizerLength}} on top of a base model, then the download status for those summarizer fine-tunings alone provides ~6.6 bits of entropy. + + +

Download masking

+ +One of the specification's mitigations is to suggest that the user agent mask the current download status by returning "{{Availability/downloadable}}" even if the actual download status is "{{Availability/available}}" or "{{Availability/downloading}}". This is done as part of this step in the [=compute AI model availability=] algorithm which backs the `availability()` APIs. + +Because implementation strategies differ (e.g. in how many bits they expose), and other mitigations such as permission prompts are available, a specific masking scheme is not mandated. For APIs where the user agent believes such masking is necessary, a suggested heuristic is to mask by default, subject to a masking state that is established for each (API, options, [=storage key=]) tuple. This state can be set to "unmasked" once a web page in a given [=storage key=] calls the relevant `create()` method with a given set of options, and successfully starts a download or creates a model object. Since [=create an AI model object=] has stronger requirements (see [[#privacy-availability-creation]]), this ensures that web pages only get access to the true download status after taking a more costly and less-repeatable action. + + +Implementations which use such a [=storage key=]-based masking scheme must ensure that the masking state is reset when other storage for that origin is reset. + +

Creation-time friction

+ +The mitigation described in [[#privacy-availability-masking]] works against attempts to silently fingerprint using the `availability()` methods. The specification also contains requirements to prevent `create()` from being used for fingerprinting, by introducing enough friction into the process to make it impractical: + +* [=Create an AI model object=] both requires and consumes [=user activation=], when it would initiate a download. +* [=Create an AI model object=] allows the user agent to prompt the user for permission, or to implicitly reject download attempts based on previous signals (such as an observed pattern of abuse). +* [=Create an AI model object=] is gated on an per-API [=policy-controlled feature=], which means that only top-level origins and their delegates can use the API. + +Additionally, initiating the download process is more or less a one-time operation, so the availability status will only ever transition from "{{Availability/downloadable}}" to "{{Availability/downloading}}" to "{{Availability/available}}" via these guarded creation operations. That is, while `create()` can be used to read some of these fingerprinting bits, at the cost of the above friction, doing so will destroy the bits as well. + +(For details on cases where downloading might happen more than once, and how privacy and security are preserved in those cases, see [[#privacy-availability-cancelation]], [[#privacy-availability-eviction]], and [[#security-disk-space]].) + +

Download cancelation

+ +An important part of making the download status into a less-useful fingerprinting vector is to ensure that the website cannot toggle the availability state back and forth by starting and canceling downloads. Doing so would allow sites much more fine-grained control over the possible fingerprinting bits, allowing them to read the bits via the `create()` methods without destroying them. + +The part of these APIs which, on the surface, gives developers control over the download process is the {{AbortSignal}} passed to the `create()` methods. This allows developers to signal that they are no longer interested in creating a model object, and immediately causes the promise returned by `create()` to become rejected. The specification has a "should"-level requirement that the user agent not actually cancel the underlying download when the {{AbortSignal}} is aborted. The web developer will still receive a rejected promise, but the download progress so far will be preserved, and the availability status (as seen by future calls to the `availability()` method) will update accordingly. + + +User agents might be inclined to cancel the download in other situations not covered in the specification, such as when the page is unloaded. This needs to be handled with caution, as if the page can initiate these operations using JavaScript (for example, by navigating away to another origin) that would re-open the privacy hole. So, user agents should not cancel the download in response to any page-controlled actions. The specific case of navigation is covered by another "should"-level requirement. + +Note that canceling downloads in response to user-controlled actions is not problematic. + +

Download eviction

+ +Another ingredient in ensuring that websites cannot toggle the availability state back and forth is to ensure that user agents don't use a quota-based eviction system for the downloaded material. For example, if a user agent implemented the translator API with one download per language arc, supported 100 language arcs, and evicted all but the 30 most-recently-used language arcs, then web pages could toggle the readable-via-`create()` availability state of language arcs from "{{Availability/available}}" back to "{{Availability/downloadable}}" by creating translators for 30 new language arcs. + + +To avoid this, user agents should not implement systems which allow web pages to control the eviction of downloaded material, including via indirect triggers such as further subsequent downloads. One way to fulfill this requirement is to never evict downloaded material in response to web page-initiated storage pressure, instead refusing to download new material if doing so would cause storage pressure. + +Evicting downloads in response to user-controlled actions is not problematic, and providing such user affordances is discussed further in [[#security-disk-space]]. + +

Alternate options

+ +While some of the above requirements, such as those on user activation or permissions policy, are specified using "must" language to ensure interoperability, most are specified using "should". The reason for this is that it's possible for implementations to use completely different strategies to preserve user privacy, especially for APIs that use small models. (For example, the language detector API.) + +The simplest of these is to treat model downloads like most other stored resources, partitioning them by the downloading page's [=storage key=]. This lets the web origin model's existing privacy protections operate, obviating the need for anything more complicated. The downside is that this spends more of the user's time, bandwidth, and disk space redundantly downloading the same model across multiple sites. + +A slight variant of this is to re-download the model every time it is requested by a new [=storage key=], while re-using the on-disk storage. This still uses the user's time and bandwidth, but at least saves on disk space. + +Going further, a user agent could attempt to fake the download for new [=storage keys=] by just waiting for a similar amount of time as the real download originally took. This then only spends the user's time, sparing their bandwidth and disk space. However, this is less private than the above alternatives, due to the presence of network side channels. For example, a web page could attempt to detect the fake downloads by issuing network requests concurrent to the `create()` call, and noting that there is no change to network throughouput. The scheme of remembering the time the real download originally took can also be dangerous, as the first site to initiate the download could attempt to artificially inflate this time (using concurrent network requests) in order to communicate information to other sites that will initiate a fake download in the future, from which they can read the time taken. Nevertheless, something along these lines might be useful in some cases, implemented with caution and combined with other mitigations. + +

Sensitive language availability

+ +Even if the user agent mitigates most of the fingerprinting risks associated with the availability of AI models per [[#privacy-availability]], such that probing availability requires a destructive action per [[#privacy-availability-creation]], the information about download availabilities for different languages can still be a privacy risk beyond fingerprinting. This is most obvious in the case of the translator API, where, for example, knowing that the user has downloaded a translator from English to a minority language might be sensitive information. But it can apply just as well to other APIs, via options such as their expected input languages, which might be implemented using downloadable fine-tunings with variable availability. + + +For this reason, on top of the creation-time mitigations discussed in [[#privacy-availability-creation]], user agents may artificially fake a download if they believe it would be helpful for privacy reasons, instead of instantly creating the model. This is *not* a fingerprinting mitigation, but instead provides some degree of plausible deniability for the user, such that web pages cannot be certain of the user's demographic information. If the web page sees model object creation taking 2–3 seconds and emitting {{CreateMonitor/downloadprogress}} events, then perhaps this is a fake download due to the user previously downloading a translator for that minority language, or perhaps it is a real download that completed quickly. + +As discussed in [[#privacy-availability-alternatives]], such fake downloads are not foolproof, and a determined web page could attempt to detect them. However, they do provide some privacy benefit, and can be combined with other mitigations (such as prompts) to provide a more robust defense, and to make such demographic probing impractically unreliable for attackers. + +

Model version

+ +Separate from the availability of a model, the specific version or behavior of a model can also be a fingerprinting vector. + +For this reason, these APIs do not expose model versions directly. And they take some efforts to avoid exposing the model version indirectly, for example by censoring the download size in the [=create an AI model object=] algorithm, so that {{CreateMonitor/downloadprogress}} events do not directly expose the size of the model. This also encourages interoperability, by making it harder for web pages to safelist specific models, and instead encouraging them to program against the general API surface. + +However, such mitigations are not foolproof. They only protect against simple attempts to passively discover the model version; behavioral probing can still reveal it. (For example, by sending a number of inputs, and checking the output against known patterns for different versions.) + + +The best way to prevent the model version from becoming a fingerprinting vector is to tie it to the user agent's version, such that the model's version (and thus behavior) only updates alongside already-exposed information such as {{NavigatorID/userAgent|navigator.userAgent}}. User agents should limit the number of possible model versions that a single user agent version can be paired with, when determining whether a model-backed operation is [=model availability/currently supported=]. Examples of possible techniques include not providing model updates to older user agent versions, or ignoring the presence of already-downloaded models below a minimum version threshold after a user agent update (instead downloading a newer version above that threshold). Note that such techniques might not always be available, for example if the user agent always uses a model bundled with the operating system, whose updates are not under the user agent's control. + +There is a tradeoff between reducing the fingerprinting bits that can be derived from the model version, and reducing the fingerprinting bits that can be derived from the model download status. (The latter is discussed in [[#privacy-availability]].) Aggressively locking new user agent versions to new model versions can result in more frequent transitions between "{{Availability/available}}" and "{{Availability/downloadable}}". This can be mitigated by allowing usage of older model versions with newer user agent versions while the new model version is downloading. This ensures the availability state stays at "{{Availability/available}}", at the cost of short periods where web pages can, with some effort, identify the user as belonging to the smaller cohort of older-model, newer-user-agent users. + +

User input

+ + +Implementations must not train or fine-tune models on user input, or otherwise store user input in a way that models can consult in the future. (For example, using retrieval-augmented generation technology.) + +Using user input in such a way would provide a vector for exposing the user's information to web pages, or for exposing information derived from the user's interactions with one site to another site, both of which are unacceptable privacy leaks. + +

Cloud-based implementations

+ +The implementation-defined parts of these APIs can be implemented by delegating to user-agent-provided cloud-based services. This is not, in itself, a significant privacy risk: web developers already have the ability to send arbitrary data (including user-provided data) to cloud services via APIs such as {{WindowOrWorkerGlobalScope/fetch()}}. Indeed, it's likely that web developers will fall back to such cloud services when these APIs are not present. Additionally, in some cases entire user agents are already implemented as cloud services, with their user interfaces streamed to the user's device. + +However, this is something for web developers to be aware of when they use this API, in case their web page has requirements on not sending certain information to third parties. We're contemplating giving control over this possibility to web developers in <#38>. + +

Security considerations

+ +Unlike many "security considerations" sections, which only summarize and restate security considerations that are already normatively specified elsewhere in the document, this section contains some normative requirements that are not present elsewhere. The novel normative requirements are called out using strong emphasis. + +

Disk space

+ +Downloading models for these APIs could use significant amounts of the user's disk space. Depending on the implementation strategy, web pages might be able to trigger more such usage, by repeatedly calling the `create()` methods with different options. + + +In the event of storage pressure, user agents should balance the utility of these APIs with the disk space they take up, possibly failing a new download (as discussed in this step) or freeing up disk space in some other way. However, user agents need to be mindful of the privacy impacts discussed in [[#privacy-availability-eviction]] when considering freeing up disk space by evicting model downloads. User agents may involve the user in these decisions, e.g., via download-time prompts (mentioned in the downloading algorithm) or some sort of model management UI. + + +If model eviction happens while the model is being actively used by a web page, in such a way that the API can no longer operate, then the user agent should fail these APIs with an "{{UnknownError}}" {{DOMException}}. + +

Runtime shared resources

+ +Current implementation strategies for these APIs can involve significant usage of resources such as GPU memory and processing power. This leads to a common implementation strategy of loading the appropriate model once, and sharing its capabilities between multiple web pages that interface with it via these APIs. + + +User agents should ensure that one web page's use of these APIs does not overly interfere with another web page's use of these APIs, or another web page's general operation. For example, it should not be possible for a background tab to prevent a foreground tab from using these APIs by calling them in a tight loop, or for one web page to lock up shared GPU resources indefinitely by repeatedly submitting large inputs. + + +This specification does not mandate any particular mitigation strategy for these issues, but possible useful strategies include queuing, rate limiting, abuse detection, and treating differently web pages which the user is actively interacting with versus those in the background. If necessary, the user agent may fail these APIs with an "{{UnknownError}}" {{DOMException}} to prevent such problems. + +

OS-provided models

+ +One implementation strategy for these APIs is to delegate to models provided by the operating system. This can provide a number of benefits, such as a more uniform experience for the user across multiple applications, or less disk space usage. + +However, doing so comes with the usual dangers of exposing operating system capabilities to the web platform. User agents still need to ensure that the various privacy and security requirements in this specification are followed when using OS-provided models, even if the user agent has less control over the model's behavior. Particularly notable requirements to watch out for are those in [[#privacy-user-input]] and [[#security-runtime]]. diff --git a/security-privacy-questionnaire.md b/security-privacy-questionnaire.md index 99bc110..13167c3 100644 --- a/security-privacy-questionnaire.md +++ b/security-privacy-questionnaire.md @@ -9,12 +9,12 @@ This feature exposes two large categories of information: - The availability information for various capabilities of the API, so that web developers know what capabilities are available in the current browser, and whether using them will require a download or the capability can be used readily. -The privacy implications of both of these are discussed [in the explainer](./README.md#privacy-considerations). +The privacy implications of both of these are discussed [in the specification](https://webmachinelearning.github.io/writing-assistance-apis/#privacy). > 02. Do features in your specification expose the minimum amount of information > necessary to implement the intended functionality? -We believe so. It's possible that we could remove the exposure of the after-download vs. readily information. However, it would almost certainly be inferrable via timing side-channels. (I.e., if downloading a language model or fine-tuning is required, then the web developer can observe the creation of the summarizer/writer/rewriter object taking longer.) +We believe so. It's possible that we could remove the exposure of the download status information. However, it would almost certainly be inferrable via timing side-channels. (I.e., if downloading a language model or fine-tuning is required, then the web developer can observe the creation of the summarizer/writer/rewriter object taking longer.) > 03. Do the features in your specification expose personal information, > personally-identifiable information (PII), or information derived from @@ -69,7 +69,7 @@ None. We use permissions policy to disallow the usage of these features by default in third-party (cross-origin) contexts. However, the top-level site can delegate to cross-origin iframes. -Otherwise, it's possible that some of the [anti-fingerprinting mitigations](./README.md#privacy-considerations) might involve partitioning download status, which is kind of like distinguishing between first- and third-party contexts. +Otherwise, some of the possible [anti-fingerprinting mitigations](https://webmachinelearning.github.io/writing-assistance-apis/#privacy-availability) involve partitioning information across sites, which is kind of like distinguishing between first- and third-party contexts. > 14. How do the features in this specification work in the context of a browser’s > Private Browsing or Incognito mode? @@ -81,9 +81,10 @@ Otherwise, we do not anticipate any differences. > 15. Does this specification have both "Security Considerations" and "Privacy > Considerations" sections? -There is no specification yet, but there is a [privacy considerations](./README.md#privacy-considerations) section in the explainer. +Yes: -We do not anticipate significant security risks for these APIs at this time. +* [Privacy considerations](https://webmachinelearning.github.io/writing-assistance-apis/#privacy) +* [Security considerations](https://webmachinelearning.github.io/writing-assistance-apis/#security) > 16. Do features in your specification enable origins to downgrade default > security protections?