-
Notifications
You must be signed in to change notification settings - Fork 60
Try rewriting the explainer #98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,26 @@ | ||
# Bidirectional WebDriver Protocol | ||
# Bidirectional WebDriver Protocol Explainer | ||
|
||
## Overview | ||
WebDriver BiDi (Bi-Directional) is being developed to allow web developers to migrate from | ||
Chromium-only [CDP](https://chromedevtools.github.io/devtools-protocol/)-based (Chrome DevTools | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shall we remove the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably makes sense to say something about it being non-standard and de-facto defined by Chromium? |
||
Protocol) developer tooling to cross-browser tooling. | ||
|
||
This document presents a possible design for a bidirectional WebDriver protocol, incorporating scenarios and resolutions discussed at the TPAC 2019 working group meeting. The protocol uses JSON-RPC messaging over WebSockets as the transport mechanism. WebDriver's current model of the browser is extended to include service workers and other non-page targets and make it possible for clients to target these additional contexts. We also discuss how the new protocol can interoperate with the existing protocol. Sample protocol messages illustrating how the protocol would work are included, and an JSON API specification is included alongside the document. | ||
## Motivation | ||
|
||
## Goals | ||
Many developer tools are exclusively targetting Chrome, relying on CDP, resulting in websites being | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Among test tooling not using WebDriver, I think Cypress is the most popular, and it's Chrome+Firefox. Puppeteer isn't hugely popular for testing (mostly automation) but it too is Chrome+Firefox. I'd describe the problem as "Many developer tools are targeting mainly Chrome, and sometimes Firefox, relying on different and non-standard protocols for each supported browser. This results in websites being better tested in those browsers, leading to site compatibility bits for other browsers." WDYT? There are also lessons in Web Testing Report (should we link that?) about support for a browser in a test framework not leading directly to that browser being tested. It also needs to be really simple (multi-browser by default I think) and above all the developer needs to want to test the browser to begin with, probably already doing some manual testing in it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @whimboo you might be interested in this wording as well, given your comment at #98 (comment) |
||
better tested against Chrome and leading to site compatibility bugs for other browsers. | ||
|
||
WebDriver BiDi introduces a new protocol, designed to be used in conjunction with the existing | ||
WebDriver protocol, allowing new functionality to be introduced to reduce the gap in functionality | ||
to CDP and allowing developer tooling to target a wider variety of browsers. | ||
|
||
The protocol is designed with the following goals in mind: | ||
WebDriver BiDi should help ensure a better end-user experience across all browsers by allowing | ||
developers to use the same tooling across all browsers. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's an additional motivation, which is to decouple automation tools from an ever-changing debugging protocol. Primarily, this reduces the maintenance burden of tool vendors whilst allowing them to (as you point out) target multiple different browsers at the same time. Because the CDP protocol changes regularly, and webdriver bidi would be stable, this would also allow cross-version testing, which is also important. Secondarily, this allows the Chromium team to make any changes they need to the protocol without fear of breaking those tools. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @shs96c agreed that allowing the debugging protocols to evolve to keep up with devtools product needs is an important reason why "just standardize CDP" didn't fly. Do you have a wording suggestion for this? Note that since BiDi will be implemented on top of CDP, changes to CDP will actually very much come with the fear of breaking tools using BiDi, and only rigorous internal testing will mitigate that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As such we should really do better in creating detailed WebDriver-bidi tests that will cover nearly all (important) cases for individual commands and events. For WebDriver HTTP I still see a lot of untested areas, especially around navigation. :( |
||
|
||
## Goals | ||
|
||
- **Support for the top customer scenarios identified at TPAC 2019:** | ||
- **Support for the top customer scenarios | ||
[identified](https://www.w3.org/2019/09/19-webdriver-minutes.html#item03) at | ||
[TPAC 2019](https://www.w3.org/2019/09/TPAC/):** | ||
- Listen for DOM events | ||
- Log what's going on in the browser including console and JS errors | ||
- Fail fast on any JS error | ||
|
@@ -31,19 +43,50 @@ The protocol is designed with the following goals in mind: | |
- Simple for browser vendors to implement and maintain. | ||
- Possible for clients to enhance their WebDriver automation with browser-specific devtools protocol features. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although not discussed at the time, being able to support something like lighthouse would be a useful thing for this spec to be able to do. |
||
|
||
This document doesn't attempt to dive into the any of the new feature scenarios identified above, but rather tries to provide a solid foundation and the necessary primitives to build these features on. The document does walk through an example of an existing WebDriver feature (unhandled prompts) being updated for a bidirectional world. | ||
## Non-goals | ||
|
||
Feature parity with CDP is a non-goal at this time; many features of CDP are rarely used by | ||
existing developer tooling, and being able to entirely supplant CDP is not a goal at this time. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indeed! |
||
|
||
## Prior Art | ||
|
||
\[FIXME: [CDP](https://chromedevtools.github.io/devtools-protocol/), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just turn this into a list, or is there additional fixing needed? |
||
[Firefox Remote Debug Protocol](https://firefox-source-docs.mozilla.org/devtools/backend/protocol.html), | ||
[Firefox Remote Protoco](https://wiki.mozilla.org/WebDriver/RemoteProtocol)l, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should say WebKit Inspector Protocol right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This isn't really prior art in the same sense, given it's never been publicly exposed. |
||
WebKit Inspector Protocol\] | ||
|
||
## Design | ||
|
||
WebDriver BiDi defines a transport layer (built on top of WebSockets) and a protocol on top of that | ||
(using JSON, where the messages are described in the standard using | ||
[CDDL](https://tools.ietf.org/html/rfc8610)). | ||
|
||
### Choice of Transport Layer | ||
|
||
\[FIXME\] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Dunno what you want here, but JSON-over-WebSockets is close the the prior art and the most standard thing for a bidi communication protocol on the web stack. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ignoring the prior art for a second, it's not immediately obvious why JSON-over-WebSockets is a better solution than CBOR-over-TCP, especially when it comes to non-local usage. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think "this is not where we want to spend our complexity budget; tooling in this space uniformly has access to JSON parsing, and WebSockets is the common denominator in existing implementations", but also "this is something we could change later if we end up in a place where the transport is having a noticable effect on performance" (but I'd be surprised if we end up in that place; if the deserialization cost is that high the protocol got too chatty anyway). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unusually for a W3C spec, we expect implementations to be provided in many different client languages. WebSockets and JSON give us a widely-supported base for local end implementations to be built from, and opens the door to quick shell scripts to do useful things. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this could just note that existing debugging protocols use WebSockets, so we will too. |
||
|
||
### Choice of Protocol Layer | ||
|
||
\[FIXME. Why JSON?\] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the answer is it matches the prior art. We should also say why we can't use JSON-RPC verbatim, and perhaps say that using CBOR is a future possibility, and the design allows for it by avoiding sending binary WebSockets messages entirely, and the first line of https://w3c.github.io/webdriver-bidi/#handle-an-incoming-message (Although maybe we should change that first line to silently drop binary messages or respond with an error instead of closing the connectin...) |
||
|
||
CDDL is used to describe the protocol layer because it provides formal semantics, accomplishing the | ||
"machine-readable API specification" goal while being similar to JSON-RPC used in the prior art. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The original WebDriver spec also uses JSON, so we're following in an already established precedent. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This could do with some tweaking I think. CDDL describes the shapes of messages that can be encoded as JSON, and we're using CDDL to describe something very similar to JSON-RPC, but CDDL itself is not similar to JSON-RPC, they're on different layers. Perhaps "while allowing us to define a protocol similar to JSON-RPC used in the prior art"? |
||
|
||
### Considered alternatives | ||
|
||
\[FIXME: Adopting CDP wholesale\] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe mention: reluctant to standardise, exposes implementation details, chatty protocol only designed for local usage. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, some of the FIXME was definitely "it's Sunday and I'm meant to be tidying my flat, not trying to improve the explainer due to WebDriver BiDi getting mentioned on orange site" (there are currently no interesting comments). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe this is where we mention WebTransport and CBOR, two newer things we're not going to use right now. |
||
|
||
## Proposals | ||
## Privacy and Security Concerns | ||
|
||
- [Core Functionality](./proposals/core.md) | ||
- [Bootstrap Scripts](./proposals/bootstrap-scripts.md) | ||
Any protocol that can be used for web testing automation can also open browsers up to malicious | ||
actors. It is vital that any functionality cannot be accessed from web platform content; browsers | ||
with multi-process architectures may want to minimise the amount of functionality within the web | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A very good point! Aside: This is actually relevant to the question of whether the |
||
content process to avoid the use of that functionality with any remote-code execution exploit | ||
within the process. | ||
|
||
[openrpc.json](./proposals/openrpc.json) contains an OpenRPC specification with an initial set of proposed commands and events. | ||
Other threats include: | ||
|
||
## References | ||
- Malware connecting to a user's browser and intercepting private data (through observing network | ||
requests) or maliciously controlling it (e.g. sending a payment when logged in to a bank). | ||
|
||
1. [WebDriver](https://w3c.github.io/webdriver/) | ||
2. [JSON-RPC 2.0 Specification](https://www.jsonrpc.org/specification) | ||
3. [OpenRPC Specification](https://spec.open-rpc.org/) | ||
4. [Browser Tools- and Testing WG, Day 1, TPAC 2019, Fukuoka](https://www.w3.org/2019/09/19-webdriver-minutes.html) | ||
5. [Browser Tools- and Testing WG, Day 2, TPAC 2019, Fukuoka](https://www.w3.org/2019/09/20-webdriver-minutes.html) | ||
- \[FIXME: ...\] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! 👍