The WHATWG Blog

Progressing Streams

July 5th, 2017 by Adam Rice

Back in 2014 we announced the Streams Standard. It's about time for an update on where we are and what's coming up.

Streaming the response to fetch() via the response.body attribute was standardized last year and is now implemented in several major browsers. Recently streaming uploads have been added to the Fetch Standard. fetch(url, {method: 'POST', body: readable}) will start an upload. The expected properties of a streaming function apply:

Bytes will be written as they become available.
Chunks do not need to be kept in memory after they have been uploaded.
Backpressure is applied: the source can stop generating new data when the network or server is slow to accept it.

Service Workers are another area where Streams are indispensable. They are used internally to allow the page to start processing bytes as soon as any are available, and are being used by developers as a powerful tool to synthesize responses. A Response object can be constructed with a ReadableStream body and then passed to fetchEvent.respondWith() like any other response.

The real power of Streams is unlocked when sources, sinks and transforms from disparate authors are combined in novel ways. The most exciting action is not in platform built-ins but in streams created in the wider developer ecosystem. With this in mind, every aspect of the standard has been fine-tuned for productivity. Take the following example:

let appendChildWritableStream = new WritableStream({
  write(domNode) {
   parentNode.appendChild(domNode);
  }
});

Notice what isn't there:

No setup code is needed. Boilerplate is kept to an absolute minimum.
No type conversions. Streams handle whatever types you throw at them.
We aren't interested in backpressure here, so nothing needs to be done about it.
Our data sink is synchronous, so no async code needs to be written.

With a recent browser you can see this in action in a live demo (video). At time of writing, this demo and the others in this post work in Chrome stable version 59 and Safari stable version 10.1.

The WritableStream API is now stable. We've added a getWriter() method which is the analogue of ReadableStream's getReader(). It adds locking semantics so that multiple writers cannot interfere with each other. Recent work has focused on predictability, for example by preventing underlying sink methods from running concurrently, and robustness, like dealing with badly-behaving strategy size functions that call into other methods reentrantly.

The strength of the algorithmic style of specification is that even unintended behavior will be the same between implementations. On the other hand, when specifying the pipeTo() method of ReadableStream, providing latitude for browsers to optimize was a high priority. As well as bypassing JavaScript when copying data between built-in streams, user agents may need to change the timing or ordering of calls to underlying methods to get the best performance for their architecture. For this reason, we specified pipeTo() in a requirements style. This presents its own challenges, for example how to specify the "least work" that an implementation can do and still be compliant.

Streams also challenge our fundamental assumptions about how the web platform works. You may not want to have to modify the DOM directly if you already have a template engine producing HTML. Shouldn't you be able to pipe a stream of HTML to an element?

We don't yet know how this capability would fit in the web platform, but Jake Archibald has created a custom element providing a compelling vision of what we could do with it. The demo demonstrates inserting a stream of HTML directly from the server.

Depending on your environment, you may have seen some significant jank in that demo. The problem is that the server supplies data faster than the browser can layout and render it. This is where backpressure comes in. Any data sink can apply backpressure just by returning a promise from its write() method. In many cases this happens as a natural consequence of the implementation. In this case, we want to delay until the browser has had a chance to render the HTML. A slight modification to the custom element and the page becomes much smoother: demo (side-by-side video).

It's clear that we should prioritize interactivity when adding content to an existing page. Maybe browsers need a special low-jank path for streaming HTML. But what about initial page load? You've probably seen pages that didn't respond to input because they were still performing some expensive layout below the fold. Should we prioritize interactivity there, too? We're still working through all the implications.

Ensuring low friction for all participants has really helped drive the progress of the standard. From bug fixes to the BYOB readable byte stream design from implementers, to large scale contributions from external contributors, the benefits of the community process are clear.

Transform streams are the final key piece needed to make the stream ecosystem complete. We have a working, tested reference implementation that we are using as the basis for active design discussions. Full standardization and implementer adoption is expected to follow in the next few months.

In past two years streams have gone from being a promising idea to having multiple independent implementations and wide adoption. Implementation work is accelerating, and there is already a critical mass of shipping functionality.

We're looking to widen developer involvement with Streams. Check out the examples, contribute some web platform tests, or help improve the documentation.

Posted in What's Next | Comments Off on Progressing Streams

The Developer’s Edition of HTML makes a comeback

June 28th, 2017 by Domenic Denicola

Back in 2011, Ben Schwarz took on the ambitious project of curating an edition of the HTML Standard specifically for web developers. It omitted details aimed specifically at browser vendors, and had several additional features to make the experience more pleasant to read.

Ben did an amazing job maintaining this for many years, but some time ago it fell behind the changes to the HTML Standard. Since the move to make HTML more community-driven, we've been hoping to find a way to synchronize the developer's edition with the mainstream specification. That day has finally arrived!

We've deployed an initial version of the new developer's edition at a new URL, https://html.spec.whatwg.org/dev/. It's rough around the edges, missing several of the features of the old version. And it needs some curation to omit implementer-specific sections; many have crept in during the downtime. We're tracking these and other issues in the issue tracker. But now, the developer's edition is integrated into our build process and editing workflow, and will forever remain synchronized with the HTML Standard itself.

Hereby we issue a call to the community to help us with the revitalized developer's edition. Two of the biggest areas of potential improvement are helping us properly mark up the source according to the guidelines for what goes in the developer's edition, and contributing to the design of the developer's edition in order to make it more beautiful and usable.

Finally, I want to thank Michael™ Smith for getting this process started, via a series of pull requests to our build tools which did most of the foundational work. And of course Ben Schwarz, without whom none of this would have happened in the first place.

Posted in Tutorials, What's Next | Comments Off on The Developer’s Edition of HTML makes a comeback

HTML and shared memory

April 28th, 2017 by Anne van Kesteren

You’d think that the HTML Standard would be pretty far removed from shared memory considerations, but as it happens HTML defines a parser for HTML which is intertwined with script execution, defines a way to instantiate new global objects through the iframe element, defines a way to instantiate new threads (and even processes, depending on the implementation) with workers, and all the various infrastructure pieces that go along with that. Finally, it also defines a message-based communication channel to communicate between those threads and processes.

That still doesn’t give us shared memory. For that, JavaScript needed to evolve and gain a new SharedArrayBuffer class: a sibling to ArrayBuffer, with the ability to be accessed from several threads at once. And on top of that we needed to do some work to make it play nicely with all the various globals the web platform provides and make sure it worked with the message-passing system (which you probably know as postMessage()), all while trying to avoid violating constraints that would make programming with SharedArrayBuffer objects a nightmare.

We ended up making several changes (and to make sure they all end up being interoperable we wrote accompanying tests):

Changed the “structured cloning” algorithm into distinct serialization and deserialization algorithms, thereby introducing an intermediate, serialized, form. This was a long overdue refactoring needed to define MessageChannel objects properly, since when using those you don't always know at the start where you'll end up. For SharedArrayBuffer objects this was critical, since we needed to enforce that they can't be sent across process boundaries.
Redefined the way worker ownership works, so it’s effectively a chain of parent-based ownership rather than all workers being owned by documents. This was necessary as we needed to separate dedicated workers nested in shared workers (not widely supported) from those nested in documents, as memory sharing works differently in these two cases.
Defined the boundaries between which globals you can share memory. For the record, the web platform has many global objects: Window, DedicatedWorkerGlobalScope, SharedWorkerGlobalScope, ServiceWorkerGlobalScope, and soon various subclasses of WorkletGlobalScope. A simplified (and slightly inaccurate) description would be that a window can share with any of its same-origin windows in iframe elements, and any descendant dedicated workers (if there’s no shared/service worker in that chain). A worker (dedicated, shared, or service) can share with any descendant dedicated workers (again, as long as there’s no shared worker in that chain). As worklets aren’t finished yet you’ll have to read up on the actual pull request for the ongoing deliberations. We might post an update when they’re shipping if there’s interest.
Defined a new messageerror event that basically ensures that when message-passing goes wrong that error does not get lost. These errors happen when you cannot allocate enough memory in the destination, or try to pass a SharedArrayBuffer object across a (theoretical) process boundary. As this event is dispatched on the receiving end it’s not the best, but if we detect that libraries often end up passing this information back to the sender we might take care of that at the standards-level at some point. For now messaging errors back was deemed too complicated and not important enough given the conditions under which these occur.
Actually defined how these SharedArrayBuffer objects get serialized and then deserialized, how various platform objects integrate with that, and how all the existing APIs that deal with serialization and deserialization in some manner integrate with that. E.g., passing SharedArrayBuffer objects to pushState() ends up throwing, because we don’t want to store them to disk, but postMessage() should generally work (although initial implementations will have limitations here, especially with MessageChannel).

As always, nothing is perfect and there are some gotchas without a good solution:

Imagine you have a window with a descendant iframe element that has further descendant dedicated workers that all collaborate together with shared memory and then the iframe element gets navigated. This ends up stopping the workers without the ability to do cleanup. Some workarounds are available, but in general it’s a somewhat fragile setup that deserves a better solution.
Aborting scripts: browsers typically let users abort scripts that are detected to significantly slow down their computer through some heuristic. This can violate some of the invariants the shared memory design tries to provide.

Although the above covers the integration of shared memory into the foundations of the web platform, there is still ongoing work on allowing specific APIs to accept and operate on shared memory. This requires changes to IDL to introduce a mechanism for safelisting APIs that can operate on SharedArrayBuffer objects, as well as updating specifications to use that new safelisting mechanism, and of course writing tests for these spec changes. This work is still ongoing, but at least now it can build on top of a solid foundation.

Posted in Browser API | Comments Off on HTML and shared memory

Working mode

April 25th, 2017 by Anne van Kesteren

In a previous post we’ve already explained how interoperability is important to the WHATWG. Without it, we’re writing fiction, and in the world of standards that is no good.

From a similar perspective, we’ve now more clearly documented how the WHATWG creates standards. The Working Mode document describes what is expected of editors and contributors, what criteria any changes to standards must fulfill, and gives guidelines for conflicts and tests.

What has changed the most since 2004 is requiring tests and implementer support for any changes made. These should help ensure that decisions need not be revisited again. Documenting our processes is also new and is born out of necessity due to the wider range of standards the WHATWG maintains.

We appreciate any feedback on the Working Mode document as it can undoubtedly be refined further.

Posted in WHATWG | Comments Off on Working mode

Improving interoperability

January 30th, 2017 by Philip Jägenstedt

The goal of the WHATWG’s Living Standards is to achieve interoperable implementations. With an ever-evolving web platform, we want changes to our standards to reach all implementations quickly and reliably, but from time to time there have been mishaps:

Two table-related interfaces were ignored by implementers for a decade, but when finally removed from the standard it turned out that WebKit had just added them. Sorry!
In 2007, the serialization algorithm for <pre> was changed to add an extra newline, so that it would roundtrip correctly. It was implemented in Presto (Opera) in 2012 (and the standard was tweaked), then implemented in Gecko in 2016, but they found it was not web compatible, so it was reverted. Oh well.
We changed how non-empty URL attributes are reflected with support from two implementers, but nobody followed through with the change and it was later reverted. Ahem.

Three months ago, we changed the process for the HTML Standard to encourage writing tests and filing browser bugs for normative changes. (Normative means that implementations are affected.) This was the first step on a path towards improving interoperability and shortening the feedback cycle, and it has thus far exceeded our own expectations:

“Tests and bugs speed up the turnaround time a lot. Without this it could go years before a browser vendor picked up or objected to a change (even if someone from that browser had given an a-OK for the change). Now some changes have patches in browsers before the change has landed in the spec.”
“Writing tests also increases the quality of the spec, since problems become clear in the tests. It also seems reasonable to assume that the tests help getting interoperable implementations, which is the goal of the spec.”
“I feel much more sure that when I make a spec change, I am doing all I can to get it implemented everywhere, in precisely the manner I intended.”

As an example, see Remove "compatibility caseless" matching where 3 of the 4 browser bugs are now fixed, or Add <script nomodule> to prevent script evaluation where all vendors have indicated support, and WebKit has a patch to implement the proposed feature and are contributing their tests to web-platform-tests—even before the standard’s pull request has landed.

Note in particular that this has not amounted to WHATWG maintainers writing all new tests. Rather, we are a community of maintainers, implementers and other contributors, where tests can be written to investigate current behavior before even discussing a change to the standard, or where the most eager implementer writes tests alongside the implementation.

We have been using this process successfully for other WHATWG standards too, such as Fetch, URL, and Streams. And today, we are elevating this process to all WHATWG standards, as now documented in the WHATWG contributor guidelines.

Posted in WHATWG | Comments Off on Improving interoperability