Showing posts with label HTML5. Show all posts

Thursday, December 17, 2015

HTML5 Video is now supported in Firefox

Today we’re excited to announce the availability of our HTML5 player in Firefox! Windows support is rolling out this week, and OS X support will roll out next year.

Firefox ships with the very latest versions of the HTML5 Premium Video Extensions. That includes the Media Source Extensions (MSE), which enable our video streaming algorithms to adapt to your available bandwidth; the Encrypted Media Extensions (EME), which allows for the viewing of protected content; and the Web Cryptography API (WebCrypto), which implements the cryptographic functions used by our open source Message Security Layer client-server protocol.

We worked closely with Mozilla and Adobe throughout development. Adobe supplies a content decryption module (CDM) that powers the EME API and allows protected content to play. We were pleased to find through our joint field testing that Adobe Primetime's CDM, Mozilla’s <video> tag, and our player all work together seamlessly to provide a high quality viewing experience in Firefox. With the new Premium Video Extensions, Firefox users will no longer need to take an extra step of installing a plug-in to watch Netflix.

We’re gratified that our HTML5 player support now extends to the latest versions of all major browsers, including Firefox, IE, Edge, Safari, and Chrome. Upgrade today to the latest version of your browser to get our best-in-class playback experience.

Friday, October 31, 2014

Message Security Layer: A Modern Take on Securing Communication

Netflix serves audio and video to millions of devices and subscribers across the globe. Each device has its own unique hardware and software, and differing security properties and capabilities. The communication between these devices and our servers must be secured to protect both our subscribers and our service.

When we first launched the Netflix streaming service we used a combination of HTTPS and a homegrown security mechanism called NTBA to provide that security. However, over time this combination started exhibiting growing pains. With the advent of HTML5 and the Media Source Extensions and Encrypted Media Extensions we needed something new that would be compatible with that platform. We took this as an opportunity to address many of the shortcomings of the earlier technology. The Message Security Layer (MSL) was born from these dual concerns.

Problems with HTTPS

One of the largest problems with HTTPS is the PKI infrastructure. There were a number of short-lived incidents where a renewed server certificate caused outages. We had no good way of handling revocation: our attempts to leverage CRL and OCSP technologies resulted in a complex set of workarounds to deal with infrastructure downtimes and configuration mistakes, which ultimately led to a worse user experience and brittle security mechanism with little insight into errors. Recent security breaches at certificate authorities and the issuance of intermediate certificate authorities means placing trust in one actor requires placing trust in a whole chain of actors not necessarily deserving of trust.

Another significant issue with HTTPS is the requirement for accurate time. The X.509 certificates used by HTTPS contain two timestamps and if the validating software thinks the current time is outside that time window the connection is rejected. The vast majority of devices do not know the correct time and have no way of securely learning the correct time.

Being tied to SSL and TLS, HTTPS also suffers from fundamental security issues unknown at the time of their design. Examples include padding attacks and the use of MAC-then-Encrypt, which is less secure than Encrypt-then-MAC.

There are other less obvious issues with HTTPS. Establishing a connection requires extra network round trips and depending on the implementation may result in multiple requests to supporting infrastructure such as CRL distribution points and OCSP responders in order to validate a certificate chain. As we continually improved application responsiveness and playback startup time this overhead became significant, particularly in situations with less reliable network connectivity such as Wi-Fi or mobile networks.

Even ignoring these issues, integrating new features and behaviors into HTTPS would have been extremely difficult. The specification is fixed and mandates certain behaviors. Leveraging specific device security features would require hacking the SSL/TLS stack in unintended ways: imagine generating some form of client certificate that used a dynamically generated set of device credentials.

High-level Goals

Before starting to design MSL we had to identify its high-level goals. Other than general best practices when it comes to protocol design, the following objectives are particularly important given the scale of deployment, the fact it must run on multiple platforms, and the knowledge it will be used for future unknown use cases.

Cross-language. Particularly subject to JavaScript constraints such as its maximum integer value and native functions found in web browsers.
Automatic error recovery. With millions of devices and subscribers we need devices that enter a bad state to be able to automatically recover without compromising security.
Performance. We do not want our application performance and responsiveness to be limited any more than it has to be. The network is by far the most expensive performance cost.

Figure 1. HTTP vs. HTTPS Performance
Flexible and extensible. Whenever possible we want to take advantage of security features provided by devices and their software. Likewise if something no longer provides the security we need then there needs to be a migration path forward.
Standards compatible. Although related to being flexible and extensible, we paid particular attention to being standards compatible. Specifically we want to be able to leverage the Web Crypto API now available in the major web browsers.

Security Properties

MSL is a modern cryptographic protocol that takes into account the latest cryptography technologies and knowledge. It supports the following basic security properties.

Integrity protection. Messages in transit are protected from tampering.
Encryption. Message data is protected from inspection.
Authentication. Messages can be trusted to come from a specific device and user.
Non-replayable. Messages containing non-idempotent data can be non-replayable.

MSL supports two different deployment models, which we refer to as MSL network types. A single device may participate in multiple MSL networks simultaneously.

Trusted services network. This deployment consists of a single client device and multiple servers. The client authenticates against the servers. The servers have shared access to the same cryptographic secrets and therefore each server must trust all other servers.
Peer-to-peer. This is a typical p2p arrangement where each each side of the communication is mutually authenticated.

Figure 2. MSL Networks

Protocol Overview

A typical MSL message consists of a header and one or more application payload chunks. Each chunk is individually protected which allows the sender and recipient to process application data as it is transmitted. A message stream may remain open indefinitely, allowing large time gaps between chunks if desired.

MSL has pluggable authentication and may leverage any number of device and user authentication types for the initial message. The initial message will provide authentication, integrity protection, and encryption if the device authentication type supports it. Future messages will make use of session keys established as a result of the initial communication.

If the recipient encounters an error when receiving a message it will respond with an error message. Error messages consist of a header that indicates the type of error that occurred. Upon receipt of the error message the original sender can attempt to recover and retransmit the original application data. For example, if the message recipient believes one side or the other is using incorrect session keys the error will indicate that new session keys should be negotiated from scratch. Or if the message recipient believes the device or user credentials are incorrect the error will request the sender re-authenticate using new credentials.

To minimize network round-trips MSL attempts to perform authentication, key negotiation, and renewal operations while it is also transmitting application data (Figure 2). As a result MSL does not impose any additional network round trips and only minimal data overhead.

Figure 3. MSL Communication w/Application Data

This may not always be possible in which case a MSL handshake must first occur, after which sensitive data such as user credentials and application data may be transmitted (Figure 3).

Figure 4. MSL Handshake followed by Application Data

Once session keys have been established they may be reused for future communication. Session keys may also be persisted to allow reuse between application executions. In a trusted services network the session keys resulting from a key negotiation with one server can be used with all other servers.

Platform Integration

Whenever possible we would like to take advantage of the security features provided by a specific platform. Doing so often provides stronger security than is possible without leveraging those features.

Some devices may already contain cryptographic keys that can be used to authenticate and secure initial communication. Likewise some devices may have already authenticated the user and it is a better user experience if the user is not required to enter their email and password again.

MSL is a plug-in architecture which allows for the easy integration of different device and user authentication schemes, session key negotiation schemes, and cryptographic algorithms. This also means that the security of any MSL deployment heavily depends on the mechanisms and algorithms it is configured with.

The plug-in architecture also means new schemes and algorithms can be incorporated without requiring a protocol redesign.

Other Features

Time independence. MSL does not require time to be synchronized between communicating devices. It is possible certain authentication or key negotiation schemes may impose their own time requirements.
Service tokens. Service tokens are very similar to HTTP cookies: they allow applications to attach arbitrary data to messages. However service tokens can be cryptographically bound to a specific device and/or user, which prevents data from being migrated without authorization.

The Release

To learn more about MSL and find out how you can use it for your own applications visit the Message Security Layer repository on GitHub.

The protocol is fully documented and guides are provided to help you use MSL securely for your own applications. Java and JavaScript implementations of a MSL stack are available as well as some example applications. Both languages fully support trusted services and peer-to-peer operation as both client and server.

MSL Today and Tomorrow

With MSL we have eliminated many of the problems we faced with HTTPS and platform integration. Its flexible and extensible design means it will be able to adapt as Netflix expands and as the cryptographic landscape changes.

We are already using MSL on many different platforms including our HTML5 player, game consoles, and upcoming CE devices. MSL can be used just as effectively to secure internal communications. In the future we envision using MSL over Web Sockets to create long-lived secure communication channels between our clients and servers.

We take security seriously at Netflix and are always looking for the best to join our team. If you are also interested in attacking the challenges of the fastest-growing online streaming service in the world, check out our job listings.

Wesley Miaw & Mitch Zollinger
Security Engineering

Monday, October 21, 2013

HTML5 Video Playback UI

by Kristofer Baxter

In the past we’ve written about HTML5 Video (HTML5 Video in IE11 on Windows 8.1, and HTML5 Video at Netflix) but we haven't spoken much about how we built the player UI. The UI Engineering team here at Netflix has been supporting HTML5 based playback for a little over a year, and now seems like the right time to discuss some of the strategies and techniques we are using to support video playback without a plugin.

One of our main objectives is to keep Netflix familiar to our members. That means we’re keeping the design of the HTML5 player consistent with our Silverlight experience. Features should be rolled out simultaneously for the two platforms. However, HTML5 users will enter playback faster, can enjoy 1080p content when GPU accelerated, and keep all the functionality they know and love.

Silverlight UI	HTML5 UI

In order to achieve a similar look and feel, we needed to recreate a few key elements of the Silverlight UI:

Scale interface to users resolution
Minimize Startup time via minimal dependency on data
Ensure High Performance on low end hardware

Scaling interface to users resolution

No matter what resolution the browser window used for playback is, our current playback UI ensures all of the controls maintain the same percentage size on screen. This lets users choose their own dimensions for playing content without the UI getting in the way.

Normally, a modern web application could implement this using CSS vw and vh units. However, we found this approach to be inadequate for our needs. Our player can be displayed in two fashions -- taking over the entire initial containing block of a viewport, or a smaller portion. To solve for this, we implemented a sizing scheme based entirely on font-relative lengths.

In this small example, you can see the scaling implementation in a direct form.

<style>
    .netflix-player-wrapper {
        font-size: 16px;
    }
    #netflix-player {
        position: absolute;
        width: 90%; height: 90%;
        left: 5%; top: 5%;
        overflow: hidden;
        background: #ccc;
        font-size: 1em;
    }
    #player-sizing {
        position: absolute;
        width: 1em; height: 1em;
        visibility: hidden;
        font-size: 1em;
    }
    #ten-percent-height {
        position: absolute;
        width: 80%; height: 10em;
        left: 10%; bottom: 8em;
        background: #000;
        display: flex;
    }    
    #ten-percent-height > p {
        display: block;
        margin: 1em;
        font-size: 2em;
        color: #fff;
    }
</style>
<div class="netflix-player-wrapper">
    <div id="netflix-player">
        <div id="player-sizing"></div>
        <div id="ten-percent-height"><p>Text</p></div>
    </div>
</div>
<script>
(function () {
    var sizingEl = document.getElementById("player-sizing"),
        controlWrapperEl = document.getElementById('netflix-player'),
        currentEmSize = 1.0;
                   
    function resize() {
        var wrapperHeight = controlWrapperEl.getBoundingClientRect().height,
            sizingHeight = sizingEl.getBoundingClientRect().height,
            wrapperOnePercentHeight = wrapperHeight / 100,
            offsetSize;

        if (sizingHeight > wrapperOnePercentHeight) {
            offsetSize = sizingHeight / wrapperOnePercentHeight;
            currentEmSize = currentEmSize / offsetSize;
        } else if (wrapperOnePercentHeight > sizingHeight) {
            offsetSize = wrapperOnePercentHeight / sizingHeight;
            currentEmSize = currentEmSize * offsetSize;
        }
        controlWrapperEl.style.fontSize = currentEmSize + "em";
    }
                
    window.addEventListener("resize", resize, false);
    resize();
})();
</script>

We implement this resizing functionality on a debounced interval in the player UI. Triggering it on every window resize would be wasteful.

By making an em unit represent 1% height of the "netflix-player" container, we can size all of our onscreen elements in a scaling manner - no matter how or where the netflix-player container is placed in the document.

Minimize Startup time via minimal dependency on data

Browser plugins like Flash and Silverlight can take several seconds to initialize, especially on a freshly booted machine. Now that we no longer need to initialize a plugin to play content, we can begin playback faster. However, we learned a lot about quick video startup in Silverlight, and can borrow techniques we developed to make our HTML5 UI launch content even faster.

When possible, allow playback to begin without title metadata.

If we already know which title the customer has selected to play (like a specific episode or movie), we can just start playback of that title immediately. Once the user has begun to buffer content, the UI can request display metadata. Metadata for the player can be a large payload since it includes episode data (title, synopsis, predicted rating), and is personalized to the user. By delaying the retrieval of metadata, users begin streaming 500 to 1200ms sooner in real-world usage.

For other conditions, such when a customer clicks play on a TV show and we want to start playback at the last episode that they were watching, we retrieve the specific episode the user wants before starting the playback process.

Populate controls which depend on rich data as that data becomes available.

Since we can begin playback before the player UI knows anything except which title to play, the player UI needs to be resilient against missing metadata. We display a minimal number of controls while this data is being requested. These controls include play/pause, exit playback, and full-screen toggling.

We use an eventing framework to let individual components know when data state has changed, so each component can stay decoupled. Here’s an example showing how we handle an event telling us the metadata is now loaded for the title.

function populateStatus() {
    if (Metadata.videoIsKnown(ObjectPool.videoId())) {
        // Update Status to reflect current playing item.
    } else {
        // Hide or remove current status
    }
}

Metadata.addEventListener(Metadata.knownEvents.METADATA_LOADED, populateStatus);

Ensure High Performance on all hardware

Not everyone has the latest and greatest hardware at their disposal, but this shouldn't prevent all sorts of devices from playing content on Netflix. To this end, we develop using a wide variety of hardware and test using a wide range of representative devices.

We’ve found the issues preventing great performance on low end hardware can mostly be avoided by adhering to the following best practices:

Avoid repaints and reflows whenever possible.

Reflows and repaints while playing content is quite costly to overall performance and battery life. As a result, we batch reads and writes to the DOM wherever possible. This helps us avoid accidental reflows.

Take advantage of getBoundingClientRect to determine the size of object.

This is a very fast way to get the dimensions of an object. However, it isn’t a free operation and results should be cached whenever possible.

Caching the size of objects when dragging, instead of recalculating them every time they are needed, is one such way to reduce the number of calls in quick succession.

function setupPointerData(e) {
    pointerEventData.dimensions = {
        handleEl:  handleEl.getBoundingClientRect(),
        wrapperEl: wrapperEl.getBoundingClientRect()
    };
    pointerEventData.drag = {
        start: { value: currentValue, max: currentMax },
        pointer: { x: e.pageX, y: e.pageY }
    };
}

function pointerDownHandler(e) {
    if (handleEl.contains(e.target)) {
        if (!dragging) {
            setupPointerData(e);
            dragging = true;
        }
    }
}

function pointerMoveHandler(e) {
    if (dragging && isValidEventLocation(e)) {
        if (!pointerEventData || !pointerEventData.dimensions) {
            setupPointerData(e);
        }
        // Use the handleEl dimensions, wrapperEl dimensions, 
        // and the event values to change the DOM.
    }
}

We have a lot of work planned

We’re working on exciting new features and constantly improving our HTML5 Video UI, and we’re looking for help. Our growing team is looking for experts to join us. If you’d like to apply, take a look here.

Monday, July 15, 2013

NfWebCrypto: a Web Cryptography API Native Polyfill

By Paul Adolph

At Netflix we are excited to build an HTML5-based player for our service, as described in a previous blog post. One of the “Premium Video Extensions” mentioned in that post is the Web Cryptography API, which “describes a JavaScript API for performing basic cryptographic operations in web applications, such as hashing, signature generation and verification, and encryption and decryption.” Netflix uses this API to secure the communication between our JavaScript and the Netflix servers.

The Web Cryptography WG of the W3C (of which Netflix is a member) produces the Web Cryptography API specification. Currently the spec is in the Working Draft stage and some browser vendors are waiting until the spec is more finalized before proceeding with their implementations. A notable exception is Microsoft, who worked with us to implement a draft version of the spec in Internet Explorer 11 for Windows 8.1 Preview, which now allows plugin-free Netflix video streaming.

To continue integrating our HTML5 application with other browsers, we decided to implement a polyfill based on the April 22, 2013 Editor’s Draft of the Web Cryptography API specification plus some other proposals under discussion. While similar in principle to JavaScript-based Web Crypto polyfills such as PolyCrypt, ours is implemented in native C++ (using OpenSSL 1.0.1c) to avoid the security risks of doing crypto in pure JavaScript. And because crypto functionality does not require deep browser integration, we were able to implement the polyfill as a stand-alone browser plugin, with our first implementation targeting Google’s Chrome browser using the Pepper Plugin API (PPAPI) framework.

So that you can also experiment with cryptography on the web, and to support the ongoing development of the specification in the W3C, we’ve released this NfWebCrypto plugin implementation as open source under the Apache Version 2.0 license. While NfWebCrypto is not yet a complete implementation of the Web Cryptography API, and may differ from the most recent version of the rapidly changing spec, we believe it has the mainstream crypto features many web applications will require. This means that you can use this plugin to try a version of the Web Cryptography API now, before it comes to your favorite browser.

At the moment the plugin is only supported in Chrome on Linux amd64 (tested in Ubuntu 12.04). For the latest details of what works and what does not, please see the README file in the NfWebCrypto GitHub repository. Here is a summary of the algorithms that are currently supported:

SHA1, SHA224, SHA256, SHA384, SHA512: digest
HMAC SHA: sign, verify, importKey, exportKey, generateKey
AES-128 CBC w/ PKCS#5 padding: encrypt, decrypt, importKey, exportKey, generateKey
RSASSA-PKCS1-v1_5: sign, verify, importKey, generateKey
RSAES-PKCS1-v1_5: encrypt, decrypt, importKey, exportKey, generateKey
Diffie-Hellman: generateKey, deriveKey
RSA-OAEP: wrapKey*, unwrapKey*
AES-KW: wrapKey*, unwrapKey*

*Wrap/Unwrap operations follow the Netflix KeyWrap Proposal and support protection of the JWE payload with AES128-GCM.

NfWebCrypto will of course be obsolete once browser vendors complete their implementations. In the meantime, this plugin is a stop-gap measure to allow people to move forward with cryptography on the web. Since finalization of the spec may still be some time away, we hope the community will benefit from this early look. We also hope that a concrete implementation will provide a backdrop against which the evolving spec can be evaluated. Finally, the NfWebCrypto JavaScript unit tests and perhaps the actual C++ implementation may be useful references for browser vendors.

Moving forward, we plan to keep pace with the W3C spec the best we can as it evolves. We welcome contributions to NfWebCrypto from the open source community, particularly in the areas of security audits, expanding the unit tests, and porting to other browser plugin frameworks and platforms.

You can find NfWebCrypto at the Netflix Open Source Center on GitHub.

Thursday, December 20, 2012

Building the Netflix UI for Wii U

Hello, my name is Joubert Nel and I’m a UI engineer on the TV UI team here at Netflix. Our team builds the Netflix experiences for hundreds of TV devices, like the PlayStation 3, Wii, Apple TV, and Google TV.

We recently launched on Nintendo’s new Wii U game console. Like other Netflix UIs, we present TV shows and movies we think you’ll enjoy in a clear and fast user interface. While this UI introduces the first Netflix 1080p browse UI for game consoles, it also expands on ideas pioneered elsewhere like second screen control.

Virtual WebKit Frame

Like many of our other device UIs, our Wii U experience is built for WebKit in HTML5. Since the Wii U has two screens, we created a Virtual WebKit Frame, which partitions the UI into one area that is output to TV and one area that is output to the GamePad.

This gives us the flexibility to vary what is rendered on each screen as the design dictates, while sharing application state and logic in a single JavaScript VM. We also have a safe zone between the TV and GamePad areas so we can animate elements off the edge of the TV without appearing on the GamePad.

We started off with common Netflix TV UI engineering performance practices such as view pooling and accelerated compositing. View pooling reuses DOM elements to minimize DOM churn, and Accelerated Compositing (AC) allows us to designate certain DOM elements to be cached as a bitmap and rendered by the Wii U’s GPU.

In WebKit, each DOM node that produces visual output has a corresponding RenderObject, stored in the Render Tree. In turn, each RenderObject is associated with a RenderLayer. Some RenderLayers get backing surfaces when hardware acceleration is enabled . These layers are called compositing layers and they paint into their backing surfaces instead of the common bitmap that represents the entire page. Subsequently, the backing surfaces are composited onto the destination bitmap. The compositor applies transformations specified by the layer’s CSS -webkit-transform to the layer’s surface before compositing it. When a layer is invalidated, only its own content needs to be repainted and re-composited. If you’re interested to learn more, I suggest reading GPU Accelerated Compositing in Chrome.

Performance

After modifying the UI to take advantage of accelerated compositing, we found that the frame rate on device was still poor during vertical navigation, even though it rendered at 60fps in desktop browsers.

When the user browses up or down in the gallery, we animate 4 rows of poster art on TV and mirror those 4 rows on the GamePad. Preparing, positioning, and animating only 4 rows allows us to reduce (expensive) structural changes to the DOM while being able to display many logical rows and support wrapping. Each row maintains up to 14 posters, requiring us to move and scale a total of 112 images during each up or down navigation. Our UI’s posters are 284 x 405 pixels and eat up 460,080 bytes of texture memory each, regardless of file size. (You need 4 bytes to represent each pixel’s RGBA value when the image is decompressed in memory.)

Layout of poster art in the gallery

To improve performance, we tried a number of animation strategies, but none yielded sufficient gains. We knew that when we kicked off an animation, there was an expensive style recalculation. But the WebKit Layout & Rendering timeline didn’t help us figure out which DOM elements were responsible.

WebKit Layout & Rendering Timeline

We worked with our platform team to help us profile WebKit, and we were now able to see how DOM elements relate to the Recalculate Style operations.

Our instrumentation helps us visualize the Recalculate Style call stack over time:

Instrumented Call Stack over Time

Through experimentation, we discovered that for our UI, there is a material performance gain when setting inline styles instead of modifying classes on elements that participate in vertical navigation.

We also found that some CSS selector patterns cause deep, expensive Recalculate Style operations. It turns out that the mere presence of the following pattern in CSS triggers a deep Recalculate Style:

.list-showing #browse { … }

Moreover, a -webkit-transition with duration greater than 0 causes the Recalculate Style operations to be repeated several times during the lifetime of the animation.
After removing all CSS selectors of this pattern, the resulting Recalculate Style shape is shallower and consumes less time.

Delivering great experiences

Our team builds innovative UIs, experiments with new concepts using A/B testing, and continually delivers new features. We also have to make sure our UIs perform fast on a wide range of hardware, from inexpensive consumer electronics devices all the way up to more powerful devices like the Wii U and PS3.

If this kind of innovation excites you as much as it does me, join our team!