Vimeo Engineering Blog - Medium

Unity, Vimeo, Sundance: the tech that’s changing storytelling

Casey Pugh — Thu, 08 Feb 2018 19:36:28 GMT

2017 was a very big year for Vimeo. Amidst a flurry of product launches (hello 360, hello live), we also formed Vimeo Creator Labs, a stealthy team that is hard at work exploring the intersection of video, film, art, media, and technology. We’re focusing on anything new and emerging in the video space — think VR, AR, robot brains, blockchain, and beyond.

https://medium.com/media/ff7cb06f48618bcf8f4d280a9e9bf212/href

All of these new developments made our return to Sundance Film Festival that much more exciting. We had 35 of our beloved Staff Pick alums attending, and they’re working on some truly stunning projects. This year, we were also super proud to have an alliance with New Frontier, because we believe that new media is pushing the industry forward. It was there that we had the opportunity to speak with some of our favorite creators about the future of storytelling.

Unity Realtime Demo, Book of the Dead

In addition to some of the sweet events we hosted in Park City (meet-ups, filmmaker hangs, secret late-night underground nacho parties, and watching short films upon short films), we were thrilled to share some of Creator Labs’ work in a panel we organized with Unity. Joining us on the panel were Nico Casavecchia and Martin Allais, both Staff Picked filmmakers, and the creators behind the VR short “BattleScar.” These two put forward one of the most compelling VR experiences at New Frontier this year, so it was great to have them contributing to the conversation. And don’t worry, if you missed us in Park City, peep the quick recap of our discussion in the video above!

BattleScar features 16-year-old Lupe, a wannabe tough kid, voiced by Rosario Dawson.

Beyond what you’ll see in the video, another exciting highlight came from Unity. They showcased the power of their real-time graphics engine, demonstrating their movement away from the stigma of being just a gaming engine, and instead towards empowering creators to tell boundary-breaking stories in new and exciting mediums. They’re also completely rethinking the creative workflow. With a real-time graphics engine, VFX teams no longer need to create in a linear fashion, but can now collaborate simultaneously. This is huge, because it saves tons of time by reducing operational overhead. Even Neill Blomkamp’s Oats Studios have been leveraging Unity for their latest short films.

It was a very inspiring year at Sundance, as VR and immersive storytelling projects seem to be turning a corner. We’re eager to explore further within this space, so we’ve prototyped some tools for Unity and WebVR to augment the tool belt of new media creators. For example, creators could leverage Vimeo’s hosting and high-quality player to stream video into their 2D or 3D experiences. We’re also making it easy to real-time video capture from Unity and share it with your team for review.

The future is looking pretty cool. See you on Mars.

Are you a new media creator? Comment below on the types of tools you use, and let us know what you’d love to see from Vimeo. If you’re still having Sundance FOMO, you can watch our curation team’s favorite short films from this year’s festival.

Originally published at https://vimeo.com/blog/post/unity-vimeo-sundance-tech-changing-storytelling.

Unity, Vimeo, Sundance: the tech that’s changing storytelling was originally published in Vimeo Engineering Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Introducing the new Vimeo upload API

Vimeo API — Wed, 07 Feb 2018 15:52:24 GMT

Vimeo’s got some pretty big news to share today. We’ve just rolled out an all-new, all-awesome version of our API — and we want you to be the first to use it. (Well, among the first. We did send out an email blast or two to some of you.)

We’re calling this version 3.4, and it’s packed with new features, enhancements, and ways to make your video life better. But if we had to pick just one word to describe it, that word would be uploads. That’s because you get more of everything: reliable file transfers, responsive upload endpoints, streamlined upload workflows, and just-plain-better upload experiences than in any previous Vimeo API.

The best place to get started is our developer site, but before you jump in, here’s a quick look at the topline points.

The tus standard comes standard

Resuming paused or interrupted uploads is easier than ever thanks to our brand new implementation of tus, an open-source protocol for transferring large files over the internet.

A tus upload begins much like any other, with an authorized POST request to /me/videos:

POST /me/videos HTTP/1.1

Host: api.vimeo.com
Authorization: bearer {access_token}
Content-Type: application/json

{
    "upload" : {
        "approach" : "tus",
        "size" : "{size}"
    }
}

The response comes back with a complete video representation — including the newly consolidated upload object, your one-stop shop for all upload fields:

{
…
    "upload" : {
        "status" : "in_progress",
        "upload_link" : "{upload_link}",
        "approach" : "tus",
        "size" : {size}
    },
…
}

Grab the upload link, and PATCH the binary video data to this location, along with some necessary headers:

PATCH {path_from_upload_link} HTTP/1.1
Host: {host_from_upload_link}
Content-Type: application/offset+octet-stream
Upload-Offset: 0
Tus-Resumable: 1.0.0

{binary_video_data}

If the upload stops, whether because of a lost internet connection or because your end user paused the transfer, the API returns a set of headers, including Upload-Offset. The value of this header tells you exactly where to restart the upload, which you do with another PATCH to upload.upload_link:

PATCH {path_from_upload_link} HTTP/1.1
Host: {host_from_upload_link}
Content-Type: application/offset+octet-stream
Upload-Offset: {upload_offset}
Tus-Resumable: 1.0.0

{remaining_binary_video_data}

Of course, the upload might have stopped because we received the entire file. If we did, the value of Upload-Offset is equal to {size} from the original POST.

The old way of resuming uploads through the API called for real-time manual calculations of bytes received and which byte number to pick up from. It was, and we quote, “a royal pain.” But with tus, you don’t need a calculator, because tus is the calculator.

Metadata first

Another improvement in 3.4 is that it gives you the ability to set video metadata during upload. Simply add the metadata fields and their values to the body of the initial POST request:

POST /me/videos HTTP/1.1
Host: api.vimeo.com
Authorization: bearer {access_token}
Content-Type: application/json

{
    "upload" : {
        "approach" : "tus",
        "size" : "{size}"
    },
    "name" : "{name}",
    "description" : "{description}",
    "privacy" : {
        "view" : "{privacy_view_setting}"
    },
    "embed" : {
        "playbar" : "{embed_playbar_setting}"
    }
}

Prior to 3.4, you’d PATCH the metadata to /videos/{video_id}, but only after you uploaded the file. And since the display name of a video is a function of metadata, many uploads started life as Untitled on Vimeo. You can keep that tradition going if you want. But why?

It’s all in the details (and the curly braces)

We’ve also tightened up the high-level logic of the API and improved consistency across the board.

Case in point: when you send a POST request to /me/videos, you get back a complete representation of the video, no matter the value of the approach parameter. And that representation includes the upload object, which you’ve already seen in action in our description of tus.

The upload object gives you everything you need to continue with the upload according to the value of approach. So when approach is tus, upload.upload_link contains the URL for the upload. But when approach is post, our form-based upload approach, upload.upload_link is null, while upload.form contains the markup for the HTML upload form.

And so much more in 3.4

That covers the highlights, but there are many more improvements to discover. See our Changelog for complete details. And when you’re ready to start, our dev site awaits.

Introducing the new Vimeo upload API was originally published in Vimeo Engineering Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Graphing systems metrics with netdata, Prometheus, and Grafana

Louis DeLosSantos — Wed, 13 Sep 2017 17:55:46 GMT

In this article I will walk you through the basics of getting Netdata, Prometheus and Grafana all working together and monitoring your application servers. This article will be using Docker on your local workstation. We will be working with Docker in an ad-hoc way, launching containers that run ‘/bin/bash’ and attaching a TTY to them. I use Docker here in a purely academic fashion and do not condone running netdata in a container. I pick this method so individuals without cloud accounts or access to VMs can try this out and for its speed of deployment.

Why netdata, Prometheus, and Grafana

Some time ago I was introduced to netdata by a colleague. We were attempting to troubleshoot python code which seemed to be bottlenecked. I was instantly impressed by the amount of metrics netdata exposes to you. I quickly added netdata to my set of go-to tools when troubleshooting systems performance.

Later, I was introduced to Prometheus. Prometheus is a monitoring application which flips the normal architecture around and polls rest endpoints for its metrics. This architectural change greatly simplifies and decreases the time necessary to begin monitoring your applications. Compared to current monitoring solutions the time spent on designing the infrastructure is greatly reduced. Running a single Prometheus server per application becomes feasible with the help of Grafana.

Grafana has been the go to graphing tool for some time now. It’s awesome. Anyone that has used it knows it’s awesome. We can point Grafana at Prometheus and use Prometheus as a data source.

All this together allows a pretty simple overall monitoring architecture: Install netdata on your application servers, point Prometheus at netdata, and then point Grafana at Prometheus.

I’m omitting an import ingredient in this stack in order to keep this tutorial simple and that is service discovery. My personal preference is to use Consul. Prometheus can plug into consul and automatically begin to scrape new hosts that register a netdata client with Consul.

At the end of this tutorial you will understand how each technology fits together to create a modern monitoring stack. This stack will offer you visibility into your application and systems performance.

Getting started — netdata and containers

To begin let’s create our container which we will install netdata on. We need to run a container, forward the necessary port that netdata listens on, and attach a TTY so we can interact with the bash shell on the container. But, before we do this we want name resolution between the two containers to work. In order to accomplish this we will create a user-defined network and attach both containers to this network. I have included the first command to run below.

docker network create — driver bridge netdata-tutorial

With this user-defined network created we can now launch our container we will install netdata on and point it to this network.

docker run -it — name netdata — hostname netdata — network=netdata-tutorial -p 19999:19999 centos:latest ‘/bin/bash’

This command creates an interactive TTY session (-it), gives the container both a name in relation to the Docker daemon and a hostname (this is so you know what container is which when working in the shells and Docker maps hostname resolution to this container), forwards the local port 19999 to the container’s port 19999 (-p 19999:19999), sets the command to run (/bin/bash) and then chooses the base container images (centos:latest). After running this you should be sitting inside the shell of the container.

After we enter the shell we can install netdata. This process could not be easier. If you take a look at this link, the netdata devs give us several one-liners to install netdata. I have not had any issues with these one liners and their bootstrapping scripts so far (if you run into anything do comment below!). Run the following command in your container.

bash <(curl -Ss https://my-netdata.io/kickstart.sh) — dont-wait

After the install completes you should be able to hit the netdata dashboard at http://localhost:19999/ (replace localhost if you’re doing this on a VM or have the Docker container hosted on a machine not on your local system). If this is your first time using netdata I suggest you take a look around. The amount of time I’ve spent digging through /proc and calculating my own metrics has been greatly reduced by this tool. Take it all in.

Next, I want to draw your attention to a particular endpoint. Navigate to http://localhost:19999/api/v1/allmetrics?format=prometheus&help=yes

in your browser. This is the endpoint which publishes all the metrics in a format which Prometheus understands. Let’s take a look at one of these metrics.

netdata_system_cpu_percentage_average{chart=”system.cpu”,family=”cpu”,dimension=”system”} 0.0831255 1501271696000

This metric is representing several things, which I will go in more details in the section on prometheus. For now understand that this metric: `netdata_system_cpu_percentage_average` has several labels: [chart, family, dimension]. This corresponds with the first cpu chart you see on the netdata dashboard.

This chart is called ‘system.cpu’, the family is cpu, and the dimension we are observing is “system”. You can begin to draw links between the charts in netdata to the prometheus metrics format in this manner.

Installing Prometheus

We will be installing Prometheus in a container for the purpose of demonstration. While Prometheus does have an official container, I would like to walk through the install process and setup on a fresh container. This will allow anyone reading to migrate this tutorial to a VM or server of any sort.

Let’s start another container in the same fashion as we did the netdata container in the previous section.

docker run -it — name prometheus — hostname prometheus — network=netdata-tutorial -p 9090:9090 centos:latest ‘/bin/bash’

This should drop you into a shell once again. Once there, quickly install your favorite editor as we will be editing files later in this tutorial.

yum install vim -y

Prometheus provides a tarball of their latest stable versions, so let’s download the latest and install into your container.

curl -L ‘https://github.com/prometheus/prometheus/releases/download/v1.7.1/prometheus-1.7.1.linux-amd64.tar.gz' -o /tmp/prometheus.tar.gz

mkdir /opt/prometheus

tar -xf /tmp/prometheus.tar.gz -C /opt/prometheus/ — strip-components 1

This should get Prometheus installed into the container. Let’s test that we can run Prometheus and connect to its web interface. It will look similar to what follows:

[root@prometheus prometheus]# /opt/prometheus/prometheus
INFO[0000] Starting prometheus (version=1.7.1, branch=master, revision=3afb3fffa3a29c3de865e1172fb740442e9d0133) source=”main.go:88"
INFO[0000] Build context (go=go1.8.3, user=root@0aa1b7fc430d, date=20170612–11:44:05) source=”main.go:89"
INFO[0000] Host details (Linux 4.9.36-moby #1 SMP Wed Jul 12 15:29:07 UTC 2017 x86_64 prometheus (none)) source=”main.go:90"
INFO[0000] Loading configuration file prometheus.yml source=”main.go:252"
INFO[0000] Loading series map and head chunks… source=”storage.go:428"
INFO[0000] 0 series loaded. source=”storage.go:439"
INFO[0000] Starting target manager… source=”targetmanager.go:63"
INFO[0000] Listening on :9090 source=”web.go:259"

Now, attempt to go to http://localhost:9090/. You should be presented with the Prometheus homepage. This is a good point to talk about Prometheus’s data model. As explained, we have two key elements in Prometheus metrics. We have the metric and its labels. Labels allow for granularity between metrics. Let’s use our previous example to further explain.

netdata_system_cpu_percentage_average{chart=”system.cpu”,family=”cpu”,dimension=”system”} 0.0831255 1501271696000

Above, our metric is “netdata_system_cpu_percentage_average” and our labels are chart, family, and dimension. The last two values constitute the actual metric value for the metric type (gauge, counter, etc…). We can begin graphing system metrics with this information, but first we need to hook up Prometheus to poll netdata stats.

Let’s move our attention to Prometheus’s configuration. Prometheus gets it config from the file located — in our example — at `/opt/prometheus/prometheus.yml`. I won’t spend an extensive amount of time going over the configuration values. We will be adding a new “job” under the “scrape_configs”. Let’s make the “scrape_configs” section look like this (we can use the dns name netdata due to the custom user-defined network we created in Docker beforehand).

scrape_configs:
# The job name is added as a label `job=` to any timeseries scraped from this config.
— job_name: ‘prometheus’

# metrics_path defaults to ‘/metrics’
# scheme defaults to ‘http’.

static_configs:
— targets: [‘localhost:9090’]

— job_name: ‘netdata’

metrics_path: /api/v1/allmetrics
params:
format: [ prometheus ]

static_configs:
— targets: [‘netdata:19999’]

Let’s start Prometheus once again by running `/opt/prometheus/prometheus`. If we now navigate to Prometheus at “http://localhost:9090/targets” we should see our target being successfully scraped. If we now go back to Prometheus’s homepage and begin to type “netdata_”. Prometheus should auto complete metrics it is now scraping.

Begin graphing

Let’s now start exploring how we can graph some metrics. Back in our netdata container lets get the CPU spinning with a pointless busy loop. On the shell do the following:

[root@netdata /]# while true; do echo “HOT HOT HOT CPU”; done

Our netdata CUP graph should be showing some activity. Let’s represent this in Prometheus. In order to do this, let’s keep our metrics page open for reference: http://localhost:19999/api/v1/allmetrics?format=prometheus&help=yes

We are setting out to graph the data in the CPU chart so let’s search for “system.cpu” in the metrics page above. We come across a section of metrics with the first comments,

# COMMENT homogeneus chart “system.cpu”, context “system.cpu”, family “cpu”, units “percentage”

followed by the metrics.

This is a good start. Now, let’s drill down to the specific metric we would like to graph.

# COMMENT netdata_system_cpu_percentage_average: dimension “system”, value is percentage, gauge, dt 1501275951 to 1501275951 inclusive
netdata_system_cpu_percentage_average{chart=”system.cpu”,family=”cpu”,dimension=”system”} 0.0000000 1501275951000

Here we learn that the metric name we care about is “netdata_system_cpu_percentage_average”. Throw this into Prometheus, and let’s see what we get. We should see something similar to this (I shut off my busy loop):

This is a good step toward what we want. You should also make note that Prometheus will tag on an “instance” label for us, which corresponds to our statically defined job in the configuration file. This allows us to tailor our queries to specific instances. Now we need to isolate the dimension we want in our query. To do this let’s refine the query slightly. And let’s also query the dimension. Place the following into our query text box.

netdata_system_cpu_percentage_average{dimension=”system”}

You should now wind up with the following graph.

Awesome, this is exactly what we wanted. If you haven’t caught on yet, we can emulate entire charts from netdata by using the `chart` dimension. If you’d like, you can combine the “chart” and “instance” dimension to create per-instance charts. Let’s give this a try:

netdata_system_cpu_percentage_average{chart=”system.cpu”, instance=”netdata:19999"}

This is the basics of using Prometheus to query netdata. I’d advise everyone at this point to read this page.The key point here is that netdata can export metrics from its internal DB, or it can send metrics “as-collected” by specifying the “source=as-collected” url parameter like so: http://localhost:19999/api/v1/allmetrics?format=prometheus&help=yes&types=yes&source=as-collected

If you choose to use the latter method, you will need to use Prometheus’s set of functions to obtain useful metrics, as you are now dealing with raw counters from the system.

For example, you will have to use the `irate()` function over a counter to get that metric’s rate per second. If your graphing needs are met by using the metrics returned by netdata’s internal database (not specifying any source= url parameter) then use that. If you find limitations then consider re-writing your queries using the raw data and using Prometheus functions to get the desired chart.

Install Grafana and graph to your heart’s content

Finally we make it to Grafana! This is the easiest part, in my opinion. This time we will actually run the official Grafana Docker container as all configuration we need to do is done via the GUI. Let’s run the following command:

docker run -i -p 3000:3000 — network=netdata-tutorial grafana/grafana

This will get grafana running at “http://localhost:3000/”. Let’s go there and login using the credentials Admin:Admin.

The first thing we want to do is click “Add data source”. Let’s make it look like the following screenshot

Graph away

With this completed, let’s graph! Create a new dashboard by clicking on the top left Grafana icon, and create a new graph in that dashboard. Fill in the query like we did above and save.

Bringing it all together

There you have it, a complete systems monitoring stack which is very easy to deploy. From here I would suggest you begin to investigate how Prometheus and a service discovery mechanism — such as Consul — can play together nicely. My current production deployments automatically register netdata services into Consul, and Prometheus automatically begins to scrape them. It’s so functional and awesome. Once achieved you do not have to think about the monitoring system until Prometheus cannot keep up with your scale. Once this happens, there are options presented in the Prometheus documentation for solving this.

Hope this was helpful. If you have more questions, or come across your own cool findings feel free to comment below. Happy monitoring!

Graphing systems metrics with netdata, Prometheus, and Grafana was originally published in Vimeo Engineering Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Android Instant Apps, step-by-step: how Vimeo went about it

Kyle Venn — Wed, 14 Jun 2017 18:34:21 GMT

What are Android Instant Apps?

As an Android user, I’m ecstatic to say that Google has finally made Android Instant Apps (AIA) public to developers. AIA is a new feature built into the Android operating system (available on everything back to Lollipop), allowing users to open individual features of your app without even having it installed. At Vimeo, we currently have an AIA with just one feature — our video player. This means that if someone clicks a Vimeo link, it’ll bring them to a nearly-identical playback experience as to what they’d enjoy on our full native mobile app — even if they don’t have the Vimeo app installed.

Vimeo was one of a handful of companies with early access to AIA, and we worked with Google to get our video player Instant App released by Google I/O. In the process of getting a 15MB full app down to a 4MB AIA feature, we learned a lot along the way. In this post I’ll try to cover some of the lessons we picked up around refactoring a large-production app, shrinking APK size, and some additional UX UI improvements we were able to make on our Instant App.

The Android developer documentation does a great job of covering all the basics of AIA, so start there or with this great post if you’re not familiar with setting up a project or roughly how the whole system works.

How do I refactor my massive app to support AIA features?

Say your current app is two years old, >15MB, has eight large features (and a load of dependencies), *and* you want to support at least one feature as an Instant App — how do you do it?

We ran into this situation back in February. We worked it out by developing a two-phase plan to get our player Instant App out as soon as possible (to meet the I/O deadline), while simultaneously paving the road for future AIA features beyond video playback. Hear about our first pass at AIA in Phase One — or feel free to jump to Phase Two where I talk about the right way to approach AIA implementation (which I also cover in the AIA panel at Google I/O).

Phase one: the quick and dirty approach

Now, as a warning, this first step is just about the opposite of what Google recommends for their Instant Apps — and I also highly advise against it for reasons I’ll discuss. But with only enough resources to put one engineer on the project at a time, we had to get creative with how we’d get our AIA feature finished in time. So right after joining the early access program with Google, we outlined the steps we’d take in our initial phase:

Delete everything that isn’t player code
Trim down the AIA to 4MB
Fix the bugs we created in steps 1 and 2
Modify UI to adhere to AIA UX best practices

Delete everything. We branched off of the “Vimeo Android” codebase and started deleting every Java class that wasn’t used by our player code, directly or indirectly. This amounted to a great deal of trial and error, but left us with a simple app that could launch our Vimeo player at a fraction of the full app’s size. Once we deleted all the Java classes, we used Android Studio’s nifty “Refactor — >Remove Unused Resources”, which cleared out all the layouts and drawables that were no longer referenced. This entire process only took about a day and got us from 15MB down to 8MB.

Trim down. Now, we only had about 4MB to go — but I assure you the last few MBs are the hardest. By using the APK Analyzer built into Android Studio, we found that a large portion of those 8MBs were coming from our larger third party dependencies.

APK Analyzer

Within the lib folder (left), you’ll see the different .so files which target individual ABIs. Currently every .so file (for every ABI) will be compiled into your feature APK — though eventually, Instant Apps may support splitting APKs based on different Android architectures. We found that libpano_video_renderer.so was coming from our dependency on the Cardboard VR SDK because it’s a library that uses the NDK, and libimagepipeline.so was coming from Fresco, our image caching library. Between the classes and .so files, Fresco was around 2MB and Cardboard was around 2.5MB.

Since we were working in a branch off of our actual app, we just deleted every reference to the Cardboard SDK. Since it was decoupled from our code, it was very easy to remove. Fresco, on the other hand, was referenced all over (and even in XML), which made it much more difficult. We ended up swapping out all of references to Fresco code with Picasso, a much smaller image caching library (<200KB).

This step turned out to be the most enlightening: it helped pinpoint exactly which large dependencies needed to be fixed and it influenced our strategy going forward, which I’ll outline in our Phase Two.

Fix the bugs. We saw some issues arise from the previous steps — mostly around buttons not working anymore due to functionality being removed, such as the cardboard button. But we also saw some nuanced changes, such as the loss of support for rounding the user’s avatar image since that was something we got for free from Fresco.

Instant App-ify the UI. We followed the Android UX guidelines pretty closely, but the short version is this: make it look like your full app, no splash screens, 2–3 implicit install prompts, and at least one explicit install prompt. For the implicit install prompts, we show explanatory dialogs if a person tries to Chromecast or comment on a video (if they’re not logged in). Explicit install prompts, on the other hand, must have the “install” icon, but don’t need to show an explanatory dialog. We only have one explicit install prompt button which mentions that the user can install the app to watch videos offline. We chose this text because it refers to a feature that isn’t available in an instant app, and is a popular feature in our full app. However, other than those few additional dialogs and the install button, the UI is identical to our full application.

Phase two: the right approach

As mentioned above, I don’t recommend the Phase One approach since you’ll end up with two code bases you have to maintain. But, if you absolutely need to get an AIA feature out quickly, then it will likely be your fastest option. If you have the time, however, I recommend untangling your dependencies one feature at a time. This post from a developer at Jet outlines many of the same strategies we’ve used for modularizing our app. I’ll try to briefly cover the pieces we found to be most important that allow you to incrementally get closer to your first AIA feature.

Get an idea of your larger dependencies
Remove, replace, or abstract the dependencies you can
Rely heavily on composition and dependency injection (DI)

Use tools to get an idea of your large dependencies. The APK Analyzer is a great place to start. In addition to seeing .so files, you can see which libraries have the most methods by clicking on classes.dex. Below on the left you can see com.google.android accounting for a large number of methods. We then were able to use a tool like the Dexcount Gradle Plugin, which gave a nice graphical view of our APK. From these two, we realized we could save a good chunk of space (and remove any need for play-services) by removing Chromecast functionality, which wasn’t even supported by AIA at the time.

Left: APK Analyzer, Right: Dexcount Gradle Plugin

To see a visualization of the tree of dependencies (maybe a library is defined as a dependency of several of your other dependencies), you can use the below command with your own app module name.

./gradlew -q dependencies :dependencies --configuration compile

Note: If you’re using v3.0.0 of the Gradle plugin, you can use implementation instead of compile — or leave the configuration parameter off entirely (-q just hides log messages).

Another way to get a rough idea of the size of some of these dependencies is by entering their package name directly into Methods Count, which will include the size (in KB) and the libraries they depend on.

The last tools I’ll mention are strictly about reducing size with little effort: ProGuard and ReDex, both of which are bytecode (dex) optimizers. ProGuard you’ll get with Android, and ReDex is an open-source library by Facebook.

Removing, replacing, and abstracting dependencies.

So now that you’ve isolated all of your large dependencies, how do you “take care of them”? That answer definitely varies case by case.

Removing. If you see that you’re depending on a library for a small subset of the functionality it offers, it may be better to remove that dependency altogether and write your own implementation. A classic example of this is importing the entire library of Guava for just a few string utilities. ProGuard will help remove unused code, but it won’t stop people from using other parts of the library. If at all possible, rely on very targeted libraries or turn to your own implementation for simple functionality.

Replacing or abstracting. In the case of image caching, we need a solution that will work for just about every conceivable AIA feature of our app, which means we can’t just remove caching. Additionally, we want to use Fresco for the full app (because it has lower level optimizations), but a smaller library for all of our Instant App features. So the solution we’ve chosen is to swap all calls to Fresco out with an interface that mirrors the API that we need. That way the full app can use Fresco, and the Instant App can use a smaller library, like Picasso or our own image caching implementation.

Rely heavily on composition and dependency injection (DI). We learned DI was crucial during our phase of removing all non-player code. There were multiple sets of functionality related to our player that weren’t required for the instant app, such as the playback of encrypted/downloaded files (since there’s no storage in AIA), Chromecasting, and 360 video and VR playback. So we took a page out of ExoPlayer’s book and relied on composition to inject functionality into our core player. We made it so the core player “presentation” and UI layers of our player architecture didn’t care about how videos were played. Instead, you could inject different “PlayerEngines” into the core player that would describe the functionality the player should use. With this architecture, we could easily omit the engines not necessary for AIA which also meant we could omit the required dependencies (Chromecast SDK, Cardboard SDK, encryption code).

Left: Monolithic player architecture, Right: DI reliant player architecture

Another benefit of using DI is that if you wrap all core functionality in interfaces, you can make the interfaces optional so that your system can live without it. If you were to wrap your crash reporting implementation in a nullable interface, you could choose to omit it for your AIA and rely on the developer console’s crash reporting, but include a different implementation in your full application. The same could be said for any singletons you may use in your code: we have an AuthSingleton, which is referenced heavily to see if a user is logged in. However, there may be AIA features which don’t need to know about auth. If it were wrapped in a nullable interface and injected everywhere it was needed, then it could be easily removed.

Putting it all together

Rome wasn’t built in a day, and the Romans certainly would have taken time to build Instant Apps with the TLC they deserve if they had the right tech. If there’s no rush to get an Instant App feature out, take your time to do it the right way. Using some techniques outlined above, you can incrementally get your APK size down and untangle your dependencies. Start with one large dependency at a time, remove your reliance on those libraries directly, and replace them with interfaces. Once all your code relies more on composition/DI, you can start breaking it into the different library modules and untangle your dependency tree even further. Lastly, for all new features going forward you should build them in their own module. This makes it harder to add unnecessary dependencies and forces you to keep your features lean. If that new feature will rely on some functionality in the app, pull that functionality into it’s own module. Oh, and keep a close eye on all your build.gradle files. Good luck!

Interested in flexing your engineering chops at Vimeo? Join our team!

Android Instant Apps, step-by-step: how Vimeo went about it was originally published in Vimeo Engineering Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Vimeo is adopting tus!

Peixian Wang — Wed, 10 May 2017 19:00:35 GMT

Big news, friends: today we’re thrilled to announce that Vimeo will be providing a new way for API developers to upload videos, making it speedier and easier than ever before! This new service will use tus, an open and resumable upload protocol.

What is tus?

tus is an open protocol for uploading files over HTTP. Using tus enables resumable uploads, meaning a user does not need to complete their file upload all in one session. And tus is an open source project with an active community: many client and server implementations exist in multiple languages, and there is a Slack channel dedicated to providing user support. Stay up to to date with the latest tus changes by visiting their Github project!

Why is it awesome?

We decided to use tus in our upload stack because the tus protocol standardizes the process of uploading files in a concise and open manner. This standardization will allow API developers to focus more on their application-specific code, and less on the upload process itself.

What are some of the amazing ways this plays out? Developers will be able to use any of tus’s many client implementations in whatever language they develop their application in. Using tus will also enable API developers to test their applications locally. Since Vimeo will be utilizing the same open source implementation of the tus server, API developers can run the server themselves and test their applications against it. This will make troubleshooting potential bugs in application code possible for developers without them having to hit Vimeo’s API.

This is a natural move for us, as Vimeo has always aimed to be an advocate and contributor to the open source community. As such, several brilliant Vimeans jumped in to help bring this update into reality for us — AND we’ve contributed two new features to the tus server implementation as well. We added support for Google Cloud Storage as a datastore and also enabled HTTP hooks for upload events. We believe that using tus will improve our members’ upload experience and provide a standard for uploading files across any platform.

When is it all happening?

We plan on making our tus upload stack available through our API in late 2017. Our current API upload process will be deprecated but not turned off. We highly recommend switching over to tus when it is available! We’ll let you know as soon as it’s ready, so you can start diving in and enjoying even more seamless uploading.

Vimeo is adopting tus! was originally published in Vimeo Engineering Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

How a hackathon idea turned into an official Vimeo app

Kevin Zetterstrom — Mon, 24 Apr 2017 14:49:33 GMT

One of the great perks of working at Vimeo is that we’re given time to work on something not yet associated with a product roadmap. We call this the Vimeo Jam, and it can include research, working on a feature we wish our apps had, or even creating a new application. We have a three-day long jam session in the Spring and Fall, as well as a jam day every other week.

During the Spring 2016 Jam session, I began work on an Android app called Day Tripper. The purpose of this app is to automatically download videos while a user is connected to wifi, so that they can play videos even if they have a weak internet connection. The user sets a trip duration, chooses their interests, and magically has videos (none of which they have previously seen) waiting for them to watch each time they open the app.

At the conclusion of the Spring 2016 Jam session, I had a working prototype. At the time, the only other Android application at Vimeo was the Vimeo flagship mobile app. To get the prototype up and running, I had to leverage code from the flagship app for authentication, the download system, and the video player, which unfortunately meant a lot of copy-paste. While this was OK for an initial prototype, it would have to change if the app was to ever make it into hands of consumers.

Being a commuter, I wanted to see this app become a real thing, if not for anything other than to enjoy it each day. Being an engineer, I wanted to make sure the application had a clean architecture and was free of bugs. So, I set off on a mission to separate the components needed by both Day Tripper and Vimeo into reusable, standalone modules. During each Jam day, I spent time removing dependencies and re-architecting components so that they could exist in multiple apps. Due to the way code was added as the Vimeo app grew, some of this proved challenging.

Ultimately, though, it helped the Android team rethink features and how we approach building core components. Throughout the course of the year, we created a separate layer called vimeo-kit-android that contains code that’s generic enough to be shared across various Vimeo-branded applications. Within that layer, we created many different modules, each with a specific purpose. For instance, there’s a download module — applications that need download functionality can include this module, while future applications that don’t need this could just omit it.

With the modules, we have several benefits. First, we know that code being shipped with Day Tripper has been battle tested by the millions of people who use the Vimeo flagship app. If we found a bug in code shared by either app, we know that we can fix it in one spot. This drastically reduces maintenance costs and improves code reliability. Next, we now have the ability to prototype quickly by leveraging pre-built components. Whether it’s a Jam project or an officially sanctioned app by another Vimeo team, we could quickly spin up apps that include code everyone is familiar with, while also bringing design unity to core components, like authentication screens. Finally, it helps us think differently about how we architect features, which leads to better code design and testability.

One year later, not only am I happy that our codebase is in a better place because of a Jam project, but I’m excited to announce that Day Tripper is available for download! Join the open Beta today and give us your thoughts!

How a hackathon idea turned into an official Vimeo app was originally published in Vimeo Engineering Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Automated type inference for dynamically typed programs

Matt Brown — Wed, 12 Apr 2017 18:54:17 GMT

Recently, we released a tool called Psalm that’s designed to check PHP code for potential errors. We use it at Vimeo as part of our build process, and it helps us stop bugs from making it into production code.

A number of similar tools exist for dynamically typed languages:

Hack (created by Facebook) has an OCaml-based type-checker that does an excellent job of analysing that language’s PHP-like syntax.
TypeScript (Microsoft) and Flow (also Facebook) are great for checking a Javascript-like syntax.
For PHP code: Phan (from Etsy) and PhpStan.

Psalm is designed to have a better understanding of the codebase than existing PHP static analysis tools, and most the work on Psalm has involved figuring out how to safely infer types in a large and varied codebase. I hope this brief introduction might shed some light on how that type inference works, and thus how Psalm interprets a given chunk of code.

Basics

To get a full picture of the program state in a block of code, we need to keep track of two separate but related concepts:

type assignments, e.g. $a = 1
type assertions, e.g. if ($a) {}

Type Assignments

$meaning_of_life = 42;

In the above assignment we know, regardless of any other factors, that $meaning_of_life is assigned the type int.

We can make this a little more abstract, and say that a set variable 𝔸 contains all the type assignments that we know about. In the above example 𝔸 = { ($meaning_of_life: int) }.

Each new expression E (as opposed to a statement, which can contain multiple expressions) potentially gives us new information about type assignments. We can say more generally that, if there’s a conflict, we can create a new set of assignments 𝔸ⁿ based on the previously known assertions 𝔸ⁿ⁻¹:

𝔸ⁿ = merge(getAssignments(E(n), 𝔸ⁿ⁻¹), 𝔸ⁿ⁻¹)

Here we also must define two functions: getAssignments extracts assignments from an expression and merge creates a new set of assertions substituting the new information for old.

In the context of the block below:

$a = 5;
$a = "five";
$b = $a;

we can calculate how the set of assertions changes line by line:

𝔸⁰ = {}

𝔸¹ = merge(getAssignments($a = 5, {}), {}) 
   = merge({ ($a: int) }, {}) 
   = { ($a: int) }

𝔸² = merge(
         getAssignments(
             $a = "five",
             { ($a: int) }
         ),
         { ($a: int) }
     ) 
   = merge({ ($a: string) }, { ($a: int) }) 
   = { ($a: string) }

𝔸³ = merge(
         getAssignments(
            $b = $a,
           { ($a: string) }
         ),
         { ($a: string) }
     ) 
   = merge({ ($b: string) }, { ($a: string) }) 
   = { ($a: string), ($b: string) }

Type Assertions

Conditional expressions often contain information about types that we can use. For example, given the function:

function foo($f) {
   if (is_int($f)) {
       // some code
   } else {
       // some other code
   }
}

we can easily see that $f is an integer inside the if statement. But this is not just a type assignment (of ($f: int)), because we know something extra — that $f can never be an int in the else block.

Instead, we need to keep track of all currently valid type assertions, 𝕋 (stored in Conjunctive Normal Form for easy manipulation).

To see how assignments and assertions interact, let’s analyse an if block in more depth.

If blocks

Let’s start with the simplest incarnation of an if statement — one with no elseif/else branches. How can we determine that the string return type of the following function is correct?

function foo(?string $a, ?string $b) : string {
   if (!$a || $b === "hello") {
       $a = "goodbye";
   }
   return $a;
}

Inside the if statement we need to calculate a new set of assignments based on the type assertions contained in the conditional:

𝔸ⁿ = merge(reconcile(𝕋, 𝔸ⁿ⁻¹), 𝔸ⁿ⁻¹)

Here we use a new function, reconcile. reconcile takes a set of assertions and creates a new set of type assignments given those truths.

Given the parameter type for $a, we have:

𝔸⁰ = { ($a: ?string), ($b: ?string) }

The if statement conditional contains the group of possible truths:

𝕋 = ((!$a) ∨ ($b === "hello"))

and inside the if we have a new set of assertions 𝔸¹:

𝔸¹ = merge( 
        reconcile(
           ((!$a) ∨ ($b === "hello")),
           { ($a: ?string), ($b: ?string) }
        ), 
        { ($a: ?string), ($b: ?string) }
     )
   = merge(
        { ($a: ?string), ($b: string) },
        { ($a: ?string), ($b: ?string) },
     )
   = { ($a: ?string), ($b: string) }

And then we encounter the expression $a = "goodbye";. When we see this expression, we can trivially calculate a new set of assertions as above:

𝔸¹ = { ($a: string), ($b: string) }

But we also need to update the group of truths within the scope of the if block because, by asserting that $a is now a string, the truth ((!$a) ∨ ($b === "hello")) is no longer applicable.

𝕋ⁿ = removeTruths(𝕋ⁿ⁻¹, getChangedVariables(𝔸ⁿ, 𝔸ⁿ⁻¹))

𝕋² = removeTruths(
         ((!$a) ∨ ($b === "hello")),
         getChangedVariables(
             { ($a: string), ($b: string) },
             { ($a: ?string), ($b: string) }
         )
     )
   = removeTruths(
         ((!$a) ∨ ($b === "hello")),
         { $a }
     )
   = {}

After the if block

After this specific if block we must update our set of assertions again. Specifically we must update the set of assertions 𝔸 with the knowledge we gained from the if block. Given:

𝕋⁰, the set of assertions in the if block’s conditional
𝔸⁰ being the set of assignments before the if block
𝔸¹ being the set of assignments inside the if block

We can calculate 𝔸² as

𝔸² = merge(findContradictions(𝕋⁰, 𝔸¹), 𝔸⁰)

where the function findContradictions finds all assertions that contradict the current truths.

𝔸² = merge( 
        findContradictions(
           ((!$a) ∨ ($b === “hello”)),
           { ($a: string), ($b: string) }
        ), 
        { ($a: ?string), ($b: ?string) }
     )
   = merge(
        { ($a: string) },
        { ($a: ?string), ($b: ?string) },
     )
   = { ($a: string), ($b: ?string) }

Now we can look at the statement return $a; and ascertain that it does, indeed, return a string.

Handling exit statements

Instead of asserting a non-null value, what if we instead return a value:

function foo(?string $a, ?string $b) : string {
   if (!$a || $b === “hello”) {
       return “goodbye”;
   }
 
   return $a;
}

That first return statement means that, after the if, our assignments now depend on the negation of 𝕋, ¬𝕋, and the initial assignments 𝔸⁰ before the if:

𝔸² = merge(reconcile(¬𝕋⁰, 𝔸⁰), 𝔸⁰)

And substituting in those values:

𝔸² = merge(  
        reconcile(
             (($a) ∧ ($b !== "hello")),
             { ($a: ?string), ($b: ?string) }
        ),  
        { ($a: ?string), ($b: ?string) }
     )
   = merge(
         { ($a: string) },
         { ($a: ?string), ($b: ?string) }
   )
   = { ($a: string), ($b: ?string) }

Which allows us, in a different context, to again assert that return $a; returns a string.

Wrapping up

We saw how you can analyse simple if statements if you keep track of variable type assignments and assertions.

More complicated rules are needed to analyse complex if statements (for example where variables are first assigned to within if blocks), and loops (for, foreach & while) present many other challenges.

If you’re interested, feel free to browse Psalm’s source code, and let us know if you have questions or further ideas: no ifs, ands, or strings are off limits!

Automated type inference for dynamically typed programs was originally published in Vimeo Engineering Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Orchestrating GCE Instances with Ansible

Louis DeLosSantos — Tue, 17 Jan 2017 15:57:17 GMT

Here at Vimeo, I’ve been trying to find the sweet spot between designing immutable infrastructure and retaining the ability to identify single resources within a homogenous group. I’ve recently adopted Ansible into my toolkit for just that. This post is going to focus on using Ansible as an orchestration tool, leaving the configuration-management aspects for another day. You can quickly install Ansible via ‘pip’.

pip install ansible

Why Ansible?

I’m a big fan of immutable infrastructure and infrastructure as code. These relatively new practices have the potential to ease administration, increase autonomy, and decrease large configuration management code bases. These are all nice features of this paradigm, but I’ve found that dealing with systems under load or new systems development requires us to occasionally fall back on more traditional models of operation.

A good example of falling back to treating our servers as “pets” is when we need to quickly enumerate process metrics from a host within a scaling group. (While this system is in development, we might not have detailed monitoring, and may have to use the command line.) Or when we’re quickly pushing changes to the number of worker processes a master process spawns, and viewing the system load before pushing out to all hosts — canary testing, if you will.

So, let’s get started

I’ll focus on GCE, but the concepts will be the same with AWS. You’ll have to fill in the gaps between the clouds. In order for us to start using Ansible to orchestrate instances in GCE, we’ll need to use Ansible’s dynamic infrastructure feature. This allows us to use a specified script in order to query GCE. When Ansible queries GCE, it creates an in-memory database of instances within our GCE Project (or VPC if you’re in AWS). We can find this feature in this platform guide.

Now, those instructions might be a little confusing. They tell you to use several files that provide the same information multiple times. I’m gonna give you simplified instructions. You should have an “ansible” service account created in your GCE project, and have this account’s json credentials downloaded to your workstation. If you don’t know how to do that, check out this page.

Once you’ve downloaded the “ansible” service account’s json file to your workstation, we need to do the following:

Install apache-libcloud

Since the python package apache-libcloud is a prerequisite. I’d install this into your global system’s python distribution.

pip install apache-libcloud

pip list #Confirm apache-libcloud is present

Create directory structure

Create a directory structure for Ansible. You can pick any root directory you like (but for my examples, I’ll use ~/git/ansible). Within the Ansible directory, create a folder called “inventory”.

mkdir -p ~/git/ansible/inventory

Obtain necessary files from ansible repo

Now we need to clone the Ansible repository only to obtain two files: ansible/contrib/inventory/{gce.py,gce.ini}. We will copy the gce.py file into ~/git/ansible/inventory and copy the gce.ini file into ~/git/ansible.

git clone https://github.com/ansible/ansible

cp ansible/contrib/inventory/gce.py ~/git/ansible/inventory/

cp ansible/contrib/inventory/gce.ini ~/git/ansible/

Configure gce.ini

After copying the files into the appropriate place, we need to populate gce.ini with the correct information. Fill out the following information:

gce_service_account_email_address = # Service account email found in ansible json file

gce_service_account_pem_file_path = # Path to ansible service account json file

gce_project_id = # Your GCE project name

Export gce.ini environment variable

Now we need to set an environment variable that informs the gce.py script where its ini file is located. I put the following in my .bashrc\.zshrc

export GCE_INI_PATH=~/git/ansible/gce.ini

Confirm #! points to the right python distribution

One last thing to check is that the #! (hashbang) directive in gce.py is using the correct python interpreter. The interpreter must be in the distribution folder in which your pip command installed apache-libcloud. (If not it, will fail.) The easiest way to remedy this is to run which:

❯ which python

  /usr/bin/python

With this information, update the first line of ~/git/ansible/inventory/gce.py to ‘#!/usr/bin/python’ (for my example).

Make sure everything works

So, after all this is taken care of, we can start to use Ansible to look at our GCE inventory. In order for us to test that the correct pieces are in the correct places, run the following command:

~/git/ansible/inventory/gce.py — list

You should see a large list of machines and information get dumped to your terminal. If you have any python package dependency issues, make sure you pip installed the complaining package, and make sure the interpreter used in the #! portion of the gce.py script is correct.

Hopefully at this point, everything’s working for you. Now Ansible’s able to query our GCE inventory. For those not familiar with Ansible, let’s explain what’s actually happening here.

What’s actually happening here

Ansible works with inventory files. An inventory file tells Ansible where to find the target machine that you’d like to perform some action on. In a typical use case, you’d be editing this inventory file yourself, adding hosts, and grouping them in intelligent ways. However, Ansible can also evaluate a script in order to form its inventory. That’s what we’re doing here. We’ll instruct Ansible to look at our inventory folder, in which will be gce.py. Ansible will execute gce.py and create an in-memory database of instances running within GCE.

Target a host, groups of hosts, and an instance group

The workflow that works best for me is: associating groups of servers with tags before having Ansible look at the inventory, pick the machines with our specified tag, and identify that these machines are targets. I’m going to create the following GCE components in order to demonstrate.

Instance-1 with tags [ example, one ]
Instance-2 with tags [ example, two ]

PS — For now the tag names are arbitrary, and here simply for instruction.

Let’s run the following command and notice the output:

❯ ansible -i ~/git/ansible/inventory tag_one -m ping

  instance-1 | SUCCESS => {

  “changed”: false,

  “ping”: “pong”

Great! We were able to point Ansible to our inventory file, and specify the ‘’ argument as a tag within GCE. You can probably see where this is going. We can create arbitrary groups of machine targets by placing them under the same tags. Let’s see what happens when we use the ping module against the tag ‘example’

❯ ansible -i ~/git/ansible/inventory tag_example -m ping

instance-2 | SUCCESS => {

“changed”: false,

“ping”: “pong”

instance-1 | SUCCESS => {

“changed”: false,

“ping”: “pong”

We just performed an action on two machines within our GCE cluster, simply by referencing their tag. This really shines when we have instance-groups (autoscaling groups in AWS). In my GCE project, I currently have an instance group named player-sentry-prod-worker-processor-1–0. I can target every machine within this group with the following command:

❯ ansible -i ~/git/ansible/inventory player-sentry-prod-worker-processor-1–0 -m ping

player-sentry-prod-worker-processor-1–0–7364 | SUCCESS => {

“changed”: false,

“ping”: “pong”

player-sentry-prod-worker-processor-1–0-iku7 | SUCCESS => {

“changed”: false,

“ping”: “pong”

player-sentry-prod-worker-processor-1–0-w2l6 | SUCCESS => {

“changed”: false,

“ping”: “pong”

player-sentry-prod-worker-processor-1–0–8fty | SUCCESS => {

“changed”: false,

“ping”: “pong”

player-sentry-prod-worker-processor-1–0-j483 | SUCCESS => {

“changed”: false,

“ping”: “pong”

Run commands across multiple instances

Let’s run an arbitrary command across all these nodes at once.

❯ ansible -i ~/git/ansible/inventory player-sentry-prod-worker-processor-1–0 -a ‘date’

player-sentry-prod-worker-processor-1–0–7364 | SUCCESS | rc=0 >>

  Tue Dec 20 06:53:29 UTC 2016

player-sentry-prod-worker-processor-1–0-w2l6 | SUCCESS | rc=0 >>

  Tue Dec 20 06:53:29 UTC 2016

player-sentry-prod-worker-processor-1–0-iku7 | SUCCESS | rc=0 >>

  Tue Dec 20 06:53:29 UTC 2016

player-sentry-prod-worker-processor-1–0-j483 | SUCCESS | rc=0 >>

  Tue Dec 20 06:53:29 UTC 2016

player-sentry-prod-worker-processor-1–0–8fty | SUCCESS | rc=0 >>

  Tue Dec 20 06:53:29 UTC 2016

As you can see, this can be a pretty powerful orchestration tool with very little necessary infrastructure. No servers, no agents, just tooling around SSH and GCE to make our lives easier.

Create runbooks for groups of instances

Now, since Ansible is a full-fledged configuration management solution — we can start to create playbooks that can act as runbooks for our servers. An example use case is quickly pushing a new systemd configuration file to a set of machines based on tag (or instance group, if you’d like). It’d look something like this:

❯ ls ~/git/ansible/plays/sentry/sentry-web/

push_config.yml sentry-web.service

❯ cat ~/git/ansible/plays/sentry/sentry-web/push_config.yml

--

- hosts: tag_sentry-web

tasks:

  - name: Upload systemd service to host

  copy:

    src: sentry-web.service

    dest: /etc/systemd/system/

    owner: root

    group: root

    mode: 0644

  become: true

  - name: Restart systemd service

  systemd:

    state: restarted

    daemon_reload: yes

    name: sentry-web

  become: true

❯ ansible-playbook -i ~/git/ansible/inventory ~/git/ansible/plays/sentry/sentry-web/push_config.yml

I won’t go too into detail about playbooks, because the documentation is sufficient. But I’ll summarize: our playbook targets our tag sentry-web and defines two tasks to run. The first is to upload our new systemd service configuration, followed by reloading systemd daemon and restarting the sentry-web service.

A couple other things as I wrap this up.

1. Take some time and view the output of gce.py — list. Any top-level json tag you see can be used as a host pattern to target machines. There’s some really helpful items there, such as Region and Zones. You can always parse this output with a json parser (such as jq) to obtain a base list of targetable items. Also, every hostname of an instance itself can be used as a target.

2. Ansible has a nice command-line flag for just listing hosts. I often use this when I need to actually SSH into a machine within an instance group.

❯ ansible -i ~/git/ansible/inventory tag_example --list-hosts

hosts (2):

instance-1

instance-2

3. The flag — check is very helpful. It’s a dry-run operation that’ll show you exactly what Ansible will do, but without making changes.

4. Experiment with some of these patterns:

Complex matching like below is possible:

# Return all instances with tag example but not tag one

❯ ansible -i ~/git/ansible/inventory ‘tag_example:!tag_one’ --list-hosts

hosts (1):

Instance-2

# Return all instances with tag_example but not hostname instance-1

❯ ansible -i ~/git/ansible/inventory ‘tag_example:!instance-1’ --list-hosts

hosts (1):

instance-2

I hope this gets the gears turning on how you can quickly create runbooks, list information, and orchestrate your cloud instances with Ansible. To me, this is the sweet spot between immutable infrastructure principles and traditional methods of caring for systems. The workflow really reminds me of the quick and easy usage of Fabric, but with better tooling around a dynamic cloud inventory. Let me know what you think, and enjoy!

Orchestrating GCE Instances with Ansible was originally published in Vimeo Engineering Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Improving load balancing with a new consistent-hashing algorithm

arodland — Mon, 19 Dec 2016 20:14:52 GMT

We run Vimeo’s dynamic video packager, Skyfire, in the cloud, serving almost a billion DASH and HLS requests per day. That’s a lot! We’re very happy with the way that it performs, but scaling it up to today’s traffic and beyond has been an interesting challenge. Today I’d like to talk about a new algorithmic development, bounded-load consistent hashing, and how it eliminates a bottleneck in our video delivery.

Dynamic packaging

Vimeo’s video files are stored as MP4 files, the same format used for download or “progressive” playback in the browser. DASH and HLS, however, don’t use a single file — they use short segments of video, delivered separately. When a player requests a segment, Skyfire handles the request on the fly. It fetches only the necessary part of the MP4 file, makes a few adjustments for the DASH or HLS format, and sends the result back to the user.

But how does Skyfire know which bytes it needs to fetch when a player requests, say, the 37th segment of a file? It needs to look at an index that knows the location of all of the keyframes and all of the packets in the file. And before it can look at it, it needs to generate it. That takes at least one HTTP request, and a bit of CPU time — or, for very long videos, a lot of CPU time. Since we get many requests for the same video file, it makes sense to cache the index and re-use it later.

When we first started testing Skyfire in the real world, we took a simple approach to caching: we cached the indexes in memory on the cloud server where they were generated, and used consistent hashing in HAProxy to send requests for the same video file to the same cloud server. That way, the cached data could be used again.

Understanding consistent hashing

Before moving forward, let’s dig into consistent hashing, a technique for distributing load among multiple servers. If you’re already familiar with consistent hashing, feel free to go ahead and skip to the next section.

To distribute requests among servers using consistent hashing, HAProxy takes a hash of part of the request (in our case, the part of the URL that contains the video ID), and uses that hash to choose an available backend server. With traditional “modulo hashing”, you simply consider the request hash as a very large number. If you take that number modulo the number of available servers, you get the index of the server to use. It’s simple, and it works well as long as the list of servers is stable. But when servers are added or removed, a problem arises: the majority of requests will hash to a different server than they did before. If you have nine servers and you add a tenth, only one-tenth of requests will (by luck) hash to the same server as they did before.

Then there’s consistent hashing. Consistent hashing uses a more elaborate scheme, where each server is assigned multiple hash values based on its name or ID, and each request is assigned to the server with the “nearest” hash value. The benefit of this added complexity is that when a server is added or removed, most requests will map to the same server that they did before. So if you have nine servers and add a tenth, about 1/10 of requests will have hashes that fall near the newly-added server’s hashes, and the other 9/10 will have the same nearest server that they did before. Much better! So consistent hashing lets us add and remove servers without completely disturbing the set of cached items that each server holds. That’s a very important property when those servers are running in the cloud.

Consistent hashing — less-than-ideal for load balancing

However, consistent hashing comes with its own problem: uneven distribution of requests. Because of its mathematical properties, consistent hashing only balances loads about as well as choosing a random server for each request, when the distribution of requests is equal. But if some content is much more popular than others (as usual for the internet), it can be worse than that. Consistent hashing will send all of the requests for that popular content to the same subset of servers, which will have the bad luck of receiving a lot more traffic than the others. This can result in overloaded servers, bad video playback, and unhappy users.

By November 2015, as Vimeo was getting ready to launch Skyfire to more than a hand-picked set of members, we decided that this overloading issue was too serious to be ignored, and changed our approach to caching. Instead of consistent-hashing based balancing, we used a “least connections” load-balancing policy in HAProxy, so that the load would be distributed evenly among servers. And we added a second-level cache using memcached, shared among the servers, so that an index generated by one server could be retrieved by a different one. The shared cache requiredsome additional bandwidth, but the load was balanced much more evenly between servers. This is the way we ran, happily, for the next year.

But wouldn’t it be nice to have both?

Why wasn’t there a way to say “use consistent hashing, but please don’t overload any servers”? As early as August 2015, I had tried to come up with an algorithm based on the power of two random choices that would do just that, but a bit of simulation said that it didn’t work. Too many requests were sent to non-ideal servers to be worthwhile. I was disappointed, but rather than wasting time trying to rescue it, we went ahead with the least-connections and shared cache approach above.

Fast forward to August 2016. I noticed a URL that the inestimable Damian Gryski had tweeted, of an arXiv paper titled Consistent Hashing with Bounded Loads. I read the abstract, and it seemed to be exactly what I wanted: an algorithm that combined consistent hashing with an upper limit on any one server’s load, relative to the average load of the whole pool. I read the paper, and the algorithm was remarkably simple. Indeed, the paper says

while the idea of consistent hashing with forwarding to meet capacity constraints seems pretty obvious, it appears not to have been considered before.

The bounded-load algorithm

Here is a simplified sketch of the algorithm. Some details are left out, and if you intend to implement it yourself, you should definitely go to the original paper for information.

First, define a balancing factor, c, which is greater than 1. c controls how much imbalance is allowed between the servers. For example, if c = 1.25, no server should get more than 125% of the average load. In the limit as c increases to ∞, the algorithm becomes equivalent to plain consistent hashing, without balancing; as c decreases to near 1 it becomes more like a least-connection policy and the hash becomes less important. In my experience, values between 1.25 and 2 are good for practical use.

When a request arrives, compute the average load (the number of outstanding requests, m, including the one that just arrived, divided by the number of available servers, n). Multiply the average load by c to get a “target load”, t. In the original paper, capacities are assigned to servers so that each server gets a capacity of either ⌊t⌋ or ⌈t⌉, and the total capacity is ⌈cm⌉. Therefore the maximum capacity of a server is ⌈cm/n⌉, which is greater than c times the average load by less than 1 request. To support giving servers different “weights”, as HAProxy does, the algorithm has to change slightly, but the spirit is the same — no server can exceed its fair share of the load by more than 1 request.

To dispatch a request, compute its hash and the nearest server, as usual. If that server is below its capacity, then assign the request to that server. Otherwise, go to the next server in the hash ring and check its capacity, continuing until you find a server that has capacity remaining. There has to be one, since the highest capacity is above the average load, and it’s impossible for every server’s load to be above average. This guarantees some nice things:

No server is allowed to get overloaded by more than a factor of c plus 1 request.
The distribution of requests is the same as consistent hashing as long as servers aren’t overloaded.
If a server is overloaded, the list of fallback servers chosen will be the same for the same request hash — i.e. the same server will consistently be the “second choice” for a popular piece of content. This is good for caching.
If a server is overloaded, the list of fallback servers will usually be different for different request hashes — i.e. the overloaded server’s spillover load will be distributed among the available servers, instead of all landing on a single server. This depends on each server being assigned multiple points in the consistent hash ring.

Real-world results

After testing the algorithm in the simulator and getting more positive results than my simpler algorithm, I started figuring out how to hack it into HAProxy. Adding code to HAProxy wasn’t too bad. The code is pretty clean and well-organized, and after a few days of work I had something that worked well enough that I could replay some traffic through it and see the algorithm in action. And it worked! Mathematical proofs and simulations are nice, but it’s hard to truly believe until you see real traffic hit real servers.

Armed with that success, in September I sent a proof-of-concept patch to HAProxy. The HAProxy maintainer, Willy Tarreau, was a real pleasure to work with. He recognized the value of the algorithm, and didn’t tell me how terrible my patch was. He did a thorough review and provided some very valuable feedback. It took a little while to work in those suggestions and get things up to snuff, but after a few weeks I had a polished version ready to send to the list. A few more minor tweaks and it was accepted in time for HAProxy 1.7.0-dev5, released on October 26. On November 25, HAProxy 1.7.0 was designated as a stable release, so bounded-load consistent hashing is now generally available.

But what I’m sure you want to know is, what did we actually gain from all of this?

Here’s a graph of the cache behavior before and after changing our HAProxy configuration.

The daily variation is caused by autoscaling: during the day, there’s more traffic, so we start more servers to handle it, and fewer requests could be served by local cache. At night, there’s less traffic, so we shut servers down, and the local cache performance went up somewhat. After switching to the bounded-load algorithm, a much bigger fraction of requests hit local cache, regardless of how many servers were running.

Here’s a graph of the shared cache bandwidth over the same time:

Before the change, each memcached server reached as high as 400 or 500 Mbit/s in outgoing bandwidth during peak hours (about 8Gbit/s in total). Afterwards, there’s less variation, and the servers stay comfortably below 100 Mbit/s each.

What’s not graphed is performance, in terms of response times. Why? Because they stayed exactly the same. The least-connection policy was doing a good job of keeping servers from getting overloaded, and fetching things from memcached is fast enough that it doesn’t have a measurable effect on the response times. But now that a much smaller fraction of the requests rely on the shared cache, and because that fraction doesn’t depend on the number of servers we run, we can look forward to handling a lot more traffic without saturating the memcached servers. In addition, if a memcached server ever goes down, the overall effect it has on Skyfire will be much less.

All in all, I’m very happy to see how a little bit of algorithm work turned a single point of failure into something a whole lot better.

Improving load balancing with a new consistent-hashing algorithm was originally published in Vimeo Engineering Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Boosting app performance with reflectionless (de)serialization

Kyle Venn — Tue, 29 Nov 2016 18:50:43 GMT

By Kevin Zetterstrom and Anthony Restaino

As any native app developer will tell you, API responsiveness and application performance are directly correlated with a positive user experience — meaning, when those things are running smoothly, your fans will be happy, and when they’re running less well, they notice. Because Vimeo’s app is so dependent on network requests, we investigated ways in which we could improve load time. While there were many areas in the request lifecycle that we examined, we focused on the parsing of JSON responses.

Gson is ON

The Vimeo Android app uses Retrofit for its networking layer and Gson for deserialization. One downside to this approach is that it can be quite slow, as Gson uses reflection to turn JSON into model objects. So to improve that deserialization time, we wanted to try removing that reflection. And we’re not the only ones who have realized how important this is: across industries, people are starting to recognize the cost of slow reflection.

In order to avoid reflection, we created custom Gson TypeAdapters. These allow us to control how data is parsed and provides us with a faster alternative. We have many models in our networking layer, and we chose a few to quantify the effect of reflection-less (de)serialization. The table below shows these models in terms of their data size.

All times are averages over 3 runs on the main thread

We used a high-end tablet for testing, knowing that if we saw gains there, lower-end devices would also benefit. Looking at the chart above, we can see that in many cases, primarily when the data is not as large, using custom TypeAdapters was faster than using reflection. The outlier was one of our heaviest models: on a high-end device such as the Nexus 9, reflection was faster than a custom TypeAdapter. But because profiling on a lower-end device showed us that we were still able to cut down on parsing time — and since not everyone has a top-shelf device — we decided it was still in our members’ best interests to use custom TypeAdapters.

STAG, yo

The models we used in the table above contained nested objects, and while it was pretty boilerplate, it amounted to 3K lines of additional code! We weren’t thrilled with the idea of writing all that code by hand, and one of our engineers (Anthony Restaino) had a great idea: why not generate it at compile time? Enter STAG. STAG stands for Speedy Type Adapter Generation, and it does just that.

STAG is an annotation processor. It works by looking for a specific annotation (GsonAdapterKey) on class member variables that you want to (de)serialize using Gson. If it finds this annotation on a member variable, it will create a TypeAdapter for that class, generating code for that member variable and any other annotated fields.

In our networking layer, we have abstract classes that use generics, so we made sure to accommodate for them. When you add the annotation to a concrete subclass, STAG will create a TypeAdapter for that class, thereby incorporating any annotated members of its parent — even those that are generic.

If you already use the Gson SerializedName annotation, taking advantage of this library is as simple as replacing that annotation with the GsonAdapterKey one. We were using SerializedName on many of our models, so incorporating STAG was fairly easy. If you’re not using the SerializedName annotation, simply add the annotation to the appropriate models.

Ready, set, deserialize

Want to get started? And improve overall app performance and user sentiment? STAG is open-sourced (https://github.com/vimeo/stag-java) and available now. No one should need to go through the headache of writing custom TypeAdapters, so for anyone using Gson, you can drop in this library and start raking in the benefits from performant, reflection-less (de)serialization.

Boosting app performance with reflectionless (de)serialization was originally published in Vimeo Engineering Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.