Showing posts with label NDK. Show all posts
Showing posts with label NDK. Show all posts

15 October 2015

Android Support Library 23.1

Posted by Ian Lake, Developer Advocate

The Android Support Library is a collection of libraries available on a wide array of API levels that help you focus on the unique parts of your app, providing pre-built components, new functionality, and compatibility shims.

With the latest release of the Android Support Library (23.1), you will see improvements across the Support V4, Media Router, RecyclerView, AppCompat, Design, Percent, Custom Tabs, Leanback, and Palette libraries. Let’s take a closer look.

Support V4

The Support V4 library focuses on making supporting a wide variety of API levels straightforward with compatibility shims and backporting specific functionality.

NestedScrollView is a ScrollView that supports nested scrolling back to API 4. You’ll now be able to set a OnScrollChangeListener to receive callbacks when the scroll X or Y positions change.

There are a lot of pieces that make up a fully functioning media playback app, with much of it centered around MediaSessionCompat. A media button receiver, a key part to handling playback controls from hardware or bluetooth controls, is now formalized in the new MediaButtonReceiver class. This class makes it possible to forward received playback controls to a Service which is managing your MediaSessionCompat, reusing the Callback methods already required for API 21+, centralizing support on all API levels and all media control events in one place. A simplified constructor for MediaSessionCompat is also available, automatically finding a media button receiver in your manifest for use with MediaSessionCompat.

Media Router

The Media Router Support Library is the key component for connecting and sending your media playback to remote devices, such as video and audio devices with Google Cast support. It also provides the mechanism, via MediaRouteProvider, to enable any application to create and manage a remote media playback device connection.

In this release, MediaRouteChooserDialog (the dialog that controls selecting a valid remote device) and MediaRouteControllerDialog (the dialog to control ongoing remote playback) have both received a brand new design and additional functionality as well. You’ll find the chooser dialog sorts devices by frequency of use and includes a device type icon for easy identification of different devices while the controller dialog now shows current playback information (including album art).

To feel like a natural part of your app, the content color for both dialogs is now based on the colorPrimary of your alert dialog theme:

<!-- Your app theme set on your Activity -->
<style name="AppTheme" parent="Theme.AppCompat.Light.DarkActionBar">
  <item name="colorPrimary">@color/primary</item>
  <item name="colorPrimaryDark">@color/primaryDark</item>
  <item name="alertDialogTheme">@style/AppTheme.Dialog</item>
</style>
<!-- Theme for the dialog itself -->
<style name="AppTheme.Dialog" parent="Theme.AppCompat.Light.Dialog.Alert">
  <item name="colorPrimary">@color/primary</item>
  <item name="colorPrimaryDark">@color/primaryDark</item>
</style>

RecyclerView

RecyclerView is an extremely powerful and flexible way to show a list, grid, or any view of a large set of data. One advantage over ListView or GridView is the built in support for animations as items are added, removed, or repositioned.

This release significantly changes the animation system for the better. By using the new ItemAnimator’s canReuseUpdatedViewHolder() method, you’ll be able to choose to reuse the existing ViewHolder, enabling item content animation support. The new ItemHolderInfo and associated APIs give the ItemAnimator the flexibility to collect any data it wants at the correct point in the layout lifecycle, passing that information into the animate callbacks.

Note that this new API is not backward compatible. If you previously implemented an ItemAnimator, you can instead extend SimpleItemAnimator, which provides the old API by wrapping the new API. You’ll also notice that some methods have been entirely removed from ItemAnimator. For example, if you were calling recyclerView.getItemAnimator().setSupportsChangeAnimations(false), this code won’t compile anymore. You can replace it with:

ItemAnimator animator = recyclerView.getItemAnimator();
if (animator instanceof SimpleItemAnimator) {
  ((SimpleItemAnimator) animator).setSupportsChangeAnimations(false);
}

AppCompat

One component of the AppCompat Support Library has been in providing a consistent set of widgets across all API levels, including the ability to tint those widgets to match your branding and accent colors.

This release adds tint aware versions of SeekBar (for tinting the thumb) as well as ImageButton and ImageView (providing backgroundTint support) which will automatically be used when you use the platform versions in your layouts. You’ll also find that SwitchCompat has been updated to match the styling found in Android 6.0 Marshmallow.

Design

The Design Support Library includes a number of components to help implement the latest in the Google design specifications.

TextInputLayout expands its existing functionality of floating hint text and error indicators with new support for character counting.

AppBarLayout supports a number of scroll flags which affect how children views react to scrolling (e.g. scrolling off the screen). New to this release is SCROLL_FLAG_SNAP, ensuring that when scrolling ends, the view is not left partially visible. Instead, it will be scrolled to its nearest edge, making fully visible or scrolled completely off the screen. You’ll also find that AppBarLayout now allows users to start scrolling from within the AppBarLayout rather than only from within your scrollable view - this behavior can be controlled by adding a DragCallback.

NavigationView provides a convenient way to build a navigation drawer, including the ability to creating menu items using a menu XML file. We’ve expanded the functionality possible with the ability to set custom views for items via app:actionLayout or using MenuItemCompat.setActionView().

Percent

The Percent Support Library provides percentage based dimensions and margins and, new to this release, the ability to set a custom aspect ratio via app:aspectRatio. By setting only a single width or height and using aspectRatio, the PercentFrameLayout or PercentRelativeLayout will automatically adjust the other dimension so that the layout uses a set aspect ratio.

Custom Tabs

The Custom Tabs Support Library allows your app to utilize the full features of compatible browsers including using pre-existing cookies while still maintaining a fast load time (via prefetching) and a custom look and actions.

In this release, we’re adding a few additional customizations such as hiding the url bar when the page is scrolled down with the new enableUrlBarHiding() method. You’ll also be able to update the action button in an already launched custom tab via your CustomTabsSession with setActionButton() - perfect for providing a visual indication of a state change.

Navigation events via CustomTabsCallback#onNavigationEvent() have also been expanded to include the new TAB_SHOWN and TAB_HIDDEN events, giving your app more information on how the user interacts with your web content.

Leanback

The Leanback library makes it easy to build user interfaces on TV devices. This release adds GuidedStepSupportFragment for a support version of GuidedStepFragment as well as improving animations and transitions and allowing GuidedStepFragment to be placed on top of existing content.

You’ll also be able to annotate different types of search completions in SearchFragment and staggered slide transition support for VerticalGridFragment.

Palette

Palette, used to extract colors from images, now supports extracting from a specific region of a Bitmap with the new setRegion() method.

SDK available now!

There’s no better time to get started with the Android Support Library. You can get started developing today by updating the Android Support Repository from the Android SDK Manager.

For an in depth look at every API change in this release, check out the full API diff.

To learn more about the Android Support Library and the APIs available to you through it, visit the Support Library section on the Android Developer site.

09 September 2015

Play Games Loot Drop for Developers

Posted by Ben Frenkel, Product Manager Google Play Games

Launched last March, Player Analytics is already becoming an important tool for many game developers, helping them to manage their games businesses and optimize in-game player behavior. Today we’re expanding Player Analytics with two new analytics reports that give you better visibility into time-based player activity and custom game events. We’re also introducing a new Player Stats API to let you tune your game experience for specific segments of players across the game lifecycle. Along with those, we’re rolling out a new version of our C++/iOS SDKs and Unity plug-in and giving you better tools to manage repeating Quests.

New useful reports for developers

We are launching two new reports later this week in the Play Games developer console: the Player Time Series Explorer and the Events Viewer. We’ve also made improvements to our player retention report.

Player Time Series Explorer

Ever wondered what your players are doing in the first few minutes of gameplay? What happens just before players spend or churn? The time-series explorer lets you understand what happens in these critical moments for your players.

For example, you carefully built out the first set of experiences in your game, but are surprised by how many players never get through even the first set of challenges. With the Player Time Series Explorer, you can now see which challenges are impeding player progress most, and make targeted improvements to decrease the rate of churn. Learn more.

Customize settings to explore player time series

Select from a list of preset questions

Find out what happens before your players spend for the first time

Select “What happens before first spend” to see what happens just before your players spend for the first time. Time series are aligned by first spend event so you can easily explore what happened just before and after first purchase.

Find out what happens before your players churn

Select “What happens before churn” to see what happens before your players stop playing. In the example below, all the churn events are right aligned to make it easier to compare player time series.

Hovering over events shows you additional details

You can see more details for all event types by holding your cursor over the event’s shape. In this example, you can see that “Player 03” spent $4.99 after earning six achievements. Hovering over the achievement shapes will show you which specific achievements were earned.

Events Viewer

Now you can create your own reports based on your custom Play Games’ events. You can select multiple events to display and bookmark the report for easy access. Learn more.

Here’s an example showing how a developer can compare the rates at which Players are entering contests, winning, and almost winning. This report would identify opportunities to improve the balance of its contest modes. You can then bookmark the settings so you can easily track improvements.

28x28 day retention grid

We added a 28-day-by-28-day retention grid to help you compare retention rates across a larger number of new user cohorts.

Tailor player experiences with the Player Stats API

Stats and reports give you insights into your what your players are doing, but wouldn’t it be nice to take action on those insights in your game? That’s what the Player Stats API is all about. The Player Stats API lets you tailor player experiences to specific segments of players across the game lifecycle. Player segments are based on player progression, spend, and engagement.

Here are some examples of what you can do with Player Stats API:

  • For highly engaged players that just aren’t spending, you can show them special bonuses that are aimed at recruiting others to play instead of spending
  • For your most prolific spenders, you can provide occasional free gifts and upgrades
  • For users that haven’t found their stride in your game, you can show them a video that directs them to community features, like clan attacks or alliances, that drive deeper engagement
  • For players that have been away from the game for a while, you can give them a welcome back message that acknowledges impressive accomplishments, and award a badge designed to encourage return play

The Player Stats API is launching in the next few weeks.

C++/iOS SDK and Unity Plug-in updates

iOS support for Play game services just got a lot better. This update includes improved CocoaPods support, which will make it easier to configure Play game services in Xcode. This also means you’ll have a much easier time building for iOS using the Unity plug-in as well.

The latest build of the C++/iOS SDKs is now built on the new Google Sign-In framework, which adds support for authentication via multiple Google apps, including Gmail and YouTube. More importantly, if a player does not have any applicable Google apps installed, the Sign-In framework will bring up a webview within the app for authentication. Opening up a webview inside the app, instead of switching to a separate browser instance, makes for a much better user experience, and addresses a top developer request. For more on the new Google Sign-in library on iOS, check out this video. Learn more.

Improved Quests

Quests are a great way of engaging your players with new goals, and with this update we have made managing Quests easier with the introduction of repeating Quests. You can create Quests that run weekly or monthly by checking the repeating quest box. This will make it easier for you to engage your players with regularly occurring challenges. Repeating Quests will be launching in the next few weeks.

If you have previously integrated Quests, you can easily convert them into repeating quests by following two easy steps.

1. Go to Quests section of developer console, and open up an existing Quest. Click the copy Quest button at the top of the page

2. Scroll down to the Schedule section of the Quest form, check the “Repeating quest” box, select between monthly and weekly quests under “Repeats”, and leave the “Ends:” field set to “Never”. After hitting save, you are done! From then on, the quest will run weekly or monthly until you decide to end it.

Google Play game services (GPGS) docs and SDK downloads

30 July 2015

Get your hands on Android Studio 1.3

Posted by Jamal Eason, Product Manager, Android

Previewed earlier this summer at Google I/O, Android Studio 1.3 is now available on the stable release channel. We appreciated the early feedback from those developers on our canary and beta channels to help ship a great product.

Android Studio 1.3 is our biggest feature release for the year so far, which includes a new memory profiler, improved testing support, and full editing and debugging support for C++. Let’s take a closer look.

New Features in Android Studio 1.3

Performance & Testing Tools

  • Android Memory (HPROF) Viewer

    Android Studio now allows you to capture and analyze memory snapshots in the native Android HPROF format.

  • Allocation Tracker

    In addition to displaying a table of memory allocations that your app uses, the updated allocation tracker now includes a visual way to view the your app allocations.

  • APK Tests in Modules

    For more flexibility in app testing, you now have the option to place your code tests in a separate module and use the new test plugin (‘com.android.test’) instead of keeping your tests right next to your app code. This feature does require your app project to use the Gradle Plugin 1.3.

Code and SDK Management

  • App permission annotations

    Android Studio now has inline code annotation support to help you manage the new app permissions model in the M release of Android. Learn more about code annotations.

  • Data Binding Support

    New data brinding features allow you to create declarative layouts in order to minimize boilerplate code by binding your application logic into your layouts. Learn more about data binding.

  • SDK Auto Update & SDK Manager

    Managing Android SDK updates is now a part of the Android Studio. By default, Android Studio will now prompt you about new SDK & Tool updates. You can still adjust your preferences with the new & integrated Android SDK Manager.

  • C++ Support

    As a part of the Android 1.3 stable release, we included an Early Access Preview of the C++ editor & debugger support paired with an experimental build plugin. See the Android C++ Preview page for information on how to get started. Support for more complex projects and build configurations is in development, but let us know your feedback.

Time to Update

An important thing to remember is that an update to Android Studio does not require you to change your Android app projects. With updating, you get the latest features but still have control of which build tools and app dependency versions you want to use for your Android app.

For current developers on Android Studio, you can check for updates from the navigation menu. For new users, you can learn more about Android Studio on the product overview page or download the stable version from the Android Studio download site.

We are excited to launch this set of features in Android Studio and we are hard at work developing the next set of tools to make develop Android development easier on Android Studio. As always we welcome feedback on how we can help you. Connect with the Android developer tools team on Google+.

02 July 2015

Game Performance: Data-Oriented Programming

Posted by Shanee Nishry, Game Developer Advocate

To improve game performance, we’d like to highlight a programming paradigm that will help you maximize your CPU potential, make your game more efficient, and code smarter.

Before we get into detail of data-oriented programming, let’s explain the problems it solves and common pitfalls for programmers.

Memory

The first thing a programmer must understand is that memory is slow and the way you code affects how efficiently it is utilized. Inefficient memory layout and order of operations forces the CPU idle waiting for memory so it can proceed doing work.

The easiest way to demonstrate is by using an example. Take this simple code for instance:

char data[1000000]; // One Million bytes
unsigned int sum = 0;

for ( int i = 0; i < 1000000; ++i )
{
  sum += data[ i ];
}

An array of one million bytes is declared and iterated on one byte at a time. Now let's change things a little to illustrate the underlying hardware. Changes marked in bold:

char data[16000000]; // Sixteen Million bytes
unsigned int sum = 0;

for ( int i = 0; i < 16000000; i += 16 )
{
  sum += data[ i ];
}

The array is changed to contain sixteen million bytes and we iterate over one million of them, skipping 16 at a time.

A quick look suggests there shouldn't be any effect on performance as the code is translated to the same number of instructions and runs the same number of times, however that is not the case. Here is the difference graph. Note that this is on a logarithmic scale--if the scale were linear, the performance difference would be too large to display on any reasonably-sized graph!


Graph in logarithmic scale

The simple change making the loop skip 16 bytes at a time makes the program run 5 times slower!

The average difference in performance is 5x and is consistent when iterating 1,000 bytes up to a million bytes, sometimes increasing up to 7x. This is a serious change in performance.

Note: The benchmark was run on multiple hardware configurations including a desktop with Intel 5930K 3.50GHz CPU, a Macbook Pro Retina laptop with 2.6 GHz Intel i7 CPU and Android Nexus 5 and Nexus 6 devices. The results were pretty consistent.

If you wish to replicate the test, you might have to ensure the memory is out of the cache before running the loop because some compilers will cache the array on declaration. Read below to understand more on how it works.

Explanation

What happens in the example is quite simply explained when you understand how the CPU accesses data. The CPU can’t access data in RAM; the data must be copied to the cache, a smaller but extremely fast memory line which resides near the CPU chip.

When the program starts, the CPU is set to run an instruction on part of the array but that data is still not in the cache, therefore causing a cache miss and forcing the CPU to wait for the data to be copied into the cache.

For simplicity sake, assume a cache size of 16 bytes for the L1 cache line, this means 16 bytes will be copied starting from the requested address for the instruction.

In the first code example, the program next tries to operate on the following byte, which is already copied into the cache following the initial cache miss, therefore continuing smoothly. This is also true for the next 14 bytes. After 16 bytes, since the first cache miss the loop, will encounter another cache miss and the CPU will again wait for data to operate on, copying the next 16 bytes into the cache.

In the second code sample, the loop skips 16 bytes at a time but hardware continues to operate the same. The cache copies the 16 subsequent bytes each time it encounters a cache miss which means the loop will trigger a cache miss with each iteration and cause the CPU to wait idle for data each time!

Note: Modern hardware implements cache prefetch algorithms to prevent incurring a cache miss per frame, but even with prefetching, more bandwidth is used and performance is lower in our example test.

In reality the cache lines tend to be larger than 16 bytes, the program would run much slower if it were to wait for data at every iteration. A Krait-400 found in the Nexus 5 has a L0 data cache of 4 KB with 64 Bytes per line.

If you are wondering why cache lines are so small, the main reason is that making fast memory is expensive.

Data-Oriented Design

The way to solve such performance issues is by designing your data to fit into the cache and have the program to operate on the entire data continuously.

This can be done by organizing your game objects inside Structures of Arrays (SoA) instead of Arrays of Structures (AoS) and pre-allocating enough memory to contain the expected data.

For example, a simple physics object in an AoS layout might look like this:

struct PhysicsObject
{
  Vec3 mPosition;
  Vec3 mVelocity;

  float mMass;
  float mDrag;
  Vec3 mCenterOfMass;

  Vec3 mRotation;
  Vec3 mAngularVelocity;

  float mAngularDrag;
};

This is a common way way to present an object in C++.

On the other hand, using SoA layout looks more like this:

class PhysicsSystem
{
private:
  size_t mNumObjects;
  std::vector< Vec3 > mPositions;
  std::vector< Vec3 > mVelocities;
  std::vector< float > mMasses;
  std::vector< float > mDrags;

  // ...
};

Let’s compare how a simple function to update object positions by their velocity would operate.

For the AoS layout, a function would look like this:

void UpdatePositions( PhysicsObject* objects, const size_t num_objects, const float delta_time )
{
  for ( int i = 0; i < num_objects; ++i )
  {
    objects[i].mPosition += objects[i].mVelocity * delta_time;
  }
}

The PhysicsObject is loaded into the cache but only the first 2 variables are used. Being 12 bytes each amounts to 24 bytes of the cache line being utilised per iteration and causing a cache miss with every object on a 64 bytes cache line of a Nexus 5.

Now let’s look at the SoA way. This is our iteration code:

void PhysicsSystem::SimulateObjects( const float delta_time )
{
  for ( int i = 0; i < mNumObjects; ++i )
  {
    mPositions[ i ] += mVelocities[i] * delta_time;
  }
}

With this code, we immediately cause 2 cache misses, but we are then able to run smoothly for about 5.3 iterations before causing the next 2 cache misses resulting in a significant performance increase!

The way data is sent to the hardware matters. Be aware of data-oriented design and look for places it will perform better than object-oriented code.

We have barely scratched the surface. There is still more to data-oriented programming than structuring your objects. For example, the cache is used for storing instructions and function memory so optimizing your functions and local variables affects cache misses and hits. We also did not mention the L2 cache and how data-oriented design makes your application easier to multithread.

Make sure to profile your code to find out where you might want to implement data-oriented design. You can use different profilers for different architecture, including the NVIDIA Tegra System Profiler, ARM Streamline Performance Analyzer, Intel and PowerVR PVRMonitor.

If you want to learn more on how to optimize for your cache, read on cache prefetching for various CPU architectures.

25 May 2015

Game Performance: Geometry Instancing

Posted by Shanee Nishry, Games Developer Advocate

Imagine a beautiful virtual forest with countless trees, plants and vegetation, or a stadium with countless people in the crowd cheering. If you are heroic you might like the idea of an epic battle between armies.

Rendering a lot of meshes is desired to create a beautiful scene like a forest, a cheering crowd or an army, but doing so is quite costly and reduces the frame rate. Fortunately this is possible using a simple technique called Geometry Instancing.

Geometry instancing can be used in 2D games for rendering a large number of sprites, or in 3D for things like particles, characters and environment.

The NDK code sample More Teapots demoing the content of this article can be found with the ndk inside the samples folder and in the git repository.

Support and Extensions

Geometry instancing is available from OpenGL ES 3.0 and to OpenGL 2.0 devices which support the GL_NV_draw_instanced or GL_EXT_draw_instanced extensions. More information on how to using the extensions is shown in the More Teapots demo.

Overview

Submitting draw calls causes OpenGL to queue commands to be sent to the GPU, this has an expensive overhead which may affect performance. This overhead grows when changing states such as alpha blending function, active shader, textures and buffers.

Geometry Instancing is a technique that combines multiple draws of the same mesh into a single draw call, resulting in reduced overhead and potentially increased performance. This works even when different transformations are required.

The algorithm

To explain how Geometry Instancing works let’s quickly overview traditional drawing.

Traditional Drawing

To a draw a mesh you’d usually prepare a vertex buffer and an index buffer, bind your shader and buffers, set your uniforms such as a World View Projection matrix and make a draw call.

To draw multiple instances using the same mesh you set new uniform values for the transformations and other data and call draw again. This is repeated for every instance.

Drawing with Geometry Instancing

Geometry Instancing reduces CPU overhead by reducing the sequence described above into a single buffer and draw call.

It works by using an additional buffer which contains custom per-instance data needed by your shader, such as transformations, color, light data.

The first change to your workflow is to create the additional buffer on initialization stage.

To put it into code let’s define an example per-instance data that includes a world view projection matrix and a color:

C++

struct PerInstanceData
{
 Mat4x4 WorldViewProj;
 Vector4 Color;
};

You also need to the structure to your shader. The easiest way is by creating a Uniform Block with an array:

GLSL

#define MAX_INSTANCES 512

layout(std140) uniform PerInstanceData {
    struct
    {
        mat4      uMVP;
        vec4      uColor;
    } Data[ MAX_INSTANCES ];
};

Note that uniform blocks have limited sizes. You can find the maximum number of bytes you can use by querying for GL_MAX_UNIFORM_BLOCK_SIZE using glGetIntegerv.

Example:

GLint max_block_size = 0;
glGetIntegerv( GL_MAX_UNIFORM_BLOCK_SIZE, &max_block_size );

Bind the uniform block on the CPU in your program’s initialization stage:

C++

#define MAX_INSTANCES 512
#define BINDING_POINT 1
GLuint shaderProgram; // Compiled shader program

// Bind Uniform Block
GLuint blockIndex = glGetUniformBlockIndex( shaderProgram, "PerInstanceData" );
glUniformBlockBinding( shaderProgram, blockIndex, BINDING_POINT );

And create a corresponding uniform buffer object:

C++

// Create Instance Buffer
GLuint instanceBuffer;

glGenBuffers( 1, &instanceBuffer );
glBindBuffer( GL_UNIFORM_BUFFER, instanceBuffer );
glBindBufferBase( GL_UNIFORM_BUFFER, BINDING_POINT, instanceBuffer );

// Initialize buffer size
glBufferData( GL_UNIFORM_BUFFER, MAX_INSTANCES * sizeof( PerInstanceData ), NULL, GL_DYNAMIC_DRAW );

The next step is to update the instance data every frame to reflect changes to the visible objects you are going to draw. Once you have your new instance buffer you can draw everything with a single call to glDrawElementsInstanced.

You update the instance buffer using glMapBufferRange. This function locks the buffer and retrieves a pointer to the byte data allowing you to copy your per-instance data. Unlock your buffer using glUnmapBuffer when you are done.

Here is a simple example for updating the instance data:

const int NUM_SCENE_OBJECTS = …; // number of objects visible in your scene which share the same mesh

// Bind the buffer
glBindBuffer( GL_UNIFORM_BUFFER, instanceBuffer );

// Retrieve pointer to map the data
PerInstanceData* pBuffer = (PerInstanceData*) glMapBufferRange( GL_UNIFORM_BUFFER, 0,
                NUM_SCENE_OBJECTS * sizeof( PerInstanceData ),
                GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_RANGE_BIT );

// Iterate the scene objects
for ( int i = 0; i < NUM_SCENE_OBJECTS; ++i )
{
    pBuffer[ i ].WorldViewProj = ... // Copy World View Projection matrix
    pBuffer[ i ].Color = …               // Copy color
}

glUnmapBuffer( GL_UNIFORM_BUFFER ); // Unmap the buffer

And finally you can draw everything with a single call to glDrawElementsInstanced or glDrawArraysInstanced (depending if you are using an index buffer):

glDrawElementsInstanced( GL_TRIANGLES, NUM_INDICES, GL_UNSIGNED_SHORT, 0,
                NUM_SCENE_OBJECTS );

You are almost done! There is just one more step to do. In your shader you need to make use of the new uniform buffer object for your transformations and colors. In your shader main program:

void main()
{
    …
    gl_Position = PerInstanceData.Data[ gl_InstanceID ].uMVP * inPosition;
    outColor = PerInstanceData.Data[ gl_InstanceID ].uColor;
}

You might have noticed the use gl_InstanceID. This is a predefined OpenGL vertex shader variable that tells your program which instance it is currently drawing. Using this variable your shader can properly iterate the instance data and match the correct transformation and color for every vertex.

That’s it! You are now ready to use Geometry Instancing. If you are drawing the same mesh multiple times in a frame make sure to implement Geometry Instancing in your pipeline! This can greatly reduce overhead and improve performance.

22 April 2015

Game Performance: Explicit Uniform Locations

Posted by Shanee Nishry, Games Developer Advocate

Uniforms variables in GLSL are crucial for passing data between the game code on the CPU and the shader program on the graphics card. Unfortunately, up until the availability of OpenGL ES 3.1, using uniforms required some preparation which made the workflow slightly more complicated and wasted time during loading.

Let us examine a simple vertex shader and see how OpenGL ES 3.1 allows us to improve it:

#version 300 es

layout(location = 0) in vec4 vertexPosition;
layout(location = 1) in vec2 vertexUV;

uniform mat4 matWorldViewProjection;

out vec2 outTexCoord;

void main()
{
    outTexCoord = vertexUV;
    gl_Position = matWorldViewProjection * vertexPosition;
}

Note: You might be familiar with this shader from a previous Game Performance article on Layout Qualifiers. Find it here.

We have a single uniform for our world view projection matrix:

uniform mat4 matWorldViewProjection;

The inefficiency appears when you want to assign the uniform value.

You need to use glUniformMatrix4fv or glUniform4f to set the uniform’s value but you also need the handle for the uniform’s location in the program. To get the handle you must call glGetUniformLocation.

GLuint program; // the shader program
float matWorldViewProject[16]; // 4x4 matrix as float array

GLint handle = glGetUniformLocation( program, “matWorldViewProjection” );
glUniformMatrix4fv( handle, 1, false, matWorldViewProject );

That pattern leads to having to call glGetUniformLocation for each uniform in every shader and keeping the handles or worse, calling glGetUniformLocation every frame.

Warning! Never call glGetUniformLocation every frame! Not only is it bad practice but it is slow and bad for your game’s performance. Always call it during initialization and save it somewhere in your code for use in the render loop.

This process is inefficient, it requires you to do more work and costs precious time and performance.

Also take into consideration that you might have multiple shaders with the same uniforms. It would be much better if your code was deterministic and the shader language allowed you to explicitly set the locations of your uniforms so you don’t need to query and manage access handles. This is now possible with Explicit Uniform Locations.

You can set the location for uniforms directly in the shader’s code. They are declared like this

layout(location = index) uniform type name;

For our example shader it would be:

layout(location = 0) uniform mat4 matWorldViewProjection;

This means you never need to use glGetUniformLocation again, resulting in simpler code, initialization process and saved CPU cycles.

This is how the example shader looks after the change. Changes are marked in bold:

#version 310 es

layout(location = 0) in vec4 vertexPosition;
layout(location = 1) in vec2 vertexUV;

layout(location = 0) uniform mat4 matWorldViewProjection;

out vec2 outTexCoord;

void main()
{
    outTexCoord = vertexUV;
    gl_Position = matWorldViewProjection * vertexPosition;
}

As Explicit Uniform Locations are only supported from OpenGL ES 3.1 we also changed the version declaration to 310.

Now all you need to do to set your matWorldViewProjection uniform value is call glUniformMatrix4fv for the handle 0:

const GLint UNIFORM_MAT_WVP = 0; // Uniform location for WorldViewProjection
float matWorldViewProject[16]; // 4x4 matrix as float array

glUniformMatrix4fv( UNIFORM_MAT_WVP, 1, false, matWorldViewProject );

This change is extremely simple and the improvements can be substantial, producing cleaner code, asset pipeline and improved performance. Be sure to make these changes If you are targeting OpenGL ES 3.1 or creating multiple APKs to support a wide range of devices.

To learn more about Explicit Uniform Locations check out the OpenGL wiki page for it which contains valuable information on different layouts and how arrays are represented.

26 March 2015

Game Performance: Layout Qualifiers

Today, we want to share some best practices on using the OpenGL Shading Language (GLSL) that can optimize the performance of your game and simplify your workflow. Specifically, Layout qualifiers make your code more deterministic and increase performance by reducing your work.


Let’s start with a simple vertex shader and change it as we go along.

This basic vertex shader takes position and texture coordinates, transforms the position and outputs the data to the fragment shader:
attribute vec4 vertexPosition;
attribute vec2 vertexUV;

uniform mat4 matWorldViewProjection;

varying vec2 outTexCoord;

void main()
{
  outTexCoord = vertexUV;
  gl_Position = matWorldViewProjection * vertexPosition;
}

Vertex Attribute Index

To draw a mesh on to the screen, you need to create a vertex buffer and fill it with vertex data, including positions and texture coordinates for this example.

In our sample shader, the vertex data may be laid out like this:
struct Vertex
{
  Vector4 Position;
  Vector2 TexCoords;
};
Therefore, we defined our vertex shader attributes like this:
attribute vec4 vertexPosition;
attribute vec2  vertexUV;
To associate the vertex data with the shader attributes, a call to glGetAttribLocation will get the handle of the named attribute. The attribute format is then detailed with a call to glVertexAttribPointer.
GLint handleVertexPos = glGetAttribLocation( myShaderProgram, "vertexPosition" );
glVertexAttribPointer( handleVertexPos, 4, GL_FLOAT, GL_FALSE, 0, 0 );

GLint handleVertexUV = glGetAttribLocation( myShaderProgram, "vertexUV" );
glVertexAttribPointer( handleVertexUV, 2, GL_FLOAT, GL_FALSE, 0, 0 );
But you may have multiple shaders with the vertexPosition attribute and calling glGetAttribLocation for every shader is a waste of performance which increases the loading time of your game.

Using layout qualifiers you can change your vertex shader attributes declaration like this:
layout(location = 0) in vec4 vertexPosition;
layout(location = 1) in vec2 vertexUV;
To do so you also need to tell the shader compiler that your shader is aimed at GL ES version 3.1. This is done by adding a version declaration:
#version 300 es
Let’s see how this affects our shader, changes are marked in bold:
#version 300 es

layout(location = 0) in vec4 vertexPosition;
layout(location = 1) in vec2 vertexUV;

uniform mat4 matWorldViewProjection;

out vec2 outTexCoord;

void main()
{
  outTexCoord = vertexUV;
  gl_Position = matWorldViewProjection * vertexPosition;
}
Note that we also changed outTexCoord from varying to out. The varying keyword is deprecated from version 300 es and requires changing for the shader to work.

Note that Vertex Attribute qualifiers and #version 300 es are supported from OpenGL ES 3.0. The desktop equivalent is supported on OpenGL 3.3 and using #version 330.

Now you know your position attributes always at 0 and your texture coordinates will be at 1 and you can now bind your shader format without using glGetAttribLocation:
const int ATTRIB_POS = 0;
const int ATTRIB_UV   = 1;

glVertexAttribPointer( ATTRIB_POS, 4, GL_FLOAT, GL_FALSE, 0, 0 );
glVertexAttribPointer( ATTRIB_UV, 2, GL_FLOAT, GL_FALSE, 0, 0 );
This simple change leads to a cleaner pipeline, simpler code and saved performance during loading time.

To learn more about performance on Android, check out the Android Performance Patterns series.

Posted by Shanee Nishry, Games Developer Advocate

13 January 2015

Efficient Game Textures with Hardware Compression

Posted by Shanee Nishry, Developer Advocate

As you may know, high resolution textures contribute to better graphics and a more impressive game experience. Adaptive Scalable Texture Compression (ASTC) helps solve many of the challenges involved including reducing memory footprint and loading time and even increase performance and battery life.

If you have a lot of textures, you are probably already compressing them. Unfortunately, not all compression algorithms are made equal. PNG, JPG and other common formats are not GPU friendly. Some of the highest-quality algorithms today are proprietary and limited to certain GPUs. Until recently, the only broadly supported GPU accelerated formats were relatively primitive and produced poor results.

With the introduction of ASTC, a new compression technique invented by ARM and standardized by the Khronos group, we expect to see dramatic changes for the better. ASTC promises to be both high quality and broadly supported by future Android devices. But until devices with ASTC support become widely available, it’s important to understand the variety of legacy formats that exist today.

We will examine preferable compression formats which are supported on the GPU to help you reduce .apk size and loading times of your game.

Texture Compression

Popular compressed formats include PNG and JPG, which can’t be decoded directly by the GPU. As a consequence, they need to be decompressed before copying them to the GPU memory. Decompressing the textures takes time and leads to increased loading times.

A better option is to use hardware accelerated formats. These formats are lossy but have the advantage of being designed for the GPU.

This means they do not need to be decompressed before being copied and result in decreased loading times for the player and may even lead to increased performance due to hardware optimizations.

Hardware Accelerated Formats

Hardware accelerated formats have many benefits. As mentioned before, they help improve loading times and the runtime memory footprint.

Additionally, these formats help improve performance, battery life and reduce heating of the device, requiring less bandwidth while also consuming less energy.

There are two categories of hardware accelerated formats, standard and proprietary. This table shows the standard formats:

ETC1 Supported on all Android devices with OpenGL ES 2.0 and above. Does not support alpha channel.
ETC2 Requires OpenGL ES 3.0 and above.
ASTC Higher quality than ETC1 and ETC2. Supported with the Android Extension Pack.

As you can see, with higher OpenGL support you gain access to better formats. There are proprietary formats to replace ETC1, delivering higher quality and alpha channel support. These are shown in the following table:

ATC Available with Adreno GPU.
PVRTC Available with a PowerVR GPU.
DXT1 S3 DXT1 texture compression. Supported on devices running Nvidia Tegra platform.
S3TC S3 texture compression, nonspecific to DXT variant. Supported on devices running Nvidia Tegra platform.

That’s a lot of formats, revealing a different problem. How do you choose which format to use?

To best support all devices you need to create multiple apks using different texture formats. The Google Play developer console allows you to add multiple apks and will deliver the right one to the user based on their device. For more information check this page.

When a device only supports OpenGL ES 2.0 it is recommended to use a proprietary format to get the best results possible, this means making an apk for each hardware.

On devices with access to OpenGL ES 3.0 you can use ETC2. The GL_COMPRESSED_RGBA8_ETC2_EAC format is an improved version of ETC1 with added alpha support.

The best case is when the device supports the Android Extension Pack. Then you should use the ASTC format which has better quality and is more efficient than the other formats.

Adaptive Scalable Texture Compression (ASTC)

The Android Extension Pack has ASTC as a standard format, removing the need to have different formats for different devices.

In addition to being supported on modern hardware, ASTC also offers improved quality over other GPU formats by having full alpha support and better quality preservation.

ASTC is a block based texture compression algorithm developed by ARM. It offers multiple block footprints and bitrate options to lower the size of the final texture. The higher the block footprint, the smaller the final file but possibly more quality loss.

Note that some images compress better than others. Images with similar neighboring pixels tend to have better quality compared to images with vastly different neighboring pixels.

Let’s examine a texture to better understand ASTC:

This bitmap is 1.1MB uncompressed and 299KB when compressed as PNG.

Compressing the Android jellybean jar texture into ASTC through the Mali GPU Texture Compression Tool yields the following results.

Block Footprint 4x4 6x6 8x8
Memory 262KB 119KB 70KB
Image Output
Difference Map
5x Enhanced Difference Map

As you can see, the highest quality (4x4) bitrate for ASTC already gains over PNG in memory size. Unlike PNG, this gain stays even after copying the image to the GPU.

The tradeoff comes in the detail, so it is important to carefully examine textures when compressing them to see how much compression is acceptable.

Conclusion

Using hardware accelerated textures in your games will help you reduce the size of your .apk, runtime memory use as well as loading times.

Improve performance on a wider range of devices by uploading multiple apks with different GPU texture formats and declaring the texture type in the AndroidManifest.xml.

If you are aiming for high end devices, make sure to use ASTC which is included in the Android Extension Pack.

19 December 2014

Google Play game services ends year with a bang!

Posted by Benjamin Frenkel, Product Manager, Play Games

In an effort to supercharge our Google Play games services (GPGS) developer tools, we’re introducing the Game services Publishing API, a revamped Unity Plugin, additional enhancements to the C++ SDK, and improved Leaderboard Tamper Protection.

Let’s dig into what’s new for developers:

Publishing API to automate game services configuration

At Google I/O this past June, the pubsite team launched the Google Play Developer Publishing APIs to automate the configuration and publishing of applications to the Play store. Game developers can now also use the Google Play game services Publishing API to automate the configuration and publishing of game services resources, starting with achievements and leaderboards.

For example, if you plan on publishing your game in multiple languages, the game services Publishing API will enable you to pull translation data from spreadsheets, CSVs, or a Content Management System (CMS) and automatically apply those translations to your achievements.

Early adopter Square Enix believes the game services Publishing API will be an indispensable tool to manage global game rollouts:


Achievements are the most used feature in Google Play game services for us. As our games support more languages, achievement management has become increasingly difficult. With the game services Publishing API, we can automate this process, which is really helpful. The game services Publishing API also comes with great samples that we were able to easily customize for our needs

Keisuke Hata, Manager / Technical Director, SQUARE ENIX Co., Ltd.





To get started today, take a look at the developer documentation here.

Updated Unity plugin and Cross-platform C++ SDK

  • Unity plugin Saved Games support: You can now take advantage of the Saved Games feature directly from the Unity plugin, with more storage and greater discoverability through the Play Games app
  • New Unity plugin architecture: We’ve rewritten the plugin on top of our cross-platform C++ SDK to speed up feature development across SDKs and increase our responsiveness to your feedback
  • Improved Unity generated Xcode project setup: You now have a much more robust way to generate Xcode projects integrated with Google Play Game Services in Unity
  • Updated and improved Unity samples: We’ve updated our sample codes to make it easier for first time developers to integrate Google Play games services
  • C++ SDK support for iPhone 6 Plus: You can now take advantage of the out-of-box games services UI (e.g., for leaderboards and achievements) for larger form factor devices, such as the iPhone 6 Plus

We also include some important bug fixes and stability improvements. Check out the release notes for the Unity Plugin and the getting started page for the C++ SDK for more details.

Leaderboard Tamper Protection

Turn on Leaderboard Tamper Protection to automatically hide suspected tampered scores from your leaderboards. To enable tamper protection on an existing leaderboard, go to your leaderboard in the Play developer console and flip the “Leaderboard tamper protection” toggle to on. Tamper protection will be on by default for new leaderboards.Learn more.

To learn more about cleaning up previously submitted suspicious scores refer to the Google Play game services Management APIs documentation or get the web demo console for the Management API directly from github here.

In addition, if you prefer command-line tools, you can use the python-based option here.

19 November 2014

Coding Android TV games is easy as pie

Posted by Alex Ames, Fun Propulsion Labs at Google*

We’re pleased to announce Pie Noon, a simple game created to demonstrate multi-player support on the Nexus Player, an Android TV device. Pie Noon is an open source, cross-platform game written in C++ which supports:

  • Up to 4 players using Bluetooth controllers.
  • Touch controls.
  • Google Play Games Services sign-in and leaderboards.
  • Other Android devices (you can play on your phone or tablet in single-player mode, or against human adversaries using Bluetooth controllers).

Pie Noon serves as a demonstration of how to use the SDL library in Android games as well as Google technologies like Flatbuffers, Mathfu, fplutil, and WebP.

  • Flatbuffers provides efficient serialization of the data loaded at run time for quick loading times. (Examples: schema files and loading compiled Flatbuffers)
  • Mathfu drives the rendering code, particle effects, scene layout, and more, allowing for efficient mathematical operations optimized with SIMD. (Example: particle system)
  • fplutil streamlines the build process for Android, making iteration faster and easier. Our Android build script makes use of it to easily compile and run on on Android devices.
  • WebP compresses image assets more efficiently than jpg or png file formats, allowing for smaller APK sizes.

You can download the game in the Play Store and the latest open source release from our GitHub page. We invite you to learn from the code to see how you can implement these libraries and utilities in your own Android games. Take advantage of our discussion list if you have any questions, and don’t forget to throw a few pies while you’re at it!

* Fun Propulsion Labs is a team within Google that's dedicated to advancing gaming on Android and other platforms.