An Overview of App Engine

Python |Java |PHP |Go

Services: The building blocks of App Engine

Note: Services were previously called "modules", and services are still declared in app.yaml files as modules, for example: module: service_name.

At the highest level, an App Engine application is made up of one or more services, which can be configured to use different runtimes and to operate with different performance settings. Services let developers factor large applications into logical components that can share App Engine features, such as Memcache, and communicate in a secure fashion.

A deployed service behaves like a microservice. By using multiple services you can deploy your app as a set of microservices.

An app that handles customer requests might include separate services to handle other tasks, such as:

API requests from mobile devices
Internal, admin-like requests
Backend processing such as billing pipelines and data analysis

Versions and instances

Each service consists of source code and a configuration file. The files used by a service represent a version of the service. When you deploy a service, you always deploy a specific version of the service. Having versions for each of your services allows you to roll back with a single click in the Cloud Platform Console, or to use traffic splitting to gradually increase traffic to the newly deployed version of a service.

Each service and each version must have a name. Choose a unique name for each service and each version. Don't reuse names between services and versions.

While running, a particular version will have one or more instances. App Engine by default scales the number of instances running up and down to match the load, thus providing consistant performance for your app at all times while minimizing idle instances and thus reducing cost.

The diagram below illustrates the hierarchy of a running App Engine application:

Hierarchy graph of services/versions/instances

Scaling types and instance classes

When you upload a version of a service, the configuration file specifies a scaling type and instance class that apply to every instance of that version. The scaling type controls how instances are created. The instance class determines compute resources (memory size and CPU speed) and pricing. There are three scaling types: manual, basic, and automatic. The available instance classes depend on the scaling type.

Manual Scaling: A service with manual scaling runs continuously, allowing you to perform complex initialization and rely on the state of its memory over time.
Basic Scaling: A service with basic scaling will create an instance when the application receives a request. The instance will be turned down when the app becomes idle. Basic scaling is ideal for work that is intermittent or driven by user activity.
Automatic Scaling: Automatic scaling is based on request rate, response latencies, and other application metrics.

This table compares the performance features of the three scaling types:

Feature	Automatic scaling	Manual scaling	Basic scaling
Deadlines	60-second deadline for HTTP requests, 10-minute deadline for task queue tasks.	Requests can run indefinitely. A manually-scaled instance can choose to handle `/_ah/start` and execute a program or script for many hours without returning an HTTP response code. Task queue tasks can run up to 24 hours.	Same as manual scaling.
Background threads	Not allowed	Allowed	Allowed
Residence	Instances are evicted from memory based on usage patterns.	Instances remain in memory, and state is preserved across requests. When instances are restarted, an `/_ah/stop` request appears in the logs. If there is a registered stop callback method, it has 30 seconds to complete before shutdown occurs.	Instances are evicted based on the `idle_timeout` parameter. If an instance has been idle, for example it has not received a request, for more than `idle_timeout`, then the instance is evicted.
Startup and shutdown	Instances are created on demand to handle requests and automatically turned down when idle.	Instances are sent a start request automatically by App Engine in the form of an empty GET request to `/_ah/start`. An instance that is stopped with `appcfg stop` or from the Cloud Platform Console) has 30 seconds to finish handling requests before it is forcibly terminated.	Instances are created on demand to handle requests and automatically turned down when idle, based on the `idle_timeout` configuration parameter. As with manual scaling, an instance that is stopped with `appcfg stop` or from the Cloud Platform Console) has 30 seconds to finish handling requests before it is forcibly terminated.
Instance addressability	Instances are anonymous.	Instance "i" of version "v" of service "s" is addressable at the URL: `http://i.v.s.app_id.appspot.com`. If you have set up a wildcard subdomain mapping for a custom domain, you can also address a service or any of its instances via a URL of the form `http://s.domain.com` or `http://i.s.domain.com`. You can reliably cache state in each instance and retrieve it in subsequent requests.	Same as manual scaling.
Scaling	App Engine scales the number of instances automatically in response to processing volume. This scaling factors in the `automatic_scaling` settings that are provided on a per-version basis in the configuration file.	You configure the number of instances of each version in that service's configuration file. The number of instances usually corresponds to the size of a dataset being held in memory or the desired throughput for offline work. You can adjust the number of instances of a manually-scaled version very quickly, without stopping instances that are currently running, using the Modules API `set_num_instances` function.	A service with basic scaling is configured by setting the maximum number of instances in the `max_instances` parameter of the `basic_scaling` setting. The number of live instances scales with the processing volume.
Free daily usage quota	28 instance-hours	8 instance-hours	8 instance-hours

Communication between services

Every service, version, and instance has its own unique URI, for example, v1.my-service.my-app.appspot.com. Incoming user requests are routed to an instance of a particular service/version according to URL addressing conventions and an optional customized dispatch file.

You can also pass requests between services and from services to external endpoints using the URL Fetch API.

All the services in an application share the state of the Datastore and Memcache services. They can also collaborate by assigning work between them to Task Queues. To access these shared services, use the corresponding App Engine APIs. Calls to these APIs are automatically mapped to the application’s namespace.

Limits

The maximum number of services and versions that you can deploy depends on your app's pricing:

Limit	Free app	Paid app
Maximum services per app	5	20
Maximum versions per app	15	120

There is also a limit to the number of instances for each service with basic or manual scaling:

Maximum instances per manual/basic scaling version
Free app	Paid app US	Paid app EU
20	200	25

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 3.0 License, and code samples are licensed under the Apache 2.0 License. For details, see our Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated November 29, 2016.

Services: The building blocks of App Engine

Versions and instances

Scaling types and instance classes

Communication between services

Limits

Send feedback about...