Menu
AWS Data Pipeline
Developer Guide (API Version 2012-10-29)

AWS Data Pipeline Limits

To ensure there is capacity for all users, AWS Data Pipeline imposes limits on the resources that you can allocate and the rate at which you can allocate resources.

Account Limits

The following limits apply to a single AWS account. If you require additional capacity, you can use the Amazon Web Services Support Center request form to increase your capacity.

AttributeLimitAdjustable
Number of pipelines100Yes
Number of objects per pipeline100Yes
Number of active instances per object5Yes
Number of fields per object50No
Number of UTF8 bytes per field name or identifier256No
Number of UTF8 bytes per field10,240No
Number of UTF8 bytes per object15,360 (including field names)No
Rate of creation of a instance from an object1 per 5 minutesNo
Retries of a pipeline activity5 per taskNo
Minimum delay between retry attempts2 minutesNo
Minimum scheduling interval15 minutesNo
Maximum number of roll-ups into a single object32No
Maximum number of EC2 instances per Ec2Resource object1No

Web Service Call Limits

AWS Data Pipeline limits the rate at which you can call the web service API. These limits also apply to AWS Data Pipeline agents that call the web service API on your behalf, such as the console, CLI, and Task Runner.

The following limits apply to a single AWS account. This means the total usage on the account, including that by IAM users, cannot exceed these limits.

The burst rate lets you save up web service calls during periods of inactivity and expend them all in a short amount of time. For example, CreatePipeline has a regular rate of 1 call each 5 seconds. If you don't call the service for 30 seconds, you will have 6 calls saved up. You could then call the web service 6 times in a second. Because this is below the burst limit and keeps your average calls at the regular rate limit, your calls are not throttled.

If you exceed the rate limit and the burst limit, your web service call fails and returns a throttling exception. The default implementation of a worker, Task Runner, automatically retries API calls that fail with a throttling exception, with a back off so that subsequent attempts to call the API occur at increasingly longer intervals. If you write a worker, we recommend that you implement similar retry logic.

These limits are applied against an individual AWS account.

API Regular rate limitBurst limit
ActivatePipeline1 call per second100 calls
CreatePipeline1 call per second100 calls
DeletePipeline1 call per second100 calls
DescribeObjects2 calls per second100 calls
DescribePipelines1 call per second100 calls
GetPipelineDefinition1 call per second100 calls
PollForTask2 calls per second100 calls
ListPipelines1 call per second100 calls
PutPipelineDefinition1 call per second100 calls
QueryObjects2 calls per second100 calls
ReportTaskProgress10 calls per second100 calls
SetTaskStatus10 calls per second100 calls
SetStatus1 call per second100 calls
ReportTaskRunnerHeartbeat1 call per second100 calls
ValidatePipelineDefinition1 call per second100 calls

Scaling Considerations

AWS Data Pipeline scales to accommodate a huge number of concurrent tasks and you can configure it to automatically create the resources necessary to handle large workloads. These automatically-created resources are under your control and count against your AWS account resource limits. For example, if you configure AWS Data Pipeline to automatically create a 20-node Amazon EMR cluster to process data and your AWS account has an EC2 instance limit set to 20, you may inadvertently exhaust your available backfill resources. As a result, consider these resource restrictions in your design or increase your account limits accordingly.

If you require additional capacity, you can use the Amazon Web Services Support Center request form to increase your capacity.