Query Plan Explanation

BigQuery provides diagnostic information about a completed query's execution plan (or query plan for short). This feature is similar to the EXPLAIN statement available in some other query engines in that it provides metadata about each step of the query plan. You can often use this information to improve query performance.

With BigQuery, you don't have to use an EXPLAIN statement to see the query plan explanation. In fact, BigQuery doesn't have an EXPLAIN statement. Instead, the query plan is available through the BigQuery web UI or the API after a query completes.

If you use the Web UI, you can see the query plan for a completed query by clicking the Explanation button that's adjacent to the Results button.

If you use the BigQuery API, you can see the query plan for a completed query by examining the jobs.get response body. Query plan information is automatically attached to query job resources and retained for 7 days.

The query plan describes a query as a series of stages, with each stage comprising a number of steps that read from data sources, perform a series of transformations on the input, and emit an output to a future stage (or the final result).

Available metadata

The following information is available for each stage in the query plan.

API JSON Name	Web UI Name	Description
`id`	Stage x	Unique (within the plan) integer ID for the stage.
`recordsRead`	Input	Number of rows (top-level records) read by the stage.
`recordsWritten`	Output	Number of rows (top-level records) written by the stage.

Relative timing

The following ratios are also available for each stage in the query plan.

API JSON Name	Web UI*	Ratio Numerator **
`waitRatioAvg`		Time the average worker spent waiting to be scheduled.
`waitRatioMax`		Time the slowest worker spent waiting to be scheduled.
`readRatioAvg`		Time the average worker spent reading input data.
`readRatioMax`		Time the slowest worker spent reading input data.
`computeRatioAvg`		Time the average worker spent CPU-bound.
`computeRatioMax`		Time the slowest worker spent CPU-bound.
`writeRatioAvg`		Time the average worker spent writing output data.
`writeRatioMax`		Time the slowest worker spent writing output data.

* The labels 'AVG' and 'MAX' are for illustration only and do not appear in the web UI.

** All of the ratios share a common denominator that represents the longest time spent by any worker in any segment.

For example, if you run a simple query that counts the number of rows in the Shakespeare public data set that are associated with the play Hamlet:

SELECT COUNT(*)
FROM [publicdata:samples.shakespeare]
WHERE corpus = 'hamlet'

You can click the Explanation button to see the following information about the query plan.

Screenshot of the web ui results hamlet search unexpanded

In this example, the average ratios are all equal to the maximum ratios. Also, because Stage 1's waitRatioMax value was the longest of all segments, it became the common denominator for all the ratios shown.

Steps metadata

Each stage comprises a number of steps. The query plan includes information about each step performed during a stage. The following table describes the steps that a stage might include.

Step	Description
READ	A read of one or more columns from an input table or intermediate result.
WRITE	A write of one or more columns to an output table or intermediate result. For HASH partitioned outputs from a stage, this also includes the columns used as the partition key.
COMPUTE	Catch-all for most expression evaluation and calls to built-in functions.
FILTER	Operator implementing the WHERE, OMIT IF and HAVING clauses.
SORT	Sort operation, includes the column keys and the sort direction.
AGGREGATE	An aggregation, such as GROUP BY. Generally split across multiple stages (once per-worker and a final pass to roll up all of the intermediate results).
LIMIT	Operator implementing the LIMIT clause.
JOIN	JOIN operation, includes the join type and the columns used.
ANALYTIC_FUNCTION	An invocation of an analytic function (also known as “window functions”).
USER_DEFINED_FUNCTION	A call to a user-defined function.

To see the steps metadata in the web UI, click on the triangle for that stage. The following example shows the expanded results for Stage 1, which for this query included steps for READ, AGGREGATE, and WRITE.

Screenshot of the web ui results hamlet search expanded

Note that the WRITE step wrote the output for stage 1 to an identifier named __stage1_output. These identifiers make it easier for you to track what data a stage produced and when that data was consumed by a later stage.

The BigQuery API and query plan explanation

Query plan information is automatically attached to job resources and retained for 7 days. When you call jobs.get() within that timeframe, the JSON response object includes the query plan information under queryPlan. For example, the following is an excerpt of a JSON response object that includes a query plan explanation.

{
 "kind": "bigquery#job",
 "id": "<ID>",
 "jobReference": {
   ...
 },
 "configuration": {
   ...
 },
 "status": {
  "state": "DONE"
 },
 "statistics": {
   ...
  "query": {
   "totalBytesProcessed": "1125284696816",
   "cacheHit": false
   "queryPlan": [
    {
     "steps": [
      {
       "description": "READ corpus, word_count FROM bigquerytestdefault:samples.shakespeare"
      },
      {
       "description": "AGGREGATE QUANTILES(word_count) AS f0_ GROUP BY corpus\n"
      },
      {
       "description": "WRITE corpus, f0_ TO TABLE __R0"
      }
     ],
     "recordsRead": "164656",
     "recordsWritten": "42"
    },
    {
     "steps": [
      {
       "description": "READ corpus, f0_ FROM __R0 AS samples.shakespeare\n"
      },
      {
       "description": "AGGREGATE QUANTILES(f0_) AS f0_ GROUP BY corpus\n"
      },
      {
       "description": "WRITE corpus, f0_ TO __root_union0"
      }
     ],
     "recordsRead": "42",
     "recordsWritten": "42"
    }
   ],
   }
  },
}

Interpreting results

Significant difference between average and max time

Your data may have a skewed distribution. Ideally, data is evenly distributed among all of the individual workers, allowing them to finish simultaneously. Unevenly distributed data can cause a query to run much longer than necessary while a few overloaded workers catch up with the rest.

For example, in the web UI, a difference between average and max read times might look like this:

read-time avg to max difference purple bar

BigQuery attempts to handle this sort of skew automatically, but in cases where there is significant skew and a significant amount of data ends up in a single bucket, this may not be possible.

Common Causes

JOIN, GROUP BY, or PARTITION that includes NULL, empty, or default values. Natural skew in your data distribution. For example, GROUP BY over Country (if the majority of your customers are from a single country) or HTTP Referer (likely logarithmic).

Troubleshooting

Consider feeding your intermediate results into a TOP COUNT statement to see how the most common values are distributed for your keys.

Solutions

If you don’t need the skewed values, consider filtering them out as early as possible.

If possible, run two different queries: one over just the skewed keys, and another that excludes them and processes everything else.

Add additional keys to subdivide the skewed data into smaller portions. Be cautious when recombining the subdivided groups because not all aggregate operators have a trivial composition—for example: SUM vs AVG. For example, you might transform:

SELECT ... GROUP BY a

into

SELECT ... FROM (SELECT ... (GROUP by a, b) GROUP BY a)

Majority of time was spent reading from intermediate stages

A prior stage may be producing more data than is expected.

Consider filtering out some records earlier in the query or using a more restrictive JOIN operation.

Majority of time was spent reading from input tables

Ideally, queries will spend the majority of time reading from input tables because it indicates that all other operations were less expensive than the base cost of reading your input data.

However, you may still be able to improve query performance by reading only necessary data. If this table is frequently used, consider partitioning it into smaller tables. For example, it is common to keep data in tables by date and querying over a particular date range (last quarter, seven days trailing, etc).

For more information, see table wildcard functions.

Majority of time was spent waiting to be scheduled

You might have more work than can be immediately scheduled.

Just wait. If your job isn’t time-critical, you can schedule it and wait for it to be executed as resources permit.

Majority of time was spent writing to output

This may be expected, if you are emitting more data than was originally read from inputs.

Consider filtering out some of the data before it is output.

Majority of time was spent writing to intermediate stages

See Majority of time was spent reading from intermediate stages.

Majority of time was spent on CPU-bound tasks

This indicates that your query is spending more time on transformation and processing data than on I/O. This is not unusual for complex queries, but may indicate an area for improvement and cost savings if the CPU usage is high enough to reach a high-compute tier.

Common causes

Does the query have many complex expressions being evaluated? User-defined functions and JSON/Regex evaluation can be particularly costly to evaluate.

Solutions

Reduce the amount of data that a complex expression is applied to by filtering as early as possible.

If this query is frequently run, consider precalculating some of the expressions so they don’t have to be reevaluated each time.

Examples

Data skew in stage 1

The GDELT database of events is stored as files of different sizes. This data skew can be seen in the difference between the average and maximum time to read the table in the first stage of a query.

For example, this simple query that counts events by country shows that the maximum read time for stage 1 was several times longer than the average and the maximum compute time was about twice as long as the average.

SELECT
  ActionGeo_CountryCode,
  COUNT(1) AS count
FROM [gdelt-bq:full.events]
GROUP BY 1
ORDER BY 2 DESC;

Screenshot of the web ui results for data skew in stage 1

There isn't much you can do to mitigate data skew in stage 1 because BigQuery table partitioning is not under user control, but if data skew occurs in later stages of a query, you can often reduce it by joining on or grouping by more keys to partition the data more finely.

Pushing down filters to improve performance

The following query applies a filter after a join operation, which in this case is inefficient because the filter pertains to only one of the join inputs.

SELECT
  ToHuman.HumanName,
  COUNT(1) AS count
FROM [gdelt-bq:full.events] AS Events
JOIN [gdelt-bq:full.crosswalk_geocountrycodetohuman] AS ToHuman
ON Events.ActionGeo_CountryCode = ToHuman.FIPSCC
WHERE Events.Year > 2010
GROUP BY 1
ORDER BY 2 DESC;

In this case, you can improve performance by pushing the filter down the execution stack, so that the filter is applied before the join operation. BigQuery doesn’t automatically push filters down across joins, as revealed by the execution plan for the following query. The filter condition Events.Year > 2010 depends on only one of the inputs to the join, but the filtering is done after the inner join.

Screenshot of the web ui results for data skew in stage 2

To improve performance, you can modify the query so that filtering happens before the join so that the filter is applied more selectively. The following example modifies the query by pushing the filter into a subquery before the join operation.

SELECT
  ToHuman.HumanName,
  COUNT(1) AS count
FROM (SELECT *
      FROM [gdelt-bq:full.events]
      WHERE Year > 2010) AS Events
JOIN [gdelt-bq:full.crosswalk_geocountrycodetohuman] AS ToHuman
ON Events.ActionGeo_CountryCode = ToHuman.FIPSCC
GROUP BY 1
ORDER BY 2 DESC;

Screenshot of the web ui results after fixing data skew in stage 2