This document describes several ways to query data using BigQuery's SQL dialect, including running synchronous queries, asyncronous queries, interactive queries, and batch queries. This document also discusses how to opt out of query caching, return large query results, create a view, and query meta-tables.
Running synchronous queries
A synchronous query waits for the query to complete before returning a response. The response includes the query results in the form of a table, which can be a temporary or permanent table.
If you prefer to submit a query request that returns before the query completes, see Running asynchronous queries.
By default, synchronous queries write results to temporary tables. To run a synchronous query that writes to a temporary table:
Web UI
- Go to the BigQuery web UI. Go to the BigQuery web UI
- Click the Compose query button.
- Enter a valid BigQuery SQL query in the New Query text area.
- Click the Run query button.
This generates a synchronous query that writes the output to a temporary table.
Command-line
The bq command-line tool runs your query synchronously by default. For information on how to run a query, see Run a query.C#
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Go
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Java
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Node.js
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
PHP
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Python
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Ruby
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Storing synchronous results in a permanent table
To save your query results in a permanent table:
Web UI
Go to the BigQuery web UI.
Go to the BigQuery web UIClick the Compose query button.
Enter a valid BigQuery SQL query in the New Query text area.
Click the Show Options button.
Click the Select Table button in the Destination Table section.
Enter a table ID and click OK.
Click the Run query button.
This generates a synchronous query that writes the output to a permanent table.
Alternately, if you forget to specify a destination table before running your query, you can copy the temporary table to a permanent table by clicking the Save as Table button in the results window.
Command-line
Use the --destination_table flag to create a permanent table based on the
query results. For example, the following query creates a permanent table
named happyhalloween in the mydataset dataset:
bq query --destination_table=mydataset.happyhalloween "SELECT name,count FROM mydataset.babynames WHERE gender = 'M' ORDER BY count DESC LIMIT 6"
API
To save the query results to a permanent table, run the query asynchronously and specify a permanent table.
Storing synchronous results in a view
For information on how to create views, see Creating views.
Running asynchronous queries
An asynchronous query returns a response immediately, generally before the query completes. You then periodically check whether the query completed using a separate API call.
If you prefer that the query wait until the query completes before returning a response, use a synchronous query.
Asynchronous queries are always saved to a table: either a new table, an existing table, or a temporary table. You can choose whether to append or overwrite data in an existing table, and whether to create a new table if none exists by that name.
By default, asynchronous queries write results to temporary tables. To run an asynchronous query that writes to a temporary table, use the command-line tool or the BigQuery API:
Command-line
Use the--no-sync flag to run a query asynchronously. For more information on
how to run an asynchronous query using the bq tool, see Running asynchronous operations.
C#
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Go
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Java
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Node.js
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
PHP
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Python
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Ruby
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Storing asynchronous results in a permanent table
To store the query results from an asynchronous query in a permanent table, include the name of a destination table.
Command-line
Use the --destination_table flag to create a permanent table based on the
query results. For example, the following query creates a permanent table
named happyhalloween in the mydataset dataset:
bq query --nosync --destination_table=mydataset.happyhalloween "SELECT name,count FROM mydataset.names_2013 WHERE gender = 'M' ORDER BY count DESC LIMIT 6"
API
To save the query results to a permanent table, include a value for the jobs#configuration.query.destinationTable property.
Storing asynchronous results in a view
For information on how to create views, see Creating views.
Running parameterized queries
BigQuery supports query parameters to help prevent SQL injection when queries are constructed using user input. This feature is only available with standard SQL syntax.
To specify a named parameter, use the @ character followed by an identifier,
such as @param_name. For example, this query finds all the words in a
specific Shakespeare corpus with counts that are at least the specified value.
SELECT word, word_count
FROM `bigquery-public-data.samples.shakespeare`
WHERE corpus = @corpus
AND word_count >= @min_word_count
ORDER BY word_count DESC;
Alternatively, use the placeholder value ? to specify a positional parameter.
Note that a query can use positional or named parameters but not both.
Command-line
Use--parameter to provide values for parameters in the form
"name:type:value". An empty name produces a positional parameter. The type may
be omitted to assume STRING.
The --parameter flag must be used in conjunction with the flag
--use_legacy_sql=False to specify standard SQL syntax.
bq query --use_legacy_sql=False \
--parameter=corpus::romeoandjuliet \
--parameter=min_word_count:INT64:250 \
'SELECT word, word_count
FROM `bigquery-public-data.samples.shakespeare`
WHERE corpus = @corpus
AND word_count >= @min_word_count
ORDER BY word_count DESC;'
API
To use named parameters, set the jobs#configuration.query.parameterMode toNAMED
(query#parameterMode
for synchronous queries).
Populate
jobs#configuration.query.queryParameters[]
with the list of parameters
(query#queryParameters[]
for synchronous queries). Set the
name
of each parameter corresponding with the @param_name used in the query.
Enable standard SQL
syntax by setting
useLegacySql to false.
{
"query": "SELECT word, word_count FROM `bigquery-public-data.samples.shakespeare` WHERE corpus = @corpus AND word_count >= @min_word_count ORDER BY word_count DESC;",
"queryParameters": [
{
"parameterType": {
"type": "STRING"
},
"parameterValue": {
"value": "romeoandjuliet"
},
"name": "corpus"
},
{
"parameterType": {
"type": "INT64"
},
"parameterValue": {
"value": "250"
},
"name": "min_word_count"
}
],
"useLegacySql": false,
"parameterMode": "NAMED"
}
Try it in the Google APIs Explorer.
To use positional parameters, set the
jobs#configuration.query.parameterMode
to POSITIONAL.
Java
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Python
For more on installing and creating a BigQuery client, refer to BigQuery Client Libraries.
Using arrays in parameterized queries
To use an array type in a query parameter set the type to ARRAY<T> where T
is the type of the elements in the array. Construct the value as a
comma-separated list of elements enclosed in square brackets, such as [1, 2,
3].
See the data types reference for more information about the array type.
Command-line
This query selects the most popular names for baby boys born in US states starting with the letter W.bq query --use_legacy_sql=False \
--parameter='gender::M' \
--parameter='states:ARRAY<STRING>:["WA", "WI", "WV", "WY"]' \
'SELECT name, sum(number) as count
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE gender = @gender
AND state IN UNNEST(@states)
GROUP BY name
ORDER BY count DESC
LIMIT 10;'
Be careful to enclose the array type declaration in single quotes so that the
command output is not accidentally redirected to a file by the > character.
API
To use an array-valued parameter set the jobs#configuration.query.queryParameters[].parameterType.type toARRAY.
If the array values are scalars set the
jobs#configuration.query.queryParameters[].parameterType.arrayType.type
to the type of the values, such as STRING. If the array values are
structures set this to STRUCT and add the needed field definitions to
structTypes.
For example, this query selects the most popular names for baby boys born in US states starting with the letter W.
{
"query": "SELECT name, sum(number) as count\nFROM `bigquery-public-data.usa_names.usa_1910_2013`\nWHERE gender = @gender\nAND state IN UNNEST(@states)\nGROUP BY name\nORDER BY count DESC\nLIMIT 10;",
"queryParameters": [
{
"parameterType": {
"type": "STRING"
},
"parameterValue": {
"value": "M"
},
"name": "gender"
},
{
"parameterType": {
"type": "ARRAY",
"arrayType": {
"type": "STRING"
}
},
"parameterValue": {
"arrayValues": [
{
"value": "WA"
},
{
"value": "WI"
},
{
"value": "WV"
},
{
"value": "WY"
}
]
},
"name": "states"
}
],
"useLegacySql": false,
"parameterMode": "NAMED"
}
Using timestamps in parameterized queries
To use a timestamp in a query parameter set the type to TIMESTAMP. The value
should be in the format YYYY-MM-DD HH:MM:SS.DDDDDD time_zone.
See the data types reference for more information about the timestamp type.
Command-line
This query adds an hour to the timestamp parameter value.bq query --use_legacy_sql=False \
--parameter='ts_value:TIMESTAMP:2016-12-07 08:00:00' \
'SELECT TIMESTAMP_ADD(@ts_value, INTERVAL 1 HOUR);'
API
To use a timestamp parameter set the jobs#configuration.query.queryParameters[].parameterType.type toTIMESTAMP.
This query adds an hour to the timestamp parameter value.
{
"query": "SELECT TIMESTAMP_ADD(@ts_value, INTERVAL 1 HOUR);",
"queryParameters": [
{
"name": "ts_value",
"parameterType": {
"type": "TIMESTAMP"
},
"parameterValue": {
"value": "2016-12-07 08:00:00"
}
}
],
"useLegacySql": false,
"parameterMode": "NAMED"
}
Using structs in parameterized queries
To use a struct in a query parameter set the type to STRUCT<T> where T
defines the fields and types within the struct. Field definitions are
separated by commas and are of the form field_name TF where TF is the type
of the field. For example, STRUCT<x INT64, y STRING> defines a struct with a
field named x of type INT64 and a second field named y of type STRING.
See the data types reference for more information about the struct type.
Command-line
This trivial query demonstrates the use of structured types by returning the parameter value.bq query --use_legacy_sql=False \
--parameter='struct_value:STRUCT<x INT64, y STRING>:{"x": 1, "y": "foo"}' \
'SELECT @struct_value AS s;'
API
To use a struct parameter set the jobs#configuration.query.queryParameters.parameterType.type toSTRUCT.
Add an object for each field of the struct to
jobs#configuration.query.queryParameters.parameterType.structTypes.
If the struct values are scalars set the
type
to the type of the values, such as STRING. If the struct values are arrays
set this to ARRAY and set the nested arrayType field to the appropriate
type. If the struct values are structures set type to STRUCT and add the
needed structTypes.
This trivial query demonstrates the use of structured types by returning the parameter value.
{
"query": "SELECT @struct_value AS s;",
"queryParameters": [
{
"name": "struct_value",
"parameterType": {
"type": "STRUCT",
"structTypes": [
{
"name": "x",
"type": {
"type": "INT64"
}
},
{
"name": "y",
"type": {
"type": "STRING"
}
}
]
},
"parameterValue": {
"structValues": {
"x": {
"value": "1"
},
"y": {
"value": "foo"
}
}
}
}
],
"useLegacySql": false,
"parameterMode": "NAMED"
}
Running interactive or batch queries
By default, BigQuery runs interactive queries, which means that the query is executed as soon as possible. Interactive queries count towards your concurrent rate limit and your daily limit. All of the examples in Running synchronous queries are interactive queries.
BigQuery also offers batch queries. BigQuery queues each batch query on your behalf, and starts the query as soon as idle resources are available, usually within a few minutes. If BigQuery hasn't started the query within 24 hours, BigQuery changes the job priority to interactive. Batch queries don't count towards your concurrent rate limit, which can make it easier to start many queries at once.
To run a batch query:
Web UI
Go to the BigQuery web UI.
Go to the BigQuery web UIClick the Compose query button.
Enter a valid BigQuery SQL query in the New Query text area.
Click the Show Options button.
Select the Batch option in the Query Priority section.
Click the Run query button.
Command-line
Use the --batch flag to run a batch query. For example, the following
query shows how to start an asynchronous batch query:
bq --nosync query --batch "SELECT name,count FROM mydataset.names_2013 WHERE gender = 'M' ORDER BY count DESC LIMIT 6"
API
Include the configuration.query.priority
property, with the value set to BATCH. For example, the language-specific
examples in running asynchronous queries
all use batch queries.
Using cached query results
BigQuery writes all query results to a table. The table is either explicitly identified by the user (a destination table), or it is a temporary, cached results table. Temporary, cached results tables are maintained per-user, per- project.
When you run a duplicate query, BigQuery attempts to reuse
cached results. When query results are retrieved from a cached results table,
the job statistics property statistics.query.cacheHit returns as true, and you
are not charged for the query. Though you are not charged for queries that use
cached results, the queries are subject to the BigQuery quota policies.
In addition to reducing costs, queries that
use cached results are significantly faster because BigQuery does not need to
compute the result set.
All query results, including both interactive and batch queries, are cached in temporary tables for approximately 24 hours with some exceptions. Query results are not cached:
-
When a destination table is specified in the job configuration, the web UI, the command line, or the API
-
If any of the referenced tables or logical views have changed since the results were previously cached
-
When any of the tables referenced by the query have recently received streaming inserts (a streaming buffer is attached to the table) even if no new rows have arrived
-
If the query uses non-deterministic functions; for example, date and time functions such as
CURRENT_TIMESTAMP()andNOW(), and other functions such asCURRENT_USER()return different values depending on when a query is executed -
If the cached results have expired; typical cache lifetime is 24 hours, but the cached results are best-effort and may be invalidated sooner
For query results to persist in a cached results table, the result set must be smaller than the maximum response size. For more information about managing large result sets, see Returning large query results.
Cached result tables have additional constraints on their use, given their
special status. You cannot target cached result tables with DML
statements. Although current semantics allow it, the use of cached results as
input for dependendent jobs is discouraged. For example, you should not submit
query jobs that retrieve results from the cache table. Instead, write your
results to a named destination table. To enable easy cleanup, features such as
the dataset level defaultTableExpirationMs property can expire the
data automatically after a given duration.
Disabling retrieval of cached results
The Use cached results option reuses results from a previous run of the same query unless the tables being queried have changed. Using cached results is only beneficial for repeated queries. For new queries, the Use cached results option has no effect, though it is enabled by default.
When you repeat a query with the Use cached results option disabled, the existing cached result is overwritten. This requires BigQuery to compute the query result, and you are charged for the query. This is particularly useful in benchmarking scenarios.
If you want to disable retrieving cached results and force live evaluation of a
query job, you can set the configuration.query.useQueryCache
property of your query job to false.
To disable the Use cached results option:
Web UI
Go to the BigQuery web UI.
Go to the BigQuery web UIClick the Compose query button.
Enter a valid BigQuery SQL query in the New Query text area.
Click Show Options.
Uncheck Use Cached Results.

Command-line
Use the nouse_cache flag to overwrite the query cache. The following
example forces BigQuery to process the query without using the existing
cached results:
bq query --nouse_cache --batch "SELECT name,count FROM mydataset.names_2013 WHERE gender = 'M' ORDER BY count DESC LIMIT 6"
API
To process a query without using the existing cached results, set the
useQueryCache
property to false.
Ensuring use of the cache
If you use the jobs.insert()
function to run a query, you can force a query job to fail unless cached results
can be used by setting the
createDisposition
property of the job configuration to CREATE_NEVER.
If the query result does not exist in the cache, a NOT_FOUND error is returned.
Verifying use of the cache
There are two ways to determine if BigQuery returned a result using the cache:
-
If you are using the BigQuery web UI, the result string does not contain information about the number of processed bytes, and displays the word "cached".

-
If you are using the BigQuery API, the
cacheHitproperty in the query result is set totrue.
Returning large query results
Normally, queries have a maximum response size.
If you plan to run a query that might return larger results,
you can set allowLargeResults to true in your job configuration. Configuring
large results requires you to specify a destination table. You incur storage
charges for the destination table.
Queries with large results are subject to these limitations:
- You must specify a destination table.
- You cannot specify a top-level
ORDER BY,TOPorLIMITclause. Doing so negates the benefit of usingallowLargeResults, because the query output can no longer be computed in parallel. - Window functions can return
large query results only if used in conjunction with a
PARTITION BYclause.
To allow large query results:
Web UI
Go to the BigQuery web UI.
Go to the BigQuery web UIClick the Compose query button.
Enter a valid BigQuery SQL query in the New Query text area.
Click Show Options.
For Destination Table, click Select Table and type a name in the Table ID field.
For Results Size, check Allow Large Results.

Command-line
Use the allow_large_results flag with the destination_table flag to
create a destination table to hold the large results set:
bq query --destination_table '[DATASET].[TABLE_NAME]' --allow_large_results "[QUERY]"
API
To enable large results, set the
configuration.query.allowLargeResults
property to true and specify a destination table using configuration.query.destinationTable.
Querying table subsets by using table decorators
Normally, BigQuery performs a full column scan when running a query. You can use table decorators to perform a more cost-effective query of a subset of your data. For more information, see table decorators.
Creating views
A view is a virtual table defined by a SQL query. You can query views in the browser tool, or by using a query job.
BigQuery's views are logical views, not materialized views, which means that the query that defines the view is re-executed every time the view is queried. Queries are billed according to the total amount of data in all table fields referenced directly or indirectly by the top-level query.
BigQuery supports up to eight levels of nested views; if there are more than
eight levels, an INVALID_INPUT error returns. You are also limited to 1,000
authorized views per dataset. In addition, views can only reference other tables
and views with the same Dataset location.
Querying a view requires the READER role for all datasets that contain the
tables in the view chain. For more information about roles, see access
control.
Web UI
After running a query, click the Save View button in the query
results window to save the results as a view.
Command-line
Use the mk command with the --view flag.
bq mk --view [SQL_QUERY] [DATASET.VIEW]
Where:
- SQL_QUERY is a valid BigQuery SQL query
- DATASET is a dataset in your project
- VIEW is the name of the view you want to create
If your query references external user-defined function resources
stored in Google Cloud Storage or local files, use the --view_udf_resource
flag to specify those resources. For example:
bq mk --view="SELECT foo FROM myUdf(table1)" --view_udf_resource="gs://my-bucket/some_library.js" --view_udf_resource="path/to/local/file.js"
API
Call tables.insert()
with a table resource that
contains a view property.
You can edit a view by calling tables.patch() in the BigQuery API.
Using meta-tables
BigQuery offers some special tables whose contents represent metadata, such as
the names of your tables. The "meta-tables" are read-only. Normally, you use
one by referencing it in a SELECT statement.
Meta-tables can be used in other API operations besides a query job, such as
tables.get or tabledata.list. They do not support tables.insert and
cannot be used as a destination table; they also do not support table decorators.
Meta-tables do not appear in a tables.list of the dataset.
Metadata about tables in a dataset
You can access metadata about the tables in a dataset by using the __TABLES__
or __TABLES_SUMMARY__ meta-table.
Use the following syntax to query a meta-table:
SELECT [FIELD] FROM [DATASET].__TABLES__;
Where DATASET is the name of your dataset, and FIELD is one of the following:
| Field | Description |
|---|---|
project_id |
Name of the project. |
dataset_id |
Name of the dataset. |
table_id |
Name of the table. |
creation_time |
The time at which the table was created, in milliseconds since January 1, 1970 UTC. |
last_modified_time |
The time at which the table was most recently changed, in milliseconds since January 1, 1970 UTC. |
row_count |
Number of rows in the table. |
size_bytes |
Total size of the table, measured in bytes. |
type |
An integer representing the table type: a regular table (1) or a view (2). |
Example
The following query retrieves metadata about the tables in the publicdata:samples dataset.
SELECT * FROM publicdata:samples.__TABLES__;
Returns:
+------------+------------+-----------------+---------------+--------------------+-----------+--------------+------+ | project_id | dataset_id | table_id | creation_time | last_modified_time | row_count | size_bytes | type | +------------+------------+-----------------+---------------+--------------------+-----------+--------------+------+ | publicdata | samples | github_nested | 1348782587310 | 1348782587310 | 2541639 | 1694950811 | 1 | | publicdata | samples | github_timeline | 1335915950690 | 1335915950690 | 6219749 | 3801936185 | 1 | | publicdata | samples | gsod | 1335916040125 | 1440625349328 | 114420316 | 17290009238 | 1 | | publicdata | samples | natality | 1335916045005 | 1440625330604 | 137826763 | 23562717384 | 1 | | publicdata | samples | shakespeare | 1335916045099 | 1440625429551 | 164656 | 6432064 | 1 | | publicdata | samples | trigrams | 1335916127449 | 1445684180324 | 68051509 | 277168458677 | 1 | | publicdata | samples | wikipedia | 1335916132870 | 1445689914564 | 313797035 | 38324173849 | 1 | +------------+------------+-----------------+---------------+--------------------+-----------+--------------+------+
Scaling to a large number of tables
__TABLES__ provides more information than Tables.list, but Tables.list
is faster on a dataset with many tables.
__TABLES_SUMMARY__ is a meta-table that is faster than __TABLES__ because
it does not contain the data-dependent fields: last_modified_time, row_count,
and size_bytes.
Generally, __TABLES__ and __TABLES_SUMMARY__ are reasonably fast for
datasets with up to a few thousand tables. For larger datasets they will become
increasingly slow, and may exceed available resources.
Temporary and permanent tables
BigQuery saves all query results to a table, which can be either permanent or temporary:
-
A temporary table is a randomly named table saved in a special dataset; the table has a lifetime of approximately 24 hours. Temporary tables are not available for sharing, and are not visible using any of the standard list or other table manipulation methods.
-
A permanent table can be a new or existing table in any dataset in which you have WRITE privileges.
Additional limits
In addition to the standard query quotas, the following limits also apply for querying data.
- Maximum tables per query: 1,000
- Maximum query length: 256 KB