Exporting Data

This page describes how to export data out of BigQuery.

After you've loaded your data into BigQuery, you can export the data in several formats. BigQuery can export up to 1 GB of data per file, and supports exporting to multiple files.

You can also use Google Cloud Dataflow to read data from BigQuery. For more information about using Cloud Dataflow to read from, and write to, BigQuery, see BigQuery I/O in the Cloud Dataflow documentation.

Before you begin

Ensure that you have read access to the data you want to export. You need dataset-level READER access. Alternatively, you can use any BigQuery IAM role that provides read access, such as dataViewer, dataEditor, or user.
Ensure that you have write access to a Google Cloud Storage bucket. You can export data only to a Cloud Storage bucket.

Exporting data stored in BigQuery

The following examples export data from a public dataset to a Cloud Storage bucket. The data is exported in CSV format.

Web UI

Go to the BigQuery web UI.
Go to the BigQuery web UI
In the navigation, find and click the expansion icon next to Public Datasets, and then expand bigquery-public-data:samples to display its contents.
Find and click the down arrow icon next to shakespeare.
Select Export table to display the Export to Google Cloud Storage dialog.
Leave the default settings in place for Export format and Compression (CSV and None, respectively).
In the Google Cloud Storage URI textbox, enter a valid URI in the format gs://[BUCKET_NAME]/[FILENAME.CSV], where [BUCKET_NAME] is your Cloud Storage bucket name, and [FILENAME.CSV] is the name of your destination file.
Click OK to export the table. While the job is running, (extracting) appears next to the name of the table in the navigation.

To check on the progress of the job, look near the top of the navigation for Job History for an Extract job.

Command-line

Use the bq extract command.

bq extract [DATASET].[TABLE_NAME] gs://[BUCKET_NAME]/[FILENAME.CSV]

For example, the following command exports the shakespeare table from the bigquery-public-data:samples dataset into a file named shakespeare.csv in a Cloud Storage bucket named example-bucket:

bq extract 'bigquery-public-data:samples.shakespeare' gs://example-bucket/shakespeare.csv

The default destination format is CSV. To export into JSON or Avro, use the destination_format flag:

bq extract --destination_format=NEWLINE_DELIMITED_JSON 'bigquery-public-data:samples.shakespeare' gs://example-bucket/shakespeare.json

API

To export data, create a job and populate the configuration.extract object.

Create an extract job that points to the BigQuery source data and the Cloud Storage destination. For information about creating jobs, see Managing Jobs, Datasets, and Projects.
Specify the source table by using the sourceTable configuration object, which comprises the project ID, dataset ID, and table ID.
The destination URI(s) must be fully-qualified, in the format gs://[BUCKET_NAME]/[FILENAME.CSV]. Each URI can contain one '*' wildcard character and it must come after the bucket name.
Specify the data format by setting the configuration.extract.destinationFormat property. For example, to export a JSON file, set this property to the value NEWLINE_DELIMITED_JSON.
To check the job status, call jobs.get([JOB_ID]) with the ID of the job returned by the initial request.
- If status.state = DONE, the job completed successfully.
- If the status.errorResult property is present, the request failed, and that object will include information describing what went wrong.
- If status.errorResult is absent, the job finished successfully, although there might have been some non-fatal errors. Non-fatal errors are listed in the returned job object's status.errors property.

API notes:

As a best practice, generate a unique ID and pass it as jobReference.jobId when calling jobs.insert() to create a job. This approach is more robust to network failure because the client can poll or retry on the known job ID.
Calling jobs.insert() on a given job ID is idempotent; in other words, you can retry as many times as you like on the same job ID, and at most one of those operations will succeed.

Configuring export options

You can configure two aspects of the exported data: the format, and the compression type.

Destination format

BigQuery supports CSV, JSON and Avro format. Nested or repeated data cannot be exported to CSV, but can be exported to JSON or Avro format.

Web UI

Set the destination format in the Export to Google Storage dialog.

Follow steps 1 through 4 in the Exporting data stored in BigQuery section to display the Export to Google Cloud Storage dialog.
Use the drop-down list next to Export format to select CSV, JSON, or Avro format.

Command-line

Use the bq extract command with the destination_format flag to set the format:

bq extract --destination_format=[CSV | NEWLINE_DELIMITED_JSON | AVRO] [DATASET].[TABLE_NAME] gs://[BUCKET_NAME]/[FILENAME]

For example, the following command exports the shakespeare table from the bigquery-public-data:samples dataset into a file named shakespeare.json in a Cloud Storage bucket named example-bucket in JSON format:

bq extract --destination_format=NEWLINE_DELIMITED_JSON 'bigquery-public-data:samples.shakespeare' gs://example-bucket/shakespeare.json

API

Specify the data format by setting the configuration.extract.destinationFormat property. For example, to export a JSON file, set this property to the value NEWLINE_DELIMITED_JSON.

Compression

BigQuery supports GZIP compression, but the default setting is no compression (NONE).

Web UI

Set the compression type in the Export to Google Storage dialog.

Follow steps 1 through 4 in the Exporting data stored in BigQuery section to display the Export to Google Cloud Storage dialog.
Select a compression type, either NONE or GZIP, using the selection buttons for Compression.

Command-line

Use the bq extract command with the compression flag to set the format:

bq extract --compression=[GZIP | NONE] [DATASET].[TABLE_NAME] gs://[BUCKET_NAME]/[FILENAME]

For example, the following command exports the shakespeare table from the bigquery-public-data:samples dataset into a file named shakespeare.zip in a Cloud Storage bucket named example-bucket in GZIP format:

bq extract --compression=GZIP 'bigquery-public-data:samples.shakespeare' gs://example-bucket/shakespeare.zip

API

Specify the compression type by setting the configuration.extract.compression property. For example, to use GZIP compression, set this property to the value GZIP.

Avro format

BigQuery expresses Avro formatted data in the following ways:

The resulting export files are Avro container files.
Each BigQuery row is represented as an Avro Record. Nested data is represented by nested Record objects.
REQUIRED fields are represented as the corresponding Avro types. For example, a BigQuery INTEGER type maps to an Avro LONG type.
NULLABLE fields are represented as an Avro Union of the corresponding type and "null".
REPEATED fields are represented as Avro arrays.
TIMESTAMP data types are represented as Avro LONG types.

The Avro format can't be used in combination with GZIP compression.