To help organize your projects, you can add arbitrary labels to your datasets. After labeling your datasets, you can search for datasets based on label values. For example, you can use labels to group datasets by purpose, environment, department, and so on.
Before you begin
- Read the Datasets documentation.
- If you plan to use the command-line tool to apply labels, read the Quickstart Using the bq Command-Line Tool and the bq Command-Line Tool reference.
- If you plan to use the BigQuery API to apply labels, read Create A Simple Application With the API.
Labeling a dataset
Labels are key:value pairs that you can attach to a dataset. BigQuery labels have the same characteristics as labels in Compute Engine and Cloud Resource Manager:
- Label keys and values can be no longer than 63 characters.
- Label keys and values can contain only lowercase letters, numbers, underscores, hyphens, and international characters.
- Label keys and values cannot exceed 128 bytes in size.
- Label keys must begin with a letter.
- Label keys must be unique within a dataset.
Each of your datasets can have up to 64 labels. Datasets do not have any default labels.
Labels must have a key, but can have an empty value. Labels with no value can be used as tags.
Creating a label
To create a label for a dataset:
Command-line
Use the bq update command with the set_label flag.
bq update --set_label [KEY1:VALUE1] --set_label [KEY2:VALUE2] --set_label [KEYN:VALUEN] [DATASET_ID]
Where [KEY1:VALUE1], [KEY2:VALUE2], and [KEYN:VALUEN] are label
key:value pairs that you want to create, and [DATASET_ID] is a valid
dataset ID.
For example, to add a label to track departments, you can
create a label with a department key:
bq update --set_label department:shipping [DATASET_ID]
API
To create a label using the API, add a labels field to the dataset's
configuration object when creating or updating the dataset.
{
"kind": "bigquery#dataset",
...
"friendlyName": string,
"description": string,
"labels": {
"[KEY1]": "[VALUE1]",
"[KEY2]": "[VALUE2]"
}
...
}
Where [KEY1] and [KEY2] are label keys, and [VALUE1] and [VALUE2]
are the corresponding label values that you want to create.
Updating a label
To update a label for a dataset:
Command-line
Use the bq update command with the set_label flag.
bq update --set_label [KEY1:VALUE1] --set_label [KEY2:VALUE2] --set_label [KEYN:VALUEN] [DATASET_ID]
Where [KEY1:VALUE1], [KEY2:VALUE2], and [KEYN:VALUEN] are label
key:value pairs that you want to update, and [DATASET_ID] is a valid
dataset ID.
For example, to update a label that tracks departments, specify
a new value for the department key:
bq update --set_label department:receiving [DATASET_ID]
API
Modify the
labelsfield in the dataset's configuration object.{ "kind": "bigquery#dataset", ... "friendlyName": string, "description": string, "labels": { "[KEY1]": "[VALUE1]", "[KEY2]": "[VALUE2]" } ... }Where
[KEY1]and[KEY2]are label keys, and[VALUE1]and[VALUE2]are the corresponding label values that you want to create.To replace all labels in the dataset, call Datasets.update().
Alternately, to replace only the labels specified and otherwise preserve existing ones, call Datasets.patch().
If there are concurrent calls to update() and delete(),
BigQuery executes the last one received.
Deleting a label
To delete a label from a dataset:
Command-line
Use the bq update command with the clear_label flag.
bq update --clear_label [KEY1] --clear_label [KEY2] --clear_label [KEYN] [DATASET_ID]
Where [KEY1], [KEY2], and [KEYN] are label keys that you want to
delete, and [DATASET_ID] is a valid dataset ID.
For example, to delete a label that tracks departments, specify the
department key:
bq update --clear_label department [DATASET_ID]
API
To remove all labels from a dataset:
Remove the
labelsfield from the dataset's configuration object.{ "kind": "bigquery#dataset", ... "friendlyName": string, "description": string, ... }Call Datasets.update().
To delete specific labels from a dataset and otherwise preserve the other existing labels:
Change the value of the labels you want to delete to
null:{ "kind": "bigquery#dataset", ... "friendlyName": string, "description": string, "labels": { "[KEY1]": null, "[KEY2]": null } ... }Where
[KEY1]and[KEY2]are label keys that you want to delete.Call Datasets.patch().
If there are concurrent calls to update() and delete(),
BigQuery executes the last one received.
Deleting a label's value
A label that has a key but an empty value can be used as a tag. To delete a label's value, specify a label with an empty value:
Command-line
Use the bq update command with the set_label flag. Specify the key,
followed by a colon, but leave the value unspecified.
bq update --set_label [KEY1]: --set_label [KEY2]: --set_label [KEYN]: [DATASET_ID]
Where [KEY1], [KEY2], and [KEYN] are label keys that you want to keep
without values, and [DATASET_ID] is a valid dataset ID.
For example, to delete the value associated with the department key,
specify the department key with an empty value:
bq update --set_label department: [DATASET_ID]
API
To delete a label's value, but retain the label's key:
Change the value of the labels you want to delete to the empty string (
""):{ "kind": "bigquery#dataset", ... "friendlyName": string, "description": string, "labels": { "[KEY1]": "", "[KEY2]": "" } ... }Where
[KEY1]and[KEY2]are label keys that you want to keep, but with no values attached to them.Call Datasets.patch().
If there are concurrent calls to update() and delete(),
BigQuery executes the last one received.
Viewing a dataset's labels
To view a dataset's labels:
Command-line
Use the bq show command with the dataset ID.
bq show [DATASET_ID]
The command-line tool returns the labels associated with that dataset.
API
Call the Datasets.get() command. The response includes all labels associated with that dataset.
Alternately, you can use Datasets.list() to view the labels for multiple datasets. You can also use filters to return a group of datasets that share common labels.
Filtering datasets using labels
To filter for a set of datasets based on labels, create a filter specification
for use in the bq command-line tool or the BigQuery API that
uses the following syntax:
field[:value][ field[:value]...
Where:
fieldis expressed aslabels.[KEY]where[KEY]is a label key.valueis an optional label value.
The filter specification has the following characteristics:
- Only the AND logical operator is supported. Space-separated comparisons are treated as having implicit AND operators.
- The only field currently eligible for filtering is "labels.key" where "key" is the name of a label.
- The filter can include up to ten expressions.
- Filtering is case-sensitive.
Filter specification examples
To search for all datasets that have a department key with a
value of shipping, use the following specification:
labels.department:shipping
To search for multiple key:value pairs, separate them with a space. For example,
to search for all datasets where the value of the department key is shipping
and the value of the location key is usa, use the following specification:
labels.department:shipping labels.location:usa
You can filter on the presence of a key alone, rather than matching against a
key:value pair. The following filter specification returns all datasets
with labels named department, without regard to the values that might be
associated with the keys:
labels.department
An equivalent specification uses an asterisk to represent all possible values
associated with the department key:
labels.department:*
Generating filtered lists of datasets
To generate a filtered list of datasets:
Command-line
Use the bq ls command with the filter flag where [FILTER_SPECIFICATION]
is a valid filter specification and [DATASET_ID]
is a valid dataset ID:
bq ls --filter "[FILTER_SPECIFICATION]" [DATASET_ID]
The command-line tool returns a list of datasets that meet the filter requirements.
For example, the following command returns a list of datasets that have a
department key with a value of shipping:
bq ls --filter "labels.department:shipping" [DATASET_ID]
API
Add a
filterproperty that specifies the filter you want to use:{ "kind": "bigquery#dataset", ... "friendlyName": string, "description": string, "filter": "[FILTER_SPECIFICATION]", ... }Where
[FILTER_SPECIFICATION]is a valid filter specification.Call the Datasets.list() command. The response includes all labels associated with that dataset.
What's next
- Read about Labeling or Tagging Resources in the Compute Engine documentation.
- Read about Using Labels in the Resource Manager documentation.