User-Defined Functions in Standard SQL
BigQuery supports user-defined functions (UDFs). A UDF enables you to create a function using another SQL expression or another programming language, such as JavaScript. These functions accept columns of input and perform actions, returning the result of those actions as a value. For information on user-defined functions in legacy SQL, see User-Defined Functions in Legacy SQL.
UDFs are temporary. This means you can only use them for the current query or command-line session.
UDF examples
CREATE TEMPORARY FUNCTION multiplyInputs(x FLOAT64, y FLOAT64)
RETURNS FLOAT64
LANGUAGE js AS """
return x*y;
""";
WITH numbers AS
(SELECT 1 AS x, 5 as y
UNION ALL
SELECT 2 AS x, 10 as y
UNION ALL
SELECT 3 as x, 15 as y)
SELECT x, y, multiplyInputs(x, y) as product
FROM numbers;
+-----+-----+--------------+
| x | y | product |
+-----+-----+--------------+
| 1 | 5 | 5 |
| 2 | 10 | 20 |
| 3 | 15 | 45 |
+-----+-----+--------------+
You can create multiple UDFs before a query. For example:
CREATE TEMPORARY FUNCTION multiplyInputs(x FLOAT64, y FLOAT64)
RETURNS FLOAT64
LANGUAGE js AS """
return x*y;
""";
CREATE TEMPORARY FUNCTION divideByTwo(x FLOAT64)
RETURNS FLOAT64
LANGUAGE js AS """
return x / 2;
""";
WITH numbers AS
(SELECT 1 AS x, 5 as y
UNION ALL
SELECT 2 AS x, 10 as y
UNION ALL
SELECT 3 as x, 15 as y)
SELECT x,
y,
multiplyInputs(x, y) as product,
divideByTwo(x) as half_x,
divideByTwo(y) as half_y
FROM numbers;
+-----+-----+--------------+--------+--------+
| x | y | product | half_x | half_y |
+-----+-----+--------------+--------+--------+
| 1 | 5 | 5 | 0.5 | 2.5 |
| 2 | 10 | 20 | 1 | 5 |
| 3 | 15 | 45 | 1.5 | 7.5 |
+-----+-----+--------------+--------+--------+
You can pass the result of a UDF as input to another UDF. For example:
CREATE TEMPORARY FUNCTION multiplyInputs(x FLOAT64, y FLOAT64)
RETURNS FLOAT64
LANGUAGE js AS """
return x*y;
""";
CREATE TEMPORARY FUNCTION divideByTwo(x FLOAT64)
RETURNS FLOAT64
LANGUAGE js AS """
return x/2;
""";
WITH numbers AS
(SELECT 1 AS x, 5 as y
UNION ALL
SELECT 2 AS x, 10 as y
UNION ALL
SELECT 3 as x, 15 as y)
SELECT x,
y,
multiplyInputs(divideByTwo(x), divideByTwo(y)) as half_product
FROM numbers;
+-----+-----+--------------+
| x | y | half_product |
+-----+-----+--------------+
| 1 | 5 | 1.25 |
| 2 | 10 | 5 |
| 3 | 15 | 11.25 |
+-----+-----+--------------+
The following example shows a UDF that employs a SQL function.
CREATE TEMPORARY FUNCTION addFourAndDivide(x INT64, y INT64) AS ((x + 4) / y);
WITH numbers AS
(SELECT 1 as val
UNION ALL
SELECT 3 as val
UNION ALL
SELECT 4 as val
UNION ALL
SELECT 5 as val)
SELECT val, addFourAndDivide(val, 2) AS result
FROM numbers;
+-----+--------+
| val | result |
+-----+--------+
| 1 | 2.5 |
| 3 | 3.5 |
| 4 | 4 |
| 5 | 4.5 |
+-----+--------+
External UDF structure
You create external UDFs using the following structure.
CREATE [TEMPORARY | TEMP] FUNCTION function_name ([named_parameter[, ...]])
RETURNS [data_type]
LANGUAGE [language]
AS [external_code]
Each UDF consists of the following components:
- CREATE [TEMPORARY | TEMP ] FUNCTION. Creates a new function. A
function can contain zero or more
named_parameters each consisting of comma-separatedparam_nameandparam_typepairs. You must include eitherTEMPORARYorTEMPwhen you create a UDF. - RETURNS [data_type]. Specifies the data type that the function returns. See Supported UDF data types for more information.
- Language [language]. Specifies the language for the function. See Supported external UDF languages for more information.
- AS [external_code]. Specifies the code that the function runs. See Quoting rules for more information about how to add code to a UDF.
Supported external UDF languages
External UDFs support code written in JavaScript, which you specify using js
as the LANGUAGE. For example:
CREATE TEMPORARY FUNCTION greeting(a STRING)
RETURNS STRING
LANGUAGE js AS """
return "Hello, " + a + "!";
""";
SELECT greeting(name) as everyone
FROM UNNEST(["Hannah", "Max", "Jakob"]) AS name;
+----------------+
| everyone |
+----------------+
| Hello, Hannah! |
| Hello, Max! |
| Hello, Jakob! |
+----------------+
See SQL type encodings in JavaScript for information on how BigQuery data types map to JavaScript types.
Supported external UDF data types
For external UDFs, BigQuery supports the following data types:
- ARRAY
- BOOL
- BYTES
- FLOAT64
- STRING
- STRUCT
- TIMESTAMP
SQL type encodings in JavaScript
Some SQL types have a direct mapping to JavaScript types, but others do not.
Because JavaScript does not support a 64-bit integer type, INT64 is
unsupported in input or output types for JavaScript UDFs. Instead, use
FLOAT64 to represent integer values as a number, or
STRING to represent integer values as a string.
BigQuery represents types in the following manner:
| BigQuery Data Type | JavaScript Data Type |
|---|---|
| ARRAY | ARRAY |
| BOOL | BOOLEAN |
| BYTES | base64-encoded STRING |
| FLOAT64 | NUMBER |
| STRING | STRING |
| STRUCT | OBJECT where each STRUCT field is a named field |
| TIMESTAMP | DATE with a microsecond field containing the microsecond
fraction of the timestamp |
Quoting rules
You must enclose external code in quotes. For simple, one line code snippets, you can use a standard quoted string:
CREATE TEMPORARY FUNCTION plusOne(x FLOAT64)
RETURNS FLOAT64
LANGUAGE js
AS "return x+1;";
SELECT val, plusOne(val) AS result
FROM UNNEST([1, 2, 3, 4, 5]) AS val;
+-----------+-----------+
| val | result |
+-----------+-----------+
| 1 | 2 |
| 2 | 3 |
| 3 | 4 |
| 4 | 5 |
| 5 | 6 |
+-----------+-----------+
In cases where the snippet contains quotes, or consists of multiple lines, use triple-quoted blocks:
CREATE TEMPORARY FUNCTION customGreeting(a STRING)
RETURNS STRING
LANGUAGE js AS """
var d = new Date();
if (d.getHours() < 12) {
return 'Good Morning, ' + a + '!';
} else {
return 'Good Evening, ' + a + '!';
}
""";
SELECT customGreeting(names) as everyone
FROM UNNEST(["Hannah", "Max", "Jakob"]) AS names;
+-----------------------+
| everyone |
+-----------------------+
| Good Morning, Hannah! |
| Good Morning, Max! |
| Good Morning, Jakob! |
+-----------------------+
SQL UDF structure
You create SQL UDFs using the following structure.
CREATE [TEMPORARY | TEMP] FUNCTION function_name ([named_parameter[, ...]])
[RETURNS data_type]
AS (sql_expression)
Each UDF consists of the following components:
- CREATE [TEMPORARY | TEMP ] FUNCTION. Creates a new function. A
function can contain zero or more
named_parameters each consisting of comma-separatedparam_nameandparam_typepairs. You must include eitherTEMPORARYorTEMPwhen you create a UDF. - [RETURNS data_type]. Optional. Specifies the data type that the function returns.
- AS [sql_expression]. Specifies the SQL expression that the function evaluates and returns.
Including external libraries
You can extend your external UDFs using the OPTIONS section. This section
allows you to specify external code libraries for the UDF.
CREATE TEMP FUNCTION myFunc(a FLOAT64, b STRING)
RETURNS STRING
LANGUAGE js AS
"""
// Assumes 'doInterestingStuff' is defined in one of the library files.
return doInterestingStuff(a, b);
"""
OPTIONS (
library="gs://my-bucket/path/to/lib1.js",
library=["gs://my-bucket/path/to/lib2.js", "gs://my-bucket/path/to/lib3.js"]
)
In the preceding example, code in lib1.js, lib2.js, and lib3.js is
available to any code in the [external_code] section of the UDF. Notice that
you can specify library files using single-element or array syntax.
UDFs and the Web UI
You can use the BigQuery web UI to run queries using one or more UDFs.
Running a query with a UDF
- Click the COMPOSE QUERY button.
- Click the Query Editor tab.
- Click the Show Options button.
- Uncheck the Use Legacy SQL checkbox.
-
Type the UDF statement into the Query Editor text area. For example:
CREATE TEMPORARY FUNCTION timesTwo(x FLOAT64) RETURNS FLOAT64 LANGUAGE js AS """ return x*2; """; -
Below the UDF statement, type your query. For example:
SELECT timesTwo(numbers) as doubles FROM UNNEST([1, 2, 3, 4, 5]) AS numbers; -
Click the RUN QUERY button. The query results display underneath the buttons.
UDFs and the bq command-line tool
You can use the bq Command-Line Tool from the Google Cloud SDK to run a query containing one or more UDFs.
Use the following syntax to run a query with a UDF:
bq query <statement_with_udf_and_query>
Best practices for JavaScript UDFs
Pre-filter your input
If your input can be easily filtered down before being passed to a JavaScript UDF, your query will likely be faster and cheaper.
Avoid persistent mutable state
Do not store or access mutable state across JavaScript UDF calls.
Use memory efficiently
The JavaScript processing environment has limited memory available per query. JavaScript UDF queries that accumulate too much local state may fail due to memory exhaustion.
Limits
- The amount of data that your JavaScript UDF outputs when processing a single row should be approxiately 5 MB or less.
- Each user is limited to running approximately 6 JavaScript UDF queries in a specific project at the same time. If you receive an error that you're over the concurrent query limit, wait a few minutes and try again.
- A JavaScript UDF can timeout and prevent your query from completing. Timeouts can be as short as 5 minutes, but can vary depending on several factors, including how much user CPU time your function consumes and how large your inputs and outputs to the JS function are.
- A query job can have a maximum of 50 JavaScript UDF resources (inline code blobs or external files).
- Each inline code blob is limited to a maximum size of 32 KB.
- Each external code resource is limited to a maximum size of 1 MB.
Limitations
- The DOM objects
Window,DocumentandNode, and functions that require them, are unsupported. - JavaScript functions that rely on native code are unsupported.
- Bitwise operations in JavaScript handle only the most significant 32 bits.