BigQuery UDFs let you encapsulate reusable SQL or JavaScript logic and invoke it like a built-in function in any query.
A user-defined function (UDF) is custom logic written in SQL or JavaScript that you register in a dataset and then call from any query. UDFs help eliminate duplicated code, centralize business rules, and simplify complex calculations.
UDFs improve readability, maintainability, and testing.You define the logic once, reference it everywhere, and update it in a single place when requirements change—no more hunting through dozens of queries.
Use CREATE FUNCTION
with a return type and an expression that operates on input parameters. You can include OR REPLACE
to update an existing UDF without dropping it first.
Add LANGUAGE js
and provide a JS function body in triple quotes.JavaScript UDFs are useful for string manipulation, complex loops, or third-party libraries not easily expressed in SQL.
UDFs can reference other tables only if you embed them in subqueries inside the function body. Pure SQL scalar UDFs should avoid external table references for best performance and reusability.
You need bigquery.routines.create
to create UDFs and bigquery.routines.update
to replace them.Consumers require bigquery.routines.get
on the dataset that stores the UDF.
Version using semantic names (customer_ltv_v1
), add comments with inputs/outputs, place in a shared routines
dataset, enforce naming conventions, and monitor usage with INFORMATION_SCHEMA.ROUTINE_USAGE
.
Avoid UDFs for one-off analytics or when built-in functions already satisfy the requirement. UDFs add maintenance overhead, so favor them only when logic is reusable.
.
Yes. Use RETURNS TABLE<schema>
and write a query that yields the desired columns.
The function definition is cached, but result caching depends on the calling query. If the query qualifies for result caching, BigQuery stores results as usual.
UDF definition is free. Query cost depends on bytes scanned by the calling query; UDF logic itself adds negligible overhead.