Data masking in BigQuery hides or obfuscates sensitive column values at query time so unauthorized users see anonymized or null results.
Prevent accidental exposure of PII while letting analysts work with useful datasets. Masked data supports compliance with GDPR, HIPAA, and SOC2 without duplicating tables.
You can ➊ apply dynamic data masking policies using policy tags, ➋ build authorized views that transform columns, or ➌ use masking functions (e.g., SHA256, REGEXP_REPLACE) directly in queries.
Create a policy tag, attach it to a column, and define a masking rule such as NULL, HASH, or PARTIAL. Users lacking the bigquery.policyTags.get
permission for that tag will see the masked value automatically.
1. Enable BigQuery Data Catalog API.
2. CREATE POLICY TAG.
3. ALTER TABLE ... ALTER COLUMN ... SET POLICY TAG.
4. Grant Data Catalog Fine-grained Reader to roles who should see clear text.
Use REGEXP_REPLACE
to reveal the first letter and domain while hiding the rest. This keeps data useful for deduping but protects identities.
SHA256
or FARM_FINGERPRINT
turns IDs into irreversible hashes. Downstream teams can still join on the hash but cannot retrieve the original ID.
Authorized views fit when you need complex masking logic or cross-table joins but want to expose a single secure interface. Grant users access to the view, not the base tables.
Document policy tags in your data catalog, version-control masking SQL, and test with least-privilege accounts. Combine masking with row-level security for maximum protection.
Yes. Apply a row-level security policy to filter records and a policy tag to mask columns. Both rules execute before results are returned.
Policy-tag masking is applied by the storage engine and has negligible impact. SQL functions inside views add minimal overhead, usually sub-millisecond per row.
Only if they hold Data Catalog Fine-grained Reader or BigQuery Admin roles on the specific policy tag’s taxonomy. Restrict those roles to trusted principals.