Auditing BigQuery access logs means systematically collecting, querying, and interpreting Cloud Logging entries that record every read, write, and metadata operation on BigQuery resources so you can answer who did what, when, from where, and with which data.
Google Cloud’s BigQuery automatically writes Admin Activity and Data Access logs to Cloud Logging. Learning to audit these logs is critical for security, compliance, cost control, and troubleshooting. This article walks you through how BigQuery access logging works, how to design effective audit queries, and how to operationalize the results.
BigQuery writes two relevant log types:
DATA_READ
and DATA_WRITE
.Entries are stored in projects/<PROJECT_ID>/logs/cloudaudit.googleapis.com%2Fdata_access
(and similar) within Cloud Logging. They can be:
Every log entry is a JSON payload. Key fields for auditing include:
protoPayload.authenticationInfo.principalEmail
– The user/service account.protoPayload.methodName
– E.g., jobservice.jobcompleted
.protoPayload.serviceData.jobCompletedEvent.job.jobStatistics.totalProcessedBytes
resource.labels.project_id
, dataset_id
, table_id
severity
, timestamp
, status.code
In the Google Cloud console, navigate to Logging > Logs Router > Log configuration. Ensure BigQuery Data Read and Data Write are turned on. You can also use the gcloud CLI:
gcloud logging sinks create bq_audit_sink \
bigquery.googleapis.com/projects/my-project/datasets/logs_dataset \
--log-filter='resource.type="bigquery_resource"'
Most teams export logs to a dedicated dataset (e.g., audit_logs
) and partition on _PARTITIONTIME
for cost-efficient querying.
Create views like vw_user_query_activity
to normalize fields and make stakeholder-friendly dashboards.
Set up Log-based alerts for high-risk events such as UPDATE table
on sensitive data or excessive cost (>1 TB processed).
SELECT
TIMESTAMP_TRUNC(timestamp, DAY) AS event_day,
protoPayload.authenticationInfo.principalEmail AS user,
JSON_VALUE(protoPayload.metadataJson, '$.jobChange.job.jobStatistics.totalProcessedBytes')/1e9 AS gb_processed
FROM `my-project.audit_logs.cloudaudit_googleapis_com_data_access`
WHERE
resource.type = 'bigquery_resource'
AND protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.tableDefinitions IS NULL
AND protoPayload.resourceName LIKE '%projects/my-project/datasets/my_dataset/tables/my_table%'
AND _PARTITIONTIME BETWEEN '2023-01-01' AND '2023-01-31';
SELECT
timestamp,
protoPayload.authenticationInfo.principalEmail AS user,
job_stats.totalProcessedBytes/1e12 AS tb_scanned
FROM `my-project.audit_logs.cloudaudit_googleapis_com_data_access`,
UNNEST([protoPayload.serviceData.jobCompletedEvent.job.jobStatistics]) AS job_stats
WHERE tb_scanned > 1 -- >1 TB
ORDER BY tb_scanned DESC;
SELECT
timestamp,
protoPayload.authenticationInfo.principalEmail AS actor,
protoPayload.resourceName AS table_path
FROM `my-project.audit_logs.cloudaudit_googleapis_com_activity`
WHERE protoPayload.methodName = 'tableservice.delete'
principalEmail
and methodName
.While Galaxy is primarily a SQL editor, its AI-assisted autocomplete and parameterization make writing the complex audit queries above dramatically faster. Store your standard audit queries in a Galaxy Collection so security and data teams endorse and reuse them without copy-pasting into Slack.
Metadata-only logging is on by default, but query text and bytes processed are missing until you enable DATA_READ/WRITE. Fix: turn them on at the desired resource hierarchy level.
Cloud Logging only guarantees 30 days retention for no-cost logs. Create a sink to BigQuery or GCS, or you’ll lose history.
Parsing protoPayload
on the fly is slow and expensive. Create schema-on-read views or flatten fields into materialized tables.
BigQuery can store terabytes of sensitive data. Without robust auditing you can’t prove compliance, attribute costs, or detect malicious behavior. Capturing and analyzing access logs provides an immutable trail showing what data was read or modified, by whom, and at what cost—critical for SOC 2, HIPAA, GDPR, and internal governance policies.
Cloud Logging stores log entries for 30 days at no charge. After that, entries are pruned unless you route them to BigQuery, Cloud Storage, or another destination.
Yes—when Data Access logging is enabled, completed query jobs include the full SQL text in protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.query
. Remember to control access to log data itself because it may contain sensitive literals.
Partition on ingestion time and cluster by high-cardinality fields such as principalEmail
. You can also materialize daily or hourly summary tables and grant analysts access only to those derived tables.
Galaxy’s context-aware SQL editor speeds up writing and sharing complex audit queries. You can save reusable log-analysis snippets in Galaxy Collections and let your team endorse them for future use. However, Galaxy itself does not collect the logs; it simply provides a superior query interface.