Cross-Region Replication for Firestore Exports

How do I enable cross-region replication for Firestore exports?

Cross-region replication for Firestore exports is the practice of automatically copying your Cloud Firestore export files from a primary Cloud Storage bucket in one region to a bucket in another region to achieve disaster recovery, data residency, or latency goals.

Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)

Oops! Something went wrong while submitting the form.

Description

Example H2

Example H3

Learn why and how to enable cross-region replication for your Cloud Firestore exports using dual-region buckets, Cloud Storage bucket replication, or manual copy pipelines, along with best practices, common pitfalls, and automation tips.

Why You Should Care About Replicating Firestore Exports

If your organization relies on Cloud Firestore as a critical transactional or analytical data store, you probably run gcloud firestore export jobs to back up your collections to Cloud Storage. Those export files are essential for point-in-time restores, analytics in BigQuery, and compliance audits. Keeping a single copy in the same region as the production database leaves you vulnerable to regional outages, accidental bucket deletion, or location-specific compliance failures. Cross-region replication mitigates these risks by maintaining an up-to-date replica of your export data in a geographically distinct location.

How Firestore Exports Work

When you run an export, Firestore serializes documents into sharded RANGE_* and ALL_NAMESPACES_KIND_* files, then writes them to a Cloud Storage bucket of your choice. Firestore itself has no concept of replication for exports—those objects are treated like any other Cloud Storage objects once written.

Understanding Cross-Region Replication

Cloud Storage offers two primary mechanisms to get your objects into multiple regions:

Dual-region or multi-region buckets. Built-in replication managed by Google. A dual-region bucket (e.g., asia1 or us-east1+us-central1) automatically keeps two copies with strong consistency.
Bucket Replication Configuration (V1 or V2). Custom, flexible replication from one bucket (source) to another (destination). You can set replication filters, encryption overrides, and conflict handling policies.

Both approaches satisfy “cross-region” requirements. The choice depends on latency, cost, control, and encryption needs.

Approaches to Cross-Region Replication for Firestore Exports

1. Use a Dual-Region or Multi-Region Bucket

This is the simplest path. Instead of exporting to gs://my-exports-single-region, create a dual-region bucket (for example, us-central1 + us-east1) and point your export at it. Google handles replication transparently, and you see only one bucket in your project.

Pros: No extra configuration, strongly consistent, lowest operational overhead.
Cons: Region pairings are fixed; you can’t fine-tune policies like delete mirroring or encryption per destination.

2. Use Cloud Storage Bucket Replication

Bucket replication gives you fine-grained control. You set a source bucket (in, say, us-central1) and a destination bucket (in europe-west4) and enable replication. New objects—including Firestore export shards—are automatically copied. You can also replicate deletes or preserve them.

High-level steps:

Create a destination bucket in the target region with the same or greater storage class.
Grant the Storage Default Service Account (often service-@gs-project-accounts.iam.gserviceaccount.com) storage.admin on that bucket.
Configure replication via the Cloud Console, gcloud storage buckets update, Terraform, or JSON API.

After the initial configuration, any export placed in the source bucket is mirrored to the destination within seconds to minutes.

3. Manual Copy Pipeline (Fallback)

If regulatory or tooling constraints block official replication, you can build a Cloud Scheduler + Cloud Functions job that triggers after each export and uses gsutil rsync or Storage Transfer Service to copy the objects. While functional, this approach is harder to maintain, slower to resolve new shards, and introduces custom code.

Step-by-Step Tutorial: Creating a Replicated Export Pipeline

The following example demonstrates bucket replication (Approach #2) using the gcloud CLI. Adjust region names to match your RTO/RPO requirements.

Create buckets:

# Source bucket in us-central1 gcloud storage buckets create gs://my-firestore-backups-src \ --location=us-central1 --uniform-bucket-level-access # Destination bucket in europe-west4 gcloud storage buckets create gs://my-firestore-backups-dst \ --location=europe-west4 --uniform-bucket-level-access

Grant permission:

# Replace PROJECT_NUMBER with your GCP project number gcloud projects add-iam-policy-binding $PROJECT_ID \ --member=serviceAccount:service-$PROJECT_NUMBER@gs-project-accounts.iam.gserviceaccount.com \ --role=roles/storage.admin

Enable replication:

# Replicate every object, keep deletes in sync gcloud storage buckets update gs://my-firestore-backups-src \ --add-bucket-replication=destination=gs://my-firestore-backups-dst,deleteOption=DELETE_MARKER

Run an export:

gcloud firestore export gs://my-firestore-backups-src/$(date +%F)

Verify: Check object metadata in the destination bucket; x-goog-replication-status should read COMPLETE.

Automating with Cloud Scheduler & Cloud Functions

If you need scheduled exports, combine the replication with a Cloud Scheduler job:

# Cloud Scheduler, daily at 02:00 UTC gcloud scheduler jobs create pubsub daily-firestore-export \ --schedule="0 2 * * *" --topic=firestore-export-trigger \ --message-body="{\"bucket\":\"my-firestore-backups-src\"}"

Your Cloud Function subscribes, runs gcloud firestore export or the Admin SDK, and writes to the source bucket. Replication remains transparent.

Monitoring and Validation

Cloud Monitoring Metrics: storage.googleapis.com/storage/replication/object_replication_lag reveals lag in seconds.
Audit Logs: Both buckets emit Write and Object Finalize events—filter where protoPayload.metadata.replicationStatus equals FAILED.
Lifecycle Rules: Apply separate retention policies in each region; replication does not override lifecycle deletions.

Best Practices

Place source and destination buckets in different seismic zones to maximize resiliency.
Create an org-policy that denies accidental deletion of destination buckets.
Use Bucket Lock if legal hold or immutability is required.
Encrypt destination data with a customer-managed key in its region to control key residency.
Test restores quarterly by importing a replica into a non-prod Firestore instance.

Common Mistakes and How to Fix Them

Forgetting to export service-account permissions. Replication fails if the Storage service account lacks storage.admin on the destination bucket. Grant the role or create a custom role with storage.objects.create.
Assuming dual-region equals ANY two regions. Dual-region buckets support only specific pairs. If you need us-central1 to europe-west4, use bucket replication instead.
Overriding lifecycle deletes unexpectedly. If you enable deleteOption=DELETE_MARKER, the source’s lifecycle rules can propagate deletions. If you need longer retention in the DR region, set deleteOption=NONE.

Conclusion

Cross-region replication for Firestore exports is straightforward once you understand Cloud Storage’s replication models. Whether you choose dual-region buckets for simplicity or bucket replication for flexibility, the key is to integrate the configuration into your infrastructure-as-code, monitor replication health, and test restores regularly. Doing so turns your daily Firestore exports from nice-to-have backups into a robust disaster-recovery asset.

Why Cross-Region Replication for Firestore Exports is important

Without cross-region replication, a single regional outage or bucket misconfiguration can render your Firestore backups useless. Enabling replication ensures your export files survive disasters, satisfy data residency and compliance requirements, and remain quickly restorable no matter which Google Cloud region experiences issues.

Cross-Region Replication for Firestore Exports Example Usage


gcloud firestore export gs://my-firestore-backups-src/2024-05-01

Cross-Region Replication for Firestore Exports Syntax

Common Mistakes

Relying on a single-region bucket for critical backups. If that region becomes unavailable, you cannot restore data. Fix by moving to a dual-region bucket or enabling bucket replication.
Granting replication permissions only on the source bucket. Replication service accounts need write access to the destination bucket as well. Fix by assigning roles/storage.admin (or a scoped custom role) to the Storage default service account on both buckets.
Using delete mirroring without realizing it propagates lifecycle deletions, potentially erasing your DR copy. Fix by setting deleteOption=NONE if destination retention must exceed the source.

Frequently Asked Questions (FAQs)

Is dual-region storage enough, or do I still need custom replication?

A dual-region bucket replicates objects into two pre-defined regions and offers strong consistency. For most disaster-recovery scenarios, it is sufficient. Choose custom bucket replication only when you need specific region pairs, different encryption keys per copy, or more granular delete mirroring controls.

Does replication impact Firestore export performance or cost?

The export job writes only to the source bucket, so export speed is unaffected. You are charged for egress traffic from the source region to the destination region and for duplicated storage, but replication itself does not add API operation costs.

How can I monitor replication lag?

Use Cloud Monitoring metrics storage.googleapis.com/storage/replication/object_replication_lag and set an alert if the lag exceeds your RPO threshold, e.g., 15 minutes.

Can I restore directly from the destination bucket?

Yes. Firestore’s gcloud firestore import command can reference any bucket your service account can access, including the replicated one. Always test the restore path as part of your DR drills.