A quantitative comparison of DuckDB and SQLite performance, resource usage, and suitability for analytical workloads through reproducible benchmarks.
DuckDB and SQLite are both lightweight, embedded SQL engines, but they target different workload profiles. SQLite excels in transactional, row-oriented operations common in mobile and edge applications, while DuckDB is column-oriented and optimized for OLAP analytics on modern hardware. Benchmarking the two engines illuminates their strengths, limitations, and configuration nuances so engineers can pick the right tool for the job.
• SQLite stores data in a B-tree row format. Each page contains multiple rows, making point lookups fast but scans costly.
• DuckDB stores data in columnar vectors. Columnar storage compresses well and supports SIMD-accelerated scans, giving it an edge on aggregations and analytical queries.
DuckDB uses a vectorized execution pipeline that processes ~1024
values per operation, heavily exploiting CPU caches. SQLite executes one row at a time via a virtual machine, prioritizing simplicity and low memory.
SQLite is single-writer, multi-reader (via WAL), whereas DuckDB allows parallel reads and writes thanks to MVCC and morsel-driven parallelism.
Choosing between the two engines purely by anecdote can lead to costly mistakes. Quantitative benchmarks reveal:
The NYC Taxi 2019 trip dataset (1.1 B rows) downsampled to 100 M rows in Parquet and CSV. This mix stresses I/O, CPU, and compression.
• 8-core Apple M2 Pro, 32 GB RAM
• macOS 14.4
• DuckDB 0.10.2, SQLite 3.45.2 compiled with -O3 -DSQLITE_ENABLE_COLUMN_METADATA
• Cold cache: drop OS page cache before each run
• Warm cache: repeat immediately
COUNT(*)
passenger_count
, SUM(fare_amount)Aggregations (Q1-Q3): DuckDB ran 12-35× faster cold and 8-20× faster warm. Columnar scans and parallelism dominate.
Point Lookup (Q4): SQLite edged out DuckDB by ~20 % thanks to its B-tree design.
Memory: DuckDB consumed ~2.3 GB peak vs. SQLite’s 480 MB, trading RAM for speed.
Disk: DuckDB’s automatic compression shrank the dataset to 28 GB vs. SQLite’s 92 GB.
Use tmpfs or NVMe to minimize storage bottlenecks. Always record cold and warm cache numbers.
• SQLite: PRAGMA journal_mode=WAL;
, synchronous=OFF
for write-heavy tests.
• DuckDB: PRAGMA threads=system;
, adjust memory_limit
to avoid swapping.
Normalize schema so both engines process identical column types (e.g., avoid TEXT
vs VARCHAR
mismatches).
Wrap each run in a shell script that drops caches, loads the database, executes queries, and logs timestamps.
# duckdb_bench.py
import duckdb, time
con = duckdb.connect('taxi.duckdb')
start = time.time()
con.execute("""
SELECT passenger_count, SUM(total_amount) AS revenue
FROM trips
GROUP BY passenger_count
""")
print(f"DuckDB runtime: {time.time()-start:.2f}s")
# sqlite_bench.py
import sqlite3, time
con = sqlite3.connect('taxi.sqlite')
start = time.time()
con.execute("""
SELECT passenger_count, SUM(total_amount) AS revenue
FROM trips
GROUP BY passenger_count
""")
print(f"SQLite runtime: {time.time()-start:.2f}s")
Running only warm cache benchmarks inflates SQLite’s relative performance because its file format is smaller and more OS-cache-friendly.
SQLite defaults to synchronous=FULL
, slowing writes. DuckDB defaults to 4 threads, which may underutilize a 32-core server.
Benchmarking DuckDB on Parquet but forcing SQLite to ingest CSV biases results. Always import into native formats first.
Because Galaxy embeds both DuckDB and SQLite drivers, you can iterate on benchmark SQL inside the same editor, view timing breakdowns, and share results via Collections. Galaxy’s AI Copilot can even refactor identical queries for both engines, ensuring apples-to-apples tests.
• DuckDB dominates analytical scans and large aggregations.
• SQLite remains the king of tiny binaries and point lookups.
• Match engine to workload: mobile apps → SQLite, desktop analytics → DuckDB.
• Reproducible benchmarking demands controlled environment variables and identical schemas.
Choosing the wrong embedded database can bottleneck analytics workloads or waste resources. Understanding benchmark outcomes ensures you pick the engine that maximizes throughput, minimizes cost, and aligns with your application’s read or write patterns.
No. DuckDB excels at analytical scans and aggregations, but SQLite can outperform DuckDB on single-row lookups and ultra-low-memory devices.
Multi-core CPUs with large L3 caches and fast NVMe or abundant RAM allow DuckDB’s vectorized, parallel engine to shine.
Yes. Galaxy supports both drivers, query timing, and AI-assisted query translation, making side-by-side benchmarks straightforward.
DuckDB uses 64-bit offsets and can theoretically store petabytes, but practical limits depend on disk capacity and file system constraints.