Understanding the Bitcoin Indexer pipeline: how blocks flow from Bitcoin Core RPC into PostgreSQL.
Bitcoin Indexer is built around a single principle: avoid row-by-row processing at every layer. Blocks are fetched concurrently, parsed in memory, and written to PostgreSQL in batch COPY operations. Nothing touches the database one row at a time.
The system has three main stages: Fetch, Parse, and Write.
┌────────────────────────────────────────────────────────────────────┐
│ Bitcoin Indexer │
│ │
│ ┌─────────────────┐ ┌──────────────────────────────────┐ │
│ │ Bitcoin Core │ │ Pipeline Workers │ │
│ │ │ │ │ │
│ │ getblockhash │──RPC──▶│ ┌─────────────┐ ┌───────────┐ │ │
│ │ getblock (v2) │ │ │ Worker 1 │ │ Worker 2 │ │ │
│ │ getblockchain │ │ │ fetch+parse│ │fetch+parse│ │ │
│ │ info │ │ └──────┬──────┘ └─────┬─────┘ │ │
│ └─────────────────┘ │ └───────┬───────┘ │ │
│ │ ▼ │ │
│ │ ┌─────────────────┐ │ │
│ │ │ Batch Writer │ │ │
│ │ │ (pgx COPY ops) │ │ │
│ └──────┴────────┬────────┴────────┘ │
│ │ │
│ ┌───────────────────────────────▼──────────────────┐ │
│ │ PostgreSQL 16 │ │
│ │ │ │
│ │ blocks transactions │ │
│ │ tx_inputs tx_outputs │ │
│ │ address_transactions │ │
│ │ utxo_set address_balances │ │
│ └──────────────────────┬───────────────────────────┘ │
│ │ │
│ ┌──────────────────────┤ │
│ │ │ │
│ ┌──────────▼──────────┐ ┌────────▼───────────────┐ │
│ │ HTTP API │ │ Apache AGE │ │
│ │ Address queries │ │ Graph traversal │ │
│ │ TX lookups │ │ Transaction tracing │ │
│ │ UTXO queries │ │ Cluster analysis │ │
│ └─────────────────────┘ └────────────────────────┘ │
└────────────────────────────────────────────────────────────────────┘
graph TD
A[getblockchaininfo] --> B[Compute safe_tip]
B --> C[Build batch range start→end]
C --> D[Dispatch to Workers]
D --> E[Worker: getblockhash]
E --> F[Worker: getblock verbosity=2]
F --> G[Parse block into models]
G --> H[Collect batch results]
H --> I[PostgreSQL COPY batch write]
I --> J{More blocks?}
J -- Yes --> C
J -- No --> K[Wait for new blocks]
Each worker independently fetches a block using two sequential RPC calls:
getblockhash(height) → returns the block hash for a given heightgetblock(hash, verbosity=2) → returns full block with decoded transactionsWorkers run concurrently. The batch does not advance until all workers in the batch finish — the wall time equals the slowest worker.
Batch [141035, 141036]
Worker 1: getblockhash(141035) → getblock(...) → done in 45ms
Worker 2: getblockhash(141036) → getblock(...) → done in 671ms
Batch wall time = 671ms ← slowest worker
During Bitcoin Core IBD, getblockhash is the most frequent bottleneck. It can spike to 3–10 seconds while Bitcoin Core is busy with LevelDB compaction or block validation. getblock itself is almost always fast once the hash is known.
The parser converts raw Bitcoin Core JSON into Go structs and then into database rows:
Raw JSON block
│
├── Block header → blocks row
├── Transactions → transactions rows
│ ├── Inputs → tx_inputs rows
│ └── Outputs → tx_outputs rows
│ └── Addresses → address_transactions rows
│ → utxo_set rows
└── Balance deltas → address_balances rows
Parse time is typically microseconds to low milliseconds even for blocks with thousands of transactions. If you see parse=2s in logs, investigate Go CPU usage and JSON decode overhead.
The batch writer collects all parsed rows from every worker in the batch and writes them to PostgreSQL in a single COPY operation per table. This avoids:
Batch of 2 blocks → single COPY into:
blocks (2 rows)
transactions (100–5000 rows)
tx_inputs (100–15000 rows)
tx_outputs (100–15000 rows)
address_transactions (200–30000 rows)
utxo_set (delta rows)
address_balances (delta rows)
DB write time is typically 15–100ms for early blocks. For recent high-transaction blocks it can reach 500ms+, which is expected.
The indexer never indexes right up to the current chain tip. It maintains a confirmation window — a fixed number of blocks behind the node tip — to avoid indexing blocks that may be reorganized.
Node tip: 886,457
Safe tip: 886,447 ← indexer stops here (10-block window)
Indexer at: 141,054
The safe tip is recomputed on every poll of getblockchaininfo.
Main goroutine
│
├── Spawns N workers (N = workers config)
│ ├── Worker goroutine 1
│ ├── Worker goroutine 2
│ └── ...
│
├── Waits for all workers in batch (WaitGroup)
│
└── Calls batch writer (single goroutine, sequential)
Worker count should equal batch size during IBD. Extra workers sit idle and add RPC pressure to a node that is already busy.
Apache AGE runs inside PostgreSQL 16 and models Bitcoin transaction relationships as a property graph:
This enables traversal queries like “find all addresses connected to this transaction within 3 hops” using Cypher syntax directly inside PostgreSQL.
| Package | Responsibility |
|---|---|
cmd/indexer |
Entry point, starts pipeline loop |
cmd/api |
Starts HTTP API server |
cmd/backfill |
Repair and historical reprocessing |
internal/pipeline |
Fetch, parse, and ingest coordination |
internal/db |
Batch writer, partition management |
internal/api |
HTTP handlers and repository queries |
internal/config |
YAML + environment config loading |
internal/models |
Shared domain structs |
pkg/rpc |
Bitcoin Core JSON-RPC client |