Sign in

Detailed Solana architecture

This lecture zooms out to the protocol layer underneath. How does a transaction actually get from your wallet into a block, and how do thousands of validators agree on the result, in 400 milliseconds? The pieces have names you've heard in passing: Proof of History, Tower BFT, Turbine, Gulf Stream. None of them are magic. Each is an engineering answer to a specific bottleneck that other chains hit and didn't solve. Putting them together shows why Solana looks the way it does and what trade-offs the design accepted along the way.

Why the standard playbook doesn't work at Solana's target throughput

Most chains follow a similar recipe. Users broadcast transactions to a public mempool. A miner or proposer picks transactions, builds a block, and gossips it to the network. Validators receive the block, replay the transactions to check the work, vote on the result, and accumulate enough votes for finality.

Each of those steps assumes a generous time budget. Bitcoin gives itself 10 minutes per block. Ethereum gives 12 seconds. At those speeds, gossiping the block to every node takes a few seconds at most, voting can be sequential, and the mempool can sit around for as long as the network is congested.

Solana aimed for 400 milliseconds per slot. At that speed, every step has to be rethought:

  • Gossiping a block of thousands of transactions through a peer-to-peer network in 400ms requires a different propagation algorithm than naive flooding.
  • Validators can't wait around for a mempool to drain. The transaction needs to be at the leader's door the moment its slot opens.
  • Voting cannot block production. By the time votes finish for slot N, the leader for slot N+1 has to already be producing.
  • The very notion of "what time is it" has to be agreed on cheaply, because every other coordination step depends on it.

Solana's architecture is what falls out of taking each of these constraints seriously. The components are PoH, Tower BFT, Gulf Stream, and Turbine, plus the parallel execution engine that processes transactions inside a slot. Everything else is built on top.

Proof of History: the clock the network can verify

The first problem is time. In a distributed system, "what time is it" is a non-trivial question. Validator A might say 12:00:00.100 and validator B might say 12:00:00.150, with no way to tell which one is right. Any coordination that depends on agreed-on time, like "this transaction came before that one," has to gossip timestamps around and reconcile them.

Solana's answer is to not use clock time at all. Instead, the network agrees on a synthetic clock based on computational work that anyone can verify.

The mechanic is straightforward. Take a SHA-256 hash. Hash it. Hash the result. Hash that. Keep going. Each hash output becomes the next hash's input, forming a long chain.

PoH: a chain of hashes that proves time passed Each hash is a SHA-256 of the previous hash. The chain cannot be parallelized. hash_0 starting seed hash_1 = SHA(hash_0) hash_2 + tx_A hash_3 hash_4 + tx_B hash_5 tx_A came in at this point tx_B came in later Three properties that make this a verifiable clock 1. Sequential Computing hash_N requires hash_(N-1). You cannot parallelize the chain. N hashes takes at least N × (one hash time) of real wall clock. 2. Verifiable in parallel Anyone can check the chain by hashing in parallel: split the chain into ranges and verify each independently. Verification is fast even though production is not. 3. Tamper-evident timestamps A tx mixed in at hash_N proves it existed by hash_N's slot. It cannot be backdated.

The three properties in the diagram are what turn a chain of hashes into a clock. Each hash takes some minimum amount of real time to compute, so the chain itself proves that time passed. If someone hands you a PoH chain with 800,000 hashes and your hardware can do 2 million SHA-256 per second, you know at least 0.4 seconds of wall clock elapsed during production, regardless of when you receive the chain or how fast you verify it.

Transactions get woven into the chain by being included in a hash input. When the leader is producing a slot and a transaction arrives, the leader hashes it into the next PoH step. From that point on, any node verifying the chain sees that transaction appearing at a specific position in the sequence. The transaction has a place in time that's cryptographically pinned, without anyone needing to agree on what wall-clock time it actually was.

The standard misunderstanding to clear up: PoH is not consensus. PoH gives an ordering within a single chain. It does not say which chain is the canonical one. If two leaders were running PoH on different inputs at the same wall-clock time, both chains would be perfectly verifiable as PoH outputs, and PoH alone could not tell you which one the network should accept. Consensus is the layer that picks the winning chain. That's the next piece.

Tower BFT: consensus that piggybacks on PoH

Tower BFT is Solana's consensus mechanism, in the family of practical Byzantine fault tolerance protocols. The general PBFT pattern is: validators receive a proposed block, run it, and vote on whether to accept. Once enough votes accumulate, the block is final.

The challenge classical PBFT runs into is that votes themselves need to be timestamped. To decide whether a vote was on time, validators have to agree on a clock, which they can't, which leads to elaborate timeout and view-change protocols. Tower BFT skirts the entire problem by using PoH as the timestamp.

Every vote a validator casts references the PoH hash at the moment of the vote. The PoH stream is the same stream the leader produced, so it's globally agreed on. There's no ambiguity about whether the vote came before or after some other event. The vote either happened at PoH position N or it didn't.

The second piece is the lockout mechanic. When a validator votes for a block, they commit to not voting against that block for a period that doubles each time they re-vote for it. The first vote locks them out for 2 slots, the second for 4, the third for 8, and so on. The longer the chain of confirming votes, the longer the lockout. By the time a block has many vote-rounds behind it, the validators who voted for it would have to wait exponentially long to vote against it, making it effectively final.

Finality on Solana is not a single moment. It's the accumulation of locked-in votes to the point where reversing the block would require a coordinated mass slashing. Practical finality is roughly 12.8 seconds, which is two epochs of validator votes, on mainnet today, though the network often considers blocks "confirmed enough" much sooner for most purposes.

Gulf Stream: no mempool

Most chains have a mempool, a pool of pending transactions waiting to be included. Users broadcast to the mempool, and miners or proposers fish out the transactions they want.

Solana doesn't have one. There's no shared queue of pending transactions. Instead, RPC nodes forward each transaction directly to the leaders who are about to produce blocks. This is Gulf Stream.

The leader schedule on Solana is public and known in advance. At any moment, every node knows who the leader will be for the next 100 or so slots. When an RPC node receives a transaction, it doesn't put it in a pool. It looks up the upcoming leaders for the next few slots and sends the transaction directly to them. By the time those leaders' slots open, the transaction is already in their queues.

This has three consequences.

First, the leader doesn't waste time during their slot fetching transactions. They have a queue. They process. Tiny startup latency, important when slots are 400ms long.

Second, "front-running the mempool" is impossible because there is no mempool. You can't watch a public pool for transactions to sandwich because the transactions aren't gossiped publicly. They're sent point-to-point to specific leaders. This eliminates one major class of MEV that Ethereum has to contend with.

Third, it shifts MEV pressure elsewhere. The leader sees all transactions arriving at their queue and can decide ordering inside the slot. If a leader wants to extract value, they can reorder transactions within their own slot, or stake-buy private order flow from RPC providers. The MEV problem doesn't disappear, it moves from "everyone watches the mempool" to "leaders have local ordering power."

Turbine: getting blocks to thousands of validators fast

Once a leader has produced a block, the block has to reach every validator on the network. The naive approach is gossip: each node, on receiving a block, forwards it to all its peers. This works fine for small networks, but at scale, the total bandwidth used by gossip grows quadratically with the number of validators. A network with thousands of validators would have hundreds of millions of redundant block transmissions for each slot.

Turbine solves this with a tree structure inspired by BitTorrent. Each block is shredded into many small fixed-size pieces, typically about a thousand bytes each. The leader sends each shred only to a small fanout of neighbors instead of to everyone. Those neighbors each forward their shreds to the next layer of validators, who forward to the next, and so on. Every validator receives the full block in roughly log(N) hops rather than the leader sending it to all N validators directly.

The shreds are also erasure-coded. Instead of sending exactly the bytes of the block, Turbine sends a slightly inflated version where any sufficient subset of shreds can reconstruct the full block. If some shreds are dropped or delayed, validators can still recover the block from what they have. This makes the propagation resilient to packet loss without needing retransmissions to ask "did you get shred 482?"

Turbine is the answer to "how do you propagate a block to 2,000 nodes in 400ms?" It's also part of why Solana's hardware requirements are higher than other chains. Every validator has to receive, forward, and reconstruct thousands of shreds per second, which requires real network bandwidth.

Parallel execution

The last piece worth naming, because it's why throughput inside a slot is high in the first place. Solana programs execute in parallel based on the accounts they declare in the transaction's account list. The runtime can run two transactions simultaneously on different cores if their writable account sets don't overlap.

This is the design choice that drove the entire programming model you spent six modules learning. Every transaction declares its accounts up front. The runtime sorts transactions into non-conflicting groups and runs each group on a different CPU core. Two SOL transfers between unrelated wallets run at the same time. A token swap and an NFT mint that share no accounts run at the same time. A program upgrade and any unrelated transaction run at the same time.

The trade-off, as you've internalized by now, is that the access pattern has to be known statically. Walking a linked list whose shape you don't know in advance, or branching to a different program based on data you haven't read yet, are awkward patterns on Solana. In exchange, you get the parallel-execution thesis: throughput scales with cores rather than being capped at single-threaded execution.

The transaction lifecycle, end to end

Now, putting it together. Here's what actually happens when a user signs a transaction.

The life of a Solana transaction 1. User signs and submits via RPC The user's wallet sends the signed tx to an RPC node, which forwards it to the network. No mempool. The tx is not waiting in any shared pool. 2. Gulf Stream forwards directly to upcoming leaders RPC nodes know the leader schedule in advance. They send the tx to the next few leaders, so by the time a leader's slot starts, the tx is already in their queue. 3. Leader produces the slot using PoH The current leader runs PoH, executes txs in parallel based on the accounts they touch, and weaves them into the PoH stream with timestamps. Output: a sequence of "entries" that, combined, form a block. Slot duration: 400 ms. 4. Turbine fans the block out to all validators The block is shredded into many small pieces, erasure-coded, and propagated in a tree structure: leader → small fanout → each receives forwards to next layer. Tens of thousands of validators receive the block in roughly one round-trip time, not N round-trips. 5. Validators verify and vote via Tower BFT Each validator replays the PoH stream, verifies tx signatures and execution, and votes. Vote lockouts double each time a validator votes for a block, making it economically expensive to revote against a deep block. Finality emerges from accumulated lockouts. 6. Block is finalized once supermajority votes accumulate

Each step exists because of one of the bottlenecks named at the start. No mempool, because mempool gossip eats time and creates MEV exposure. PoH, because every other step needs an agreed-on clock. Parallel execution, because slots are short and you can't process thousands of transactions sequentially in 400ms. Turbine, because gossip doesn't scale to thousands of validators. Tower BFT, because consensus needs to be fast enough to keep up with block production.

What this means for you as a programmer

You spent six modules learning to write code that runs inside step 3. Every Anchor constraint you wrote, every PDA you derived, every CPI you composed, runs as part of "leader executes transactions in parallel based on their account lists." That part of the architecture is the only part your program directly interacts with.

But the other parts shape the world your program lives in. Slots are 400 ms because of the propagation budget Turbine provides. Compute units are tightly capped because the leader must finish executing within one slot. Versioned transactions and Address Lookup Tables exist because the 1,232-byte transaction size limit comes from the UDP-packet shape that Turbine uses for shreds. Priority fees matter because Gulf Stream lets RPC nodes choose ordering when forwarding. The leader having scheduling power is what makes priority-fee tipping meaningful.

You don't have to remember every component. You do need the high-level picture: a Solana transaction is signed off-chain, forwarded directly to known upcoming leaders, executed in parallel under a verifiable clock, propagated via a tree to all validators, voted on with exponentially-doubling lockouts, and finalized over the next several seconds. Everything else, including all the constraints you internalized as a programmer, falls out of that pipeline.