Why Hubs? | Notion

In Farcaster v2, off-chain Farcaster protocol data will be stored on Farcaster Hubs. I’ve had a few questions recently about why we built hubs and who is going to run them.

Farcaster v1 off-chain shortcomings

During the first year of running Farcaster v1, we learned that several of our original assumptions for how the off-chain part of the protocol should be architected were suboptimal.

Nested JSON blobs weren’t scaling — while they were simple to start, after a user had a lot of casts, the files got large and required pagination. These additional requirements started to make these files lose the simplicity that originally made them appealing.
Validity — since all off-chain Farcaster data is cryptographically signed, developers were responsible for validating that a given cast or like was actually from the user (and had to go look that up on-chain). If a user wanted to roll their Ethereum address, it would require re-signing and re-validation.
Indexing — developers were responsible for watching on-chain changes to the identity contract and then very regularly indexing all of the JSON files. Said another way: in order to offer users a reasonably real-time feed, you were required to run a mini-version of Google.
Consistency — outside the scope of this essay, but as in most distributed systems, there were a lot of edge cases left for each developer to figure out for themselves what was canonical.

In retrospect, perhaps more rigorous first principles thinking may have had us avoid this architecture at the start. Some observant readers may also point out that validity, indexing and consistency are solved with blockchains! This is true, but blockchains come with another set of trade-offs which we can revisit in a future essay. (cf. Varun’s Sufficient Decentralization in Social Networks for a directional gist.)

What do Farcaster Hubs do?

Farcaster Hubs are servers that do three things: validation, propagation and storage. This is similar to an Ethereum node, but for Farcaster data. Another way to think about it as the “data layer” for Farcaster.

Validation — when a hub receives a cast, it verifies the cryptographic signature matches the user’s signer key → custody Ethereum address → on-chain FID. (This requires hubs to interface with Ethereum nodes.) If data is incorrectly signed or inconsistent, hubs will ignore it. By ensuring data is correctly signed and consistent, developers can assume all data from a hub’s APIs is valid and don’t have to spend the time and effort doing it themselves.
Propagation — all data stored on a hub is then rebroadcasted trustlessly to other peer hubs on the Farcaster network, similar to Ethereum nodes. This allows developers the ability to quickly start building on Farcaster without having to worry about indexing.
Storage — finally, hubs are where all the off-chain primitives—casts, likes, follows, profiles, signing keys, connected addresses—for all users are stored. Hubs, like an Ethereum node, provide a standard set of APIs for developers to be able to query protocol data.

Why build Farcaster Hubs vs. use another protocol / platform?

We think hubs offer developers a better Farcaster-specific developer experience. With hubs, developers get out-of-box data validation, consistency, indexing, storage and APIs in a single binary. It’s certainly possible that other general purpose protocols and platforms can achieve similar outcomes, but at the expense of developer speed and efficiency, especially at scale.

In line with our philosophy of product-led protocol development, we believe an opinionated, vertically-integrated developer experience is the best way to encourage developers to start building on Farcaster.