GossipSub is a protocol that extends existing PubSub
implementation to make PubSub more efficient. Since GossipSub is an extension of PubSub, it makes sense to know about PubSub before directly jumping on to GossipSub.
PubSub
It is a form of communication among services through a common topic of interest.
Let's take the example of the WhatsApp group to explain PubSub
PubSub comprises of
1. Provider: Provide data that is expected & received by the subscribers E.g Send a message in a group
2. Topic: Group the related data. E.g. Group name
3. Subscriber: Recieve data related to the topics they are subscribed to. E.g. All participants of the group
In the standard PubSub implementation, whenever a publisher publishes a message on a topic, it is flooded to all clients which have subscribed to that particular topic.
From this point onward I'll use some terminologies which I'll explain now itself
1. Peer - Client who wants to send and subscribe to messages on a topic
2. Peering - Process of connecting one peer to another peer. It is bidirectional & needs to be acknowledged by both peers to be called a successful peering
Design Goal in PubSub
1. Reliability: Successful delivery of every message
2. Speed: Quickly delivery of every message
3. Efficiency: Remove redundancy of messages & reduce network bandwidth
4. Resilience: Subscription/Unsubscription to a topic should not affect the whole system
5. Scale: Handling multiple subscribers to the topics
6. KISS - Keep it stupid simple. A basic design principle
Since we are clear with the basic things we needed to know, we can now learn about GossipSub.
The first part is discovering peers before subscribing to the topic. Peer discovery is not part of PubSub, it relies on the application to do that.
There are certain methods for that
1. DHT - (Distributed hash tables)
2. Local network broadcasts
3. Exchanging peer lists with existing peers
4. Centralized trackers or rendezvous points
5. Lists of bootstrap peers
Explaining each of them is out of the scope of this article, but rest assured they are standard algorithms part of the ambient peer discovery process
Types of Peering
In GossipSub there are 2 types of peering
1. Full Message Peering (FMP)
In this peering, the full message content is sent to peers throughout the network. Full Message peering creates a sparsed network, not every peer is connected with Full Message Peering.
Purpose: Reduce network bandwidth since full message content is not forwarded to all peers
2. MetaData Only Peering (MOP)
In this peering, only the meta of the message is sent to peers throughout the network. Whichever client is left out of full message peering is then connected using MetaData Only Peering.
Purpose: Gossip about which message is available and not yet received by certain peer
GossipSub combines both the peering strategy to devise a new protocol which is quite efficient and scalable. Now each peer has to maintain a network of Full Message (Generally few & configurable) & MetaData Only Peering connection.
The number of Full message connections is called Peering Degree. For simplicity's sake, let's say 5 is the ideal value for this.
Grafting And Pruning
Grafting: In case the Peering Degree goes below the value of 5, it converts random MOPs to FMP
Pruning: n case the Peering Degree goes above the value of 5, it converts random FMPs to MOP
In libp2p’s implementation, it performs heartbeat checks every 1 second & this grafting and pruning are internal parts of that process.
Subscriptions
Now each peer maintains a list of topics to which their directly connected peer is subscribed.
Every time a peer subscribe or unsubscribe from a topic, that peer sends a message about that to all its connected peer (Even though that peer is not subscribed to that topic, it doesn't matter to that peer)
Subscriptions & their relation to Grating & Pruning
On Subscription: Peer along with sending the message about subscriptions also grafts some connections to become FMP
On Un-subscription: Peer along with informing about it, also prunes their connection for that topic to MOP
Send Message & Gossip About Message
Send Message
When a peer sends a message
it stores & forwards to all FMP connections
these peers then store & forward it to their FMP connections
These peers maintain the list of messages they have seen & avoid retransmission if already seen.
Gossip
Each peer gossip about the message they have received to randomly selected MOP n (Let's say 6) connections.
Now this allows peers to cross-check & see if they have received the message, if they haven't received it, they request it from the peer who gossiped about that message
Fan-Out
Peers can send messages to topics they are not subscribed to
Peers can randomly pick n (Let's say 6 again) peers of such category & fan out the messages. This type of connection is unidirectional, The other 6 peers don't know if they have been selected.
These particular peers cache the list of such peers for a certain amount of time
State
Each peer maintains a state of their own
1. Subscriptions: List of topics subscribed to.
2. Fan-out topics: These are the topics messaged recently but not subscribed to. For each topic, the time of the last sent message to that topic is remembered.
3. List of peers currently connected: For each peer connected, the state includes all the topics they are subscribed to and whether the peering for each topic is full-content, metadata-only or fan-out.
4. Recently seen messages: Cache of recently seen messages. It is used to ignore retransmitted messages. For each message the state includes who sent it and the sequence number, which is enough to uniquely identify any message. For very recent messages, the full message contents are kept so that they can be sent to any peers that request the message.
That's it, This is GossibSub.
I'll post about the security context of GossipSub in the next article. Stay tuned!
Reference: https://github.com/libp2p/specs/tree/master/pubsub/gossipsub