Peer to Peer

DSE Version: 6.0

Video

As opposed to a reational set up, Apache Cassandra uses a peer to peer system. Let's explore how this system is designed to keep you online when bad things happen.

Transcript: 

Let’s spend a little time talking about peer-to-peer.  What we are looking at here is a standard set up in a relational world.  This client/server model has been around for years and years.

 

Let’s talk about how it works with replication strategy in a relational set up.  You have a leader node and that node is responsible for everything, such as inserts, updates, reads.  But then you have read replicates behind the leader node. These replica nodes are set up with a little bit of a lag, but they are not set up the same was as the leader node because they are just copies.

 

And what happens with that set up?  We shard it. Sharding? Really? What kind of problems happen with Sharding?  Let’s take a closer look. We have our data spread out over multiple shards. Each shard could store a certain segment of your customers, etc.  

What happens then?   Some routing has to happen, which means you probably have to write some code to figure out where to put data or where to find data when you need it.  That means you don’t have the benefit of joins, aggregations or group bys because of the way your data is spread.

 

So what if something happens to that leader?  If he suddenly goes down we are now in a situation where we have to wait for the dreaded failover.  Time is money people! We can’t afford failovers because it will take seconds--sometimes many seconds for a new leader to take over.  This is time that users are sitting there staring at their app...waiting. Just waiting.

 

Another situation can occur when you have nodes that can’t see each other anymore.  This is bad because leaders can’t see followers and we are in a bad state. This leads to chaos. If those nodes can’t see each other anymore, this completely messes up stuff in a highly consistent system.  The followers have to be consistent with the leader and if they can’t see the leader (or the leader can’t see them) they can’t be consistent. When this happens new leaders have to be elected, but now you have two leaders!  Oh no! The sky is falling!

 

How does this work in a peer-to-peer system, which is what you probably wanted to start with?  In Apache Cassandra, it works like this. We have replicas of data in our cluster and those replicas are all equals in this peer to peer set up.  No one is a leader or a follower. They are all the same.

 

So there is a coordinator, which you may have heard about in another module.  That coordinator takes the data from the client and writes asyncronously to the each replica node responsible for that data.

 

Let’s say that the data is sent to a coordinator on the other side of that cluster? No problem!  That coordinator will also send the data to the correct replica set. Yay! Everything is golden so far!

 

What if we split this cluster right down the middle?  Oh no! Split brain problem! Or is it? No! Because Cassandra handles this problem automatically.  This is not a failover event. Each client that can seen by the client is actually still online. As long as you can write to a replica you are still up and running.  This is configurable as well, you can dial this up so that you are your intolerant of this situation or dial it back down so that you are intolerant of this situation.  But you have the power to control that to fit the needs of environment.

 

So what’s going to happen in this situation?  Well when you write to the coordinator on one side of the partition, it will write that data to the replica or replicas  that it can see.

 

From the other side, in our example, it can see two replicas so it will write to both of them.  This is all managed by consistency level, which is talked about in a different module but is critical to get your head around so make sure you make your way over there at some point.

 

The way that Apache Cassandra is designed is to keep you online when bad things happen, like when someone spills coffee in the server room and brings down some nodes, someone cuts through a cable, or some other minor or even major catastrophe happens. Apache Cassandra is built to manage these eventualities and keep you plugging along.  

No write up.
No Exercises.
No FAQs.
No resources.
Comments are closed.