Wednesday, October 26, 2011

The “initial_token” in Cassandra Means the “Very First Time”

Cassandra uses tokens to split key ranges across nodes. When a Cassandra node is started the very first time, it will check if an “initial token” is specified in cassandra.yaml; otherwise, the node will generate a token from the cluster it is joining. But how does a node know that it is being started the “very first time”? It is simple. The token is stored on the local disk and persists across process start/stop. Therefore, once a token is stored, changing the “initial_token” parameter in cassandra.yaml will have no effect. When multiple nodes have the same token, Cassandra will elect a new owner of the token, print out an warning and then continue on. The nodetool however will under-report the number of nodes in a ring because it only probes nodes that have unique tokens. It is such a common problem when making Cassandra VM images that it even gets its own FAQ on the Cassandra wiki. The only safe way to create a new token cleanly is to wipe out the data and commit logs and then restart the node.

No comments:

Post a Comment