For supporting split-site clusters, we envision the need for a tiebreaker server. This page provided some of the background thinking which led to our quorum server. As such, this page mainly provides background information, and the design proposed below wasn't exactly the one implemented.
This server would give out a token to one subcluster which had contacted it and asked to have quorum. If it was granted the exclusive privilege of being the primary subcluster it would receive a HAVEQUORUM token, and if it were not granted quorum, it would be given a NOQUORUM token.
If a node did not renew its HAVEQUORUM token, then the server would begin sending the node at the other end of the cluster a NOQUORUM token until the beginning of a new bidding cycle for selecting a new primary subcluster.
It is important that the tiebreaker server effectively have its own heartbeat mechanism (through the continual token renewal), so that if one node dies when it has the quorum token, that the tiebreaker service know that it has died. Otherwise it would never be able to pass the quorum token along to another node when this happens. Since UDP has no connection mechanism, and TCP connections normally take around 10 minutes to notify you when they are disconnected, unfortunately one can't rely on network disconnect notifications for this function.
Once this happens, no subcluster would be able to receive the HAVEQUORUM token for a period of time.
Then after that, all connections in good standing would be eligible to be selected as the primary subcluster.
The process for obtaining quorum during a bidding interval would happen like this:
the tiebreaker server would announce an open bidding cycle
Once a primary subcluster was selected, the process the winner would be:
Once a primary subcluster was selected, the process all the losers would follow would be:
When a client subcluster first connects to the server, it provides its identification and bid information to the tiebreaker server, and waits for a new bid announcement. If the server wishes to, it can terminate the current holder of the quorum token and begin a new bid process as though the current holder had failed to renew in time.
Should they contain a 'deadtime' value? Or should they be slaved to the server? Obviously some kind of common expectation (negotiation) has to happen here...
When a node which had a connection open goes down gracefully, can its connection be transferred gracefully? or is it just something which will cause a rebid for quorum?
zhenh: Can we don't let the client keep the connection? for example, every minutes the clients connect to the tiebreaker, send the data of itself, then get the quorum status from tiebreaker, then disconnect.
AlanR: I thought I addressed this above, but I see that I didn't. You could theoretically design it to allow disconnect and reconnect, but TCP/IP connections are both slow to start and fairly expensive to start, but they are much more efficient once they've been established for a while (like 4 seconds or so), and they're cheap to keep up. In UDP, there is no connection - so you can't connect or disconnect. My first guess is that it is both noticeably more efficient and more reliable to stay connected if you choose TCP, and irrelevant if you choose UDP. On a related note, one of the most common criticisms of HTTP 1.0 is that it has very short connection times, which both increase server load, and slow down data transfer. So, subsequent versions of HTTP make it possible to reuse server connections to avoid this problem. We can avoid it completely right from the beginning if we just choose to stay connected.
I would conclude that UDP is the better protocol when no firewalls or encryption are involved. I would conclude that TCP may be necessary when dealing with firewalls, because corporate security directives are often inflexible. If one allowed both TCP and allowed people to choose port numbers then this would allow one the maximum flexibility to conform to corporate guidelines.
[The more I think about this protocol the more sense UDP seems to make, since there is no bulk data transfer, and latency is more important than throughput. But, I'm not really an expert in this]
It is necessary for to be accessible from both sites, and highly desirable that the tiebreaker server be located on a third site, and both the server (or HA server pair) and the networks to it from the other sites be as reliable as possible. Low latency and high band and the should be highly reliable.
Since the quorum server may serve many clusters, and many versions of cluster software, it is necessary for it to be very tolerant of different versions of the software running on either end of the connection. It might also prove desirable to have different versions of the server be running on different ports at the same time.