[go: up one dir, main page]

Skip to content

raft: Implement gRPC interceptor for request proxying

Common Raft library, especially our etcd/raft library of choice, is bundled with Proposal Forwarding. When a node receives a proposal, and it does not hold the primary role, it forwards the entry to the current leader. This feature is handy for the application as it eliminates the need for request routing. Unfortunately, it doesn't work for this.

In Gitaly WAL, a transaction is verified, committed, and then applied in order. Transactions are serialized. The next transaction is verified based on the latest state after applying the previous one. Raft is involved in the process at the committing phase. The log entry is broadcasted after an entry is verified and admitted. In addition, a transaction depends on a snapshot. This snapshot is essentially the point-of-time state of a partition that a transaction reads the data from.

If Proposal Forwarding is used, two nodes are allowed to accept mutator requests simultaneously and then commit transactions independently. Although the resulting log entries are handled by the primary, a lot of things might go wrong:

  • The transaction on the replica uses an outdated snapshot. There are some techniques to reduce this risk, but there is no guarantee.
Replica: A -> B -> C -> Start E based on C
Primary: A -> B -> C -> Commit D after the replica fetches the latest index but before D starts.
=> E should depend on E instead of C. This case does not happen if E is initiated on primary.
  • Log entries from Replica and Primary are not compatible.
Replica: A -> B -> C -> Start D -> Send to Primary
Primary: A -> B -> C -> Start E -> Commit E -> Receive D -> Commit D
=> D is not verified against E, although E commits before D.

As a result, Log Entry forwarding doesn't work for Gitaly. It must be enabled by default for the sake of data correctness. Gitaly needs a proper request routing. There are two discussed approaches: client-side routing and proxying. The latter solution is deemed simpler but effective for initial iterations.

The idea is simple. We implement a gRPC middleware. The middleware accepts mutator requests if the current node holds the primary role for the target partition. Otherwise, it proxies the gRPC requests the whole request to the destination node. Finally, it collects and returns the result. In Gitaly, Praefect applies this method intensively. We could reuse a significant amount of code there.

Edited by Quang-Minh Nguyen
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information