One of the major roadblocks to increasing TPS inside the Factom Protocol is the architecture of factomd, particularly the single-threaded approach to message processing.
To provide a short recap, the Factom Protocol consensus happens inside the “Process List” (PL) — one per block. Each PL is split into a number of Virtual Machines (VMs), one for each federated server.
Messages that arrive are distributed over the VMs. The official order is determined by the federated server responsible for that VM by sending out “Acks” containing the cryptographic signature and the position of the message inside the VM.
This order is very important, allowing each server to build blocks the exact same way and come to an agreement. So far, so good.
The Process
The official order in which messages are processed is: VM₀ items 0..N₀, followed by VM₁ items 0..N₁, all the way to VMₘ items 0..Nₙ.
For the sake of readability, let’s just assume that all messages are just in one long list. This is also how factomd processes the messages, by going through the list one by one and seeing if they can be applied.
An example why this is important is the following: A owes both B and C 5 FCT each, but A only has 6 FCT. Thinking they can trick the system, A submits two transaction simultaneously, one to pay B and one to pay C. The first transaction receives the position #123, the second #125. At the time they arrive, both messages are valid and A has enough balance to cover it.
When it comes time to process the messages, the node will update A’s balance to 1 FCT at position #123 and consider transaction #125 to be invalid due to low balance.
The Problem
The order of message does make a difference. We can’t simply spawn workers to work off messages as quickly as possible, since race conditions have the potential to create different states from the same set of messages.
However, not all transactions are the same. For example, if A wants to send B 5 FCT and C wants to send D 2 FCT, it makes no difference which one is applied first or second.
It is that property we can exploit to achieve multi-threading.
Rather than building a list of messages, we can build a dependency tree:
Messages are only valid if the blocks / minutes are signed properly, making that the base dependency for all messages. Beyond that, each message has both conflicts and resolutions. A conflict is something that needs to exist or be resolved. A resolution will resolve specific dependencies.
The following messages are part of the PL:
- Commit Entry/Chain
Conflict: The balance of the EC Address
Resolution: Revealing the entry with a specific hash - Reveal Entry
Conflict: The existence of a paying commit
Resolution: None - FCT Transaction
Conflict: The balances of the input addresses
Resolution: The balances of the output FCT/EC addresses - AddServer/RemoveServer
Conflict: None
Resolution: None - ChangeServerKey
Conflict: None
Resolution: None
Most of them are fairly straightforward. Reveals need to have a commit paying for them before they can be applied. Transactions need enough balance to cover them. In most cases, there is no reason not to multi-thread them.
Parallel Processing
A message can be processed when it has no conflicts or all its conflicts can be resolved. For this purpose, we can introduce the concept of buckets. Each FCT or EC Address is a bucket, as well as a bucket of commits.
Transactions are placed into buckets according to their inputs — outputs are always positive and thus will never be a problem. If two or more transactions are in the same bucket, that is they both have the same address as input, the one with a lower order is picked first. If a transaction is the only one in a bucket, it can be processed immediately.
Commits are similar to Transactions but different, in the sense that only commits themselves can decrease EC. Transactions can only increase EC balance. For this reason, commits don’t technically need to wait for conflicts to be resolved until the EC balance is too low to pay for the commit. Multiple commits done by the same EC address are resolved by their order.
Reveals are the simplest of all. If a commit exists that paid for it with enough EC, the message is processed. Otherwise it is blocked until all commits have been processed.
If a message has an unresolvable conflict, it can be either discarded or kept around to see if it can be resolved in the future.
Edge Cases
Let’s assume there are three addresses, each with a balance of zero or less than 1 FCT: A, B, and C. In the same minute, A sends 1 FCT to B, B sends 1 FCT to C, and C sends 1 FCT to A. This creates a cyclical dependency that can be hard to detect.
If any of the addresses have more than 1 FCT, that transaction would have no conflict and the entire cycle would be resolved. If the transactions never receive enough FCT, the transactions will fail.
Conclusion
By being able to process messages that have no conflicts individually, it is possible to process them simultaneously. This has potentially dramatic effects on the ability for nodes to turn the processlist into the blocks and apply that block to the state.