Jump to content

Solving the Byzantine Fault Problem in Rated Games


Recommended Posts

The multiplayer lobby is currently plagued by rated games being quit rather than finished resulting in artificial rating manipulation. No explanation is needed as to why this is terrible.

I looked through some posts on the forums regarding the topic and some solutions and why they wouldn't work. All of them ended somewhere close to "Who actually disconnected cannot be reliably unless WFG hosts dedicated servers". Which isn't quite true. This is basically a text book Byzantine fault problem. https://en.wikipedia.org/wiki/Byzantine_fault.

A somewhat good enough solution can be implemented into the lobby without much changes, and without dedicated servers for rated games. However, any solution to the problem would need at least 3 entities. And also, it is assumed that no more than 1/3 would be up to any shenanigans by manipulating game states and lying to the "arbitration" service. Long story short, 2 players and 1 arbiter.

A proof of consensus mechanism needs to be implemented to ensure three nodes are in sync and knows what is going on. I think a decent start would be to just seed a pseudo-RNG that is deterministic with a value agreed upon at game start. Note that turn hashes cannot be used since the current value need to be computed by a bot that does not run the simulation, if turn hashes are used and mismatching values are sent, there is no way to determine which hash is correct. This isn't actually Byzantine Fault Tolerant, but that doesn't really matter since if the client kept on lying, once the game ends, the worst thing that could happen is two mismatching reports being sent to the ratings bot. However, to successfully do this would be a not so easy task if someone cared enough to do, they can just as easily implement an auto clicker to play the game.

Technically, it would look like this:

arbiter: Another WFG hosted lobby bot.

Both players send the next RNG output to the bot, the bot queries its own RNG, and if all three values match, life is peachy.

If one value doesn't match, the arbiter, and the player who sent the matching value are correct, and the other player is incorrect.

How is the quitting problem solved?

Once the game ends, and both players send the game reports, this information can be used to solve the trust problem.

Currently, when one person quits, upon game end, only one game report is send to the ratings bot which ignores the game if both reports are not identical.

With this implemented, the reports from the player who was in sync with the arbiter would be deemed correct.

Can't the proof of consensus be "I am here arbiter" every Nth turn?

Not really. This would allow for players to fool the arbiter even if they quit. To be fair, even this allows for it to someone who is both smart enough, and determined enough. The solution is to implement a true byzantine fault tolerant consensus method, but I genuinely think this is overkill.

How hard is this to circumvent?

To circumvent this. Nefarious actor needs to patch the engine to send the expected value regardless of whether they are in a game currently or not, and upon game end, send a falsified game report. To make it more tolerant as said above, a more secure consensus method has to be implemented. On the top of my head, a solution is to reseed the RNG with the simulation hash at certain intervals. Once the values get reseeded, unless the client has been computing the turn hashes, they wouldn't get the next value right. While this solves the fooling problem, it also creates a new problem.

This highlights the flaw in this whole thing. If 2/3 of the system cannot be trusted, the faulty entity cannot be determined. If a game goes out of sync, there is no way to find the flaw since none of the turn hashes sent can be verified. The reseeding part relies on two players sending matching hashes. This isn't a regression per se, but its still annoying if OOSes are weaponized.

Can it be more resilient?

Yes, in 1999, the "Practical Byzantine Fault Tolerance" algorithm came out. It is extremely fast. With this state machine, the three entities can process player commands and generate simulation tracking alternative hashes that do not depend on actually computing the simulation state. However, since there is only 3 entities, and the problem can only be solved if 1/3 or less entities are nefarious, so OOSes cannot be reliably solved. However, notice that this cannot be solved even with dedicated rated game hosting.

</useless crap no one reads>

  • Thanks 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...