Umpire 0.5 progress: I've been hard at work. Networking support is very nearly done - I have successfully initiated a multiplayer game over the Internet, though it remains too slow. But I think all notable refactors are behind me, just a few methods that need to be re-implemented to take advantage of local caching of observations and minimize round-trips.
In addition, I've revived and updated the AI training infrastructure, and have a preliminary AlphaGo Zero style player algorithm trained, and a good library of self-play data to build on.
What that means (in case you need a reminder) is that we have simple AIs (random baselines, in fact) play games against each other, and track who wins. Then a supervised neural network model is trained to model the probability of victory given the game state and the action taken. It's similar to a Q-learning state action model, but instead of the "soft" (and somewhat arbitrary) supervision of a reward function, the "hard" supervision of wins and losses is used exclusively.
Of course, I don't have Google's budget, so I'm not sure how far I'll get with this. But it's something I've wanted to try, and now that I have 24GB of VRAM to throw at it I thought I'd see what I can do.