I have been interested in making it possible to explore applications of machine learning in 0 AD (as some of you may have gathered from https://trac.wildfiregames.com/ticket/5548 ). I realized that I haven't really explained very thoroughly my interest and motivation so I figured I would do so here and see what everyone thinks!
tl;dr - At a high level, I think that adding an OpenAI gym-like interface* could be a cool addition to 0 AD that would benefit both 0 AD (technically and in terms of publicity) as well as the research community in machine learning and AI. I go into the specifics below as well as discuss other potential avenues for integrating/leveraging machine learning:
Potential Machine Learning Problems/Applications
Intelligent unit control (micromanagement)
I have an example where an AI learns to kite with cavalry archers when fighting infantry at https://github.com/brollb/simple-0ad-example. This is probably one of the easiest problems to explore as it can be done progressively starting with small, clearly defined scenarios using the functionality added in the beforementioned ticket. That said, there are still some of the standard challenges present with machine learning around ensuring that the AI has been trained on sufficiently diverse scenarios so that it doesn't ever encounter something new and behave incorrectly.
As far as potential impact on the game, automatic micromanagement could be interesting for either a component in an otherwise scripted AI such as Petra or as a way to make the units more intelligent as they gain experience. That is, I could imagine that as the units gain more experience, they could also start having improved tactical behavior, such as kiting, automatically.
Enemy AI Trained Entirely with Reinforcement Learning
This is actually very difficult although it has been recently done in StarCraft 2 (https://deepmind.com/blog/article/alphastar-mastering-real-time-strategy-game-starcraft-ii). Although I think this could be fun for people to try to do, I wouldn't have high expectations on this front for awhile because it is a very hard problem for ML to solve - especially given the large number of different civilizations, maps, resource types, etc.
Enemy AI with Scripting and Learned Components
This is referring to a generic version of what I mentioned under "intelligent unit control". Essentially, there are a lot of opportunities to incorporate learned components into an otherwise scripted AI. From a technical perspective, this makes the machine learning problem much easier/tractable while still enabling more intelligent behavior from the built in AI.
There are many different examples of intelligent components that could be incorporated. For example, it could try to predict the outcome of a battle (to determine if we should retreat) or try to imitate various high-level human strategies (such as predicting what a human might target for an attack).
Quantitative Game Balancing
This is a very interesting problem and I find 0 AD to be a particularly unique opportunity for exploring it. Essentially, the idea is that there are many different parameters in a game (such as attack damage for each unit, etc) which are quite difficult to tune without making the game imbalanced and one of the civilizations/strategies OP. (I don't think I need an example for this community but I enjoyed watching https://www.gdcvault.com/play/1012211/Design-in-Detail-Changing-the.) This problem is nontrivial since detecting overpowered strategies really requires an understanding of the way various aspects of the game can be exploited.
Although this is a nontrivial problem, I find it to be an exciting opportunity for 0 AD to gain publicity and for researchers to have a sandbox in which they can explore this research question in an actual game (rather than a trivial, toy environment). That is, many of the other environments used in reinforcement learning research are either open source toy environments (eg, CartPole) or proprietary games which cannot be modified (eg, StarCraft 2). There has been a bit of related research in detecting imbalance in complex games like StarCraft 2 as well as balancing simpler games but as proprietary games will not be exposing the parameters used for the units (and other aspects of the game), automatic game balancing approaches are limited. Being an open source game that people actually play, 0 AD provides a really exciting opportunity for research in this direction as the parameters of the game are not proprietary and could be modified programmatically enabling researchers to explore this rather complex problem. For the 0 AD community, enabling researchers to conduct this type of research in the game itself should make it much easier to be able to incorporate any results of such research into the game making 0 AD more fun and an even better game!
Training the AI to imitate humans is worth mentioning although the impact on the game is likely to be in one of the beforementioned ways. Imitation learning, unlike reinforcement learning, is training the AI using expert demonstrations of gameplay. It is often used as a method for essentially initializing the AI to something reasonable before training it further with reinforcement learning (ie, training the AI using a reward rather than example). Imitation learning can arguably be more valuable for game development given that it can more directly instill various human-like behaviors (hopefully making the gameplay more engaging and interesting) rather than simply trying to maximize some reward or score in the game.
Techniques to Train and Understand AI Agents
This is more of a general research direction that I find interesting (and is similar to research that I have done in the past). Essentially, this is exploring the means by which the game developer can use the various methods of instilling behavior into an AI (programming, reinforcement learning, imitation learning) to create the desired behavior (and game experience). This is a bit of both a human-computer interaction (HCI) and machine learning question (also related to machine teaching). To give a more concrete example, this would include exploring the behavior of a trained RL agent in the game, correcting these behaviors, and perhaps trying to detect potentially incorrect behaviors to raise to the user automatically. 0 AD is well suited for this type of research for the same reasons that it is well suited for exploring game balance - most games used in research are either proprietary or not something people would actually play.
Optimizing Existing Game Parameters (Relatively Easy)
There are some existing machine learning tricks that could be used to make other sorts of improvements to the game rather than explore research questions. A while back, I was playing around with CMAES (a machine learning technique to optimize a set of parameters given a "fitness function") to improve some of the sort of magic numbers used within Petra such as "popPhase2" and "armyMergeSize". Essentially, this made it possible to find values for these parameters which would improve the AI's ability to win when playing against the standard Petra agent (on the hardest difficulty). Although I don't find this as interesting as the other areas, it is a useful tool that could be nice to apply to other aspects of the game.
Overall, I think it would be really exciting to be able to explore some of the research questions in 0 AD as I think it could be beneficial both to researchers but also would make it easier to incorporate the results of this research into 0 AD (making it an even better game!). Of course, this is only true if the functionality required to be added to 0 AD is easy to maintain and doesn't add overhead taking away from the development of the core game features and functionality. I am also hopeful that incorporating some of these machine learning capabilities could also be beneficial to the community and raise awareness of 0 AD!
As far as technical requirements, I made an RPC interface for controlling the AI from Python (because the majority of machine learning tools are in Python). This makes it possible to explore 1, 2, and 3 as well as provides necessary functionality for 4, 5, and 6. As mentioned above, I have an example of #1 on GitHub and I think this could make for really interesting undergraduate projects (as well as potentially interesting integrations into the game). However, I think 0 AD is a particularly unique opportunity for exploration of 4 and 6. Game balancing (#4) still requires the ability to programmatically edit the unit parameters which I have explored a little bit but haven't added to the game. If this is something that others find interesting (and wouldn't mind me asking a few questions ), I would be open to adding this as well.
Anyway, I find these machine learning problems and applications quite exciting both for 0 AD and for AI/ML research but I want to know what the rest of the community thinks! Let me know what you think or if you have any questions/comments!
* I say *OpenAI gym-like* because a gym environment requires an observation space (numerical representation of the world for the AI), action space (numerical representation of the actions the AI can perform), and reward function to be defined. It isn't clear what the most appropriate choices for these would be (and they could vary based on the specific scenario) so I would prefer making more of a "meta-gym" where it is basically an OpenAI gym that needs the user to specify these values.