It is another milestone in the development of Artificial intelligence (AI): When AI systems will have beaten the best human player in chess and Go, can be reached also in the case of 3D video games with several participants a higher success rate.
A group of the company deep mind in London by Max Jaderberg reported in the journal “Science” about the development of your AI system.
The researchers used the mode “Capture the flag” (“capture the flag”) of the Multiplayer game “Quake III Arena”. This computer must catch two Teams the flag of the opponent and the own base game. You can put your opponent through laser shots out of action, this is a short period of time on your own base back into the game. The symmetrical layout of the rooms and corridors are intended to give both Teams the same opportunities, and are generated according to the random principle.
the players move in first-person perspective of a team member through the virtual terrain. You need to work with your group members and your opponent in chess to keep. The programmed goal of AI agents is to achieve a high score.
The principle of Multiplayer, according to trained Jaderberg and colleagues in a series of AI agents in parallel by these played against each other. Most of the AI agents will be developed and a similar game, and similar strategies. This changed in the course of 450 000 training games: After a few Thousand Games the AI agents were often waiting in the base of the opponent, that the flag shows up there again. Their strategy changed, however, in the course of the training. Some time later, many of the AI followed fellow agent rather of a team. “When Training in a rich multi-agent world, a complex and surprisingly highly intelligent artificial behavior emerged,” the researchers write.
After about 200,000 Games the AI agents on average were better than the best people. Most recently, she won clearly: If two human players played against two AI agents, and conquered the latter, on average, 16 more flags.
KI-agents responded to the Emergence of an opponent, on average, to 258 milliseconds, people in 559 milliseconds. However, even if the researchers, the response time of the KI-slowing agents, the artificial, the human players are superior.
“The presented framework for the training of agent populations, each with its own rewards contains only minimal assumptions about the structure of the game,” write Jaderberg and colleagues. He could, therefore, be used for Learning in a variety of agent systems with multiple team members.