In this article, we present MADRaS: Multi-Agent DRiving Simulator. It is a multi-agent version of TORCS, a racing simulator popularly used for autonomous driving research by the reinforcement learning and imitation learning communities. You can read more about TORCS in the below resources:
MADRaS is a multi-agent extension of Gym-TORCS and is open source, lightweight, easy to install, and has the OpenAI Gym API, which makes it ideal for beginners in autonomous driving research. It enables independent control of tens of agents within the same environment, opening up a prolific direction of research in multi-agent reinforcement learning and imitation learning research aimed at acquiring human-like negotiation skills in complicated traffic situations—a major challenge in autonomous driving that all major players are racing to solve.
Most open-source autonomous driving simulators (like CARLA*, DeepDrive, AirSim, and Udacity* SDC) innately support only egocentric control; that is, single agent behavior, and have preprogrammed behaviors for the other agents. The difficulty in introducing agents with custom behaviors in these simulators restricts the diversity of real-world scenarios that can be simulated.
To address this issue, we developed MADRaS, wherein each car on the racing track can be independently controlled, enabling the creation of rich, custom-made traffic scenarios, and learning the policy of control of multiple agents simultaneously.
The task of negotiation in traffic can be posed as that of finding the winning strategy in a multi-agent game, wherein multiple entities (cars, buses, two-wheelers, and pedestrians) are trying to achieve their objectives of getting from one place to another fast, yet safely and reliably. Imitation learning algorithms like Behavioral Cloning, Active Learning, and Apprenticeship Learning (Inverse Reinforcement Learning followed by Reinforcement Learning) have proved to be effective for learning such sophisticated behaviors, under a multitude of simplifying assumptions and constraining conditions.
A major portion of the contemporary literature makes the single-agent assumption; that is, the agent acts in an environment with a plethora of other agents—similar or different—but does not interact with any of them, robbing it of data and information that could potentially be extremely useful in decision making, at both the egocentric and collaborative levels.
Driving, however, is inherently multi-agent, and the following is a partial list of things that become possible once we get rid of the single-agent assumption.
Source: eDriving
One of the earliest instances of multi-agent systems being deployed in vehicles (starting way back in 1993!) was in the use of platooning, wherein vehicles travel at highway speeds with small inter-vehicle spacing to reduce congestion and still achieve high throughput without compromising safety. Now it seems obvious that autonomous cars in the near future will communicate, cooperate, and form platoons over intersecting lengths of their commutes.
Source: phys.org
Apart from transferring information about pile-ups and possible diversions ahead to all the vehicles in the geographical vicinity, this power of reliable communication can be used to pool together the knowledge of multiple learning agents. An intuitive motivation could be to consider a large gridworld. With a single learning agent, one could solve the gridworld in n hours of training. With multiple learning agents pooling their experiences, we could cut down the training time significantly, possibly even linearly!
There’s a host of untapped literature on communication among multiple agents in various environments (not autonomous driving… yet.) See:
Now this raises important questions about the reliability of the communication between vehicles. With the imminent advent of 5G,1 fast and reliable communication between vehicles can help lead to the training and deployment of completely hands-free autonomous cars.
Drivers on the road constantly anticipate the potential actions of fellow drivers. As an example, for close maneuvering in car parks and intersections, eye contact is made to ensure a shared understanding. Defense Advanced Research Projects Agency (DARPA) stated that traffic vehicle drivers, unnerved by being unable to make eye contact with the robots, had resorted to watching the front wheels of the robots for an indication of their intent.
Source: The Star
Multi-agent learning comes with its own share of complications:
But remember why we started solving fully autonomous driving (FAD) in the first place. Writing for Technology Review, Will Knight outlines the possibilities of our driverless car future:
The list goes on..
So, today we’re excited to release MADRaS for the community to kickstart research into making FAD a reality. With the ability of introducing multiple learning agents in the environment at the same time, this simulator, built on top of TORCS, can be used to benchmark and try out existing and new multi-agent learning algorithms for self-driving cars such as: Multi-Agent Deep Deterministic Policy Gradient (MADDPG), PSMADDPG, and the lot. And since this extends TORCS, it supports the deployment of all the single-agent learning algorithms as well. Scripts for training a DDPG agent are provided as a sample.
Check out the following video for an overview of the features and the general interface.
This project was developed by Abhishek Naik and Anirban Santara (an Intel® Student Ambassador for AI) during their internship at the Parallel Computing Lab, Intel Labs, Bangalore, India. This project was driven by Intel’s urge to address the absence of an open source multi-agent autonomous driving simulator that can be utilized by machine learning (particularly, reinforcement learning) scientists to rapidly prototype and evaluate their ideas. Although the system was developed and optimized entirely on the Intel® Core™ i7 processor and Intel® Xeon® processors, we believe that it would run smoothly on other x86 platforms, too. Currently, we are working on integrating MADRaS with the Intel® Nervana™platform Reinforcement Learning Coach and we invite the community to participate in its development.
Please feel free to report any incompatibility or bug by creating an issue in the GitHub repository. We hope MADRaS enables new and veteran researchers in academia and the industry to make this FAD a reality!