MADRaS: A Multi-Agent DRiving Simulator for Autonomous Driving Research

Guest Blog Last Updated : 07 May, 2020

6 min read

Introduction

simulator screen

In this article, we present MADRaS: Multi-Agent DRiving Simulator. It is a multi-agent version of TORCS, a racing simulator popularly used for autonomous driving research by the reinforcement learning and imitation learning communities. You can read more about TORCS in the below resources:

MADRaS is a multi-agent extension of Gym-TORCS and is open source, lightweight, easy to install, and has the OpenAI Gym API, which makes it ideal for beginners in autonomous driving research. It enables independent control of tens of agents within the same environment, opening up a prolific direction of research in multi-agent reinforcement learning and imitation learning research aimed at acquiring human-like negotiation skills in complicated traffic situations—a major challenge in autonomous driving that all major players are racing to solve.

Most open-source autonomous driving simulators (like CARLA*, DeepDrive, AirSim, and Udacity* SDC) innately support only egocentric control; that is, single agent behavior, and have preprogrammed behaviors for the other agents. The difficulty in introducing agents with custom behaviors in these simulators restricts the diversity of real-world scenarios that can be simulated.

To address this issue, we developed MADRaS, wherein each car on the racing track can be independently controlled, enabling the creation of rich, custom-made traffic scenarios, and learning the policy of control of multiple agents simultaneously.

More Details about this System

The task of negotiation in traffic can be posed as that of finding the winning strategy in a multi-agent game, wherein multiple entities (cars, buses, two-wheelers, and pedestrians) are trying to achieve their objectives of getting from one place to another fast, yet safely and reliably. Imitation learning algorithms like Behavioral Cloning, Active Learning, and Apprenticeship Learning (Inverse Reinforcement Learning followed by Reinforcement Learning) have proved to be effective for learning such sophisticated behaviors, under a multitude of simplifying assumptions and constraining conditions.

A major portion of the contemporary literature makes the single-agent assumption; that is, the agent acts in an environment with a plethora of other agents—similar or different—but does not interact with any of them, robbing it of data and information that could potentially be extremely useful in decision making, at both the egocentric and collaborative levels.

Multi-Agent Systems

Driving, however, is inherently multi-agent, and the following is a partial list of things that become possible once we get rid of the single-agent assumption.

Platooning

virtual vehicles

Source: eDriving

One of the earliest instances of multi-agent systems being deployed in vehicles (starting way back in 1993!) was in the use of platooning, wherein vehicles travel at highway speeds with small inter-vehicle spacing to reduce congestion and still achieve high throughput without compromising safety. Now it seems obvious that autonomous cars in the near future will communicate, cooperate, and form platoons over intersecting lengths of their commutes.

Pooling knowledge

traffic simulator

Source: phys.org

Apart from transferring information about pile-ups and possible diversions ahead to all the vehicles in the geographical vicinity, this power of reliable communication can be used to pool together the knowledge of multiple learning agents. An intuitive motivation could be to consider a large gridworld. With a single learning agent, one could solve the gridworld in n hours of training. With multiple learning agents pooling their experiences, we could cut down the training time significantly, possibly even linearly!

There’s a host of untapped literature on communication among multiple agents in various environments (not autonomous driving… yet.) See:

Now this raises important questions about the reliability of the communication between vehicles. With the imminent advent of 5G,¹ fast and reliable communication between vehicles can help lead to the training and deployment of completely hands-free autonomous cars.

Leveraging intent

Drivers on the road constantly anticipate the potential actions of fellow drivers. As an example, for close maneuvering in car parks and intersections, eye contact is made to ensure a shared understanding. Defense Advanced Research Projects Agency (DARPA) stated that traffic vehicle drivers, unnerved by being unable to make eye contact with the robots, had resorted to watching the front wheels of the robots for an indication of their intent.

car, biker, and pedestrian

Source: The Star

As inter vehicle communication becomes ubiquitous and reliable (shout out to 5G!), autonomous vehicles will be able to transmit their intent to neighboring vehicles to implement the level of coordination beyond what human drivers currently achieve using eye contact.

But there are a few things to consider first..

Multi-agent learning comes with its own share of complications:

Nonstationarity A moving-target problem, since the best policy changes as the other agents’ policies change.
Curse of dimensionality The exponential growth of state and action variables with the number of agents.
Specifying a good goal Difficult, since the agents’ returns are correlated and cannot be maximized independently.
Exploration Apart from having to explore the environment, the agents also have to obtain information about other agents.
Coordination The effect of an agent’s action on the environment also depends on the actions taken by other agents, hence the need of mutually consistent actions in order to achieve the intended effect.

Recalling the Why of our Problem

But remember why we started solving fully autonomous driving (FAD) in the first place. Writing for Technology Review, Will Knight outlines the possibilities of our driverless car future:

Safer transportation
The National Highway Traffic Safety Administration estimates that more than 90 percent of road crashes involve human error, a figure that has led some experts to predict that autonomous driving will reduce the number of accidents on the road by a similar percentage. Assuming the technology becomes ubiquitous and does have such an effect, the benefits to society will be huge. Almost 33,000 people die on the roads in the United States each year, at a cost of USD 300 billion, according to the American Automobile Association. The World Health Organization estimates that, worldwide, over 1.2 million people die on roads every year.
Improved fuel efficiency
Apart from lesser traffic congestion due to fewer accidents, it is also expected that the rise of self-driving taxis will help decrease the total number of cars on the road, alleviating the overall traffic. And because driverless vehicles will be designed to optimize efficiency in acceleration and braking, the adoption of autonomous cars could reduce CO₂ emissions produced by cars by as much as 300 million tons per year. Autonomous vehicles traveling in high-speed platoons that reduce aerodynamic drag could also reduce fuel consumption by 20 percent.⁴
Less traffic?
Driverless cars communicating with each other and their surroundings can find and exploit the optimal routes more effectively, which will help spread the demand for scarce open road spaces.
Improved human productivity
With cars doing most or all of the driving, we’ll be free to make the most of our time spent in the vehicle—spending that time reading books, watching a game, interacting with family members, and even getting some work done!

The list goes on..

End Notes

So, today we’re excited to release MADRaS for the community to kickstart research into making FAD a reality. With the ability of introducing multiple learning agents in the environment at the same time, this simulator, built on top of TORCS, can be used to benchmark and try out existing and new multi-agent learning algorithms for self-driving cars such as: Multi-Agent Deep Deterministic Policy Gradient (MADDPG), PSMADDPG, and the lot. And since this extends TORCS, it supports the deployment of all the single-agent learning algorithms as well. Scripts for training a DDPG agent are provided as a sample.

Check out the following video for an overview of the features and the general interface.

Check out the GitHub* repository and the wiki for the project.
Check out videos of a sample Deep Deterministic Policy Gradients (DDPG) agent that has learned to drive in traffic.

This project was developed by Abhishek Naik and Anirban Santara (an Intel® Student Ambassador for AI) during their internship at the Parallel Computing Lab, Intel Labs, Bangalore, India. This project was driven by Intel’s urge to address the absence of an open source multi-agent autonomous driving simulator that can be utilized by machine learning (particularly, reinforcement learning) scientists to rapidly prototype and evaluate their ideas. Although the system was developed and optimized entirely on the Intel^® Core™ i7 processor and Intel^® Xeon^® processors, we believe that it would run smoothly on other x86 platforms, too. Currently, we are working on integrating MADRaS with the Intel^® Nervana^™platform Reinforcement Learning Coach and we invite the community to participate in its development.

Please feel free to report any incompatibility or bug by creating an issue in the GitHub repository. We hope MADRaS enables new and veteran researchers in academia and the industry to make this FAD a reality!

Authors

Abhishek Naik: IIT Madras
Anirban Santara: Intel Student Ambassador for AI, IIT Kharagpur
Balaraman Ravindran: Head, RBC-DSAI, IIT Madras
Bharat Kaul: Parallel Computing Lab, Intel Labs

Guest Blog

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Basics of Machine Learning

Machine Learning Lifecycle

Importance of Stats and EDA

Understanding Data

Probability

Exploring Continuous Variable

Exploring Categorical Variables

Missing Values and Outliers

Central Limit theorem

Bivariate Analysis Introduction

Continuous - Continuous Variables

Continuous Categorical

Categorical Categorical

Multivariate Analysis

Different tasks in Machine Learning

Build Your First Predictive Model

Evaluation Metrics

Preprocessing Data

Linear Models

KNN

Selecting the Right Model

Feature Selection Techniques

Decision Tree

Feature Engineering

Naive Bayes

Multiclass and Multilabel

Basics of Ensemble Techniques

Advance Ensemble Techniques

Hyperparameter Tuning

Support Vector Machine

Advance Dimensionality Reduction

Unsupervised Machine Learning Methods

Recommendation Engines

Improving ML models

Working with Large Datasets

Interpretability of Machine Learning Models

Automated Machine Learning

Model Deployment

Deploying ML Models

Embedded Devices

MADRaS: A Multi-Agent DRiving Simulator for Autonomous Driving Research

Introduction

More Details about this System

Multi-Agent Systems

Platooning

Pooling knowledge

Leveraging intent

But there are a few things to consider first..

Recalling the Why of our Problem

End Notes

Authors

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#