26th ITS World Congress, Singapore, 21-25 October 2019
Paper ID #
Using Artificial Intelligence to improve Traffic Flow at Intersections
Markus Mauder (markus.mauder@siemens.com)1*, David Borst (david.borst@siemens.com) 1,
Florian Fanderl (florian.fanderl@siemens.com) 1, Evren Pamir (evren.pamir@siemens.com) 1,
Konrad Vowinckel (konrad.vowinckel@siemens.com) 1, Claus Beringer
(claus.beringer@siemens.com) 1
1. Siemens Mobility GmbH, Germany
Abstract
Congestion is plaguing modern cities every day and with the number of vehicles projected to nearly
double by 2040, the problem will only increase. The necessity to share road resources between various
transportation modes (pedestrians, cyclists, public transit, cars) at intersections is a particularly strong
contributing factor for congestion. Existing solutions adapt signalized intersections to changing traffic
patterns based on pre-defined rule sets. These approaches suffer from an inability to anticipate and
handle the large space of traffic patterns in an exhaustive way. To solve this problem, we introduce an
approach employing an artificial intelligence capable of learning to adapt autonomously to changing
traffic conditions from short-term to long-term tendencies in real-world applications. We use
reinforcement learning on simulated and real intersections to explore the solution space and allow the
artificial intelligence to generalize and react adequately to new and previously unseen scenarios. This
flexibility allows our technique to reduce waiting times by up to 47% in simulations. The presented
system is currently being tested on four intersections in a German city of approximately 200,000
inhabitants.
Keywords:
Artificial Intelligence, Traffic Management, Intersection Optimization
Introduction
Cities are central to modern society and more and more people move to cities every day. In 2018 the
global population of cities has increased by two inhabitants per second. This trend takes a toll on the
mobility infrastructure in cities. The average speed is lower than it used to be and is projected to drop
even further. With the number of vehicles projected to almost double by 2040 [1], road capacity will
meet an impasse. Adding new capacity is often impossible or politically unacceptable in cities with
existing structures, making it ever more important to manage the existing capacity carefully.
Furthermore, the demand on existing road capacity is not static and needs constant adaptation of
Using Artificial Intelligence to Improve Traffic Flow at Intersections
existing capacities to meet. Mobility strategies of individual cities impose further restrictions on the
acceptable signal programming. The combination of these factors confronts traffic management with
new and complex challenges. Addressing these challenges is crucial to maintaining cities as economic
differentiators and areas with a high quality of live.
Problem setting
Increasing traffic has been a problem for a long time. Traffic signal systems have been designed at
least in part to deal with increased traffic from the beginning. However, the problem is continuously
growing and current solutions cannot rise to the challenge. In this section we give an overview of the
technical setting of traffic management and give a first impression of why current systems fail to
address the new scale of traffic.
Signalized intersections are a means to improve road-safety and optimize the utilization of the road
capacity. To improve road-safety, authorities established a catalog of rules and regulations to be met by
the traffic control signal plan. These include among others minimal green times, inter-green times and
conflict matrices. All rules and regulations are hard-coded and evaluated within the local signal plan.
Once road-safety is established, it is the goal of the traffic control to optimize the utilization of the
road capacity given the restrictions imposed by the authorities.
Modern traffic signal systems (TSS) are made up of traffic signals, traffic sensors and controllers. The
traffic controller contains the control logic of the intersection (signal plan), which controls the traffic
signals based on time and readouts of the traffic sensors. Signals control one or more traffic streams
across the intersection. Optional connection to a central Traffic Management Centre can be
established.
The state of a TSS is given by detection readouts and the state of its signals. The TSS’s state
information can be used to understand and dynamically adapt the traffic light system to the
intersection’s current demand pattern. The controller’s programming can be modified by changing the
duration and the timing of a green phase. Ad hoc modifications of the signal plan (e.g. extension of
green time) based on the traffic control logic can be defined as local, intersection specific
optimization.
A further important aspect of traffic management on signalized intersections is to account for
non-local optimization potential. The TSS is part of a network of systems that influence each other.
Thus, there is a large potential for improvement in coordination of intersections to reach a global
optimum, rather than a local optimum at each intersection separately. This is called network adaptive
traffic optimization. A commonly known example of a simple network adaptation is a green wave, in
which intersections’ timings are coordinated in such a way that cars traveling in one direction between
intersections always reach an intersection during one direction’s green time.
To address the challenges posed by ever growing traffic within this setting requires TSS’s to be
re-programmed in a highly dynamic way. Systems attempting to reach this high level of reactivity
either require very high configuration effort or use insufficiently detailed abstraction of the real world.
2
Using Artificial Intelligence to Improve Traffic Flow at Intersections
In the following section we give an overview of existing solutions and outline how the proposed
system will go beyond the state of the art.
Related work
The most straight-forward way of planning an intersection is to draft static signal plans, which assign
green time to different intersections at pre-defined start times and for pre-defined durations. Some
shortcomings of this technique are that signal plans need to be updated to account for long-term
changes in the environment, each intersection (though connected to others by roads) operates in
isolation and the signal plan stays constant even if traffic fluctuates.
Within a given signal plan, short term fluctuations in traffic can be handled by local rule-based traffic
actuation [2]. This technique allows changing green durations and phases based on local detection
data.
Rules are defined by a human engineer and react to anticipated traffic patterns by temporarily
modifying signal timings for the current cycle. While it affords some reaction to fluctuating traffic,
this technique still has some limitations: (1) deterministic programming is limited to the human
capacity to anticipate and solve problems and (2) the planning effort of an intersection with local
traffic actuation is higher.
Even with local traffic actuation a signal plan is limited by its relatively static nature. There is no
reaction to changing traffic patterns beyond the very short term. A straight-forward extension is to
have a set of pre-defined signal plans which get selected based on pre-defined criteria. Based on the
observation that traffic fluctuates periodically, a simple way to choose a signal plan is based on the
current time. This allows the intersection to react to changes in traffic over the course of multiple
hours. However, this technique still does not address the intersection’s role in the traffic network.
Any change to an intersection’s signal timing has consequences for intersections downstream from it.
To improve the networks overall capacity, intersections’ signal timings must be coordinated. In its
simplest form, signal plans are designed together to make sure their timings agree.
By choosing this approach over locally adaptive signal plans or changing signal plans, the system
loses the ability to react to changing traffic patterns. Systems designed to coordinate between
intersections while being adaptive at the same time (network adaptive) exist (e.g., SCOOT [4], SCATS
[5]). Their ability to handle both worlds comes at the price of much larger complexity. The downside
of this approach are (1) high configuration effort, (2) a need to abstract out some details about the
intersection state from the model to keep complexity manageable, (3) no ability to adapt to changes
that lie outside its programming and (4) long iteration cycles for updates due to need to incorporate
human planning.
To address the downsides inherent in human-made abstract representations of the real world, artificial
intelligence techniques can be used [2]. The promise of this technology is its ability to learn to form its
own representations and complex rules to optimally react to the real-world environment.
Unfortunately (to the authors’ knowledge) no artificial intelligence system has been incorporated into
large scale deployment yet. One reason is that these techniques often rely on the availability of large
3
Using Artificial Intelligence to Improve Traffic Flow at Intersections
amounts of information about the controlled system [7,8,9,10,11,12]. For example, a comprehensive
study validated MARLIN-ATSC [6] in a large simulation, but it relied on detailed knowledge about
the environment that is not commonly available in existing real-world infrastructure.
In this paper we are going to propose a network adaptive artificial intelligence traffic control system
designed to be used in real-world applications and currently under test in a German city of
approximately 200,000 inhabitants.
Proposed solution
In this paper we describe an AI-driven traffic management system for networks of signalized
intersections. Our goal is to create an intelligent, self-learning traffic actuation, which continuously
adjusts according to changing traffic patterns by utilizing more data and artificial intelligence. The
primary focus is to propose a system that is ready for use in existing infrastructure. To achieve this
goal, the system must be able to operate using infrastructure which is already installed at intersections
and it must do so while providing a continuously improving signal plan.
The system can be deployed using existing infrastructure requiring only a minimum of installed
hardware. If at least one detector per lane is available, the system can base its decisions on actual
measurements, otherwise assumptions must be made to estimate required values. As a result, the
system can be deployed on most inner-city intersections without hardware changes, saving installation
and hardware costs and allowing the system to be applied to a wide range of intersections. In addition
to detection data, information about the currently active signal plan, as well as invariant information
about intersection topology, like the distance between a sensor and its corresponding stop line, are
used. The system can be deployed as a cloud-based service, which uses network interfaces provided by
the existing infrastructure. To be able to address shortcomings of existing solutions the solution has to
reduce engineering effort and costs significantly, improve performance by reducing abstraction of the
real world, be self-learning and with that be able to adapt in all relevant time horizons, and be always
up-to-date by incorporating the self-learning in an automated process.
However, before we address each of these aspects as attributes of our proposed solution, we must
establish that the system complies with all rules and regulations in order to guarantee road-safety. To
be able to guarantee the required level of safety our system is divided into two layers. One layer is the
traffic controller, which is already deployed as part of every traffic signal system, the other is a
cloud-based system. On the controller level all rules and regulations are hard-coded and isolated from
the other components of the proposed solution. This guarantees the compliance with road-safety, while
maintaining the AI’s ability to improve traffic. Improvements that the cloud layer submits are
additionally checked regarding road-safety compliance by the first layer. This makes sure the
environment is always in a safe state.
Having addressed the fundamental properties, the system reduces engineering effort by optimizing a
reward function, which is correlated with traffic engineering measures and includes constraints that
force it to construct a signal program with expected properties. The function is built such that
4
Using Artificial Intelligence to Improve Traffic Flow at Intersections
optimizing it improves established traffic performance indicators like average waiting time,
intersection capacity or traffic flow. In addition, it guarantees properties like making sure that all
travellers will get the right of way in a timely manner. This process is a purely data driven one,
allowing the system to optimize traffic control with only very limited initialization costs. By using a
responsive artificial intelligence approach, the system learns to dynamically adapt to arising traffic
situations and demand patterns, reducing the configuration efforts of a traffic engineer to the creation
of an initial frame signal plan. Given data availability the system will be able to seamlessly switch
between multiple mathematical optimization functions (reward functions, e.g., reduction of waiting
time or public transport prioritization), always using the previously generated data set as pre-training.
In existing solutions this process includes extensive manual reconfiguration efforts from traffic
engineers.
To improve performance of the overall system, the strengths of artificial intelligence are utilized to
recognize more granular traffic patterns and predict the future state of the system. By learning to
anticipate detection patterns based on data from local and neighbouring intersections’ detection, the
system can adapt the signal timing of the current phase or trigger a different phase in response to
short-term traffic variations. By considering historic detection data the system can anticipate the
overall pattern of traffic and can modify the current signal program to match the projected traffic
patterns.
The system caters to the need and environment of each intersection individually. By learning to adapt
to arising traffic situations, the system implicitly learns the necessary information about the
intersection’s environment and continuously adapts its model to keep up with the environment’s state.
This allows the system to use the capabilities of each intersection fully.
In addition, the system uses the state of neighbouring intersections to find a system optimum, which
achieves both local and global optimization criteria. This allows it to achieve a much higher global
optimization than would be possible given only local state information. This makes it possible for
most vehicles to pass a network unencumbered by red lights.
Its responsive artificial intelligence approach allows the system to adjust the traffic control policy to
account for more dramatic traffic changes. Thus, the system becomes self-learning. Through the
application of pattern recognition, the system learns to generalize from observed situations to new
ones, so the system will always learn to adapt after new situations arise. This mitigates the need for
expensive re-planning of traffic light system software and allows the system to respond much quicker
to new long-term trends than would otherwise be possible. This allows the system to be always
up-to-date with current traffic.
By continuously adapting to new traffic patterns based on recent trends, the system also stays relevant
in the face of long-term variation in traffic patterns. Automatic adjustments to the traffic control
strategy, result in an always up-to-date system. During operation, the system builds up knowledge of a
wide range of traffic conditions and learns strategies to optimally respond to them. This knowledge
can be shared among intersections to more quickly and more robustly learn to adapt to new
environments and traffic patterns. This allows the system to use optimization potential on all scales.
5
Using Artificial Intelligence to Improve Traffic Flow at Intersections
These properties coupled with a technical ability to integrate with existing infrastructure through
cloud-based services, allow the proposed system to be used in the real world and move beyond
simulation-based proof-of-concept.
Training
Figure 1 – Reinforcement Learning
The proposed system uses a reinforcement learning approach. In reinforcement learning (RL) a
software agent is confronted with an environment with which it can interact by receiving observations
and sending actions. It chooses actions that maximize its internal model projection of cumulative
internal reward. The agent’s goal is to maximize the cumulative value of its built-in reward function.
See Figure 1 for an illustration.
In the presented system, the agent is a cloud-based artificial intelligence system, which interfaces with
the intersections’ controllers (environment). Its observations consist of the state of all intersections
accumulated over a signal cycle. This includes second by second detector readouts and signal group
states for each intersection, as well as state information like the currently active signal timing
modifications. The available actions correspond to changes to the signal plan timings that the agent
chooses to make, which are modifications of the signal start (offset) and duration (split), which
maintain the current cycle time. The reward is a function measuring the quality of the current situation
to allow the agent to gauge the success of its choice of actions. The reward correlates detection data
with the detector’s distance to the corresponding stop line to calculate a value that corresponds with
traffic engineering measures like waiting time (see page 4). Figure 2 shows the negative correlation
between the reward and cumulated waiting times in a simulation-based evaluation. In this example,
maximizing the reward function minimizes waiting time.
6
Using Artificial Intelligence to Improve Traffic Flow at Intersections
Figure 2 - Correlation of reward and waiting time
After deployment the system initially starts by observing the current traffic on the intersection. This
allows it to establish a baseline of traffic variation in its environment. By injecting variation into the
signal plan timings, the system generates training data that allows it to tune its model of the effect of
an action given an environment state. The data being collected during this phase allows the system to
start projecting useful actions in a new environment state. By carefully building up a set of
action-observation-mappings the system can re-train its model periodically to stay up to date with
medium-term and long-term traffic variation.
The process of training the system can be expedited when a simulation of the environment is available.
By allowing the agent to explore its environment much more aggressively, the artificial intelligence
sees the reaction of the system to extreme changes in the signal plan (bootstrapping). This allows a
much faster training at the cost of being infeasible for training in real-world application. However,
where we have access to a simulation of a real-world test bed, this technique can be used to give the
artificial intelligence a head-start over an untrained, passive agent. A patent about the detailed process
is currently pending.
Evaluation
The system’s test area is located in a German city of approximately 200,000 inhabitants and consists
of four adjacent intersections. Our system was first tested in a simulation based on the test area and is
currently deployed to a cloud system controlling the actual real-world test area.
7
Using Artificial Intelligence to Improve Traffic Flow at Intersections
Figure 3 - Example of real vs generated traffic detector patterns. The x-axis corresponds to the time of day,
while the y-axis shows mean car count per cycle.
Simulation
Our approach has great promise to increase intersections’ performance. We showed this using a
simulation based on a real test area.
The simulation was specifically designed to ensure the system’s ability to generalize over different
real-world traffic scenarios. To create traffic scenarios that resemble the real world, the test area was
modelled in the Sumo traffic simulator. Traffic patterns were based on a statistical analysis of multiple
months of historical real-world traffic patterns. The simulated probabilistic traffic patterns vary within
realistic boundaries established by this analysis and are used to ensure varying inputs for the artificial
intelligence (see Figure 3 for an example). By exposing the artificial intelligence to unique traffic
scenarios, the trained artificial intelligence can learn to generalise.
As discussed in Section
Training
, the system starts by observing (simulated) traffic situations. After observing traffic situations with
sufficient variance to build a first model, the artificial intelligence is frozen and allowed to modify the
signal plans according to its expectations of which action improves traffic. The resulting traffic data is
subject to a greater degree of variance and is used as additional training material for a further batch of
training.
Figure 4 - Waiting times of baseline and agent in simulation. The x-axis corresponds to seconds of waiting
time per vehicle.
After eight simulation runs of 4.5 hours’ worth of unique traffic involving approximately 9000 cars
8
Using Artificial Intelligence to Improve Traffic Flow at Intersections
each using the same pre-trained agent (after initial observation and a first round of training based on
data from actively influenced traffic using bootstrapping), the agent was able to achieve an average
reduction of waiting time in the simulation of approximately 47% (see Figure 4).
Real world
Contrary to other traffic control scenarios described in literature, our approach has been applied to
real-world intersections. The technical realization of our system is in place and works as described
above. Each intersection’s controller is connected to our cloud-based infrastructure and forwards all
detection events and signal changes to the cloud-based agent. The agent aggregates this information
and calculates its current reward. It uses this information to generate a small modification of the signal
plan and sends the change back down to the controller on the intersection, which applies the change to
its internal timings, if (and only if) the change does not conflict with its internal safeguards.
To increase the variability in the training data, the agent is allowed to make small changes to the signal
plan in accordance with local safety regulations and limited to a small effect size. This, too, has been
demonstrated to work and the agent was successfully trained on the resulting batch of observation data.
A comprehensive study of the achieved improvements is currently being conducted.
Conclusion
Traffic light systems are the prevalent means to increase throughput at intersections in cities. However,
there remains a large optimization potential by continuously adapting the resources assigned to each
road user based on the current demand. To better utilize signalized intersections’ available capacity, we
have created a self-learning dynamic artificial intelligence system for use in real world applications.
In contrast to current traffic systems, our proposed system reduces engineering effort, improves
performance, is self-learning and stays always up-to-date. It responds to current traffic patterns,
considers information from neighbouring intersections, is trained to take a specific intersection’s
environment into account, complies with local regulations and safety regulations, considers short-term,
medium-term and long-term changes in traffic and uses pattern recognition to react to common traffic
patterns.
The presented system is currently being tested in a real-world environment to conclusively
demonstrate its ability to improve traffic. At the same time, it is under active development to increase
its ability to reason over traffic patterns and identify advanced responses to the detected traffic
situation.
By basing the system on data and making it fully automatic, we have laid the groundwork to add
further sources of data. Emerging trends such as floating car data, car-to-infrastructure communication
and communication with autonomous vehicles will not only be trivial to include in our proposed
system but will strengthen it by increasing its ability to reason and to react to changing traffic. Existing
systems will struggle to incorporate the new data while responding to increases in traffic complexity,
9
Using Artificial Intelligence to Improve Traffic Flow at Intersections
requiring at least large efforts of engineering, or turning out to be too complex to handle within the
confines of existing techniques. This inherent flexibility and scalability allow the system to evolve and
become more powerful as the world is struggling with steady increases in vehicles.
References
1. Smith, M.N. (2016), The number of cars will double worldwide by 2040, Bernstein,
https://www.businessinsider.de/global-transport-use-will-double-by-2040-as-china-and-india-gdp-
balloon
2. Salter R.J. (1976) Vehicle-actuated signal facilities, Highway Traffic Analysis and Design.
Palgrave, London
3. Mannion, P. and Duggan, J. and Howley, E. (2016), An Experimental Review of Reinforcement
Learning Algorithms for Adaptive Traffic Signal Control, Springer.
4. P. B. Hunt, P. B. and Robertson, D. I. and Bretherton, R. D. and Winton, R. I. (1981), SCOOT—A
traffic responsive method of coordinating signals, Transp. Road Res. Lab., Crowthorne, U.K.,
Technical Report.
5. A. G. Sims and K. W. Dobinson, SCAT—The Sydney co-ordinated adaptive traffic system:
Philosophy and benefits, Int. Symp. Traffic Control Systems, Berkeley, CA, USA, 1979.
6. El-Tantawy, S. and Abdulhai, B. and Abdelgawad, H. (2013), Multiagent Reinforcement Learning
for Integrated Network of Adaptive Traffic Signal Controllers (MARLIN-ATSC): Methodology and
Large-Scale Application on Downtown Toronto, IEEE Transactions on Intelligent Transportation
Systems, Vol. 14, No. 3.
7. Genders, W., & Razavi, S. (2016). Using a deep reinforcement learning agent for traffic signal
control. arXiv preprint arXiv:1611.01142.
8. van der Pol, E. (2016). Deep reinforcement learning for coordination in traffic light control. Master's
thesis, University of Amsterdam.
9. Taale, H., Bäck, T., Preuss, M., Eiben, A. E., De Graaf, J. M., & Schippers, C. A. (1998, September).
Optimizing traffic light controllers by means of evolutionary algorithms. In Proceedings of the 6th
European Congress on Intelligent Techniques and Soft Computing (Vol. 3, pp. 1730-1734).
10. Mousavi, S. S., Schukat, M., & Howley, E. (2017). Traffic light control using deep policy-gradient
and value-function-based reinforcement learning. IET Intelligent Transport Systems, 11(7),
417-423.
11. Abdulhai, B., Pringle, R., & Karakoulas, G. J. (2003). Reinforcement learning for true adaptive
traffic signal control. Journal of Transportation Engineering, 129(3), 278-285.
10
Using Artificial Intelligence to Improve Traffic Flow at Intersections
12. Jin, J., & Ma, X. (2015). Adaptive group-based signal control by reinforcement
learning. Transportation Research Procedia, 10, 207-216.
11