VissimRL is a multi-agent reinforcement learning framework for traffic signal control built on the PTV Vissim traffic simulator. By encapsulating the complex Vissim COM interface, the framework provides researchers with an efficient and standardized development environment. VissimRL simplifies the construction of reinforcement learning–based traffic signal control (RL-TSC) simulations and supports a variety of commonly used signal control strategies and evaluation metrics.
The method design and experimental details of this project are described in the following paper:
[1] H.-C. Chang, S.-Y. Huang, Y.-C. Chen, and I.-C. Wu, “VissimRL: A Multi-Agent Reinforcement Learning Framework for Traffic Signal Control Based on Vissim,” arXiv preprint arXiv:2601.18284, 2026. doi: 10.48550/arXiv.2601.18284.
The basic installation procedure for VissimRL is described below.
It is recommended to create an isolated Python environment using Conda (tested with Python 3.9):
conda create -n vissim_rl python=3.9 -y
conda activate vissim_rlAll required Python dependencies are specified in pyproject.toml. Install the project using:
pip install -e .
This directory contains statically generated Python wrappers for the Vissim COM interface. The primary purpose of this module is to improve development efficiency and runtime performance.
The makepy tool generates static Python wrappers for a specified COM type library, including detailed information about all available properties, methods, and events. This allows Python to interact with COM objects in a manner similar to early binding, providing better performance and improved IDE support.
※ If a Vissim version update results in changes to COM GUIDs, please re-run the following command:
python -m win32com.client.makepy "Vissim.Vissim.25"
Replace the generated wrapper file in the com_lib directory with vissim_api.py.
The Vissim wrapper layer provides an abstraction for efficient interaction with the Vissim simulator. This module encapsulates the Vissim COM interface and simplifies simulation operations, making the development of RL-TSC environments more intuitive.
-
baseProvides core wrapper classes for simulation initialization and network loading, enabling efficient interaction with the Vissim simulator and supporting the fundamental simulation functions of VissimRL. -
signal_controlProvides utilities for traffic signal control, including phase construction and phase modification. Flexible phase configuration enables effective management of intersection signal operations. -
evaluationProvides a set of evaluation tools for computing and analyzing simulation metrics commonly used in RL-TSC, such as waiting time, travel time, and queue length.
This module implements a multi-agent (multi-intersection) environment that serves as the interaction bridge between the Vissim simulator and RL agents. The environment follows mainstream multi-agent reinforcement learning (MARL) design paradigms, such as the Gymnasium and PettingZoo interfaces, and supports scalable multi-intersection scenarios.
-
envDefines the core environment configuration and interfaces, including simulation parameter setup and initialization. This module implements standard RL environment functions such asresetandstep, supporting both single-intersection and multi-intersection training. -
traffic_signalImplements traffic signal operation logic for each intersection. This module manages signal states and maintains relevant signal information (e.g., phase timing and performance metrics), working closely with the observations, actions, and rewards modules to enable flexible and optimized signal control. -
observationsDefines the observation representations received by RL agents. Multiple observation designs are supported, including local (single-intersection) and global (network-level) observations. Custom observation functions can be implemented by inheriting from the base class. -
actionsDefines traffic signal control actions and applies them through theTrafficSignalinterface to dynamically adjust signal parameters. Custom action designs can be implemented by inheriting from the base class.Currently supported control methods include:
FixedTimeControl: Executes the predefined signal timing plan in Vissim.ChooseNextPhase: Selects the next signal phase.SwitchNextOrNot: Determines whether to switch to the next phase.SetPhaseDuration: Predicts the duration of the next phase.AdjustNextPhase: Adjusts the duration of the next phase.IncrementalAdjustNextPhase: Incrementally adjusts the next phase duration over time.
-
rewardsImplements flexible reward functions for evaluating agent performance based on various traffic efficiency metrics. Reward weights can be configured, and custom reward functions can be implemented by inheriting from the base class.
This directory contains helper modules that improve operational convenience and support feature extension.
-
gui_controlorExtends GUI management functionality for Vissim, allowing users to conveniently control the visibility of the Vissim main window. -
detector_builder
This module automatically creates vehicle detectors in Vissim based on the provided network and detector configuration. By constructing detectors, it enables traffic-state observation and subsequent performance evaluation in simulation, and can be used to emulate real-world roadside sensing setups (e.g., traffic cameras).
Detector IDs are assigned to vehicles via User Defined Attributes (UDA) for downstream vehicle-level statistics and performance analysis. The module outputs: (1) an updated network file (.inpx) containing detector, (2) detector configuration (.json), and (3) a visualization layout (.layx) for verification and debugging.
This directory provides synthetic traffic networks used in the experiments, including a simple single-intersection network (Single-Intersection) and a three-intersection arterial network (Arterial-3), along with their corresponding network structures, signal settings, and vehicle detector configurations.
For single-intersection scenarios, the Gymnasium environment can be used (see the Gymnasium API).
Refer to examples/Gymnasium_test.py for a detailed example.
import gymnasium as gym
from vissim_rl.environment.observations import DefaultObservationFunction
from vissim_rl.environment.actions import ChooseNextPhase
from vissim_rl.environment.rewards import DefaultRewardFunction
env = gym.make('vissim-rl-v0',
net_path='path_to_your_network.inpx',
sig_path='path_to_your_signal_folder',
detector_path='path_to_your_detector_info.json',
use_gui=True,
start_time=600,
sim_period=4200,
observation_config={'class': DefaultObservationFunction},
action_config={'class': ChooseNextPhase, 'args': {'delta_time': 5}},
reward_config={'class': DefaultRewardFunction},
)
obs, infos = env.reset()
while True:
actions = env.action_space.sample()
next_obs, reward, termination, truncation, info = env.step(actions)
if termination or truncation: breakFor multi-intersection traffic signal control, the PettingZoo environment is recommended (see the PettingZoo API).
Refer to examples/PettingZoo_test.py for a detailed example.
from vissim_rl import parallel_env
from vissim_rl.environment.observations import DefaultObservationFunction
from vissim_rl.environment.actions import ChooseNextPhase
from vissim_rl.environment.rewards import DefaultRewardFunction
env = parallel_env(net_path='path_to_your_network.inpx',
sig_path='path_to_your_signal_folder',
detector_path='path_to_your_detector_info.json',
use_gui=True,
start_time=600,
sim_period=4200,
observation_config={'class': DefaultObservationFunction},
action_config={'class': ChooseNextPhase, 'args': {'delta_time': 5}},
reward_config={'class': DefaultRewardFunction},
)
obs, infos = env.reset()
while env.agents:
actions = {agent: env.action_space(agent).sample() for agent in env.agents}
next_obs, rewards, terminations, truncations, infos = env.step(actions)If you use this project in your research or related work, please cite the following paper:
@article{chang2026vissimrl,
title = {VissimRL: A Multi-Agent Reinforcement Learning Framework for Traffic Signal Control Based on Vissim},
author = {Chang, Hsiao-Chuan and Huang, Sheng-You and Chen, Yen-Chi and Wu, I-Chen},
journal = {arXiv preprint arXiv:2601.18284},
year = {2026}
}This project contains source code only and does not include the PTV Vissim software or its associated licenses.
Users are responsible for ensuring that they have legally obtained and comply with the licensing terms of PTV Vissim when conducting simulations or experiments.
The source code of this project is released under the MIT License. See the LICENSE file for details.