POMDP cat-and-mouse PettingZoo grid world with recurrent MARL baselines, diagnostics, and demo assets.
-
Updated
Mar 21, 2026 - Python
POMDP cat-and-mouse PettingZoo grid world with recurrent MARL baselines, diagnostics, and demo assets.
R.L. methods and techniques.
Safety challenges for RL and LLM agents' ability to learn and use biologically and economically aligned utility functions. The benchmarks are implemented in a gridworld-based environment. The environments are relatively simple, just as much complexity is added as is necessary to illustrate the relevant safety and performance aspects.
Simple Grid Environment for Gymnasium
Extended, multi-agent, and multi-objective (MaMoRL / MoMaRL) gridworld environments building framework based on DeepMind's AI Safety Gridworlds. This is a suite of reinforcement learning environments illustrating various safety properties of intelligent agents. It is made compatible with OpenAI's Gym/Gymnasium and Farama Foundation PettingZoo.
Enables you to convert a PettingZoo environment to a Gym environment while supporting multiple agents (MARL). Gym's default setup doesn't easily support multi-agent environments, but this wrapper resolves that by running each agent in its own process and sharing the environment across those processes.
PythonでQ学習(強化学習)を実装し、4×4グリッドでの行動学習をヒートマップで可視化したデモ。
A modular, extensible, entity-component-system (ECS) gridworld environment
python package for fast shortest path computation on 2D polygon or grid maps
Causal-AIRL: MSc research code + interactive demo. 23pp↑ cross-style policy agreement via latent Z deconfounding. MSc Data Science @ Edinburgh 2024-25.
A reinforcement learning project implementing a Deep Q-Network agent that learns goal oriented navigation in a custom grid environment, with policy evaluation, visualization, and analytics.
Research-grade Reinforcement Learning framework for single-agent and multi-agent warehouse navigation using Deep Q-Networks (DQN), PyTorch, replay buffer, target networks, logging, and full test suite. Built for PhD-level RL and autonomous systems research.
Simulating the autonomous ship navigation in a gridworld.
Experimental AlphaZero-style RL agent for optimizing strategies in the StarCraft II Arcade map 'New Random Tower Defense'.
An implementation of Value Iteration and Policy Iteration to solve a stochastic, grid-based Markov Decision Process (MDP), using the Gridworld environment.
Accelerated minigrid environments with JAX
RL Maze Project
Implementation of Q-learning to solve GridWorld
Add a description, image, and links to the gridworld topic page so that developers can more easily learn about it.
To associate your repository with the gridworld topic, visit your repo's landing page and select "manage topics."