Shadow Heist OpenEnv Environment

ShadowHeistEnv is a grid-based stealth RL environment built around partial observability, autonomous guard agents, shaped rewards, and optional Hugging Face action selection.

Files

env.py: main environment with reset(), step(action), and state()
tasks.py: task definitions for collect_one, safe_heist, and perfect_escape
graders.py: scoring functions returning values in [0.0, 1.0]
agent.py: Hugging Face pipeline decision agent with rule-based fallback
gradio_ui.py: Gradio-based UI for playing the game manually or via the agent
run_shadow_heist.py: example loop that runs the environment and prints rewards/states
openenv.yaml: OpenEnv manifest pointing at env:ShadowHeistEnv

Game Design

The player controls a thief on an N x N grid.
Guards patrol independently and switch to chase behavior once they detect the player.
Treasures must be stolen with the steal action.
The exit ends the run successfully once at least one treasure has been secured.
The observation is partially observable through a visibility radius and masked grid.
hide enables stealth mode and lowers guard detection chance.
The Gradio UI keeps three difficulty levels:
- easy: 6x6, 1 guard, 2 treasures, 80 max steps
- medium: 8x8, 2 guards, 3 treasures, 100 max steps
- hard: 10x10, 3 guards, 4 treasures, 120 max steps

Action Space

move_up
move_down
move_left
move_right
hide
steal
wait

Rewards

+10 successful steal
+1 safe movement
-5 detected by a guard
-50 caught
+100 successful escape
+0.1 exploration bonus for reaching a new cell

Install

python -m pip install -e .

Optional Hugging Face integration:

python -m pip install -e .[huggingface]

Optional Gradio UI:

python -m pip install -e .[ui]

Example

from agent import ShadowHeistDecisionAgent
from env import ShadowHeistEnv

env = ShadowHeistEnv(grid_size=8, num_guards=2, num_treasures=3, seed=7)
agent = ShadowHeistDecisionAgent(seed=7)

state = env.reset()
done = False

while not done:
    action = agent.decide_action(state)
    state, reward, done, info = env.step(action)
    print(action, reward, state["player_pos"], state["collected_treasures"])

Run the included script:

python run_shadow_heist.py

Run the Gradio UI:

python gradio_ui.py

OpenEnv Manifest

name: shadow_heist_env
entry_point: env:ShadowHeistEnv
tasks:
  - name: collect_one
    grader: graders:grade_easy
  - name: safe_heist
    grader: graders:grade_medium
  - name: perfect_escape
    grader: graders:grade_hard

Tests

pytest

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
__pycache__		__pycache__
card_battle_env		card_battle_env
openenv_card_battle.egg-info		openenv_card_battle.egg-info
server		server
shadow_heist_env.egg-info		shadow_heist_env.egg-info
tests		tests
README.md		README.md
agent.py		agent.py
env.py		env.py
graders.py		graders.py
gradio_ui.py		gradio_ui.py
openenv.yaml		openenv.yaml
pyproject.toml		pyproject.toml
run_shadow_heist.py		run_shadow_heist.py
tasks.py		tasks.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shadow Heist OpenEnv Environment

Files

Game Design

Action Space

Rewards

Install

Example

OpenEnv Manifest

Tests

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Shadow Heist OpenEnv Environment

Files

Game Design

Action Space

Rewards

Install

Example

OpenEnv Manifest

Tests

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages