🧠 LLM vs Human Decision Making

Research Implementation: Large Language Models are Near-Optimal Decision-Makers with a Non-Human Learning Behavior

This repository contains the implementation for comparing decision-making behaviors between Large Language Models (LLMs) and humans across three fundamental dimensions: uncertainty, risk, and set-shifting.

📋 Abstract

Human decision-making belongs to the foundation of our society and civilization, but we are on the verge of a future where much of it will be delegated to artificial intelligence. The arrival of Large Language Models (LLMs) has transformed the nature and scope of AI-supported decision-making; however, the process by which they learn to make decisions, compared to humans, remains poorly understood. In this study, we examined the decision-making behavior of five leading LLMs across three core dimensions of real-world decision-making: uncertainty, risk, and set-shifting. Using three well-established experimental psychology tasks designed to probe these dimensions, we benchmarked LLMs against 360 newly recruited human participants. Across all tasks, LLMs often outperformed humans, approaching near-optimal performance. Moreover, the processes underlying their decisions diverged fundamentally from those of humans. On the one hand, our finding demonstrates the ability of LLMs to manage uncertainty, calibrate risk, and adapt to changes. On the other hand, this disparity highlights the risks of relying on them as substitutes for human judgment, calling for further inquiry.

🧪 Experimental Tasks

1. 🎰 Iowa Gambling Task

Decision-making under uncertainty

2. 🎲 Cambridge Gambling Task

Decision-making under risk.

3. 🃏 Wisconsin Card Sort Test

Decision-making under set-shifting.

✨ Features

🤖 Multi-LLM Support: Integration with OpenAI, Anthropic, Google, and Ollama
👥 Human & AI Participants: Seamless comparison between human and LLM decision-making
📊 Comprehensive Analytics: Detailed data collection and analysis tools
🎨 Modern UI: Clean, responsive web interface built with oTree
⚡ High Performance: Concurrent processing with configurable worker pools

🚀 Quick Start

Prerequisites

Python 3.8 or higher
API keys for LLM providers (OpenAI, Anthropic, etc.)

Installation

Clone the repository

git clone https://github.com/your-username/LLM_vs_Human_Decision_Making.git
cd LLM_vs_Human_Decision_Making

Install dependencies
```
pip install -r requirements_base.txt
```

Configure environment

Create a .ENV file in the root directory:

OPENAI_API_KEY=your_openai_key_here
ANTHROPIC_API_KEY=your_anthropic_key_here
GOOGLE_API_KEY=your_google_key_here

Running the Experiments

🖥️ Standard Mode (Human Participants)

./run_otree.sh

🤖 Bot Mode (LLM Participants)

./run_otree_with_bot.sh task_name

🧹 Cleanup Processes

./run_clean_otree_process.sh

Access the interface at http://localhost:8000

📁 Project Structure

LLM_vs_Human_Decision_Making/
├── 🎰 iowa_gambling_task/          # Iowa Gambling Task implementation
├── 🎲 cambridge_gambling_task/     # Cambridge Gambling Task implementation  
├── 🃏 wisconsin_card_sort_test_task/ # Wisconsin Card Sort Test implementation
├── 📝 info_collector/              # Participant data collection
├── 🤖 LLM_utils/                   # LLM integration utilities
├── 🛠️ otree_utils/                 # oTree framework utilities
├── 🎨 _templates/                  # HTML templates
├── 📊 _static/                     # Static assets (CSS, JS, images)
├── ⚙️ settings.py                  # Main configuration
├── 📋 settings_sessions.py         # Session configurations
└── 📚 requirements_base.txt        # Python dependencies

📊 Data & Analysis

Analytical Code and Data: The complete analytical code and experimental data supporting this research are available on the Open Science Framework:

🔗 OSF Repository: Analytical Code and Data

⚙️ Configuration

Session Settings

Edit settings_sessions.py to customize experimental parameters:

# Example configuration for Iowa Gambling Task
dict(
    name='iowa_gambling_task',
    num_demo_participants=30,
    init_money=2000.0,
    card_rewards=[100, 100, 50, 50],
    use_language_model=True,
    language_model='gpt-4o',
    temperature=1.0,
    total_interactions=80,
    # ... more parameters
)

Supported LLM Models

The system supports a wide range of LLM providers and models:

OpenAI: gpt-4o, gpt-o4-mini
Claude: claude-3-5-sonnet
Gemini: gemini-1.5-pro
DeepSeek: Deepseek-r1

For the complete list of supported models, see LLM_utils/llm/__init__.py

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Built with the oTree framework
Inspired by classic experimental psychology paradigms
Thanks to all human participants

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 LLM vs Human Decision Making

📋 Abstract

🧪 Experimental Tasks

1. 🎰 Iowa Gambling Task

2. 🎲 Cambridge Gambling Task

3. 🃏 Wisconsin Card Sort Test

✨ Features

🚀 Quick Start

Prerequisites

Installation

Running the Experiments

📁 Project Structure

📊 Data & Analysis

⚙️ Configuration

Session Settings

Supported LLM Models

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LLM_utils		LLM_utils
_static/global		_static/global
_templates/global		_templates/global
cambridge_gambling_task		cambridge_gambling_task
info_collector		info_collector
iowa_gambling_task		iowa_gambling_task
otree_utils		otree_utils
wisconsin_card_sort_test_task		wisconsin_card_sort_test_task
.gitignore		.gitignore
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
manage.py		manage.py
requirements.txt		requirements.txt
requirements_base.txt		requirements_base.txt
run_clean_otree_process.sh		run_clean_otree_process.sh
run_otree.sh		run_otree.sh
run_otree_with_bot.sh		run_otree_with_bot.sh
settings.py		settings.py
settings_sessions.py		settings_sessions.py

License

ynulihao/LLM_vs_Human_Decision_Making

Folders and files

Latest commit

History

Repository files navigation

🧠 LLM vs Human Decision Making

📋 Abstract

🧪 Experimental Tasks

1. 🎰 Iowa Gambling Task

2. 🎲 Cambridge Gambling Task

3. 🃏 Wisconsin Card Sort Test

✨ Features

🚀 Quick Start

Prerequisites

Installation

Running the Experiments

📁 Project Structure

📊 Data & Analysis

⚙️ Configuration

Session Settings

Supported LLM Models

📄 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages