When Large Language Model based Agent Meets User Behavior Analysis:
A Novel User Simulation Paradigm

User behavior analysis is crucial in human-centered AI applications. In this field, the collection of sufficient and high-quality user behavior data has always been a fundamental yet challenging problem. An intuitive idea to address this problem is automatically simulating the user behaviors. However, due to the subjective and complex nature of human cognitive processes, reliably simulating the user behavior is difficult. Recently, large language models (LLM) have obtained remarkable successes, showing great potential to achieve human-like intelligence. We argue that these models present significant opportunities for reliable user simulation, and have the potential to revolutionize traditional study paradigms in user behavior analysis. In this paper, we take recommender system as an example to explore the potential of using LLM for user simulation. Specifically, we regard each user as an LLM-based autonomous agent, and let different agents freely communicate, behave and evolve in a virtual simulator called RecAgent. For comprehensively simulation, we not only consider the behaviors within the recommender system (e.g., item browsing and clicking), but also accounts for external influential factors, such as, friend chatting and social advertisement. Our simulator contains at most 1000 agents, and each agent is composed of a profiling module, a memory module and an action module, enabling it to behave consistently, reasonably and reliably. In addition, to more flexibly operate our simulator, we also design two global functions including real-human playing and system intervention. To evaluate the effectiveness of our simulator, we conduct extensive experiments from both agent and system perspectives. You can get the introduction video of RecAgent through the Baidu Netdisk(Password: 8sjj) or Google Drive.

Figure 1: RecAgent Framework

🔥 News

[9/18/2023] RecAgent v2.0 is released on arXiv with following updates:
- 👥 More Agents: from 25 to at most 1000
- 🧠 Human-like Memory Mechanism
- 📖 Comprehensive Experiment
- 🕹 System-level and Agent-level Intervention
- 🙋‍♂️ Human Involved Simulation
[6/5/2023] RecAgent v1.0 is released on arXiv: When Large Language Model based Agent Meets User Behavior Analysis: A Novel User Simulation Paradigm

💡 Features

Novel User Simulation Paradigm: RecAgent is a novel user simulation paradigm that combines large language models (LLMs) with user behavior analysis. It is designed to simulate the behavior of real users in a recommender system. It also replicates users' social interactions, including chatting and posting. RecAgent can be used to evaluate the performance of recommender systems and to generate user behavior data for training and testing.
Human-like Memory Mechanism: RecAgent's memory mechanism mimics human cognitive processes, divided into sensory memory, short-term memory, and long-term memory. Through efficient compression and scoring of observations, it ensures the relevance and timeliness of information. By combining relevant memory with high-level insights, it achieves more realistic and consistent simulation of user behaviors.
Large Scale Multi-Agent Simulation: RecAgent is designed for robust, large-scale multi-agent parallel simulations. Seamlessly integrated with multiple OpenAI APIs, it support the capacity to simultaneously engage up to 1000 agents in its simulations.
Multi-dimensional Evaluation: We conducted an evaluation from two perspectives, that of the agent and that of the system. From the perspective of the agent, we assessed the effectiveness of various types of memory and evaluated whether the agent could access information-rich and relevant memories based on different memory structures. From the system's perspective, we focused on the reliability of the user behavior generated by our system and assessed the efficiency of the simulation.
High Extensibility: RecAgent is designed to support a variety of recommendation algorithms and is equally compatible with multiple LLMs. Both aspects come with interfaces that are easy to customize.
Human-in-the-loop Simulation: Individuals can create their own Agent to role-play and participate in the entire simulation process. Real people can take the same actions as other Agents, such as watching recommended movies, chatting with other Agents, or posting publicly.
Flexible Intervention and Control: RecAgent provides diverse means to intervene with Agents and monitor the results. An Agent's profile can be freely modified. Moreover, people can intervene with Agents through dialogue. RecAgent also supports saving checkpoints, facilitating retrospective analysis and counterfactual studies.

🛠 Installation

To install RecAgent, one can follow the following steps:

Clone the repository:

git clone https://github.com/RUC-GSAI/YuLan-Rec.git

Install the required dependencies:
```
pip install -r requirements.txt
```
Using pip to install faiss may report an error. One solution is to use conda install faiss-cpu -c pytorch.
Set your OpenAI API key in the config/config.yaml file.

📦 Usage

To start using RecAgent, follow these steps:

Configure the simulation parameters in the config/config.yaml file.

# path for item data
item_path: data/item.csv
# path for user data
user_path: data/user_1000.csv
# path for relationship data
relationship_path: data/relationship_1000.csv
# path for save interaction records
interaction_path: data/interaction.csv
# directory name for faiss index
index_name: data/faiss_index
# simulator directory name for saving and restoring
simulator_dir: data/simulator
# simulator restoring name
simulator_restore_file_name: 
# recommender system model
rec_model: Random
# number of epochs
epoch: 30
# number of agents, which cannot exceed the number of user in user.csv
agent_num: 10
# number of items to be recommended in one page
page_size: 5
# temperature for LLM
temperature: 0.3
# maximum number of tokens for LLM
max_token: 1500
# execution mode, serial or parallel
execution_mode: serial
# time interval for action of agents. The form is a number + a single letter, and the letter may be s, m, h, d, M, y
interval: 5h
# name of LLM model
llm: gpt-3.5
# number of max retries for LLM API
max_retries: 100
# verbose mode
verbose: True
# threshold for agent num
active_agent_threshold: 100
# method for choose agent to be active, random, sample, or marginal
active_method: random
# propability for agent to be active
active_prob: 1
# memory for recagent, recagent or none
recagent_memory: recagent
# whether to add role play
play_role: False
# list of api keys for LLM API
api_keys:
  - xxxxx
  - xxxxx

The first agent_num users in user.csv will be loaded. Note that the agent_num in config.yaml cannot exceed the total number of users in user.csv.

1. Example

Run the simulation script:

python -u simulator.py --config_file config/config.yaml --output_file messages.json --log_file simulation.log

config_file is the path of the configuration file. output_file is the path of the output file, which is a JSON file containing all messages generated during the simulation. log_file set the path of the log file.

We will obtain some output like:

INFO:1422158:os.getpid()=1422158
INFO:1422158:
active_agent_threshold: 100
active_method: random
active_prob: 1
agent_num: 10
api_keys: ['xxxxx']
epoch: 30
execution_mode: serial
index_name: data/faiss_index
interaction_path: data/interaction.csv
interval: 5h
item_path: data/item.csv
llm: gpt-3.5
log_file: test.log
log_name: 1422158
max_retries: 100
max_token: 1500
output_file: output/message/test.json
page_size: 5
play_role: False
rec_model: Random
recagent_memory: recagent
relationship_path: data/relationship_1000.csv
simulator_dir: data/simulator_debug
simulator_restore_file_name: None
temperature: 0.3
user_path: data/user_1000.csv
verbose: True
Load faiss db from local

  0%|          | 0/10 [00:00<?, ?it/s]
100%|██████████| 10/10 [00:00<00:00, 3275.52it/s]
INFO:1422158:Simulator loaded.
INFO:82754:Round 1
  0%|                  | 0/10 [00:00<?, ?it/s]
INFO:1422158:Sarah Miller enters the recommender system.
INFO:1422158:Sarah Miller is recommended ['<Casper>;;The movie <Casper > is about a friendly ghost named Casper who lives in a haunted mansion with his three mischievous uncles.', '<The Goonies>;;<The Goonies> is a 1985 adventure-comedy film directed by Richard Donner and produced by Steven Spielberg.', '<One False Move>;;<One False Move> is a crime thriller movie released in 1991.', '<What About Bob?>;;<What About Bob?> is a comedy film about a man named Bob Wiley (played by Bill Murray) who has multiple phobias and anxiety disorders.', '<Phantasm III: Lord of the Dead>;;<Phantasm III: Lord of the Dead> is a horror movie that follows the story of Mike, who is on a mission to stop the Tall Man, a supernatural entity who is responsible for the death of his family.'].
INFO:1422158:Sarah Miller watches ['Casper']

2. Website Demo

Figure 2: RecAgent Interface

Click the Start button to start the simulation. The simulation will run for epoch rounds. You can click the Pause button to pause and the Reset button to reset the simulation. The demo supports both serial and parallel execution modes. The agent's profile, activity status, system data, and log information are all displayed in real-time on the interface.

📚 Data

The item data used in the simulation is from MovieLens-1M. User profiles and relationships between users are fabricated for simulation purposes. We provide the FAISS index of the item data in directory faiss_index.

💪 Maintainers

@Lei Wang @Jingsen Zhang @Hao Yang @Zhi-Yuan Chen @Jiakai Tang @Zeyu Zhang

📖 License

RecAgent uses MIT License. All data and code in this project can only be used for academic purposes.

📄 Citation

Please cite the following paper as the reference if you use our code.

@misc{wang2023large,
      title={When Large Language Model based Agent Meets User Behavior Analysis: A Novel User Simulation Paradigm}, 
      author={Lei Wang and Jingsen Zhang and Hao Yang and Zhiyuan Chen and Jiakai Tang and Zeyu Zhang and Xu Chen and Yankai Lin and Ruihua Song and Wayne Xin Zhao and Jun Xu and Zhicheng Dou and Jun Wang and Ji-Rong Wen},
      year={2023},
      eprint={2306.02552},
      archivePrefix={arXiv},
      primaryClass={cs.IR}
}

Name		Name	Last commit message	Last commit date
Latest commit History 163 Commits
agents		agents
asset		asset
config		config
data		data
llm		llm
recommender		recommender
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
backend.py		backend.py
backend.sh		backend.sh
demo.py		demo.py
nohup.sh		nohup.sh
requirements.txt		requirements.txt
run_demo.py		run_demo.py
simulator.py		simulator.py
simulator.sh		simulator.sh
test.js		test.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

When Large Language Model based Agent Meets User Behavior Analysis:
A Novel User Simulation Paradigm

🔥 News

Table of Contents

💡 Features

🛠 Installation

📦 Usage

1. Example

2. Website Demo

📚 Data

💪 Maintainers

📖 License

📄 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

When Large Language Model based Agent Meets User Behavior Analysis: A Novel User Simulation Paradigm

🔥 News

Table of Contents

💡 Features

🛠 Installation

📦 Usage

1. Example

2. Website Demo

📚 Data

💪 Maintainers

📖 License

📄 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

When Large Language Model based Agent Meets User Behavior Analysis:
A Novel User Simulation Paradigm

Packages