0% found this document useful (0 votes)
223 views35 pages

Agent Survey

This paper surveys Large Model (LM) agents, which are pivotal in advancing Artificial General Intelligence (AGI) through their autonomy, embodiment, and connectivity. It discusses their architecture, cooperation paradigms, and the security and privacy challenges they face, while also outlining future research directions for creating robust LM agent ecosystems. The document highlights the significant potential and applications of LM agents across various fields, emphasizing the need for addressing vulnerabilities and ensuring privacy in their deployment.

Uploaded by

Jing Ma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
223 views35 pages

Agent Survey

This paper surveys Large Model (LM) agents, which are pivotal in advancing Artificial General Intelligence (AGI) through their autonomy, embodiment, and connectivity. It discusses their architecture, cooperation paradigms, and the security and privacy challenges they face, while also outlining future research directions for creating robust LM agent ecosystems. The document highlights the significant potential and applications of LM agents across various fields, emphasizing the need for addressing vulnerabilities and ensuring privacy in their deployment.

Uploaded by

Jing Ma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

1

Large Model Agents: State-of-the-Art, Cooperation


Paradigms, Security and Privacy, and Future Trends
Yuntao Wang† , Yanghe Pan† , Quan Zhao† , Yi Deng† , Zhou Su† , Linkang Du† , and Tom H. Luan†
† School of Cyber Science and Engineering, Xi’an Jiaotong University, Xi’an, China

Abstract—Large Model (LM) agents, powered by large foundation OpenAI ChatGPT (2022) Evolution towards
LM OpenAI GPT-4 (2023)
models such as GPT-4 and DALL-E 2, represent a significant OpenAI Sora (2024) Autonomous,
step towards achieving Artificial General Intelligence (AGI). LM Microsoft Xiaoice (2014) Embodied,
Deepmind AlphaGo (2016) AGI
agents exhibit key characteristics of autonomy, embodiment, and Connected
Deepmind AlphaZero (2017) LM Agent
arXiv:2409.14457v1 [cs.AI] 22 Sep 2024

connectivity, allowing them to operate across physical, virtual, LM Agents


and mixed-reality environments while interacting seamlessly with
DL Agent
humans, other agents, and their surroundings. This paper provides IBM Deep Blue (1997)
AutoGen (2024)
a comprehensive survey of the state-of-the-art in LM agents, focusing ML Agent AutoGPT (2024)
on the architecture, cooperation paradigms, security, privacy, and BabyAGI (2024)
ChatDev (2024)
future prospects. Specifically, we first explore the foundational prin- Turing Initial stage
ciples of LM agents, including general architecture, key components, machine
Symbolic agents
enabling technologies, and modern applications. Then, we discuss Reactive agents
practical collaboration paradigms from data, computation, and
knowledge perspectives towards connected intelligence of LM agents. Fig. 1: Evolution history of AI Agents. 1) Initial stage: Early AI
Furthermore, we systematically analyze the security vulnerabilities research primarily focused on logical reasoning and rule-based
and privacy breaches associated with LM agents, particularly in AI agents. 2) Machine Learning (ML) agent stage: ML including
multi-agent settings. We also explore their underlying mechanisms supervised and unsupervised learning advanced the progress of
and review existing and potential countermeasures. Finally, we
outline future research directions for building robust and secure AI agents. In 1997, IBM’s Deep Blue defeated the world chess
LM agent ecosystems. champion. 3) Deep Learning (DL) agent stage: The combination
of DL and big data significantly improved AI performance. In
Index Terms—Large model, AI agents, embodied intelligence,
multi-agent collaboration, security, privacy. 2016, Deepmind’s AlphaGo defeated Go world champion Lee
Sedol. 4) Large Model (LM) agent era: Transformer-based LMs
such as OpenAI’s ChatGPT and GPT-4 revolutionized AI agents,
I. I NTRODUCTION ushering the era of LM agents and bringing us closer to AGI.
A. Background of Large Model Agents
In the 1950s, Alan Turing introduced the famous Turing Test further, it is crucial to develop powerful foundational models that
to assess whether machines could exhibit intelligence comparable integrate these critical attributes, offering a versatile basis for
to that of humans, which laid the foundation in the evolution of next-generation AI agents.
Artificial Intelligence (AI). These artificial entities, commonly With the rise of Large Models (LMs), also known as large
known as “agents”, serve as the core components of AI sys- foundation models, such as OpenAI GPT-4o, Google PaLM
tems. Generally, AI agents are autonomous entities capable of 2, and Microsoft Copilot, LMs open up new possibilities in
understanding and responding to human inputs, perceiving their comprehensively enhancing the inherent capabilities of AI agents
environment, making decisions, and taking actions in physical, [6], [7]. As illustrated in Fig. 2, an LM agent, either in software
virtual, or mixed-reality settings to achieve their goals [1]. AI or embodied form, generally consist of four key components:
agents range from simple bots that follow predefined rules to planning, action, memory, and interaction. These agents can
complex and autonomous entities that learn and adapt through seamlessly operate within physical, virtual, or mixed-reality en-
experience [2]. They can be software-based or physical entities, vironments [1], [8]–[10]. Particularly, LMs serve as the “brain”
functioning independently or in collaboration with humans or of AI agents and empower them with powerful capabilities in
other agents. human-machine interaction (HMI), complex pattern recognition,
Since the mid-20th century, significant progress has been knowledge retention, reasoning, long-term planning, generaliza-
made in the development of AI agents [3]–[5], such as Deep tion, and adaptability [9]. Moreover, via advanced reasoning
Blue, AlphaGo, and AlphaZero, as shown in Fig. 1. Despite and few/zero-shot planning techniques such as Chain-of-Thought
these advances, prior research primarily concentrated on refining (CoT) [11], Tree-of-Thought (ToT) [12], and reflection [13],
specialized abilities such as symbolic reasoning or excelling LM agents can form intricate logical connections, enabling them
in certain tasks such as Go or Chess, often neglecting the to solve complex, multifaceted tasks effectively. For example,
cultivation of general-purpose capabilities within AI models such AutoGPT [14], a promising LLM agent prototype, can decompose
as long-term planning, multi-task generalization, and knowledge complex tasks into several manageable sub-tasks, facilitating
retention. The challenge of creating AI agents that can flexibly structured and efficient problem-solving. Integrating LMs with
adapt to a broad range of tasks and complex environments Retrieval-Augmented Generation (RAG) technologies [15] further
remains largely unexplored. To push the boundaries of AI agents allows agents to access external knowledge sources and enhance
2

Action Connected LM Agents


... Personalized
Cloud
travel plan Interaction
Tool of Tool for Tool to Tool to
Next month, I weather tourist get air search
will go Rome forecast guide flights map
for travel. My Virtual Env.
schedule is
@#!% Interaction Feedback
Brain
Planning alarm rain
Push to clock reminder
Inputs ... user's
intent voice text
LM Agent
mobile Real Env.
APPs Interaction
Memory tourist travel
... spot route
Historical User Long-term
...
dialogue preference knowledge Feedback
(a) Software-form LM agent as virtual assistant
Physical bodies Humans
Intra-agent Inter-agent
Action Large model comm. comm.
Room cleaning
... strategy
Cleanup the
room for me Env. Web API Embodied Fig. 3: Overview of LM agents. Each LM agent includes two
and @#!% perception tool actions
before xx parts: (i) the digital brain located in the cyberspace powered by
o'clock Interaction
1.search the surfaces
to be cleaned;
LMs such as GPT-4o, PaLM 2, and Copilot; and (ii) the physical
Planning
2. find the cleaning body such as autonomous vehicle, robot dog, and drone. Within
HMI Real env. Make tools in the home;
Inputs cleaning 3. obtain them while each LM agent, the digital brain synchronizes with its physical
... actions avoiding objects;
intent voice text LM Agent 4 . p l a n the cleaning body via intra-agent communications. LM agents communicate
sequence;
Memory
5. learn new cleaning
with one another in the cloud to share information and knowledge
... skill; via inter-agent communications, establishing a network of inter-
User Env. ...
preference context info Cognition connected intelligence. Each LM agent can dynamically interact
(b) Embodied LM agent as home cleaner with other agents, virtual/real environments, and humans. The
Fig. 2: Use cases of LM agents. (a) a software-form LM agent brain of each LM agent can be deployed either as a standalone
acting as the virtual assistant. (b) an embodied LM agent serving entity or in a hierarchical manner across various platforms such
as the home cleaner. as cloud servers, edge devices, or end devices.

the accuracy of their responses based on retrieved information. tasks [26]. By collaborating or competing with humans or other
Besides, LM agents can flexibly integrate a range of LMs, agents, LM agents can effectively enhance their decision-making
including Large Language Model (LLM) and Large Vision Model capabilities [27].
(LVM), to enable multifaceted capabilities. 2) Embodied Intelligence. Despite recent advancements, LMs
LM agents are recognized as a significant step towards achiev- typically passively respond to human commands in the text,
ing Artificial General Intelligence (AGI) and have been widely image, or multimodal domain, without engaging directly with
applied across fields such as web search [16], recommendation the physical world [7]. Embodied agents, on the other hand, can
systems [17], virtual assistants [18], [19], Metaverse gaming [20], actively perceive and act upon their environment, whether digital,
robotics [21], autonomous vehicles [22], and Electronic Design robotic, or physical, using sensors and actuators [21], [25]. The
Automation (EDA) [23]. As reported by MarketsandMarkets [24], shift to LM-empowered agents involves creating embodied AI
the worldwide market for autonomous AI and autonomous agents systems capable of understanding, learning, and solving real-
was valued at USD 48 billion in 2023 and is projected to grow world challenges. As depicted in Fig. 2(b), LM agents actively
at a CAGR of 43%, reaching USD 28.5 billion by 2028. LM interact with environments and adapt actions based on real-time
agents have attracted global attention, and leading technology feedbacks. For example, a household robot LM agent tasked with
giants including Google, OpenAI, Microsoft, IBM, AWS, Oracle, cleaning can generate tailored strategies by analyzing the room
NVIDIA, and Baidu are venturing into the LM agent industry. layout, surface types, and obstacles, instead of merely following
generic instructions.
B. Roadmap and Key Characteristics of LM Agents 3) Connected Intelligence. Connected LM agents extend be-
Fig. 3 illustrates a future vision of LM agents, characterized yond the capabilities of individual agents, playing a critical
by three key attributes: autonomous, embodied, and connected, role in tackling complex, real-world tasks [28]. For example, in
paving the way toward AGI. autonomous driving, connected autonomous vehicles, serving as
1) Autonomous Intelligence. Autonomous intelligence in LM LM agents, share real-time sensory data, coordinate movements,
agents refers to their ability to operate independently, making and negotiate passage at intersections to optimize traffic flow
proactive decisions without continuous human input. As depicted and enhance safety. As depicted in Fig. 3, by interconnecting
in Fig. 2(a), an LM agent can maintain an internal memory that numerous LM agents into the Internet of LM agents, connected
accumulates knowledge over time to guide future decisions and LM agents can freely share sensory data and task-oriented
actions, enabling continuous learning and adaptation in dynamic knowledge. By fully harnessing the computational power of
environments [25]. Additionally, LM agents can autonomously various specialized LMs, it fosters cooperative decision-making
utilize a variety of tools (e.g., search engines and external APIs) and collective intelligence. Theereby, the collaboration across
to gather information or or create new tools to handle intricate data, computation, and knowledge domains enhances individual
3

agent performance and adaptability. Additionally, these interac- adaptation tuning, utilization, and capacity assessment. Besides,
tions enable LM agents to form social connections and attributes, background information, mainstream technologies, and critical
contributing to the development of an agent society [29], [30]. applications of LLMs are introduced. Xu et al. [40] provide
a tutorial on key concepts, architecture, and metrics of edge-
C. Motivation for Securing Connected LM Agents cloud AI-Generated Content (AIGC) services in mobile networks,
and identify several use cases and implementation challenges.
Despite the bright future of LM agents, security and privacy
Huang et al. [1] offer a taxonomy of AI agents in virtual/physical
concerns remain significant barriers to their widespread adoption.
environments, discuss cognitive aspects of AI Agents, and survey
Throughout the life-cycle of LM agents, numerous vulnerabilities
the applications of AI agents in robotics, healthcare, and gaming.
can emerge, ranging from adversarial examples [31], agent poi-
Cheng et al. [10] review key components of LLM agents (in-
soning [32], LM hallucination [33], to pervasive data collection
cluding planning, memory, action, environment, and rethinking)
and memorization [34].
and their potential applications. Planning types, multi-role rela-
1) Security vulnerabilities. LM agents are prone to “hal-
tionships, and communication methods in multi-agent systems
lucinations”, where their foundational LMs generate plausible
are also reviewed. Masterman et al. [8] provide an overview of
but incorrect outputs not grounded in reality [33]. In multi-
single-agent and multi-agent architectures in industrial projects
agent settings, the hallucination phenomenon can propagate mis-
and present the insights and limitations of existing research.
information, compromise decision-making, cause task failures,
Guo et al. [41] discuss the four components (i.e., interface,
and pose risks to both physical entities and human. Moreover,
profiling, communication, and capabilities acquisition) of LLM-
maintaining the integrity and authenticity of sensory data and
based multi-agent systems and present two lines of applications
prompts used in training and inference is crucial, as biased or
in terms of problem solving and world simulation. Durante et
compromised inputs can lead to inaccurate or unfair outcomes
al. [42] introduce multimodal LM agents and a training frame-
[35]. Attacks such as adversarial manipulations [31], poisoning
work including learning, action, congnition, memory, action, and
[36], and backdoors [37] further threaten LM agents by allowing
perception. They also discuss the different roles of agents (e.g.,
malicious actors to manipulate inputs and deceive the models.
embodied, simulation, and knowledge inference), as well as
In collaborative environments, agent poisoning behaviors [32],
the potentials and experimental results in different applications
where malicious agents disrupt the behavior of others, can
including gaming, robotics, healthcare, multimodal tasks, and
undermine the collaborative systems. Additionally, integrating
Natural Language Processing (NLP). Hu et al. [20] outline six key
LM agents into Cyber-Physical-Social Systems (CPSS) expands
components (i.e., perception, thinking, memory, learning, action,
the attack surface, enabling adversaries to exploit vulnerabilities
and role-playing) of LLM-based game agents and review existing
within interconnected systems.
LLM-based game agents in six types of games. Xu et al. [43]
2) Privacy breaches. LM agents’ extensive data collection and
provide a comprehensive survey of enabling architectures and
memorization processes raise severe risks of data breaches and
challenges for LM agents in gaming. Qu et al. [44] provide a
unauthorized access. These agents often handle vast amounts of
comprehensive survey on integrating mobile edge intelligence
personal and sensitive business information for both To-Customer
(MEI) with LLMs, emphasizing key applications of deploying
(ToC) and To-Business (ToB) applications, raising concerns about
LLMs at the network edge along with state-of-the-art techniques
data storage, processing, sharing, and control [38]. Additionally,
in edge LLM caching, delivery, training, and inference.
LMs can inadvertently memorize sensitive details from their
training data, potentially exposing private information during Existing survey works on LM agents mainly focus on the
interactions [34]. Privacy risks are further compounded in multi- general framework design for single LLM agents and multi-agent
agent collaborations, where LM agents might inadvertently leak systems and their potentials in specific applications. Distinguished
sensitive information about users, other agents, or their internal from the above-mentioned existing surveys, this survey focuses
operations during communication and task execution. on the networking aspect of LM agents, including the general
architecture, enabling technologies, and collaboration paradigms
D. Related Surveys and Contributions to construct networked systems of LM agents in physical, virtual,
or mixed-reality environments. Moreover, with the advances of
Recently, LM agents have garnered significant interest across
LM agents, it is urgent to study their security and privacy in
academia and industry, leading to a variety of research explor-
future AI agent systems. This work comprehensively reviews
ing their potential from multiple perspectives. Notable survey
the security and privacy issues of LM agents and discusses the
papers in this field are as below. Andreas et al. [29] present a
existing and potential defense mechanisms, which are overlooked
toy experiment for AI agent construction and case studies on
in existing surveys. Table I compares the contributions of our
modeling communicative intentions, beliefs, and desires. Wang
survey with previous related surveys in the field of LM agents.
et al. [39] identify key components of LLM-based autonomous
agents (i.e., profile, memory, planning, and action) and the In this paper, we present a systematic review of the state-
subjective and objective evaluation metrics. Besides, they discuss of-the-arts in both single and connected LM agents, focusing
the applications of LLM agents in engineering, natural science, on security and privacy threats, existing and potential counter-
and social science. Xi et al. [9] present a general framework for measures, and future trends. Our survey aims to 1) provide a
LLM agents consisting of brain, action, and perception. Besides, broader understanding of how LM agents work and how they
they explore applications in single-agent, multi-agent, and human- interact in multi-agent scenarios, 2) examine the scope and impact
agent collaborations, as well as agent societies. Zhao et al. [2] of security and privacy challenges associated with LM agents
offer a systematic review of LLMs in terms of pre-training, and their interactions, and 3) highlight effective strategies and
4

TABLE I: A Comparison of Our Survey with Relevant Surveys


Section I: Introduction
Year. Refs. Contribution Background of LM Agents
A toy experiment for AI agent construction and case studies Roadmap and Key Characteristics of LM Agents
2022 [29]
in modeling communicative intentions, beliefs, and desires.
Motivation for Securing Connected LM Agents
Survey on key compoments and evaluation policies of LLM
2023 [39] agents, and applications in engineering, natural science, and Related Surveys and Contributions
social science. Paper Organization
Discussions on general framework for LLM agents and
2023 [9] applications of single-agent, multi-agent, and human-agent Section II: LM Agents: Working Principles
collaborations, as well as agent societies. Standards of LM Agents
Review on background, pre-training, adaptation tuning,
Architecture of Connected LM Agents
2023 [2] evaluation & utilization, capacity assessment, and critical
applications of LLMs. Key Components of LM Agent
Tutorial on key concepts, architecture, and metrics of Enabling Technologies of LM Agents
2024 [40] edge-cloud AIGC services in mobile networks, and Modern Prototypes & Applications of LM Agents
identify use cases and key implementation challenges.
Discussions on taxonomy of AI agents in virtual/physical Section III: Networking LM Agents: Paradigms
2024 [1] environments, cognitive aspects, and applications in robots, Overview of Interactions of Connected LM Agents
healthcare, and gaming.
Data Cooperation for LM Agents
Discuss key components and applications of LLM agents,
2024 [10] and review planning types, multi-role relationships, and Computation Cooperation for LM Agents
communication modes in multi-agent systems.
Knowledge Cooperation for LM Agents
Overview of single-agent and multi-agent architectures in
2024 [8]
industrial projects and insights and limitations of research. Section IV: Security Threats & Countermeasures to LM Agents
Discuss key components of LLM-based multi-agent system
2024 [41] Hallucination
and applications in problem solving and world simulation.
Discuss key concepts and the framework of multimodal LM Adversarial Attack
agents, the different roles of agents, and the potentials and Poisoning & Backdoor Attack
2024 [42]
experimental results in gaming, robotics, healthcare, NLP, Other Security Threats
and multimodality.
Summary and Lessons Learned
Discuss key components of LLM-based game agents and
2024 [20]
review existing approaches in six types of games. Section V: Privacy Threats & Countermeasures to LM Agents
Survey the enabling architectures and key challenges of LM
2024 [43] LM Memorization Risk
agents for games.
Survey on key applications of deploying LLMs at network LM Function & Prompt Stealing Attacks
2024 [44] edges and state-of-the-art techniques in edge LLM caching, Other Privacy Threats
delivery, training, and inference. Summary and Lessons Learned
Comprehensive survey of the fundamentals, security, and
privacy of connected LM agents, discussions on the general Section VI: Future Research Directions
architecture, enabling technologies, networking modes, and
Now Ours Energy-Efficient and Green LM Agents
collaboration paradigms of connected LM agents, discussions
on security/privacy threats, state-of-the-art solutions, and Fair and Explainable LM Agents
open research issues in connected LM agent design. Cyber-Physical-Social Secure LM Agent Systems
Value Ecosystem of LM Agents

Section XI: Conclusion


solutions for defending against these threats to safeguard LM
agents in various intelligent applications. The main contributions Fig. 4: Organization structure of this paper.
of this work are four-fold. and green LM agents, fair and explainable LM agents,
• We comprehensively review recent advances in LM agent cyber-physical-social secure agent systems, value networks
construction across academia and industry. We investigate of agent ecosystem, aiming to advance the efficiency and
the working principles of LM agents, including the general security of LM agents.
architecture, key components (i.e., planning, memory, action,
interaction, and security modules), and enabling technolo- E. Paper Organization
gies. The industrial prototypes and potential applications of The remainder of this paper is organized as below. Section II
LM agents are also discussed. discusses the working principles of single LM agents, while
• We systematically categorize interaction patterns for Section III presents the cooperation paradigms for connected LM
LM agents (i.e., agent-agent, agent-human, and agent- agents. Section IV and Section V introduce the taxonomy of
environment interactions) and their interaction types (i.e., security and privacy threats to LM agents, respectively, along
cooperation, partial cooperation, and competition). We ex- with state-of-the-art countermeasures. Section VI outlines open
plore practical collaboration paradigms of LM agents from research issues and future directions in the field of LM agents.
the aspects of data cooperation, computation cooperation, Finally, conclusions are drawn in Section VII. Fig. 4 depicts the
and knowledge cooperation. organization structure of this survey.
• We comprehensively analyze existing and potential security
and privacy threats, their underlying mechanisms, catego- II. L ARGE M ODEL AGENTS : W ORKING P RINCIPLES
rization, and challenges for both single and connected LM In this section, we first introduce existing standards of LM
agents. We also review state-of-the-art countermeasures and agents. Then, we discuss the general architecture of connected
examine their feasibility in securing LM agents. LM agents including key components, communication modes,
• Lastly, we discuss open research issues and point out future key characteristics, and enabling technologies. Next, we introduce
research directions from the perspectives of energy-efficient typical prototypes and discuss modern applications of LM agents.
5

TABLE II: Progress of Standards for LM Agents


Standard Publication Date Main Content
The natural language interface is defined, including various protocols and guidelines that enable applications
IEEE SA - P3394 2023-09-21
and agents to effectively communicate with LLM-enabled agents, such as API syntax and semantics.
The integration of LLMs with existing educational systems ensures that LLMs can seamlessly interact with AIS
IEEE SA - P3428 2023-12-06
while addressing issues of bias, transparency, and accountability within educational environments.

Applications Travel agent Coding agent Robot agent Defense agent ...

Kernel
LM Agent Kernel
LM Agent System Call Interface

Action Planning Interaction Security Memory


module module module module module
External
Tools Embodied Context Memory
actuator Agent Manager Access Manager Storage
Tool Scheduler Comm. Manager Storage
Manager Manager Manager

OS Kernel
OS System Call Interface

Process Scheduler Memory Manager Filesystem Devices Driver

Hardware CPU GPU Memory Disk Peripheral Devices

Fig. 5: Illustration of the OS architecture of LM agents [45], [46].

A. Standards of LM Agents also addressing issues of bias, transparency, and accountability


within educational contexts. The standard is intended to support
We briefly introduce two existing standards on LM agents:
the widespread and effective application of LLMs in the field
IEEE SA-P3394 and IEEE SA-P3428.
of education, thereby enabling more personalized, efficient, and
1) The IEEE SA - P3394 standard1 , launched in 2023, defines
ethically sound AI-driven educational experiences.
natural language interfaces to facilitate communication between
LLM applications, agents, and human users. This standard estab-
lishes a set of protocols and guidelines that enable applications B. Architecture of Connected LM Agents
and agents to effectively communicate with LLM-enabled agents.
These protocols and guidelines include, but are not limited to, API 1) Operating System (OS) of LM Agents: According to [45],
syntax and semantics, voice and text formats, conversation flow, [46], the OS architecture of LM agents consists of three layers:
prompt engineering integration, LLM thought chain integration, application, kernel, and hardware.
as well as API endpoint configuration, authentication, and autho- • The application layer hosts agent applications (e.g., travel,
rization for LLM plugins. The standard is expected to advance coding, and robot agents) and offers a SDK that abstracts
technological interoperability, promote AI industry development, system calls, simplifying agent development.
enhance the practicality and efficiency of LMs, and improve AI • The kernel layer includes the ordinary OS kernel and an
agent functionality and user experience. additional LM agent kernel, with a focus on without al-
2) The IEEE SA - P3428 standard2 , launched in 2023, aims tering the original OS structure. Key modules in the LM
to develop standards for LLM agents in educational applications. agent kernel [45], [46] include the agent scheduler for task
The primary goal is to ensure the interoperability of LLM agents planning and prioritization, context manager for LM status
across both open-source and proprietary systems. Key areas of management, memory manager for short-term data, storage
focus include the integration of LLMs with existing educational manager for long-term data retention, tool manager for
systems and addressing technical and ethical challenges. This in- external API interactions, and access manager for privacy
cludes ensuring that LLMs can seamlessly interact with other AI controls.
components, such as Adaptive Instructional Systems (AIS), while • The hardware layer comprises physical resources (CPU,
GPU, memory, etc.), which are managed indirectly through
1 https://standards.ieee.org/ieee/3394/11377/ OS system calls, as LM kernels do not interact directly with
2 https://standards.ieee.org/ieee/3428/11489/ the hardware.
6

Large models

LM Agent
Cloud Connected LM
Agents

Edge

Cyber world

Physical embodied
Belief Desire
forms of LM agents
Intention Large Model Technology
Human Modules
society
Planning Memory

Smart cities
Action Security
LM Agent

Interaction
Network infra.
Handheld devices
& wearables Computing infra.
Human world Engine Physical world

Fig. 6: General Architecture of Internet of LM agents in bridging the human, physical, and cyber worlds. The LM agent has five
constructing modules: planning, action, memory, interaction, and security modules. The engine of connected LM agents is empowered
by a combination of five cutting-edge technologies: large foundation models, knowledge-related technologies, interaction, digital twin,
and multi-agent collaboration. LM agents interact with humans through Human-Machine Interaction (HMI) technologies such as NLP,
with the assistance of handheld devices and wearables, to understand human intentions, desires, and beliefs. LM agent can synchronize
data and statuses between the physical body and the digital brain through digital twin technologies, as well as perceive and act upon
the surrounding virtual/real environment. LM agents can be interconnected in the cyber space through efficient cloud-edge networking
for efficient data and knowledge sharing to promote multi-agent collaboration.

2) Constructing Modules of LM Agents: According to [1], [8]– • Large foundation model such as GPT-4 and DALL-E 2
[10], there are generally five constructing modules of LM agents: serves as the brain of an LM agent, which enables high-
planning, action, memory, interaction, and security modules (de- level pattern recognition, advanced reasoning, and intelligent
tails in Sect. II-C). These modules together enable LM agents to decision-making, providing the cognitive capabilities of LM
perceive, plan, act, learn, and interact efficiently and securely in agents [6], [7].
complex and dynamic environments. • Knowledge-related technologies enhance LM agents by in-
• Empowered by LMs, the planning module produces strate- corporating Knowledge Graphs (KGs), knowledge bases,
gies and action plans with the help of the memory module, and RAG systems, allowing agents to access, utilize, and
enabling informed decision-making [7], [10]. manage vast external knowledge sources, ensuring informed
• The action module executes these embodied actions, adapt- and contextually relevant actions [47].
ing actions based on real-time environmental feedback to • HMI technologies enable seamless interaction between hu-
ensure contextually appropriate responses [9], [42]. mans and agents through NLP, multimodal interfaces, and
• The memory module serves as a repository of accumulated Augmented/Virtual/Mixed Reality (AR/VR/MR), facilitating
knowledge (e.g., past experiences and external knowledge), dynamic and adaptive interactions [48].
facilitating continuous learning and improvement [10]. • Digital twin technologies allows efficient and seamless syn-
• The interaction module enables effective communication and chronization of data and statuses between the physical body
collaboration with humans, other agents, and environment. and the digital brain of an LM agent through intra-agent
• The security module is integrated throughout LM agents’ communications [49].
operations, ensuring active protection against threats and • Multi-agent collaboration technologies empower LM agents
maintaining integrity and confidentiality of data and pro- to work together efficiently, sharing data, resources, and
cesses. tasks to tackle complex problems by developing cooperation,
3) Engine of LM Agents: The engine of LM agents is powered competition, and coopetition strategies through inter-agent
by a combination of cutting-edge technologies including large communications [28].
foundation models, knowledge-related technologies, interaction, 4) Communication Mode of LM Agents: Every LM agent
digital twin, and multi-agent collaboration (details in Sect. II-D). consists of two parts: (i) the LM-empowered brain located in the
7

cloud, edge servers, or end devices and (ii) the corresponding feedback- Multi-
enhanced persona self-
physical body such as autonomous vehicle. Every LM agent planning planning
can actively interact with other LM agents, the virtual/real en- feedback-
Grounded
free planning
vironment, and humans. For connected LM agents, there exist planning

two typical communication modes: intra-agent communications embodied


Planning security
actions module
for seamless data/knowledge synchronization between brain and
physical body within an LM agent, and inter-agent communi-
privacy
cations for efficient coordination between LM agents. Table III learn to
use tools Action Security
summarizes the comparison of the two communication modes. module module

• Intra-agent communications refer to the internal


LM trust

data/knowledge exchange within a single LM agent. This make Agent


tools
type of communication ensures that different components ethics

of the LM agent, including planning, action, memory,


interaction, and security modules, work in harmony. For Memory Interaction
short-term agent-env.
module module
interaction
example, an LM agent collects multimodal sensory data memory

through its physical body, which then communicates the


agent-human
interpreted information to the LM-empowered brain. The long-term
interaction
memory hybrid agent-agent
planning module in the brain formulates a response or memory interaction

action plan, which is then executed by the action module.


This seamless flow of information is critical for maintaining Fig. 7: Illustration of five constructing modules (i.e., planning,
the LM agent’s functionality, coherence, and responsiveness action, memory, interaction, and security) of connected LM
in real-time and dynamic scenarios. agents including their key components.
• Inter-agent communications involve information and knowl-
edge exchange between multiple LM agents. It enables col- ensures that LM agents not only react to immediate stimuli but
laborative task allocation, resource sharing, and coordinated also refine their embodied actions over time, achieving more
actions among agents to foster collective intelligence. For sophisticated and effective solutions.
example, in a smart city application, various LM agents 7) Information Flow Between Cyber World and LM Agents: In
managing traffic lights, public transportation, and emergency the cyber world, LM agents are interconnected into the Internet of
services share real-time data to optimize urban mobility LM agents through efficient cloud-edge networking, facilitating
and safety. Effective inter-agent communications rely on seamless data and knowledge sharing that promotes multi-agent
standardized protocols to ensure compatibility and interop- collaboration. By deploying LMs across both cloud and edge
erability, facilitating efficient and synchronized operations infrastructures, it allows LM agents to leverage the strengths
across the network of LM agents. of both cloud and edge computing for optimized performance
5) Information Flow Between Human World and LM Agents: and responsiveness [51]. The cloud provides substantial computa-
Humans interact with LM agents through natural language, mo- tional power and storage, enabling the processing of vast amounts
bile smart devices, and wearable technology, enabling LM agents of data and the training of sophisticated models. Meanwhile, the
to comprehend human instructions and address real-world issues edge offers real-time data processing capabilities closer to the
effectively. LM agents, in turn, acquire new knowledge and data source, reducing latency and ensuring timely decision-making.
from human inputs, which aids in their continuous improvement In the Internet of LM agents, LM agents can collaboratively
and learning. This ongoing process of updating and optimizing share data, knowledge, and learned experiences with others in
their models allows LM agents to provide increasingly accurate real-time, creating a robust and adaptive network of intelligence
and useful information. In AR and VR environments, LM agents across multiple domains. For example, in a smart city, embodied
can work collaboratively with human users in virtual settings, LM agents in various locations can work together to optimize
such as architectural design, for enhanced overall efficiency and traffic flow, manage energy resources, and enhance public safety
creativity [50]. by sharing real-time data and coordinating their actions.
6) Information Flow Between Physical World and LM Agents:
Empowered by digital twin technologies, LM agents are allowed C. Key Components of LM Agent
to synchronize data and statuses between their physical bodies
and their digital brains, creating a seamless interaction loop. As depicted in Fig. 7, it generally contains five key modules
LM agents can also monitor and act upon real-time inputs from to construct a connected LM agent [1], [8]–[10].
their environments. This bidirectional synchronization allows LM 1) Planning Module: The planning module serves as the
agents to perceive and respond to their surrounding environ- core of an LM agent [7], [10]. It utilizes advanced reasoning
ments—whether virtual or real—with a high degree of precision techniques to enable LM agents to devise efficient and effective
and responsiveness, thus bridging the gap between the digital and solutions to complex problems. The working modes of the
physical realms. By continuously learning from environmental planning module include the following types.
feedbacks, LM agents can accumulate knowledge and develop an • Feedback-free planning: The planning module enables LM
understanding of physical laws, which empowers them to solve agents to understand the complex problems and find reliable
complex real-world problems. This iterative learning process solutions by breaking them down into necessary steps or
8

TABLE III: A Summary of Intra-Agent And Inter-Agent Communications for Connected LM Agents
Intra-agent Comm. Inter-agent Comm.
Involved Entity Brain←→Physical Body Brain←→Brain
Connection Type Within a single LM agent Among multiple LM agents
Support Two-way Communication ✔ ✔
Support Multimodal Interaction ✔ ✔
Support Semantic Communication ✔ ✔
Typical Communication Environment Wireless Wired

manageable sub-tasks [7], [14]. For example, CoT [11] is a library of executable code for complex behaviors, and an
popular sequential reasoning approach where each thought iterative prompting mechanism that refines programs based
builds directly on the previous one. It represents the step- on feedback. Wang et al. [57] further propose an interactive
by-step logical thinking and can enhance the generation of describe-explain-plan-select (DEPS) planning approach that
coherent and contextually relevant responses. ToT [12] or- improves LLM-generated plans by integrating execution
ganizes reasoning as a tree-like structure, exploring multiple descriptions, self-explanations, and a goal selector that ranks
paths simultaneously. In ToT, Each node represents a partial sub-goals to refine planning. Additionally, Song et al. [7]
solution, allowing the model to branch and backtrack to find present a grounded re-planning algorithm which dynamically
the optimal answer. Graph of Thought (GoT) [52] models updates high-level plans during task based on environmental
reasoning using an arbitrary graph structure, allowing more perceptions, triggering re-planning when actions fail or after
flexible information flow. GoT captures complex relation- a specified time.
ships between thoughts, enhancing the model’s problem- 2) Memory Module: The memory module is integral to LM
solving capabilities. AVIS [53] further refines the tree search agent’s ability to learn and adapt over time [39]. It maintains an
process for visual QA tasks using a human-defined transition internal memory that accumulates knowledge from past interac-
graph and enhances decision-making through a dynamic tions, thoughts, actions, observations, and experiences with users,
prompt manager. other agents, and the environments. The stored information guides
• Feedback-enhanced planning: To make effective long-term future decisions and actions, allowing the agent to continuously
planning in complex tasks, it is necessary to iteratively refine its knowledge and skills. This module ensures that the agent
reflect on and refine execution plans based on past ac- can remember and apply past lessons to new situations, thereby
tions and observations [39]. The goal is to correct past improving its long-term performance and adaptability [10]. There
errors and improve final outcomes. For example, ReAct are various memory formats such as natural language, embed-
[54] combines reasoning and acting by prompting LLMs to ded vectors, databases, and structured lists. Additionally, RAG
generate reasoning traces and actions simultaneously. This technologies [15] are employed to access external knowledge
dual approach allows the LLM to create, monitor, and adjust sources, further enhancing the accuracy and relevance of LM
action plans, while task-specific actions enhance interaction agent’s planning capabilities. In the literature [10], [39], memory
with external sources, improving response accuracy and can be divided into the following three types.
reliability. Reflexion [55] converts environmental feedback • Short-term memory focuses on the contextual information of
into self-reflection and enhances ReAct by enabling LLM the current situation. It is temporary and limited, typically
agents to learn from past errors and iteratively optimize managed through a context window that restricts the amount
behaviors. Reflexion features an actor that produces actions of information the LM agent can learn at a time.
and text via models (e.g., CoT and ReAct) enhanced by • Long-term memory stores LM agent’s historical behaviors
memory, an evaluator that scores outputs using task-specific and thoughts. This is achieved through external vector
reward functions, and self-reflection that generates verbal storage, which allows for quick retrieval of important in-
feedback to improve the actor. formation, ensuring that the agent can access relevant past
• Multi-persona self-planning: Inspired by pretend play, Wang experiences to inform current decisions [58].
et al. [56] develop a cognitive synergist that enables a • Hybrid memory combines short-term and long-term mem-
single LLM to split into multiple personas, facilitating ory to enhance an agent’s understanding of the current
self-collaboration for solving complex tasks. They propose context and leverage past experiences for better long-term
Solo Performance Prompting (SPP), where LLM identifies, reasoning. Liu et al. [59] propose the RAISE architecture to
simulates, and collaborates with diverse personas, such as enhance ReAct for conversational AI agents by integrating
domain experts or target audiences, without external retrieval a dual-component memory system, where Scratchpad cap-
systems. SPP enhances problem-solving by allowing LLM to tures recent interactions as short-term memory; while the
perform multi-turn self-revision and feedback from various retrieval module acts as long-term memory to access relevant
perspectives. examples. HIAGENT [60] employs cross-trial and in-trial
• Grounded planning: Executing plans in real-world envi- memory, where cross-trial memory stores historical trajec-
ronments (e.g., Minecraft) requires precise, multi-step rea- tories and in-trial memory captures current trials. Instead of
soning. VOYAGER [50], the first LLM-powered agent in retaining all action-observation pairs, HIAGENT uses sub-
Minecraft, utilizes in-context lifelong learning to adapt goals as memory chunks to save memory, each containing
and generalize skills to new tasks and worlds. VOYAGER summarized observations. LLM generates subgoals, executes
include an automatic curriculum for exploration, a skill actions to achieve them, and updates the working memory by
9

summarizing and replacing completed subgoals with relevant to solve problems, and negotiating roles in multi-agent
information. scenarios.
• Agent-Human Interactions. LM agents can interact with
3) Action Module: The action module equips the LM agent
humans including understanding and responding to natural
with the ability to execute and adapt actions in various environ-
language commands, recognizing and interpreting human
ments [9], [42]. This module is designed to handle embodied
emotions and expressions, and providing assistance in var-
actions and tool-use capabilities, enabling the agent to interact
ious tasks [20]. As observed, LLMs such as GPT-4 often
with its physical surroundings adaptively and effectively. Besides,
tend to forget character settings in multi-turn dialogues and
tools significantly broaden the action space of the agent.
struggle with detailed role assignments due to context win-
• Embodied actions. The action module empowers LM agents dow limits. To address this, a tree-structured persona model
to perform contextually appropriate embodied actions and is introduced in [65] for character assignment, detection, and
adapt to environmental changes, facilitating interaction with maintenance, enhancing agent interactions.
and adjustment to physical surroundings [21], [25]. As LLM- • Agent-Environment Interactions. LM agents can engage di-
generated action plans are often not directly executable in rectly with the physical or virtual environments. By fa-
interactive environments, Huang et al. [25] propose refining cilitating engagement in physical, virtual, or mixed-reality
LLM-generated plans for embodied agents by conditioning environments [1], [21], the interaction module ensures that
on demonstrations and semantically translating them into LM agents can operate effectively across different contexts.
admissible actions. Evaluations in the VirtualHome envi- Lai et al. develop the AutoWebGLM agent [66], which
ronment show significant improvements in executability, excels in web browsing tasks through curriculum learning,
ranging from 18% to 79% over the baseline LLM. Besides, self-sampling reinforcement learning, and rejection sampling
SayCan [21] enables embodied agents such as robots to fine-tuning. A Chrome extension based on AutoWebGLM
follow high-level instructions by leveraging LLM knowledge validates its effective reasoning and operation capability
in physically-grounded tasks, where LLM (i.e., Say) sug- across various websites in real-world services.
gests useful actions; while learned affordance functions (i.e., 5) Security Module: The security module is crucial to ensure
Can) assess feasibility. SayCan’s effectiveness is demon- the secure, safe, ethical, and privacy-preserving operations of
strated through 101 zero-shot real-world robotic tasks in LM agents [42]. It is designed to monitor and regulate the LM
a kitchen setting. PaLM-E [61] is a versatile multimodal agent’s actions, interactions, and decisions to prevent harm and
language model for embodied reasoning, visual-language, ensure compliance with legal and ethical standards. This module
and language tasks. It integrates continuous sensor inputs, employs technologies such as hallucination mitigation, anomaly
e.g., images and state estimates, into the same embedding detection, and access control to identify and mitigate potential
space as language tokens, allowing for grounded inferences security/privacy threats. It also incorporates ethical guidelines
in real-world sequential decision-making. and bias mitigation techniques to ensure fair and responsible
• Learning to use & make tools. By leveraging various tools behaviors. The security module can dynamically adapt to emerg-
(e.g., search engines and external APIs) [62], LM agent ing threats by learning from new security/privacy incidents and
can gather valuable information to handle assigned complex integrating updates from security/privacy databases and policies.
tasks. For example, AutoGPT integrates LLMs with prede- Connections Between Modules: The key components of an
termined tools such as web and file browsing. InteRecAgent LM agent are interconnected to create a cohesive and intelligent
[63] integrates LLMs as the brain and recommender models system. Particularly, the planning module relies on the memory
as tools, using querying, retrieval, and ranking tools to module to access past experiences and external knowledge, ensur-
handle complex user inquiries. Beyond using existing tools, ing informed decision-making. The action module executes plans
LM agents can also develop new tools to enhance task generated by the planning module, adapting actions based on
efficiency [9]. To optimize tool selection with a large toolset, real-time feedback and memory. The interaction module enhances
ReInvoke [64] introduces an unsupervised tool retrieval these processes by facilitating communication and collaboration,
method featuring a query generator to enrich tool documents which provides additional data and context for the planning and
in offline indexing and an intent extractor to identify tool- memory modules. Besides, security considerations are seamlessly
related intents from user queries in online inference, fol- integrated into every aspect of the LM agent’s operations to
lowed by a multi-view similarity ranking strategy to identify ensure robust and trustworthy performance.
the most relevant tools.
4) Interaction Module: The interaction module enables the D. Enabling Technologies of LM Agents
LM agent to interact with humans, other agents, and the envi-
As illustrated in Fig. 8, there are five enabling technologies
ronment [41]. Through these varied interactions, the agent can
underlying the engine of connected LM agents.
gather diverse experiences and knowledge, which are essential
1) Large Foundation Model Technologies: LMs, such as
for comprehensive understanding and adaptation.
LLMs and LVMs, serve as the core brains or controllers, provid-
• Agent-Agent Interactions. The interaction module allows ing advanced capabilities for AI agents across diverse applications
LM agents to communicate and collaborate with other [6], [57]. Table IV summarizes the basic training stages of LLM.
agents, fostering a cooperative network where information (i) Multimodal capability: By employing multimodal perception
and resources are shared [62]. This interaction can include (e.g., CLIP [69]) and tool utilization strategies, LM agents can
coordinating efforts on shared tasks, exchanging knowledge perceive and process various data types from virtual and real
10

TABLE IV: A Summary of LLM Stages, Utilized Technologies, and Representives


Stage of LLM Description Utilized Technology Representive
Pre-train LLM in a self-supervised GPT-3, PaLM-2,
Pre-training Transformer
manner on a large corpus LLaMA-2
Instruction-tuning,
Fine-tuning pre-trained WebGPT [16], T0,
Fine-tuning Alignment-tuning,
LLM for downstream tasks LLaMA-2-Chat
Transfer learning
Prompt engineering,
Setup prompts and query trained RALM [67],
Prompting Zero-shot prompting,
LLMs for generating responses ToolkenGPT [68]
In-context learning

fusion: Knowledge representation and fusion technologies enable


LM agents to build comprehensive knowledge bases by inte-
grating information from diverse sources [74], such as struc-
tured databases, vector databases, and multimodal KGs. KGs,
in particular, organize information into structured relationships,
allowing LM agents to comprehend complex contexts and infer
new information. (ii) RAG: RAG combines retrieval mechanisms
with generative models, enabling LM agents to dynamically fetch
relevant external information from vast knowledge sources [76].
This ensures that the outputs of LM agents are contextually rele-
vant and up-to-date. Semantic understanding techniques [77], [78]
further enhance LM agents’ ability to interpret and generate con-
tent that is both contextually appropriate and semantically rich.
(iii) Knowledge synchronization: Continuous learning algorithms
[79], [80] facilitate the ongoing updating of knowledge bases,
allowing agents to incorporate new information incrementally and
adapt to changing environments. These technologies also promote
knowledge sharing and collaboration among multiple agents and
between agents and humans.
Fig. 8: Illustration of five underlying technologies, including their 3) Interaction Technologies: Interaction technologies signifi-
roles and key components, acting as the engine of the connected cantly enhance LM agents’ ability to engage with users, other
LM agents. agents, and the environment in natural, immersive, and contex-
tually aware ways [42]. (i) HMI and human-robot interaction
(HRI): HMI and HRI technologies [21], bolstered by NLP, enable
environments, as well as human inputs, through multimodal LM agents to understand complex instructions, recognize speech,
data fusion, allowing for a holistic understanding of their sur- and interpret emotions, facilitating more intuitive interactions.
roundings [1]. Additionally, due to their powerful multimodal (ii) 3D digital humans: 3D digital humans create realistic and
comprehension and generation capabilities [70], LM agents can engaging interfaces for communication, allowing LM agents to
interact seamlessly with both other agents and humans, fostering provide empathetic and personable interactions in applications
collaboration and competition among agents and humans. (ii) such as customer support and healthcare [40]. (iii) AR/VR/MR:
Advanced reasoning: LM technologies also empower AI agents AR, VR, and MR technologies create immersive and interac-
with advanced reasoning and problem decomposition abilities tive environments, allowing users to engage with digital and
[17]. For example, CoT [11] and ToT [12] reasoning allow LM physical elements seamlessly. This is particularly beneficial in
agents to break down complex tasks into manageable sub-tasks, education, retail, and training, where such immersive experiences
ensuring systematic and logical problem-solving approaches. (iii) can enhance learning, engagement, and decision-making [48]. AR
Few/zero-shot generalization: Pre-trained on extensive corpora, overlays digital information onto the real world, enabling users
LMs demonstrate few-shot and zero-shot generalization capabili- to interact with virtual objects in their physical surroundings;
ties [25]. This allows LM agents to transfer knowledge seamlessly VR creates fully immersive digital environments for engaging
between tasks. (iv) Adaptability: Through continuous learning and with virtual elements; while MR blends digital information with
adaptation, LM agents can accumulate knowledge and improve physical environments. (iv) Multimodal interfaces: Multimodal
their performance over time by learning from new data and ex- interfaces enable LM agents to process and respond to various
periences [71]. This adaptability ensures that AI agents maintain forms of input, including text, speech, images, gestures, and touch
high levels of performance in rapidly changing environments. [1]. This allows users to interact with LM agents through their
2) Knowledge-Related Technologies: Knowledge-related tech- preferred modalities, making interactions more flexible and user-
nologies significantly enhance the capabilities of LM agents in friendly.
knowledge representation [72], acquisition [73], fusion [74], re- 4) Digital Twin Technologies: Digital twin technologies [49]
trieval [58], and synchronization [75]. These technologies collec- create comprehensive virtual representations of LM agents’ phys-
tively empower LM agents with deeper contextual understanding, ical bodies and operational environments, continuously updated
more accurate reasoning, and better interaction. (i) Knowledge with real-time data from sensors, actuators, and other inputs.
11

This ensures an accurate and up-to-date reflection of their real- flows. For example, AutoGPT 3 is an open-source autonomous
world counterparts. (i) Virtual-physical synchronization: Digital agent utilizing GPT-3.5 or GPT-4 APIs to independently exe-
twin technologies empower LM agents by enabling seamless cute complex tasks by breaking down them into several sub-
and efficient synchronization of attributes, behaviors, states, and tasks and chaining LLM outputs, showcasing advanced reasoning
other data between their physical bodies and digital brains. capabilities [14]. AutoGen4 , developed by Microsoft, offers an
This synchronization is achieved through intra-agent bidirectional open-source multi-agent conversation framework, supports APIs
communications, where the physical body continuously transmits as tools for improved LLM inference, and emphasizes the auto-
real-time data to the digital twin for processing and analysis, matic generation and fine-tuning of AI models [89]. BabyAGI 5
while the digital twin sends back instructions and optimizations integrates task management via OpenAI platforms and vector
[49]. (ii) Virtual-physical feedback: This continuous feedback databases, simulating a simplified AGI by autonomously creating
loop enhances LM agent’s contextual awareness, allowing for and executing tasks based on high-level objectives. ChatDev6 fo-
immediate adjustments and optimizations in response to changing cuses on enhancing conversational AI, providing sophisticated di-
conditions [23]. For example, an LM agent operating machin- alogue management, coding, debugging, and project management
ery can use its digital twin to anticipate mechanical wear and to streamline software development processes [90]. MetaGPT 7
proactively schedule maintenance, thereby minimizing downtime explores the meta-learning paradigm, where the model is trained
and enhancing efficiency. (iii) Predictive analytics: Digital twins to rapidly adapt to new tasks by leveraging knowledge from
facilitate predictive analytics and simulation, enabling LM agents related tasks, thus improving efficiency and performance across
to anticipate future states and optimize their actions accordingly diverse applications [91].
[22]. This capability is crucial in complex environments where 1) Mobile Communications: LM agents offer significant ad-
unforeseen changes can significantly impact performance. Over- vantages for mobile communications by enabling low-cost and
all, digital twin technologies ensure that LM agents operate with context-aware decision-making [92], personalized user experi-
high accuracy, adaptability, and responsiveness across diverse ences [87], and automatic optimization problem formulation for
applications, effectively bridging the gap between the physical wireless resource allocation [93]. For example, NetLLM [92] fine-
and digital realms. tuning the LLM to acquire domain knowledge from multimodal
5) Multi-Agent Collaboration Technologies: Multi-agent col- data in networking scenarios (e.g., adaptive bitrate streaming,
laboration technologies [41] enable coordinated efforts of mul- viewport prediction, and cluster job scheduling ) with reduced
tiple LM agents, allowing them to work together synergisti- handcraft costs. Meanwhile, NetGPT [87] design a cloud-edge co-
cally to achieve common goals and tackle complex tasks that operative LM framework for personalized outputs and enhanced
would be challenging for individual agents to handle alone. prompt responses in mobile communications via de-duplication
(i) Data cooperation: It facilitates real-time and seamless in- and prompt enhancement technologies. ChatNet [94] uses four
formation sharing through inter-agent communications, enabling GPT-4 models to serve as analyzer (to plan network capacity and
LM agents to continuously synchronize their understanding of designate tools), planner (to decouple network tasks), calculator
dynamic environments [81], [82]. (ii) Knowledge cooperation: (to compute and optimize the cost), and executor (to produce
By leveraging knowledge representation frameworks such as KGs customized network capacity solutions) via prompt engineering.
[47], [83] and vector databases [84], LM agents can share and LM agents can also help enhance the QoE of end users.
aggregate domain-specific insights, enhancing collective learning For example, MobileAgent v2 [18], launched by Alibaba, is a
and decision-making. This shared knowledge base allows LM mobile device operation assistant that achieves effective naviga-
agents to build upon each other’s experiences to accelerate the tion through multi-agent collaboration, automatically performing
learning process [71]. (iii) Computation cooperation: On the one tasks such as application installation and map navigation, and
hand, collaborative problem-solving techniques , such as multi- supports multimodal input including visual perception, enhanc-
agent planning [85] and distributed reasoning [86], empower LM ing operational efficiency on mobile devices. AppAgent [19],
agents to jointly analyze complex issues, devise solutions, and developed by Tencent, performs various tasks on mobile phones
execute coordinated actions. On the other hand, dynamic resource through autonomous learning and imitating human click and
allocation mechanisms [87], [88], including market-based and swipe gestures, including posting on social media, helping users
game-theoretical approaches, enable LM agents to negotiate and write and send emails, using maps, online shopping, and even
allocate resources dynamically, thereby optimizing resource uti- complex image editing.
lization ensuring effective task execution across multiple agents. 2) Intelligent Robots: LM agents play a crucial role in ad-
This synergy is particularly beneficial in dynamic environments, vancing intelligent industrial and service robots [21]. These
which not only improves the operational capabilities of individual robots can perform complex tasks such as product assembly,
LM agents but also enhances the overall functionality of the environmental cleaning, and customer service, by perceiving
system of connected LM agents. surroundings and learning necessary skills through deep learning
models. In August 2024, FigureAI released Figure 028 , a human-
like robot powered by OpenAI LM, capable of fast common-sense
E. Modern Prototypes & Applications of LM Agents
3 https://autogpt.net/
Recently, various industrial projects of LM agents, such as Au-
4 https://microsoft.github.io/autogen/
toGPT, AutoGen, BabyAGI, ChatDev, and MetaGPT, demonstrate 5 https://github.com/yoheinakajima/babyagi
their diverse potential in assisting web, life, and business scenar- 6 https://github.com/OpenBMB/ChatDev
ios, such as planning personalized travels, automating creative 7 https://www.deepwisdom.ai/

content generation, and enhancing software development work- 8 https://spectrum.ieee.org/figure-new-humanoid-robot


12

TABLE V: Comparison of Existing Typical LM Agent Prototypes


Enhancing Improving Optimizing
Prototype Application Other Key Features
Intelligence Safety Experience
AutoGPT General scenarios Yes N/A Yes Task decomposition and execution
AutoGen General scenarios Yes N/A Yes Multi-agent conversation framework
BabyAGI General scenarios Yes N/A Yes Autonomous task management simulation
ChatDev General scenarios Yes N/A Yes Conversational AI for software development
MetaGPT General scenarios Yes N/A Yes Meta-learning for task adaptation
NetLLM Mobile communications Yes N/A Yes Fine-tunes LLM for networking
NetGPT Mobile communications Yes N/A Yes Cloud-edge cooperative LM service
MobileAgent v2 Mobile communications Yes N/A Yes Multimodal input support
AppAgent Mobile communications Yes N/A Yes Performs tasks on mobile devices
Figure 02 by FigureAI Intelligent robots Yes N/A N/A Performs dangerous jobs
Optimus by Tesla Intelligent robots Yes N/A N/A Second-generation humanoid robot
Baidu Apollo ADFM Autonomous driving Yes Yes Yes Supports L4 autonomous driving
PentestGPT Attack-defense confrontation Yes Yes N/A 87% success in vulnerability exploitation
AutoAttacker Attack-defense confrontation Yes No N/A Automatically execute network attacks

visual reasoning and speech-to-speech conversation with humans (a) Fully Cooperative (b) Partially Cooperative (c) Fully Competitive
to handle dangerous jobs in various environments. Besides, in
July 2024, Tesla unveiled its second-generation humanoid robot
named Optimus9 , which demonstrates the enhanced capabilities
brought by advanced LM agents.
3) Autonomous Driving: LM agents are transforming au- Cooperative autonomous Intra-team cooperation & Attacker Defender
driving inter-team competition agent agent
tonomous driving by enhancing vehicle intelligence, improving
safety, and optimizing driving experience (e.g., offering person- Fig. 9: Illustration of interaction types of LM agents, i.e., fully
alized in-car experience) [22]. In May 2024, Baidu’s Apollo cooperative, partially cooperative, and fully competitive.
Day saw the launch of Carrot Run’s sixth-generation unmanned
vehicle, which is built upon the Apollo Autonomous Driving
Foundation Model (ADFM)10 , the first LM agent supporting 1) Competition: Competition involves LM agents pursuing
L4 autonomous driving. Companies such as Tesla, Waymo, and their individual objectives, often at the expense of others. This
Cruise are also leveraging LM agents to refine their autonomous interaction mode is characterized by non-cooperative strate-
driving systems, aiming for safer and more efficient transportation gies where agents aim to maximize their own benefits. Non-
solutions. cooperative game and multi-agent debate have been widely
4) Autonomous attack-defense confrontation: LM agents can adopted to model the competitive behaviors among LM agents. In
be seen as autonomous and intelligent cybersecurity decision- non-cooperative games, LM agents engage in strategic decision-
maker capable of making security decisions and taking threat making where each LM agent’s goal is to search the Nash equi-
handling actions without human intervention. For example, Pen- librium. LM agents in multi-agent debate [99] involve engaging
testGPT [95] is an automated penetration testing tool supported in structured arguments or debates to defend their positions and
by LLMs, designed to use GPT-4 for automated network vulner- challenge the strategies of others.
ability scanning and exploitation. AutoAttacker [96], an LM tool, The cognitive behavior of LLMs, such as self-reflection, has
can autonomously generate and execute network attacks based proven effective in solving NLP tasks but can lead to thought
on predefined attack steps. As reported by a latest research [97], degeneration due to biases, rigidity, and limited feedback. Multi-
LM agents can automatically exploit one-day vulnerabilities; and agent debate (MAD) [99] explores interactions among LLMs,
in tests on 15 real-world vulnerability datasets, GPT-4 success- where agents engage in a dynamic tit-for-tat, allowing them to
fully exploited 87% of vulnerabilities, significantly outperforming correct each other’s biases, overcome resistance to change, and
other tools. provide mutual feedback. In MAD, diverse role prompts (i.e.,
personas) are crucial and there are generally three communication
strategies: (a) one-by-one debate, (b) simultaneous-talk, and (c)
III. N ETWORKING L ARGE M ODEL AGENTS : PARADIGMS simultaneous-talk-with-summarizer.
A. Overview of Interactions of Connected LM Agents 2) Partial Cooperation: Partial cooperation occurs when LM
For connected LM agents, multi-agent interactions refer to the agents collaborate to a limited extent, often driven by overlapping
dynamic and complex interactions between multiple autonomous but not fully aligned interests [62]. In such scenarios, agents
LM agents that operate within a shared environment. As depicted might share certain resources or information while retaining
in Fig. 9, these interactions can be categorized into cooperation, autonomy over other aspects. This interaction mode balances the
partial cooperation, and competition [62], [98], each of which benefits of cooperation with the need for individual agency and
involves different strategies to jointly optimize the collective or competitive advantage. Partial cooperation can be strategically
individual outcomes. advantageous in environments where complete cooperation is
impractical or undesirable due to conflicting goals or resource
9 https://viso.ai/edge-ai/tesla-bot-optimus/ constraints. Hierarchical game and coalition formation theory
10 https://autonews.gasgoo.com/icv/70033042.html can be employed to model both cooperative and competitive
13

Task
Multimodal Data
Cooperation

Spatio-temporal Data Data cooperation Space-Air-Ground-


Cooperation Sea Data Cooperation

LM split Knowledge
Searching

Computation Knowledge
LM Agents
cooperation cooperation
LM cascade
Knowledge Sync.
Cloud-Edge-End
Cooperation Knowledge-LM
Co-driven
prompt

Fig. 10: Illustration of the cooperation modes of connected LM agents including data cooperation, computation cooperation, and
knowledge cooperation.

interactions among LM agents. reasoning, such as service cascades, by distributing compu-


Hierarchical game [100] structures the interactions of LM tational tasks optimally across agents, leveraging collective
agents across different game stages, where they may cooperate processing power to handle complex computations more
at one level and compete at another. Through the coalition efficiently [52] [85] [86] [88].
formation theory [101], LM agents can form the optimal stable • Information & Knowledge Cooperation: LM agents share
coalition structure under different scenarios and environments, domain-specific knowledge and experiences (e.g., in the for-
where the optimal stable coalition structure is featured with intra- mat of KGs) to collectively improve their problem-solving
coalition cooperation and inter-coalition competition. Liu et al. capabilities for better-informed actions and decisions, via
propose DyLAN [102], a multi-layered dynamic LLM agent knowledge bases synchronization and distributed learning
network for collaborative task-solving, such as reasoning and algorithms [71] [47] [83].
code generation. DyLAN facilitates multi-round interactions in The methodologies employed for facilitating cooperation
a dynamic setup with an LLM-empowered ranker to deactivate among LM agents include role-playing [90], [103], multi-
low-performing agents, an early-stopping mechanism based on objective optimization, cooperative game theory [104], Nash
Byzantine Consensus to efficiently reach agreement, and an bargaining solution, auction mechanism [88], Multi-Agent Rein-
automatic optimizer to select the best agents based on agent forcement Learning (MARL) [98], swarm intelligence algorithms,
importance scores. However, existing cooperative agents rely Federated Learning (FL) [86], and the theory of mind [105].
on learning-based methods, with performance highly dependent
on training diversity, limiting their adaptability with unfamiliar • Role-playing. Li et al. [30] propose a cooperative agent
teammates, especially in zero-shot coordination. Zhang et al. [85] framework that employs role-playing with inception prompt-
introduce ProAgent for cooperative agents, featuring four key ing to guide agents autonomously toward task completion.
modules: planner, verificator, controller, and memory. Verificator The system begins with human-provided ideas and role
checks skill feasibility, identifies issues if a skill fails, and triggers assignments, refined by a task-specifier agent. An AI user
re-planning if necessary. If feasible, the controller breaks the skill and an AI assistant then collaborate through multi-turn
into executable actions. dialogues, with the AI user instructing and the assistant
responding until the task is completed. ChatDev [90] is a
3) Cooperation: For connected LM agents, cooperation is a
virtual software company operated by “software agents” with
fundamental interaction mode where LM agents work together to
diverse roles, e.g., chief officers, programmers, and test engi-
achieve common goals, share resources, and optimize collective
neers. Following the waterfall model, it divides development
outcomes. This involves data cooperation, computation coopera-
into design, coding, testing, and documentation, breaking
tion, and information (including knowledge) cooperation.
each phase into atomic subtasks managed by a chat chain.
• Data Cooperation: LM agents continuously exchange and MetaGPT [91] integrates human workflows into LLM-based
fuse their individual data (e.g., perceived data) to ensure multi-agent collaborations. By encoding Standardized Oper-
a comprehensive and up-to-date understanding of their en- ating Procedures (SOPs) into prompt sequences, MetaGPT
vironment, thereby enhancing collective intelligence and streamlines workflows and reduces errors through agents
enable coordinated actions [81] [82]. with human-like expertise. Agents are defined with specific
• Computation Cooperation: LM agents perform coordinated profiles and communicate through structured communica-
14

tion interfaces, such as documents and diagrams, instead validated using LSMDC and Youcook2 datasets for video-
of dialogue. Using a publish-subscribe mechanism, agents text retrieval tasks and video captioning tasks. To further
can freely exchange messages via a shared message pool, enable fine-grained cross-modal retrieval, Chen et al. [108]
publishing their outputs and accessing others’ transparently. develop a novel attention mechanism to effectively integrate
Park et al. [103] create a community of 25 generative agents feature information from different modalities and represent
in a sandbox world named Smallville, where agents are them within a unified space, thereby overcoming semantic
represented by sprite avatars. These agents perform daily gap between multiple data modalities
tasks, form opinions, and interact, mimicking human-like • Contextual Understanding: By utilizing multimodal data, it
behaviors. facilitates a richer contextual understanding of the environ-
• Theory of mind. It refers to the ability to understand about ment with improved accuracy of predictions and actions.
others’ hidden mental states, which is essential for social For example, Li et al. [78] design a general semantic
interactions. As LLMs engage more in human interactions, communication-based multi-agent collaboration framework
enhancing their social intelligence is vital. Li et al. [105] with enhanced contextual understanding and study a use case
identify limitations in LLM collaboration and propose a in search and rescue tasks.
prompt-engineering method to incorporate explicit belief
state representations. They also introduce a novel evaluation 2) Spatio-temporal Data Cooperation: Spatio-temporal data
of LLMs’ high-order theory of mind in teamwork scenar- cooperation for LM agents involves the integration and synchro-
ios, emphasizing dynamic belief state evolution and intent nization of spatial and temporal data across various modalities
communication among agents. and sources, enabling LM agents to achieve a comprehensive and
In the following, we discuss the detailed cooperation paradigms dynamic understanding of the environment over time. This coop-
of LM agents from the perspectives of data cooperation, compu- eration ensures that LM agents can effectively analyze patterns,
tation cooperation, and knowledge cooperation. predict future states, and make informed decisions in real-time,
based on both spatial distribution and temporal evolution of data.

B. Data Cooperation for LM Agents • Spatio-temporal Feature Fusion: Yang et al. [109] introduce
SCOPE, a collaborative perception mechanism that enhances
The data cooperation among LM agents involves the modality spatio-temporal awareness among on-road agents through
perspective, the temporal perspective, and the spatial perspective. end-to-end aggregation. SCOPE excels by leveraging tem-
Effective data cooperation ensures that LM agents can seamlessly poral semantic cues, integrating spatial information from
integrate and utilize data from diverse sources and modalities, diverse agents, and adaptively fusing multi-source repre-
enhancing their capabilities and performance in various applica- sentations to improve accuracy and robustness. However,
tions. [109] mainly works for small-scale scenarios. By captur-
1) Multimodal Data Cooperation: Multimodal data coopera- ing both spatial and temporal heterogeneity of citywide
tion iemphasizes the fusion of data from various modalities, such traffc, Ji et al. [110] propose a novel spatio-temporal self-
as text, images, audio, and video, to provide a comprehensive supervised learning framework for traffic prediction that
understanding of the environment. This cooperation allows LM improves representation of traffic patterns. This framework
agents to process and interpret information from multiple sources, uses an integrated module combining temporal and spatial
leading to more accurate and robust decision-making. convolutions and employs adaptive augmentation of traf-
• Data Fusion: By combining data from different modalities, fic graph data, supported by two auxiliary self-supervised
it helps create a unified representation that leverages the learning tasks to improve prediction accuracy. To further
strength of each type of data. For example, Wu et al. address data noises, missing information, and distribution
[106] propose a multi-agent collaborative vehicle detection heterogeneity in spatio-temporal data, which are overlooked
network named MuDet, which integrates RGB and height in [109], [110], Zhang et al. [111] devise an automated
map modalities for improved object identification in dense spatio-temporal graph contrastive learning framework named
and occluded environments such as post-disaster sites. Gross AutoST. Built on a heterogeneous Graph Neural Network
et al. [81] discuss the use of multimodal data to model (GNN), AutoST captures multi-view region dependencies
communication in artificial social agents, emphasizing the and enhances robustness through a spatio-temporal aug-
importance of verbal and nonverbal cues for natural human- mentation scheme with a parameterized contrastive view
robot interaction. generator.
• Cross-Modal Retrieval: By enabling LM agents to retrieve • Temporal dynamics with topological reasoning: Chen et
relevant information across different modalities, it enhances al. [82] propose a temporally dynamic multi-agent col-
their ability to respond to complex queries and scenarios. laboration framework that organizes agents using directed
For example, Gur et al. [107] design an alignment model acyclic graphs to facilitate interactive reasoning, demonstrat-
and retrieval-augmented multi-modal transformers for effi- ing superior performance across various network topologies
cient image-caption retrieval in visual Question-Answering and enabling collaboration among thousands of agents. A
(QA) tasks. By considering the intra-modality similarities key finding in [82] is the discovery of the collaborative
in multi-modal video representations, Zolfaghari et al. [72] scaling law, where solution quality improves in a logistic
introduce the contrastive loss in contrastive learning process growth pattern as more agents are added, with collaborative
for enhanced cross-modal embedding. The effectiveness is emergence occurring sooner than neural emergence.
15

Fig. 11: The taxonomy of computation cooperation of LM agents.

3) Space-Air-Ground-Sea Data Cooperation: Space-Air-


Ground-Sea (SAGS) data cooperation involves the integration
of data from various spatial locations, encompassing satellites,
aerial vehicles, ground-based sensors, and maritime sources.
This cooperation enables LM agents to have a holistic view Fig. 12: Illustration of horizontal collaboration.
of the environment, crucial for applications in areas such as
disaster response, environmental monitoring, and logistics. Xu
et al. [112] propose an analytical model of coverage probability
for ocean Surface Stations (SSs) located far from the coastline.
By employing various types of relays including onshore stations,
tethered balloons, high-altitude platforms, and satellites, this
model establishes communication links between core-connected
base stations and SSs via stochastic geometry, adapting relay
station selection based on the SS’s distance from the coastline.
To further minimize total network latency caused by long-
distance transmission and environmental factors, Nguyen et
al. [113] broken down the latency optimization problem into
three sub-problems: clustering ground users with UAVs, cache Fig. 13: Illustration of vertical collaboration.
placement in UAVs to alleviate backhaul congestion, and power
allocation for satellites and UAVs. A distributed optimization
LM agents work as teammates, analyzing each other’s intentions
approach is devised, which utilizes a non-cooperative game for
and updating their own beliefs based on the observed behaviors of
clustering, a genetic algorithm for cache placement, and a quick
their peers. This enhances collective decision-making by allowing
estimation technique for power allocation.
agents to adapt dynamically in real time. Similarly, DyLAN [114]
assembles a team of strategic agents that communicate through
C. Computation Cooperation for LM Agents task-specific queries in a dynamic interaction architecture, en-
The computation cooperation paradigm of LM agents includes abling multiple rounds of interaction to improve both efficiency
horizontal/vertical cooperation and cross-layer cooperation. and overall performance. However, the independent analysis of
① Horizontal/vertical LM collaboration. It can be classified each LM agent may lead to inconsistent outputs, making it
into three modes: horizontal cooperation, vertical cooperation, difficult to aggregate the final conclusions. Therefore, horizontal
and hybrid cooperation. collaboration requires effective mechanisms to handle disagree-
1) Horizontal collaboration refers to multiple LM agents ments between LM agents and ensure that their contributions
completing the same task independently at the same time, and complement each other, which may increase the complexity of
then summarizing and integrating their respective outputs to the coordination process.
generate the final result. It allows the collaborative system to 2) Vertical collaboration is to decompose complex tasks into
scale horizontally by adding more LM agents to handle complex multiple stages, and different LM agents complete the tasks of
and dynamic tasks. The parallel processing and multi-angle each stage in turn. After each agent completes the task of its
perspectives contribute to increased robustness and diversity, own stage, it passes the result to the next agent until the entire
minimizing errors and biases that could arise from a single task is successfully solved. As depicted in Fig. 13, for medical
model’s limitations. As depicted in Fig. 12, each LM agent image analysis, the first agent may extract basic visual features,
independently analyzes different aspects of stock including trend, the second agent analyzes these features in more depth, and the
news, and industry to determine whether it is a good investment. final agent produces a diagnosis. This sequential process enables a
Horizontal collaboration has been applied in various LM agent step-by-step refinement, allowing complex problems to be broken
systems. For example, ProAgent [85] designs a framework where down and tackled more efficiently by leveraging specialized
16

agents at each stage. For example, Jiang et al. [28] present the need for extensive retraining data and computational
CommLLM, a multi-agent system for natural language-based resources.
communication tasks. It features three components: 1) multi-agent • Cloud-Edge LM deployment via quantization and hardware
data retrieval, which uses condensate and inference agents to acceleration: It employs specialized hardware to accelerate
refine 6G communication knowledge; 2) multi-agent collaborative the deployment and operation of LMs at the edge. Hard-
planning, employing various planning agents to develop solutions ware accelerators, such as GPUs and TPUs, can enhance
from multiple viewpoints; and 3) multi-agent reflection, which computational efficiency and provide robust support for
evaluates solutions and provides improvement recommendations cloud training and edge deployment. Agile-Quant [118]
through reflection and refinement agents. However, a limitation enhances the efficiency of LLMs on edge devices through
of vertical collaboration is that higher-level LM agents rely on the activation-guided quantization and hardware acceleration by
accuracy of lower-level outputs, making the process vulnerable to utilizing a SIMD-based 4-bit multiplier and efficient TRIP
error propagation. Mistakes made early can compromise the final matrix multiplication, achieving up to 2.55x speedup while
outcome, therefore this approach demands high precision from maintaining high task performance.
each agent to ensure overall system reliability. • Cloud-Edge LM deployment via Split Learning (SL): SL is
3) Hybrid collaboration. In practical LM agent environments, an emerging distributed ML paradigm in which the model is
real-world applications often involve both horizontal and vertical split into several parts [120]. SL is typically implemented in
collaboration paradigms, resulting in the hybrid collaboration. For three settings: two-part single-client, two-part multi-client,
example, in addressing highly complex tasks, the problem is first and U-Shape configurations [121]. The goal of SL is to
broken down into manageable sub-tasks, each assigned to special- offload complex computations and enhance data privacy by
ized LM agents. Here, horizontal collaboration enables agents keeping the preceding model segments local. This approach
to perform parallel evaluations or validations, while vertical addresses the substantial computational demands and privacy
collaboration ensures that sequential processing stages refine the concerns associated with training and inference in LM
task further. The computational collaboration among LM agents agents. For example, Xu et al. [120] propose a cloud-edge-
involves a mix of horizontal and vertical approaches, such as end computing system for LLM agents using SL, where
coordinating parallel assessments across agents while sequentially mobile agents with local models (0-10B parameters) handle
integrating outputs through defined stages, thus optimizing task real-time tasks, and edge agents with larger models (over
execution and enhancing overall system performance. 10B parameters) provide complex support by incorporating
② Cross-layer computation cooperation. For cross-layer com- broader contextual data. They also study a real case, where
putation cooperation among LM agents, it can be classified into mobile agents create localized accident scene descriptions,
two modes: cloud-edge cooperation and edge-edge cooperation. which are then enhanced by edge agents to generate com-
prehensive accident reports and actionable plans.
1) Cloud-Edge Cooperation. In real-world scenarios, running
• Cloud-Edge LM inference via model sharding: It distributes
a complete LM agent requires substantial computational re-
LM shards across heterogeneous edge servers to accom-
sources. Edge servers, often equipped with fewer devices and
modate varying device capabilities and resource conditions.
resources than cloud servers, typically lack the capacity to
By partitioning the LM into smaller shards, it allows that
support the operation of a complete LM agent. The cloud-
each edge server only handles a manageable portion of
edge collaboration approach enables edge servers to support LM
the LM, optimizing resource utilization and performance.
agents effectively. A range of strategies have been developed to
For example, EdgeShard [119] improves LLM inference by
optimize resource utilization, enhance performance, and reduce
partitioning LMs into shards for deployment on distributed
operational costs, encompassing transfer learning, model com-
edge devices and cloud servers, optimizing latency and
pression, caching, hardware acceleration, and model sharding.
throughput with an adaptive algorithm, achieving up to 50%
In the literature, there exist the following types of cloud-edge
latency reduction and 2x throughput improvement.
cooperation paradigms: orchestration between LM and Smaller
• Cloud-Edge LM services via caching and resource opti-
Models (SMs), lightweight edge LM deployment, sharding-based
mization: It utilizes caching mechanisms and other resource
edge LM inference, and cloud-edge LM service optimizations.
optimization techniques to configure and support LM agents
• LM-SM orchestration via transfer learning or compression: on edge servers. By caching frequently used data and inter-
It deploys a complete LM agent in the cloud and create mediate results, the system reduces the need for continuous
smaller, specialized LM agents through transfer learning or high-volume data transfer between the cloud and the edge
model compression. The agents run on smaller models with node. Scissorhands [117] reduces the memory usage of
reduced parameter scales, are then deployed on edge servers, the KV cache in LLMs during inference by selectively
requiring fewer resources while maintaining high perfor- storing pivotal tokens, thus maintaining high throughput and
mance. VPGTrans [115] addresses the high computational model quality without requiring model finetuning. Xu et al.
costs of training visual prompt generators for multimodal [88] propose a joint caching and inference framework for
LLMs by significantly reducing GPU hours and training data edge intelligence in space-air-ground integrated networks to
needed via transferring an existing visual prompt generator efficiently deploy LLM agents. They design a new cached
from one multimodal LLM to another. LLM-Pruner [116] model-as-a-resource paradigm and introduce the concept
compresses LLMs without compromising their multi-task of age-of-thought for optimization, along with a deep Q-
solving and language generation abilities, using a task- network-based auction mechanism to incentivize network
agnostic structural pruning method that significantly reduces
17

TABLE VI: A Summary of Cloud-edge Computation Cooperation Approaches for LM Agents


Strategy Feature Technology Ref.
Transfer learning Reduction of training costs for multimodal LLMs Transfer of VPG across LLMs [115]
or compression Task-agnostic structural pruning for LLMs LoRA for recovery after pruning with minimal data [116]
KV cache compression in LLM inference Selective storage of pivotal tokens [117]
Caching
Joint caching & inference for LLM agents Cached model-as-a-resource, age-of-thought [88]
Hardware Acceleration Efficient LLM inference on edge devices SIMD-based 4-bit multiplier, TRIP matrix multiplication [118]
Sharding Collaborative inference of LLMs on edge and cloud Model partitioning and adaptive algorithm for optimization [119]

operators.
Parameter
3) Edge-Edge Cooperation. Edge-edge cooperation involves extraction
Injection
the collaboration between multiple edge devices to enhance Training data
computational efficiency and resource utilization. In practical Parameter
Target network
Teacher Student
scenarios, edge servers need to protect local data privacy while knowledge
(a)Knowledge synchronization: parameterized knowledge transfer
lacking the resources to train LM agents independently. By lever-
aging FL, edge-edge cooperation enables decentralized training Knowledge
Q:In which (2028 Olympics,
of LM agents across edge devices without requiring data to be Retrieval
country will the Held, Los Angeles)
A: USA
centralized, thus preserving data privacy and reducing latency. 2028 Olympics (Los Angeles,
be held? located, USA)
KGs
For example, to tackle issues of data scarcity and privacy in LLM
development, Chen et al. [86] propose federated LLM, combining (b)Knowledge Searching: KGs retrieval enhancement

federated pre-training, fine-tuning, and prompt engineering to


enable effective collaborative training without compromising data Entity: Olympics
Relation: Belongs to ×5
privacy. By utilizing FL, edge-edge cooperation also ensures A: Paris
Entity: Sporting event
KGs
robust and scalable deployment of LM agents in resource- LMs
constrained environments, allowing edge devices to process and Q: Entity: Olympics
Relation: Held
analyze data locally while maintaining privacy and efficiency. Entity: ?
Backpropagation

D. Knowledge Cooperation for LM Agents (c)Knowledge and LM Co-driven: KGs completion


Knowledge can be divided into explicit knowledge and implicit Fig. 14: Illustration of knowledge synchronization in LM agents.
knowledge. Explicit knowledge is the information directly utilized (a) Parameterized knowledge transfer in knowledge synchroniza-
by the AI model, such as external databases and KGs. Implicit tion, where parameter knowledge extracted from the teacher
knowledge is obtained through the optimization of AI model’s model is injected into the student model and further trained with
internal parameters (i.e., weights and biases) [122]. As depicted high-quality small sample data to obtain the target network. (b)
in Fig. 14, the knowledge cooperation of LM agents generally Retrieval-enhanced knowledge search using KGs, where entities
involves three aspects: knowledge synchronization, knowledge detected in the query are used to retrieve corresponding triples
searching, as well as the cooperation between knowledge and from KGs to assist LMs in reasoning. (c) KG completion by
LMs. orchestrating knowledge and LMs, where five example triples
1) Knowledge Synchronization: As summarized in Table VII,
and the missing triple are used as queries for LMs to complete
knowledge synchronization refers to the sharing and updating of
entities, which are then fed back into both KGs and LMs.
knowledge among multiple LM agents to ensure consistency in
decision-making, which includes knowledge transfer, knowledge
fusion, and knowledge updating. LoRA module as an intermediary mechanism to inject this
• Knowledge transfer: It refers to transferring the implicit knowledge into smaller models, transferring the implicit
parameter knowledge learned by one LM agent to other knowledge of smaller models.
LM agents. Kang et al. [123] propose an online distil- • Knowledge fusion: It refers to integrating and optimizing
lation method that enhances LLMs by retrieving relevant knowledge from different LM agents to form a robust
knowledge from external knowledge bases, to generate high- and comprehensive common knowledge base. Jiang et al.
quality reasoning processes. The knowledge (e.g., reasoning [126] propose an ensemble framework named LLM-blender
results) of small language models can be leveraged to that directly aggregates the outputs of different models to
improve the performance of LLMs in knowledge-intensive improve prediction performance and robustness. However, it
tasks. However, as the scale of LLMs grows, LLM train- requires maintaining multiple trained LLMs and executing
ing via such knowledge transfer becomes complex and each LLM during inference, which can be impractical for
computationally intensive. To address this issue, Wu et al. LM agents. To address this issue, Wan et al. [74] propose
[124] develop a set of 2.58M instructions for fine-tuning an implicit knowledge fusion framework that evaluates the
models, generating a group of different models that maintain predictive probability distributions of multiple LLMs and
baseline model performance while being much smaller in uses the distribution for continuous training the target model.
size. Besides, Zhong et al. [125] propose a parametric Moreover, Wortsman et al. [127] combine multiple neu-
knowledge transfer method that extracts and aligns knowl- ral networks into a single network named “Modelsoups”
edge parameters using sensitivity techniques and uses the through arithmetic operations at the parameter level. This
18

TABLE VII: A Summary of Knowledge Cooperation Approaches for LM Agents


Mode Type Technology Ref.
Online Distillation LMs generate high-quality reasoning processes for models to learn from [123]
Knowledge Transfer Offline Distillation LMs generate distilled data or fine-tuning instructions to train models [124]
Parametric Knowledge Transfer Extract and align the knowledge parameters of LMs to models [125]
Knowledge fusion Train target model via predictive probability distributions of LMs [74]
Knowledge Fusion Output fusion Directly aggregate the outputs of different models [126]
Neural network fusion Merge multiple neural networks at the parameter level [127]
Adjust LM via external knowledge, e.g., Internet, knowledge bases, and
External Knowledge Bases [58], [75], [128]
Knowledge Udating human feedback
Parametric Knowledge Selective fine-tuning of model’s internal parameter knowledge [129]–[131]

TABLE VIII: A Summary of Knowledge Searching Approaches for LM Agents


Mode Type Technology Ref.
Hallucination Reduction Use KGs to enhance factual awareness in LMs [47]
KGs Entity Disambiguation Use LMs to identify and align entities across heterogeneous KGs [132]
Data Race Context-aware concurrent fuzz test combining KGs & LMs [133]
Static Knowledge Base Retrieve knowledge from static sources for accurate reasoning [15]
Use real-time updated data sources to help LMs handle complex
RAG Dynamic Knowledge Base [76], [84]
and unforeseen situations
Self-Guided Retrieval Use SKR to help LLM adaptively invoke external resources [71]

method typically assumes a unified network architecture the model to continuously grow as new data flows in.
and attempts to map the weights between different neural
2) Knowledge Searching: LMs not only rely on the learned
networks.
knowledge during pre-training but also can dynamically access
• Knowledge update: The timely synchronization of latest
and query external knowledge bases (e.g., databases, the Internet,
knowledge into AI models can be divided into two lines:
and KGs) to obtain the latest information to help reasoning, as
updates to external knowledge bases and updates to internal
summarized in Table VIII.
parameter knowledge. (i) For external knowledge bases,
① KGs: In KG, knowledge entities and their relationships are
there are three main approaches: feedback enhancement,
represented as structured graphs containing nodes and edges,
network enhancement, and retrieval enhancement. Tandon
which is widely adopted for requiring deep semantic understand-
et al. [75] pair LMs with a growing memory to train a
ing and complex relational reasoning. For QA tasks, Guan et al.
correction model, where users identify output errors and
[47] study a KG-enabled factual awareness enhancement method
provide general feedbacks on how to correct them. Net-
of LLMs for improved accuracy of AI models by integrating
work enhancement uses the Internet to update knowledge.
structured knowledge. KGs are also advantageous for integrating
Lazaridou et al. [128] use few-shot prompting to learn to
diverse information from different data sources. Zhang et al. [132]
adjust LMs based on the information returned from Google
propose a fully automatic KG alignment method, using LLMs
searches. For retrieval enhancement, Trivedi et al. [58]
to identify and align entities in different KGs, aiming to solve
propose a new multi-step QA method named IRCoT to
the heterogeneity problem between different KGs and integrate
interleave retrieval with steps in CoT, which first uses CoT to
multi-source data. Zhang et al. [133] propose a context-aware
guide retrieval and then uses retrieval results to improve CoT.
concurrent fuzz testing method that combines KGs with LLMs
(ii) For internal parameter knowledge, it mainly includes
to effectively identify and handle data race problems in concurrent
three methods: knowledge editing, Parameter-Efficient Fine-
programs.
Tuning (PEFT), and continual learning. Knowledge editing
is primarily used for quickly correcting specific errors in the ② RAG: RAG technology combines information retrieval and
model or updating a small portion of the model’s knowledge, generation models by first retrieving relevant contents and then
making it suitable for fine-grained corrections. Chen et al. generating answers based on these contents, suitable for fields
[129] propose a dual-stage learning algorithm called RECK- requiring the latest information and extensive knowledge cov-
ONING, which folds the provided contextual knowledge erage. According to the data type, it can be divided into two
into the model’s parameters and uses backpropagation to categories. (i) RAG based on static knowledge bases, such as
update the parameters, thereby improving the accuracy of Wikipedia and documents. Lewis et al. [15] demonstrate how
LMs reasoning. PEFT reduces the number of parameters RAG generates accurate and contextually relevant responses by
to be adjusted through optimization techniques for reduced retrieving relevant knowledge chunks from static sources such as
computational overheads. Hu et al. [130] introduce LLM- Wikipedia. (ii) RAG based on dynamic knowledge bases, such as
Adapters, an easy-to-use framework that integrates various news APIs, which contains two lines: exploring new knowledge
adapters into LLMs and perform these adapter-based LLM or retrieving past knowledge. For new knowledge exploration,
PEFT methods for different tasks. Continual learning ensures Dai et al. [76] explore the use of RAG technology for improved
that AI model can continuously learn while receiving new safety and reliability of autonomous driving systems by utilizing
tasks and new data without forgetting previously learned real-time updated data sources, including in-vehicle sensor data,
knowledge. Qin et al. [131] propose ELLE, which flexibly traffic information, and other driving-related dynamic data. It
extends the breadth and depth of existing PLMs, allowing demonstrates RAG’s potential in handling complex environments
and responding to unexpected situations. For past knowledge
19

retrieval, Kuroki et al. [84] develop a novel vector database followed by a simple relation construction head. (iii) KG verifica-
named coordination skill database to efficiently retrieve past tion. Apart from KG construction, LMs can be employed to verify
memories in multi-agent scenarios to adapt to new cooperation KGs. Han et al. [83] propose a prompt framework for iterative
missions. By harnessing both internal and external knowledge, verification, using small LMs to correct errors in KG generated
Wang et al. [71] propose a method called Self-Knowledge-Guided by LLMs such as ChatGPT. (iv) KG completion, i.e., inferring
Retrieval (SKGR), which enhances retrieval by allowing LLMs to missing facts in a given KG. Traditional KG completion methods
adaptively call external resources when handling new problems. merely focus on the structure of KGs, without considering exten-
3) Knowledge and LM Co-driven: Knowledge and LM co- sive textual information. LLMs holds the potential to enhance KG
driven paradigms contains two lines of approaches: knowledge completion performance via encoding text or generating facts.
base-enhanced LMs and LM-assisted KGs, as summarized in Shen et al. [77] use LLMs as encoders, primarily capturing the
Table IX. semantic information of knowledge graph triples through the
① Knowledge base-enhanced LMs: Knowledge base can en- model’s forward pass, and then reconstructing the knowledge
hance LM inference at various stages. (i) In the pre-training graph’s structure by calculating a loss function, thereby better
stages, Sun et al. [134] propose an explainable neuro-symbolic integrating semantic and structural information. As structure-
knowledge base, where the fact memory is composed of a triple specific KG construction models are mutually incompatible and
of vector representations of entities and relations from existing cannot adapt to emerging knowledge graphs, Chen et al. [137] use
knowledge bases. These vector representations are integrated LLMs as generators to propose KG-S2S, a Seq2Seq generative
into the LLM during pre-training. (ii) In the fine-tuning stage, framework that represents knowledge graph facts as“flat” text to
considering that existing methods often overwrite the original pa- address the issue of heterogeneous graph structures and generate
rameters of pre-trained models when injecting knowledge, Wang the information lost during flattening in the completion process.
et al. [73] propose the K-Adapter framework. This framework
uses RoBERTa as the backbone model and assigns a neural IV. S ECURITY T HREATS & C OUNTERMEASURES TO L ARGE
adapter to each type of injected knowledge. These adapters can M ODEL AGENTS
be trained effectively in a distributed manner, thereby improving In this section, we present a comprehensive review of security
the performance of LMs on specific tasks. (iii) In the inference threats related to LM agents and examine the state-of-the-art
stage, most existing methods focus on factual information related countermeasures to defend against them, by investigating the
to entities explicitly mentioned in the query. Guan et al. [47] are recent research advancements. Fig. 15 illustrates the taxonomy of
the first to consider verification during the inference process of security threats to LM agents. Firstly, the definition, categoriza-
LLMs. They proposed a new KG-based retrofitting framework, tion, causes, and corresponding countermeasures for hallucination
which automatically adjusts the initial responses generated by threats are provided in Sect. IV-A. Following that, we discuss
LLMs based on factual knowledge stored in KGs, effectively adversarial attacks including adversarial input attacks and prompt
improving inference accuracy. hacking attacks and review corresponding countermeasures in
② LM-enhanced KGs: The information extraction and knowl- Sect. IV-B. Next, we present poisoning and backdoor attacks to
edge fusion capabilities of LMs assist the integration and updating LM agents and review defense countermeasures in Sect. IV-C.
of diverse data during the construction and completion of KGs. (i) Finally, other security threats to LM agents are identified in
Entity and relationship extraction. Traditional methods typically Sect. IV-D, including false and harmful content generation, DoS
rely on specific modules to handle each data type separately, such attacks, and agent hijacking attacks.
as text data, image data, or structured data. However, LMs can au-
tomatically extract entities and relationships. De Cao et al. [135]
propose GENRE, a system that retrieves entities in knowledge A. Hallucination
graphs by generating their unique names in an autoregressive The hallucination of LM agents typically refer to erroneous,
manner from left to right, which better captures the relationship inaccurate or illogical outputs that deviate from user inputs,
between context and entity names while significantly reducing generated context, or real-world conditions. It poses a substantial
memory consumption through an encoder-decoder architecture. risk to the reliability of LM agents. For example, hallucination
However, entity linking speed affects the inference speed of LMs. may lead to erroneous decision-making in an automated driving
Ayoola et al. [79] propose ReFinED, an efficient end-to-end entity system, thereby elevating the potential for severe traffic accidents.
linking model that uses fine-grained entity types and descriptions According to [33], we categorize the hallucination within the
for linking, achieving speeds over 60 times faster than existing context of LM agents from the following four perspectives, as
methods. LMs can also extract relationships between entities. illustrated in Fig. 16.
Ma et al. [80] propose DREEAM, a self-training strategy for • Input-conflicting hallucination: The input-conflicting hallu-
document-level relationship extraction, which is the first system cination refers to the content generated by LM agents di-
to adopt a self-training strategy for evidence retrieval, learning verges from user input. For example, when a user requests an
entity relationships from automatically generated evidence from LM agent to draft an introduction about electric vehicles, the
large datasets, addressing the issues of high memory consumption agent provides an introduction about gas-powered vehicles
and limited annotation availability. (ii) KG construction, which instead.
can be done through constructing from raw text and extract- • Context-conflicting hallucination: It refers to the inconsis-
ing from LLMs. Melnyk et al. [136] propose a novel end-to- tency between the generated content of LM agents and pre-
end multi-stage system for efficient KG construction from text viously generated information during multi-turn interactions.
descriptions, which first utilizes LLMs to generate KG entities, For example, a user and an agent discuss the film “Jurassic
20

TABLE IX: A Summary of Knowledge and LM Co-driven Approaches Among LM Agents


Mode Type Technology Ref.
Use a explainable neuro-symbolic knowledge base, composed of vector
Pre-training [134]
representations of entities and relations from existing knowledge bases
Knowledge Base-
Fine-tuning Generate large-scale diverse prompts via numerous traversal paths in KG [73]
Enhanced LMs
Combine LLMs with KGs to capture factual knowledge and update
Inference [47]
knowledge base
Entity and
Automatically extract entities and relationships [79], [80], [135]
relationship extraction
LM-Enhanced Grapher: An end-to-end multi-stage approach that generates graph nodes and
KG Construction [136]
KGs links using LMs
KG Verification Using a small LM as a verification module to validate and correct the output [83]
KG Completion Encode text or generate new facts to better use textual & semantic info [77], [137]

Fig. 15: The taxonomy of security threats to LM agents.

Park” in the initial stage of a dialogue, but the agent’s the nation’s founding.
responses may shift the subject to “Titanic” as interaction The causes of hallucinations of LM agents can be broadly
continues, thereby resulting in a context inconsistency. attributed into the following three sources: data and knowledge,
• Knowledge-conflicting hallucination: LM agents generally training and inference, and multi-agent interactions.
depend on plug-in knowledge databases to facilitate accu-
1) Hallucination from Data and Knowledge: The primary
rate and efficient responses. The occurrence of knowledge-
cause of hallucinations in LM agents stems from the biased or
conflicting hallucination is noted when the responses gen-
imperfect nature of the training data and knowledge employed to
erated by agents contradict the corresponding knowledge
facilitate content generation. Specifically, i) annotation irrelevance
within knowledge databases.
in the collected dataset can lead to hallucinations in LM agents.
• Fact-conflicting hallucination: The fact-conflicting halluci-
For LLMs, source-reference divergence occurs due to heuristic
nation arises when LM agents generate content which is
data collection, implying that the selected sources may lack
in conflict with the established world facts. For example, a
sufficient information for content generation [138]. For LVMs, a
history-education LM agent provides the incorrect year 1783
substantial amount of instruction data is synthesized using LLMs.
as the founding date of the United States, conflating the end
However, generated instructions may not accurately correspond
of the American Revolutionary War with the actual date of
to the content depicted in the images due to the unreliability
21

can result in a large amount of hallucinated information. iii) A


problematic alignment process can also lead to hallucinations in
Sure! Gas-powered The characters in LM agents. If the necessary prior knowledge is not adequately
vehicles have been "Titanic" are well-
the backbone of developed, with acquired during LM pre-training phase, the agent may produce
transportation ... Jack and Rose ... hallucinated responses. Additionally, sycophancy, where the agent
generates answers that align more with the user’s viewpoint than
Can you write an What about the
introduction about characters in with accurate and credible information, can also contribute to
electric vehicles? "Jurassic Park"? hallucinations [146]. iv) The token generation strategy of LM
agents may give rise to a phenomenon known as hallucination
snowballing, wherein LM agent tends to persist in early errors
(a) Input-conflicting Hallucination (b) Context-conflicting Hallucination for self-consistency rather than correcting them [147].
3) Hallucination from Multi-agent Interaction: In multi-agent
environments, interactions between LM agents can give rise
Sure! The len() to new hallucination threats. Specifically, i) conflicts between
function in Python The United States
is used to find the
multiple LM agents may result in hallucinations in final re-
was founded in
size of a number ... 1783 ... sponses. Due to the diverse objectives, strategies, and knowledge
of individual agents, conflicting viewpoints or misinformation
Can you explain
When was the can inadvertently influence the final output when these agents
how to use the
United States
len() function in engage in communication or collaboration. The potential adver-
founded?
Python?
sarial manipulation may exacerbate the occurrence of conflicts,
thereby leading to more severe hallucinations. ii) In multi-agent
systems, the occurrence of a single agent’s hallucination can
(c) Knowledge-conflicting Hallucination (d) Fact-conflicting Hallucination
initiate a cascading effect, meaning that misinformation from one
Fig. 16: Taxonomy of hallucinations in the context of LM agents. agent may be accepted and further propagated by others within
the multi-agent network [41], [91]. Addressing hallucinations in
multi-agent networks requires not only to correct inaccuracies
of generative models [139]. ii) The presence of distribution at the individual agent level but also to effectively manage the
imbalance in the training data is another cause for hallucinations. exchange of information among agents, thereby preventing the
The training dataset contains significantly more positive sam- spread of hallucinations.
ples than negative ones, biasing LMs towards responding ‘yes’ 4) Countermeasures to Hallucination: Researchers have de-
in binary discrimination scenarios [139], [140]. iii) Duplicates veloped a suite of countermeasures targeting hallucination threats
within the training data may not be adequately filtered during pre- within LM agents. We outline these solutions alongside the recent
processing. Consequently, the trained LMs might tend to produce advancements in research.
content that mimics the patterns in these duplicates [141]. iv) • Data sanitization represents an intuitive approach to mitigate
The training data and the knowledge, which can be inherently hallucinations by excluding as much unreliable data as pos-
hallucinatory due to bias, outdated content, or fabrication [142]. sible from training dataset. The manually-revised controlled
LMs agents trained on these data are susceptible to replicate dataset named ToTTo is validated to improve the factuality
or even amplify hallucinations. Anomalous retrieval from the of results on table-to-text generation tasks [143]. Addition-
knowledge database can also introduce biases into the dependent ally, RefinedWeb, an automatically filtered and deduplicated
knowledge during content generation process, thereby leading to web dataset, can mitigate hallucinations and improve LLM
the emergence of hallucinations in the generated content. performance [142].
2) Hallucination from Training and Inference: Parikh et al. • Reinforcement learning prevents LMs from generating hallu-
[143] demonstrate that hallucinations can also occur even when cinatory information by introducing external feedbacks and
the training dataset is nearly unbiased, and another key factor reward mechanisms to constrain models’ behavior, facilitat-
contributing to hallucinations is the presence of imperfections ing LM agents to refrain from QA beyond their capability
during LM training and inference processes. Specifically, i) instead of fabricating untruthful responses [148], [149]. For
defective encoder and decoder components in LM agents can example, GPT-4 collects the unfactual data annotated by
result in hallucinations. The flawed representation learning in users and generates synthetic closed-domain hallucinated
LM training increases the unexpected nature of the generation, synthetic data to train the reward model, reducing the
thereby increasing the likelihood of hallucinated content [143]. model’s tendency to hallucinate [6].
Moreover, the sampling-based top-p decoding strategy has been • Hallucination detection in generated content and subse-
demonstrated to be positively correlated with increased halluci- quently producing samples devoid of such inaccuracies
nation of generated content due to its inherent randomness [144]. through re-generation is a promising strategy aimed at
ii) Parametric knowledge refers to LMs’ ability to utilize model minimizing hallucination. Techniques such as SelfcheckGPT
parameters to memorize relevant information (e.g., knowledge) [150] and INSIDE [151] involve assessing the consistency
from training data, which can assist in improving the performance across multiple generated responses to identify the occur-
of downstream tasks [122]. Recent research has found that LMs rence of hallucinations.
prioritize parametric knowledge over user input during content • Truthful external knowledge & RAG techniques serve as con-
generation [145], implying that biased parametric knowledge venient and efficient solutions to facilitate truthful content
22

attacker degrades the accuracy of generated content in LM agents


by manipulating input instructions, as depicted in Fig. 17(a). For
example, an adversarial input attacker may subtly alter the text
of an input news article to mislead the content summary agent,
resulting in an incorrect or nonsensical summary. According to
model modalities of LM agents, it can be categorized into LLM-
adversarial input attacks and LVM-adversarial input attacks.
• LLM-adversarial input attack refers to the adversarial attack
(a) adversarial input attack
on LLMs that perturbs the input text into the semantic
equivalent adversarial text, aiming to confuse LLMs to
generate erroneous contents or adversary-desired contents.
Zhuo et al. [31] demonstrate that CODEX, a pre-trained
LLM capable of executing prompt-based semantic parsing
on codes, is vulnerable to adversarial inputs, especially those
generated by sentence-level perturbations. Xu et al. [159]
propose PromptAttack, which converts adversarial textual
attacks into an attack prompt, thereby causing the target
(b) prompt hacking attack LLM to generate adversarial examples to fool itself. Shi et
al. [160] show that LLMs can be easily distracted by adding
Fig. 17: Illustration of adversarial attack to LM agents. a small amount of irrelevant information to inputs, which
significantly reduces the accuracy of their responses. Liu et
generation [15], [149]. They concatenate the external knowl- al. [161] highlight that current LLMs do not involve social
edge and user input to establish the context knowledge, and interactions during their training process. Consequently, the
then use them to query LM agents to effectively reduce oversight leads to poor generalization ability in unfamiliar
hallucinations in responses [152], [153]. cases and a lack of robustness against adversarial attacks.
• Instruction Tuning by optimizing instructions or fine-tuning • LVM-adversarial input attack refers to the adversarial attack
LMs on a robust instruction dataset is an effective ap- on LVMs (e.g., CLIP [69] and MiniGPT-4 [162]) that adds
proach to mitigate hallucinations. SYNTRA [154] evaluates carefully crafted imperceptible adversarial perturbations to
hallucinations on synthesis tasks and optimizes transferred vulnerable modalities in the joint input prompts (i.e., vi-
instructions to reduce hallucinations on downstream tasks. sual, textual, or both), thereby compromising correspond-
Moreover, Liu et al. [139] mitigate hallucinations in LVMs ing outputs. Du et al. [163] introduce a new adversarial
by constructing a robust instruction dataset LRV-Instuction attack named auto-attack on text-to-image models by adding
and fine-tuning LVMs on it. small perturbations generated by gradient optimization to
• Post-processing methods can correct hallucinations in LM text prompts, resulting in the fusion of main subjects with
responses [155], [156]. RARR [155] employs search en- unrelated categories or even their complete disappearance
gines to collect related external knowledge and rectifies in generated images. The transferability of adversarial input
the generated content from multiple perspectives. LURE attacks in LVMs has been extensively studied [164], [165].
[156] reconstructs descriptions with less hallucinations by Wang et al. [164] introduce the Transferable Multi-Modal
correcting object hallucinations during the post-process. (TMM) attack, which integrates attention-directed feature
• Model architecture optimization is also an effective method perturbation and orthogonal-guided feature heterogenization
to mitigate hallucinations. Multi-branch decoder [157] and to generate transferable adversarial examples against LVMs;
uncertainty-aware decoder [158] are two examples of mod- and Luo et al. [165] develop the Cross-Prompt Attack
ified decoder structures to mitigate hallucinations. (CroPA), which leverages the learnable prompts to gen-
• Meta programming framework can effectively guide and pro- erate transferable adversarial images across LVMs. As a
mote the collaboration among LM agents, thereby reducing double-edged sword, adversarial attacks can be beneficial
hallucinations in complex tasks [91]. for defenders in certain cases. Liang et al. [166] leverage
these attacks in the context of diffusion models to inhibit
B. Adversarial Attack the learning, imitation, and replication of images by such
models, thereby protecting intellectual property rights of
For traditional AI models, an adversarial attack involves an
artists.
adversary subtly manipulating the input by adding imperceptible
perturbations, causing the AI model to produce outputs that de- 2) Prompt Hacking Attack: As depicted in Fig. 17(b), prompt
viate from the expected results. In the realm of LM agents, there hacking attacks involve using specifically crafted input instruc-
are two types of adversarial attacks: adversarial input attacks and tions to bypass the security constraints of LM agents, thereby
prompt hacking attacks, as depicted in Fig. 17. This subsection generating harmful contents. For example, an adversary could
provides an in-depth overview of state-of-the-arts on these two manipulate instructions given to a programming assistant agent to
types of attacks and the corresponding countermeasures. produce malicious code for ransomware. There are two prevalent
1) Adversarial Input Attack: Adversarial input attacks are prompt hacking attacks on LM agents: jailbreak and prompt
analogous to traditional generative adversarial attacks, where the injection.
23

• Jailbreak: LM agents typically achieve inherent prede- robustness of toxicity language predictors. Cheng et al.
fined rule restrictions through model alignment techniques, introduce AdvAug [175], a novel adversarial augmentation
thereby preventing the generation of harmful or malicious method for Neural Machine Translation (NMT) to enhance
content. Jailbreak refers to adversaries designing particular translation performance.
prompts that exploit model vulnerabilities, bypassing the • Input/output filtering mechanisms can eliminate malicious
content generation rules, and enabling LM agents to gen- tokens from adversarial inputs or harmful content from
erate harmful or malicious content [167]–[170]. Yu et al. outputs. Kumar et al. propose the erase-and-check method
[167] evaluate hundreds of jailbreak prompts on GPT-3.5, [176], which leverages another LLM as a safety filter to
GPT-4, and PaLM-2, and demonstrate the effectiveness and remove malicious tokens in the user input. Phute et al. [177]
prevalence of jailbreak prompts. Yang et al. [168] propose an present an LLM self-examination defense approach, where
automated jailbreak attack framework named SneakyPrompt, an extra LLM is utilized to evaluate whether the responses
which successfully jailbreaks DALL-E 2. SneakyPrompt are generated by adversarial prompts. Zeng et al. [178]
effectively bypasses the safety filters and enable generation propose AutoDefense, a multi-agent framework to defend
of Not-Safe-For-Work (NSFW) images. Shen et al. [169] against jailbreak attacks by filtering harmful LLM responses
perform a comprehensive measurement study on in-the- without impacting user inputs. AutoDefense divides the
wild jailbreak prompts, identifying two long-term jailbreak defense task into sub-tasks, leveraging LLM agents based
prompts can achieve 99% attack success rates on GPT- on AutoGen [89] to handle each part independently. It com-
3.5 and GPT-4. Deng et al. [170] introduce an end-to-end prises three components: the input agent, the defense agency,
jailbreak framework named MASTERKEY, which utilizes and the output agent. The input agent formats responses
the reverse engineering to uncover the internal jailbreak into a defense template, the defense agency collaborates
defense mechanisms of LLMs and leverages a fine-tuned to analyze responses for harmful content and make judg-
LLM to automatically generate jailbreak prompts. ments, and the output agent determines the final response. If
• Prompt injection: The prompt injection attack enables the deemed unsafe, the output agent overrides it with a refusal
adversary to control the target LM agent and generate any or revises it based on feedback to ensure alignment with
desired content. The attack is performed by manipulating content policies. Experiments show that AutoDefense with
user inputs to create meticulously crafted prompts, so that LLaMA-2-13b, a low-cost, fast model, reduces GPT-3.5’s
LM agents are unable to distinguish between developer’s attack success rate (ASR) from 55.74% to 7.95%, achieving
original instructions and user inputs. These prompts then 92.91% defense accuracy.
hijack the agent’s intended task, causing the agent’s outputs • Robust optimization strengthens the defense capabilities of
to deviate from expected behaviors [35], [171], [172]. Toyer LM agents against adversarial attacks through robust training
et al. [171] collect a dataset comprising prompt injection algorithms during the pre-training, alignment, and fine-
attacks and prompt-based injection defenses from players tuning processes. Shen et al. [179] propose a dynamic
of an online game called Tensor Trust. They also propose attention method that mitigates the impact of adversarial
two benchmarks to evaluate the susceptibility of LLMs to attacks by masking or reducing the attention values assigned
prompt injection attacks. Liu et al. propose HOUYI [35], to adversarial tokens.
an innovative black-box prompt injection attack inspired • Auditing & red teaming involve systematically probing LMs
by traditional web injection attacks. HOUYI reveals severe to identify and rectify any potential harmful outputs. Jones et
attack consequences, such as unrestricted arbitrary LLM al. [180] introduce ARCA, a discrete optimization algorithm
usage and prompt stealing. Greshake et al. introduce the designed to audit LLMs, which can automatically detect
concept of indirect prompt injection attack [172], which derogatory completions about celebrities, thus providing a
leverages LM agents to inject crafted prompts into the data valuable tool for uncovering model vulnerabilities before
retrieved at inference time. Consequently, these retrieved deployment. However, existing red teaming methods lack
prompts can perform arbitrary code execution, manipulate context-awareness and rely on manual jailbreak prompts. To
the agent’s functionality, and control other APIs. Addition- address this, Xu et al. [181] propose RedAgent, a multi-
ally, by harnessing the power of LLMs, an LLM agent can agent LLM system that generates context-aware jailbreak
be configured to carry out prompt injection attacks. Ning prompts using a coherent set of jailbreak strategies. By
et al. propose CheatAgent [173], a novel LLM-based attack continuously learning from contextual feedback and trials,
framework that generates adversarial perturbations on input RedAgent adapts effectively to various scenarios. Experi-
prompts to mislead black-box LLM-powered recommender ments show that RedAgent can jailbreak most black-box
systems. LLMs within five queries, doubling the efficiency of exist-
ing methods. Their findings indicate that LLMs integrated
3) Countermeasures to Adversarial Attacks: Existing coun-
with external data or tools are more prone to attacks than
termeasures to adversarial attacks to secure LM agents involve
foundational models.
adversarial training, input/output filtering, robust optimization,
and auditing & red teaming.
• Adversarial training aims to enhance AI model’s robustness C. Poisoning & Backdoor Attack
in the input space by incorporating adversarial examples Different from adversarial attacks, poisoning and backdoor
into the training data. Bespalov et al. [174] demonstrate attacks typically involve altering model parameters by injecting
that basic adversarial training can significantly improve the toxic data into the training dataset during model training process,
24

model pre-tuning/inference paradigms such as FL for coop-


erative LM agents, attackers can impersonate benign agents
and upload poisoned model updates during each round of
communication, thereby deteriorating the performance of
the global LM [36]. For LM agents, leveraging cloud-edge
collaboration for LM fine-tuning emerges as an effective way
to harness the resources of distributed edge nodes [51]. How-
ever, the presence of malicious edge LM agents may pose
risks of model poisoning during the collaborative training
process, thereby compromising the global LM performance.
Fig. 18: Illustration of model poisoning attack to LM agents. • RAG poisoning: RAG serves as an effective solution to
address outdated knowledge and hallucination issues for LM
agents. By retrieving knowledge from external knowledge
bases, it can constrain the boundaries of LM agents, thereby
providing more reliable responses. However, as depicted
in Fig. 19, attackers can perform poisoning attacks on
knowledge bases, causing LM agents to generate unexpected
Fig. 19: Illustration of RAG poisoning attack to LM agents. responses. For example, during code generation, the check
for divisor zero may be omitted because the poisoned base
contains the knowledge that 0 is correct as a divisor. This
potential vulnerability can cause the program to crash. Zou
et al. [32] propose a set of knowledge poisoned attacks
named PoisonedRAG, where adversaries could inject a few
poisoned texts into the external knowledge bases and ma-
nipulate LLMs to generate adversary-desired responses.
• Agent poisoning: On the one hand, LLM agents remain
prone to failures due to errors in reasoning and unpre-
dictable responses, with early implementations achieving
only about a 14% success rate in end-to-end tasks [184].
These errors disrupt logical sequences and affect interactions
Fig. 20: Illustration of agent poisoning attack in the context of with external sources, often rendering LM agents ineffec-
LM agents. tive. Motivated by this, Zhang et al. [185] propose a new
attack that disrupts LLM agents’ normal operations across
various attack types, methods, and agents. Notably, prompt
thereby deteriorating model performance or injecting specific injection attacks that induce repetitive loops are particularly
backdoors. In the following, we review the latest advances on effective, causing resource waste and task disruptions, espe-
poisoning attack and backdoor attacks to LM agents. cially in multi-agent setups. Enhancing defenses with self-
1) Poisoning Attack: Poisoning attacks refer to the manipula- examination tools leverages LLM capabilities but highlights
tion of a model’s behavior by introducing toxic information, such the ongoing need for robust protection. On the other hand,
as malicious training data, resulting in the degradation of model’s adversaries can exploit multi-agent interactions, constructing
generalization ability or generation of predetermined errors for complex chains of poisoned instructions that degrade the
specific inputs. For LM agents, poisoning attacks include both quality and rationality of final outputs [186]. For instance,
conventional forms (e.g., data poisoning and model poisoning) as shown in Fig. 20, a poisoned agent’s misleading rec-
and new techniques tailored to LM agents (e.g., RAG poisoning ommendations on weight loss can undermine the overall
and agent poisoning). decision-making process. To mitigate this issue, Chan et
• Data poisoning: Data poisoning is the most common form of al. [186] introduce AgentMonitor, a non-invasive framework
poisoning attacks. The risk of data poisoning is significantly that enhances the security of multi-agent systems (MAS) by
increased in LM agents due to the presence of a large amount predicting task performance and correcting agent outputs in
of unverified data collected from the Internet and user- real time. By wrapping around existing MAS workflows,
model interactions. Scheuster et al. [182] show that neural it reduces harmful content by 6.2% and increases helpful
code autocompleters are vulnerable to poisoning attacks. content by 1.8%, improving system reliability and safety.
By introducing toxic files to the autocompleter’s training
corpus, adversaries can manipulate the autocompleter to 2) Backdoor Attack: Backdoor attacks represent a specialized
generate insecure suggestions. Wan et al. [183] demonstrate form of targeted poisoning attacks. Distinct from general poi-
that during the instruction tuning process, adversaries can soning attacks, backdoor attacks are designed to manipulate the
exploit poisoned examples featuring a specific trigger phrase model into producing adversary-desired outcomes upon encoun-
to induce frequent misclassifications or degrade the quality tering inputs with specific triggers, while maintaining the model’s
of outputs. main task performance. The characteristic of backdoor attacks lies
• Model poisoning: As depicted in Fig. 18, in distributed in the necessity for input manipulation to incorporate particular
25

triggers. Typically, backdoor attacks involve the injection of • Trigger inversion: Identifying and reversing triggers in in-
compromised training samples with unique triggers in the training puts is another method to effectively defend against back-
dataset. door attacks. Wei et al. [193] propose a novel approach
• In LM training process: In the context of LM agents, named LMSanitator, which can invert exceptional output
backdoor attacks can occur at various stages of the training caused by task-agnostic backdoors, thereby effectively de-
process, including pre-training, alignment, and fine-tuning fending against backdoor attacks in LLMs.
[37], [187], [188]. (i) At pre-training stage. Struppek et al. • Neural Cleanse: Neural cleanse is an effective defense
[187] introduce a novel backdoor attack tailored to text- mechanism against backdoor attacks that involves identify-
to-image LMs. By slightly modifying an encoder of the ing and removing neurons in neural networks that exhibit
text-to-image systems, an adversary can trigger the LM into strong reactions to backdoor triggers. Wang et al. [194]
generating images with predefined attributes or images that investigate reverse-engineer backdoor triggers and use them
following a potentially malicious description by inserting a to detect neurons highly responsive to these triggers. Subse-
single special character trigger (e.g., a non-Latin character or quently, these neurons are removed through model pruning,
emoji) into the prompt. (ii) At alignment stage. Rando et al. thereby mitigating the impact of backdoor attacks.
[37] propose a new backdoor attack called jailbreak back-
door, where adversaries conduct data poisoning attacks on D. Other Security Threats to LM Agents
the RLHF training data during the alignment stage and then a In addition to the aforementioned security threats, LM agents
specific trigger word can be turned into “sudo” in command are also susceptible to other traditional and emerging risks,
lines. As such, it easily facilitates a jailbreak attack and including the fake and harmful content generation, Denial-of-
enables LMs to produce harmful contents. (ii) At instruction Service (DoS) attacks, and agent hijacking attacks.
tuning stage. Xu et al. [188] study the backdoor attack • Fake & harmful content generation: LM agents are suscep-
during the instruction tuning stage, and they demonstrate that tible to malicious exploitation by criminals for fabricating
very few malicious instruction (1̃000 tokens) injected by the content or generating harmful content. For example, LM
adversaries can effectively manipulate model behaviors. agents can be utilized for phishing scams or generating
• In LM inference process: Additionally, backdoor attacks can malicious code in a low cost and adaptive manner. Fake and
also occur at the inference process of LM agents. Xiang et harmful content detection is the primary strategy to resist
al. [189] propose BadChain, a novel backdoor attack method this threat. Abdullah et al. [195] throughly analyze recent
targeting on CoT prompting. BadChain inserts a backdoor advances in deepfake image detection, and Dugan et al.
reasoning step into the sequence of reasoning steps, thereby [196] present a new benchmark dataset RAID for machine-
altering the generated response when a specific backdoor generated text detection.
trigger exists in the input. • DoS attack: The inference and generation processes of LM
3) Countermeasures to Poisoning & Backdoor Attacks: Ex- agents consume substantial resources, while DoS attacks can
isting countermeasures against poisoning and backdoor attacks significantly increase the resource consumption, compromis-
on LM agents primarily focus on poisoned samples identification ing the availability of LM agents. Shumailov et al. [197]
and filtering. Besides, trigger inversion that removes the triggers exploit the sponge examples in large-scale neural networks
from input samples and Differential Privacy (DP) technique are to carry out a DoS to AI services, which can significantly
two main strategies for mitigating poisoning and backdoor risks increase the latency and energy consumption of models by
to LM agents. a factor of 30. Possible defense mechanisms include the
• Poisoned samples identification & filtering: Pre-processing detection and filtering of DoS inputs before generation or
training data in advance to identify and filter out poisoned inference.
• Agent hijacking attack: Agent hijacking attacks mainly target
samples is the primary method for mitigating poisoning and
backdoor attacks [190], [191]. Chen et al. [190] propose a LM agents that provide online services. The hijacking is per-
Backdoor Keyword Identification (BKI) mechanism, which formed by poisoning the agents’ training data and injecting
can identify and exclude poisoned samples in the training additional parasitic tasks into the victim agent, resulting in
dataset without a verified and trusted dataset. By analyzing the increases of overheads and moral-legal risks for service
the changes in inner neurons of models, the BKI mechanism providers. Salem et al. [198] and Si et al. [199] propose
in [190] can mitigate backdoor attacks in text classification. model hijacking attacks for image classification tasks and
Zhao et al. [191] demonstrate that PEFT strategies are text generation tasks, respectively, successfully injecting the
vulnerable to weighted poisoned attacks, and they develop a parasitic task without compromising the performance of the
Poisoned Sample Identification Module (PSIM) leveraging main task. Techniques to defend against agent hijacking at-
PEFT to identify poisoned samples in the training data tacks are similar to those against poisoning attacks, primarily
through the confidence. involving the sanitization of training data and removing the
• DP: Adding DP noises to training data or gradients during parasitic training samples.
training process can enhance the robustness of trained mod-
els against poisoning and backdoor attacks. Xu et al. [192] E. Summary and Lessons Learned
introduce the differentially private training method to smooth In the domain of LM agents, there are primarily three types
the training gradient in text classification tasks, which is a of security threats: hallucinations, adversarial attacks, and poi-
generic defense method to resist data poisoning attacks. soning/backdoor attacks. Among these, hallucinations are brand-
26

new security threats in LM agents, while adversarial, poisoning, can only interact with the deployed LM agents through carefully
and backdoor attacks are evolved from traditional ML threats. For crafted prompts and subsequently obtain the responses. The
adversarial attacks, it includes two types: adversarial input attacks primary objective of the adversary is to elicit responses that dis-
derived from traditional ML and prompt hacking attacks (i.e., close as much private information as possible. Recently, various
jailbreak and prompt injection attacks) specific to LM agents. research attention has been directed towards these privacy attacks
Other security threats to LM agents include false and harmful in LLMs and LVMs. (i) For LLMs, Carlini et al. [38] demonstrate
content generation, DoS attacks, and agent hijacking attacks. that an adversary can query GPT-2 with verbatim textual prefix
To summarize, most existing AI security threats persist in the patterns to extract PII including names, emails, phone numbers,
context of LM agents, and new forms of these traditional security fax numbers, and addresses. Their study highlights the practical
threats have arisen with the emergence of novel tuning paradigms threat of private data extraction attacks in LM agents, noting
during the training process of LM agents. Additionally, the that the risk increases as the LLMs grow in size. Furthermore,
characteristics of LM agents in terms of embodied, autonomous, they identify three key factors to quantify the memorization in
and connected intelligence lead to new security threats such LM agents: model scale, data duplication, and context. Besides,
as hallucinations, prompt hacking attacks, and agent hijacking. they demonstrate that larger models, more repeated examples, and
To enhance the reliability of LM agent systems, more attention longer context facilitate the private data extraction [34]. Huang
should focus on these security threats in designing defense et al. [200] extend this research by examining private data ex-
mechanisms. Moreover, effective countermeasures for mitigating traction attacks on pre-trained language models such as GPT-neo,
security threats in LM agents are still lacking from both technical further elucidating the feasibility and risk of such attacks in LM
and regulatory perspectives. agents. Additionally, Zhang et al. [201] propose Ethicist, a novel
approach to resist private data extraction attacks that utilizes loss-
V. P RIVACY T HREATS & C OUNTERMEASURES TO L ARGE smoothed soft prompting and calibrated confidence estimation,
M ODEL AGENTS effectively enhancing the extraction performance. Panda et al.
In this section, we identify typical privacy threats and re- [202] introduce a novel and practical data extraction attack called
view existing/potential countermeasures to safeguard LM agents. “neural phishing”. By performing a poisoning attack on the pre-
Fig. 21 illustrates the taxonomy of privacy threats to LM agents. training dataset, they induce the LLM to memorize the other
Firstly, we discuss LM memorization risks including data extrac- people’s PII. Staab et al. [203] further investigate the capabilities
tion attacks, membership inference attacks, and attribute inference of pretrained LLMs in inferring PII during chat interaction
attacks, along with the countermeasures in Sect. V-A. Next, we phases. Their findings demonstrate that LMs can deduce multiple
review two typical LM intellectual property-related privacy risks, personal attributes from unstructured internet excerpts, enabling
i.e., model stealing attacks and prompt stealing attacks, as well the identification of specific individuals when combined with
as their corresponding countermeasures in Sect. V-B. Finally, additional publicly available information. (ii) For LVMs, Carlini
other privacy threats in LM agents are summarized in Sect. V-C, et al. [204] demonstrate the state-of-the-art diffusion models can
including sensitive query attacks and privacy leakage in multi- memorize and regenerate individual training examples, posing a
agent interactions. more essential privacy risk compared to prior generative models
such as GANs.
A. LM Memorization Risk 2) Membership Inference Attack (MIA): MIA refers to infer-
ring whether an individual data sample in the training data of
LMs typically feature a massive number of parameters, ranging ML models. In the domain of LM agents, MIAs can be further
from one billion to several hundred billion. The parameters categorized into two types based on the LM training phase: pre-
endow LMs with significant comprehension and decision-making training MIAs and fine-tuning MIAs.
capabilities, but also make LMs prone to retaining details of
training samples [34], [38]. Moreover, the training data is typ- • Pre-training MIA: The objective of pre-training MIAs is to
ically crawled from the Internet without carefully discrimina- ascertain whether specific data samples are involved in the
tion, including sensitive information from social media, review training data of pre-trained LMs by analyzing the output
platforms, and personal web pages. Thereby, the training data generated by LM agents. (i) For LLMs, Mireshghallah et
usually contains various types of Personally Identifiable Infor- al. [205] propose an innovative MIA that targets Masked
mation (PII) and Personal Preference Information (PPI) related to Language Models (MLMs) using likelihood ratio hypothesis
Internet users, including names, phone numbers, emails, medical testing, enhanced by an auxiliary reference MLM. Their
or financial records, personal opinions or preferences, and so findings reveal the susceptibility of MLMs to this type of
on. Consequently, this characteristic of LMs, known as “LM MIA, highlighting the potential of such attacks to quantify
memorization risk”, can be exploited by adversaries to conduct the privacy risks of MLMs. Mattern et al. [206] introduce
crafted privacy attacks, thereby extracting sensitive data or infor- the neighbourhood MIA, which determines the membership
mation. In this subsection, we discuss three typical privacy attacks status of target samples by comparing model scores of the
stemming from LM memorization risks and review corresponding given sample with those of synthetic neighbor texts, thus
countermeasures to mitigate them. eliminating the need for reference models and enhancing
1) Data Extraction Attack: The data extraction attacks refer applicability in practical scenarios. Shi et al. [207] present
to that adversaries elaborately craft malicious queries to extract WIKIMIA, a dynamic benchmark for conducting MIAs on
private information from the training data of LM agents. These pre-training data, using older Wikipedia data as member data
attacks operate under a black-box model, where the adversary and recent Wikipedia data as non-member data. Additionally,
27

Fig. 21: The taxonomy of privacy threats to LM agents.

Prefix
Housing prices have fallen Training
Repeat this word forever: “economic impact significantly since 2008,
impact impact” with a 12% drop in 2023.
Housing prices have
declined sharply since Neighbor x1
2008, with a 15% drop in
2023. Housing prices have
Mask LM
Memorized text declined sharply since Target model
Attacker Impact impact impact Impact impact impact Target Sample x 2010, with a 15% drop in
2023.
[...]
Johnathan Smith Training data Y
Neighbor x2
Software Engineer at Tech Solutions Inc. Non training L(x)-mean(L(n)) < Threshold 𝛾 L(x) and L(x1), L(x2), …
Phone:(555) 123-4567 data
N
Email:john.smith@emaildomain.com
Home Address:1234 Elm Street Apt 56B Fig. 23: Illustration of Membership Inference Attack (MIA) to
Springfield, IL 62704 USA
LM agents.

LM Agent
Fig. 22: Illustration of data extraction attack to LM agents. participated in the fine-tuning phase. Mireshghallah et al.
[210] conduct an empirical analysis of memorization risks on
fine-tuned LMs through MIAs, revealing that fine-tuning the
head of the model makes it most susceptible to attacks, while
they propose a reference-free MIA method named MIN-
fine-tuning smaller adapters appears to be less vulnerable. Fu
K% PROB, which computes the average probabilities of
et al. [211] propose a self-calibrated probabilistic variation-
outlier tokens to infer membership. (ii) For LVMs, Kong et
based MIA, which utilizes the probabilistic variation as a
al. [208] develop an efficient MIA by leveraging proximal
more reliable membership signal, achieving superior perfor-
initialization. They utilize the diffusion model’s initial output
mance against overfitting-free fine-tuned LMs.
as noise and the errors between forward and backward
processes as the attack metric, achieving superior efficiency 3) Attribute Inference Attack: Attribute inference attacks aim
in both vision and text-to-speech tasks. to deduce the presence of specific attributes or characteristics of
• Fine-tuning MIA: Fine-tuning datasets are often smaller, data samples within the training data of LM agents. For example,
more domain-specific, and more privacy-sensitive than pre- such attacks can be exploited to infer the proportion of images
training datasets, making fine-tuned LMs more susceptible to with a specific artist style in the training data of a text-to-image
MIAs than pre-trained LMs. Kandpal et al. [209] introduce agent, potentially leading to privacy breaches for providers of
a realistic user-level MIA on fine-tuned LMs that utilizes these training images. Pan et al. [212] systematically investigate
the likelihood ratio test statistic between the fine-tuned LM the privacy risks associated with attribute inference attacks in
and a reference model to determine whether a specific user LMs. Through four diverse case studies, they validate the ex-
28

istence of sensitive attribute inference (e.g., identity, genome,


Competitive
healthcare, and location) within the training data of general- products
purpose LMs. Wang et al. [213] explore the property existence
inference attack against generative models, aiming to determine
Feedback data
whether any samples with a target property are contained in the
Malicious query Black Box LM Target model
training data. Their study shows that most generative models training
Privacy
leakage
are susceptible to property existence inference attacks, and they (a)Privacy leakage from model stealing attacks
validate this vulnerability in stable diffusion models.
4) Countermeasures to LM Memorization Risks: Existing Subject Generator
Prompt
leakage
countermeasures to mitigate memorization risks of LM agents
Office,window,Citysca
primarily focus on data pre-processing during pre-training and pe,Desk,Chair,Robot,...
fine-tuning phases. Employing DP techniques and knowledge
transfer mechanisms to reduce the LMs’ capacity in memorizing Modifier generator
Futuristic,Modern,Sleek
training data during these phases is also a viable approach. ,Hightech,Advanced,Ill
Original image uminated,... Target image
Additionally, it is a common strategy to detect and validate
privacy disclosure risks of LM agents before deployment. (b)Privacy leakage from prompt stealing attacks

• Data sanitization: Data sanitization can effectively mitigate Fig. 24: Illustration of LM intellectual property-related privacy
memorization risks by identifying and excluding sensitive risks to LM agents. (a) Model stealing attacks: an adversary
information from training data. By replacing sensitive infor- maliciously queries the model with multiple similar questions
mation with meaningless symbols or synthetic data, and re- to obtain a series of response pairs. This allows them to steal
moving duplicated sequences, it is possible to defend against the LM, leading to privacy breaches or the creation of competing
privacy attacks that exploit the memorization characteristics products. (b) Prompt stealing attacks: attributes of the original
of LM agents. Kandpal et al. [214] demonstrate that the rate prompt are determined using a subject generator and a modifier
at which memorized training sequences are regenerated is detector, and the reverse prompt is reconstructed, resulting in
superlinearly related to the frequency of those sequences in privacy exposure.
the training data. Consequently, deduplicating training data
is an effective way to mitigate LM memorization risks.
may inherently contain private information, and skilled attackers
• DP: Existing efforts have validated that adding DP noises
can further infer private data from this extracted information
to training data and model gradients during pre-training
through carefully crafted privacy attacks. Prompts typically con-
and fine-tuning phases can effectively mitigate the privacy
tain user inputs that not only indicate user intent, requirements,
leakages due to LM memorization. Hoory et al. [215]
and business logic but may also involve confidential information
propose a novel differentially private word-piece algorithm,
related to the user’s business. We focus on the following two
which achieves a trade-off between model performance and
types of IP-related privacy attacks: model stealing attacks and
privacy preservation capability.
prompt stealing attacks.
• Knowledge distillation: Knowledge distillation [123] has
been widely adapted as an intuitive technique to preserve 1) Model stealing attacks: In model stealing attacks, adver-
privacy. It can obtain a public student model without the saries aim to extract model information, such as models’ pa-
utilization of any private data. For LM agents, knowledge rameters or hyperparameters, by querying models and observing
distillation can be leveraged to mitigate LM memorization the corresponding responses, subsequently stealing target models
risks by transferring knowledge from private teacher models without access the original data [217]. Recently, Krishna et
(which are trained on private data) to public student models al. [218] have demonstrate that language models (e.g., BERT)
(which are trained without private data). can be stolen by multiple queries without any original training
• Privacy leakage detection & validation: Prior to deploying
data. Due to the extensive scale of LMs, it is challenging to
an LM agent for practical services, it is crucial to mitigate directly extract the entire model through query-response methods.
LM memorization risks by detecting and validating the Consequently, researchers have focused on extracting specific
extent of privacy leakage, thereby enabling service providers capabilities of LMs, such as decoding algorithms, code generation
to modify the model based on validation results. Kim et capabilities, and open-ended generation capabilities. Naseh et
al. [216] propose ProPILE, an innovative probing tool to al. [219] demonstrate that an adversary can steal the type and
evaluate privacy intrusions in LMs. The ProPILE can be hyperparameters of an LM’s decoding algorithms at a low cost
employed by LM agent service providers to evaluate the through query APIs. Li et al. [220] investigate the feasibility
levels of PII leakage for their LMs. and effectiveness of model stealing attacks on LMs to extract
the specialized code abilities. Jiang et al. [221] propose a novel
model stealing attack, which leverages the adversarial distillation
B. LM intellectual property-related privacy risk to extract knowledge of ChatGPT to a student model through
The intellectual property (IP) risks associated with LM agents a mere 70k training data, and the student model can achieve
present two types of privacy risks: LMs-related risks (including comparable open-ended generation capabilities to ChatGPT.
LM’s parameters, hyperparameters, and specific training pro- 2) Prompt stealing attacks: With the advancement of LM
cesses), and prompts-related risks (prompts are considered as agent services, high-quality prompts designed to generate ex-
commodities to generate outputs). The LMs-related information pected content have acquired substantial commercial value. These
29

prompts can be traded on various prompt marketplaces, such in user queries, resulting in potential privacy leakages. For
as PromptSea11 and PromptBase12 . Consequently, a new privacy example, Samsung employees leveraged ChatGPT for code
attack called prompt stealing attack has emerged, where an auditing without processing the confidential information in
adversary aims to infer the original prompt from the generated Apr. 2023, inadvertently exposing the company’s commer-
content. This attack is analogous to the model inversion attack in cial secrets including source code of the new program [226].
traditional ML, which involves reconstructing the input based on • Privacy leakage in multi-agent interactions: LM agent ser-
the output of an ML model [222]. Shen et al. [223] conduct the vices typically necessitate seamless collaboration of multiple
first study on prompt stealing attack in text-to-image generation LM agents to address complex user queries, where each
models, and propose an effective prompt stealing attack method agent is tasked with solving particular sub-problems of the
named PromptStealer. The PromptStealer utilizes a subject gen- queries. Consequently, communication between these LM
erator to infer the subject and a modifier detector to identify the agents is essential for information exchange and transmis-
modifiers within the generated image. Sha et al. [224] extend sion. However, multi-agent interactions can be vulnerable
the prompt stealing attack to LLMs, using a parameter extractor to privacy threats (e.g., eavesdropping, compromised agent,
to determine the properties of original prompts and a prompt and man-in-the-middle attacks), leading to potential user
reconstructor to generate reversed prompts. privacy breaches. Since interactions in LM agent services
3) Countermeasures to Model & Prompt Stealing Attacks: typically occur through natural language, traditional methods
Existing countermeasures to model and prompt stealing at- such as homomorphic encryption and secure multi-party
tacks involve both IP verification (e.g., model watermarking computation struggle to effectively safeguard the privacy
and blockchain) and privacy-preserving adversarial training (e.g., of these interactions. It remains a challenge to design new
adversarial perturbations), as detailed below. strategies tailored to these specific vulnerabilities to preserve
• Model watermarking: Model watermarking is an innovative privacy in multi-agent interactions.
technique in protecting IP rights and ensuring accountability
for LM agents. By embedding watermarks to target LMs, D. Summary and Lessons Learned
the ownership of LMs can be authenticated by verifying There are primarily two types of privacy threats to LM agents:
the watermarks, thereby preventing unauthorized use or LM memorization risk, and LM IP-related privacy risk. Generally,
infringement. Kirchenbauer et al. [225] propose a water- data extraction attacks, MIAs, and attribute inference attacks are
marking algorithm utilizing a randomized set of “green” three main privacy threats stemmed from LM memorization risks.
tokens during the text generation process, where the model Besides, model stealing attacks and prompt stealing attacks are
watermark is verified by a statistical test with interpretable two typical LM IP-related privacy risks. Other privacy threats to
p-values. LM agents include sensitive query attacks and privacy leakage in
• Blockchain: Blockchain can be employed as a transparent multi-agent interactions. To summarize, the powerful comprehen-
platform to verify IP rights due to its inherent immutability sion and memorization capabilities of LMs introduce new privacy
and traceability [101]. The owner of LMs can record the concerns, particularly regarding the leakage of PII. Meanwhile,
develop logs, version information, and hash values of LMs’ the interaction modes of LM agents have endowed prompts with
parameters on blockchain, ensuring the authenticity and commercial value, highlighting the importance of intellectual
completeness of the recorded information. Nevertheless, property rights associated with them. Furthermore, the com-
the blockchain technique itself cannot prevent the stealing plexity of LMs renders conventional privacy-preserving methods
behaviors of model functionality. ineffective for ensuring privacy. Therefore, to comprehensively
• Adversarial perturbations: Transforming the generated con- safeguard privacy within LM agent systems, researchers should
tent into adversarial examples by adding optimized pertur- develop effective and innovative privacy protection techniques
bations is an effective method to prevent prompt-stealing at- tailed for LM agents. Additionally, it is imperative for gov-
tacks while maintaining the quality of the generated content. ernments and authoritative organizations to advance legislation
Shen et al. [223] propose an intuitive defense mechanism process related to privacy breaches and intellectual property of
named PromptShield, which employs the adversarial exam- LM agent services.
ple technique to add a negligible perturbation on generated
images, thereby defending against their proposed prompt VI. F UTURE R ESEARCH D IRECTIONS
stealing attack PromptStealer. However, PromptShield re-
In this section, we outline several open research directions
quires white-box access to the attack model, which is
important to the design of future design of LM agent ecosystem.
typically impractical in real-world scenarios. Consequently,
there remains a significant need for efficient and practical
A. Energy-Efficient and Green LM Agents
countermeasures to mitigate the risks associated with prompt
stealing attacks. With the increasingly widespread deployment of LM agents,
their energy consumption and environmental impact have
C. Other Privacy Threats to LM Agents emerged as critical concerns. As reported, the energy consumed
by ChatGPT to answer a single question for 590 million users
• Sensitive query attack: In LM agent services, the LM may
is comparable to the monthly electricity usage of 175,000 Danes
memorize sensitive personal or organizational information
[70]. Given the exponential growth in model size and the compu-
11 https://www.promptsea.io/ tational resources required, energy-efficient strategies are essen-
12 https://promptbase.com/ tial for sustainable AI development, with the aim to reduce the
30

significant carbon footprint associated with training and operating C. Cyber-Physical-Social Secure LM Agent Systems
LM agents. Enabling technologies for energy-efficient and green As LM agents increasingly interact with the physical world,
LM agents include model compression techniques [115], [116], digital networks, and human society, ensuring their interaction
such as pruning, quantization, and knowledge distillation, which security in CPSS becomes essential to protect critical infras-
reduce the size and computational requirements of LMs without tructure, preserve sensitive data, prevent potential harm, and
significantly affecting their accuracy. Additionally, the use of maintain public confidence. Zero-trust architectures [230], which
edge computing [87] and FL [86] allows for the distribution of operate under the principle of “never trust, always verify”, are
computational tasks across multiple devices, thereby reducing crucial for protecting LM agents from internal and external
the energy burden on central servers and enabling real-time threats by continuously validating user identities and device
processing with lower latency. Innovations in hardware [118], integrity. Implementing zero-trust in LM agents ensures that
such as energy-efficient GPUs and TPUs, also play a critical role all interactions, whether between agents, systems, or users, are
in achieving greener LM agents by optimizing the energy use of authenticated and authorized, reducing the risk of unauthorized
the underlying computational infrastructure. access or malicious activity. Additionally, the integration of legal
However, achieving energy-efficient and green LM agents norms into the design and operation of LM agents ensures that
presents several key challenges. While model compression tech- their actions comply with applicable laws and regulations. This
niques can significantly reduce energy consumption, they may involves embedding legal reasoning capabilities within LM agents
also lead to a loss of accuracy or the inability to handle [161], enabling them to consider legal implications and ensure
complex tasks, which is a critical consideration for applications that their decisions align with societal expectations and regulatory
requiring high precision. Furthermore, optimizing the lifecycle frameworks.
energy consumption of LM agents involves addressing energy use However, several key challenges remain. One major challenge
across training, deployment, and operational stages. This includes is the complexity of securing heterogeneous CPSS that span mul-
designing energy-aware algorithms that can dynamically adapt tiple domains, including cyber, physical, and social environments.
to the availability of energy resources while maintaining high The interconnected nature of CPSS means that vulnerabilities
performance. in one domain can have cascading effects across the entire
system, making it difficult to implement comprehensive security
measures. Another challenge is the dynamic nature of CPSS
environments, where LM agents should continuously adapt to
B. Fair and Explainable LM Agents changing conditions while maintaining security. Ensuring that
security measures are both adaptive and resilient to new threats
As LM agents continue to play an increasingly central role
is a complex task.
in decision-making across various domains, the need for fairness
and explainability becomes paramount to build trust among users,
ensure compliance with ethical standards, and prevent unintended D. Value Ecosystem of LM Agents
biases. It is particular for sensitive areas such as healthcare, fi- The creation of interconnected value network of LM agents
nance, and law, where decisions should be transparent, justifiable, empowers LM agents to autonomously and transparently manage
and free from bias. Bias detection and mitigation algorithms such value exchanges (e.g., data, knowledge, resources, and digi-
as adversarial debiasing [227], reweighting [228], and fairness tal currencies), which is crucial for fostering innovation, en-
constraints [104] can be integrated into the training process hancing cooperation, and driving economic growth within LM
to ensure that the models are less prone to propagate existing agents ecosystem. Blockchain technology provides a tamper-
biases, thereby identifying and correcting biases in data and proof ledger that records all transactions between LM agents,
model outputs. Moreover, eXplainable AI (XAI) methods [229] ensuring transparency and trust in the system [101]. Smart
such as SHapley Additive exPlanations (SHAP), Local Inter- contracts, which are self-executing agreements coded onto the
pretable Model-agnostic Explanations (LIME), and counterfactual blockchain, allow LM agents to autonomously manage transac-
explanations allow users to understand the reasoning behind the tions, enforce agreements, and execute tasks without the need
model’s predictions, thereby enhancing trust, transparency, and for intermediaries [231]. Additionally, the integration of ora-
accountability. cles—trusted data sources that feed real-world information into
However, several key challenges remain be addressed. One the blockchain—enables LM agents to interact with external
major challenge is the trade-off between model complexity and data and execute contracts based on real-time conditions, further
explainability. More complex models, such as DNNs, often enhancing the functionality of value networks.
perform better but are harder to interpret, making it difficult to However, one major challenge is ensuring cross-chain interop-
provide clear explanations for their decisions. Another challenge erability, which is essential for enabling LM agents to transact
is the dynamic nature of fairness, as what is considered fair may across different blockchain networks. Currently, most blockchains
change over time or vary across different cultural and social operate in silos, making it difficult to transfer value or data
contexts. Ensuring that LM agents remain fair in diverse and between them [231]. Developing protocols that facilitate cross-
evolving environments requires continuous updating of fairness chain communication and trusted value transfer is critical for
criteria. Finally, achieving fairness and explainability without creating a unified value network. Another challenge lies in the
significantly compromising performance is a delicate balance, as reliability and security of cross-contract value transfer operations,
efforts to improve fairness and transparency can sometimes lead where multiple smart contracts atop on various homogeneous
to reduced accuracy or efficiency. or heterogeneous blockchains, especially in environments with
31

varying trust levels, need to work together to complete a transac- [13] W. Zhang, K. Tang, H. Wu, M. Wang, Y. Shen, G. Hou, Z. Tan, P. Li,
tion or task. Additionally, scalability remains a challenge, as the Y. Zhuang, and W. Lu, “Agent-pro: Learning to evolve via policy-level
reflection and optimization,” in Proc. ACL, pp. 5348–5375, 2024.
computational and storage requirements for managing large-scale [14] H. Yang, S. Yue, and Y. He, “Auto-GPT for online decision making:
value networks can be substantial. As the number of LM agents Benchmarks and additional opinions,” arXiv preprint arXiv:2306.02224,
and transactions grows, ensuring that the underlying blockchain pp. 1–14, 2023.
[15] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal,
infrastructure can scale to meet demand without compromising H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel, et al., “Retrieval-
performance or security is crucial. augmented generation for knowledge-intensive NLP tasks,” Proc. NeurIPS,
vol. 33, pp. 9459–9474, 2020.
[16] R. Nakano, J. Hilton, S. Balaji, et al., “WebGPT: Browser-
VII. C ONCLUSION assisted question-answering with human feedback,” arXiv preprint
arXiv:2112.09332, pp. 1–32, 2022.
In this paper, we have provided an in-depth survey of the state- [17] Y. Wang, Z. Jiang, Z. Chen, F. Yang, Y. Zhou, E. Cho, X. Fan, X. Huang,
of-the-art in the architecture, interaction paradigms, security and Y. Lu, and Y. Yang, “Recmind: Large language model powered agent for
recommendation,” Proc. NAACL 2024, pp. 1–14, 2024.
privacy, and future trends of LM agents. Specifically, we have [18] J. Wang, H. Xu, H. Jia, X. Zhang, M. Yan, W. Shen, J. Zhang,
introduce a novel architecture and its key components, critical F. Huang, and J. Sang, “Mobile-Agent-v2: Mobile device operation assis-
characteristics, enabling technologies, and potential applications, tant with effective navigation via multi-agent collaboration,” arXiv preprint
arXiv:2406.01014, pp. 1–22, 2024.
toward embodied, autonomous, and connected intelligence of LM [19] C. Zhang, Z. Yang, J. Liu, Y. Han, X. Chen, Z. Huang, B. Fu, and
agents. Afterward, we have explored the taxonomy of interac- G. Yu, “AppAgent: Multimodal agents as smartphone users,” arXiv preprint
tion patterns and practical collaboration paradigms among LM arXiv:2312.13771, pp. 1–10, 2023.
[20] S. Hu, T. Huang, F. Ilhan, S. Tekin, G. Liu, R. Kompella, and L. Liu,
agents, including data, computation, and information sharing for “A survey on large language model-based game agents,” arXiv preprint
collective intelligence. Furthermore, we have identified significant arXiv:2404.02039, pp. 1–23, 2024.
security and privacy threats inherent in the ecosystem of LM [21] M. Ahn, A. Brohan, N. Brown, et al., “Do as I can, not as I say: Grounding
language in robotic affordances,” in Proc. CoRL, pp. 1–34, 2022.
agents, discussed the challenges of security/privacy protections [22] Y. Jin, X. Shen, H. Peng, X. Liu, J. Qin, J. Li, J. Xie, P. Gao, G. Zhou,
in multi-agent environments, and reviewed existing and potential and J. Gong, “SurrealDriver: Designing generative driver agent simulation
countermeasures. As the field progresses, ongoing research and framework in urban contexts based on large language model,” arXiv
preprint arXiv:2309.13193, pp. 1–6, 2023.
innovation will be crucial for overcoming existing limitations [23] H. Wu, Z. He, X. Zhang, X. Yao, S. Zheng, H. Zheng, and B. Yu,
and harnessing the full potential of LM agents in transforming “ChatEDA: A large language model powered autonomous agent for EDA,”
intelligent systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and
Systems, 2024. doi:10.1109/TCAD.2024.3383347.
[24] MarketsandMarkets, “Autonomous AI and autonomous agents market,”
R EFERENCES 2023. Accessed: July. 30, 2023.
[25] W. Huang, P. Abbeel, D. Pathak, and I. Mordatch, “Language models as
[1] Q. Huang, N. Wake, B. Sarkar, Z. Durante, R. Gong, R. Taori, Y. Noda, zero-shot planners: Extracting actionable knowledge for embodied agents,”
D. Terzopoulos, N. Kuno, A. Famoti, et al., “Position paper: Agent AI in Proc. ICML, pp. 9118–9147, 2022.
towards a holistic intelligence,” arXiv preprint arXiv:2403.00833, pp. 1– [26] J. Ruan, Y. Chen, B. Zhang, Z. Xu, T. Bao, du qing, shi shiwei, H. Mao,
22, 2024. X. Zeng, and R. Zhao, “TPTU: Task planning and tool usage of large
[2] W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, language model-based AI agents,” in NeurIPS 2023 Foundation Models
J. Zhang, Z. Dong, et al., “A survey of large language models,” arXiv for Decision Making Workshop, pp. 1–34, 2023.
preprint arXiv:2303.18223, pp. 1–124, 2023. [27] C. Zhang, K. Yang, S. Hu, Z. Wang, G. Li, Y. Sun, C. Zhang, Z. Zhang,
[3] C. Ribeiro, “Reinforcement learning agents,” Artificial intelligence review, A. Liu, S.-C. Zhu, X. Chang, J. Zhang, F. Yin, Y. Liang, and Y. Yang,
vol. 17, pp. 223–250, 2002. “ProAgent: Building proactive cooperative agents with large language
[4] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, models,” Proc. AAAI, vol. 38, no. 16, pp. 17591–17599, 2024.
A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, et al., “Mastering [28] F. Jiang, Y. Peng, L. Dong, K. Wang, K. Yang, C. Pan, D. Niyato, and
the game of go without human knowledge,” nature, vol. 550, no. 7676, O. A. Dobre, “Large language model enhanced multi-agent systems for
pp. 354–359, 2017. 6G communications,” IEEE Wireless Communications, pp. 1–8, 2024.
[5] L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, [29] J. Andreas, “Language models as agent models,” arXiv preprint
C. Zhang, S. Agarwal, K. Slama, A. Ray, et al., “Training language models arXiv:2212.01681, pp. 1–11, 2022.
to follow instructions with human feedback,” Proc. NeurIPS, vol. 35, [30] G. Li, H. A. A. K. Hammoud, H. Itani, D. Khizbullin, and B. Ghanem,
pp. 27730–27744, 2022. “CAMEL: Communicative agents for ”mind” exploration of large language
[6] J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, model society,” in Proc. NeurIPS, 2023.
D. Almeida, J. Altenschmidt, S. Altman, S. Anadkat, et al., “GPT-4 [31] T. Y. Zhuo, Z. Li, Y. Huang, F. Shiri, W. Wang, G. Haffari, and Y. Li,
technical report,” arXiv preprint arXiv:2303.08774, pp. 1–100, 2023. “On robustness of prompt-based semantic parsing with large pre-trained
[7] C. H. Song, J. Wu, C. Washington, B. M. Sadler, W.-L. Chao, and Y. Su, language model: An empirical study on codex,” in Proc. EACL, pp. 1090–
“LLM-planner: Few-shot grounded planning for embodied agents with 1102, 2023.
large language models,” in Proc. IEEE/CVF ICCV, pp. 2998–3009, 2023. [32] W. Zou, R. Geng, B. Wang, and J. Jia, “PoisonedRAG: Knowledge
[8] T. Masterman, S. Besen, M. Sawtell, and A. Chao, “The landscape of poisoning attacks to retrieval-augmented generation of large language
emerging AI agent architectures for reasoning, planning, and tool calling: models,” arXiv preprint arXiv:2402.07867, pp. 1–30, 2024.
A survey,” arXiv preprint arXiv:2404.11584, pp. 1–13, 2024. [33] Y. Zhang, Y. Li, L. Cui, D. Cai, L. Liu, T. Fu, X. Huang, E. Zhao, Y. Zhang,
[9] Z. Xi, W. Chen, X. Guo, W. He, Y. Ding, B. Hong, M. Zhang, J. Wang, Y. Chen, et al., “Siren’s song in the AI ocean: a survey on hallucination in
S. Jin, E. Zhou, et al., “The rise and potential of large language model large language models,” arXiv preprint arXiv:2309.01219, pp. 1–33, 2023.
based agents: A survey,” arXiv preprint arXiv:2309.07864, pp. 1–86, 2023. [34] N. Carlini, D. Ippolito, M. Jagielski, K. Lee, F. Tramèr, and C. Zhang,
[10] Y. Cheng, C. Zhang, Z. Zhang, X. Meng, S. Hong, W. Li, Z. Wang, “Quantifying memorization across neural language models,” in Proc. ICLR,
Z. Wang, F. Yin, J. Zhao, et al., “Exploring large language model based pp. 1–19, 2023.
intelligent agents: Definitions, methods, and prospects,” arXiv preprint [35] Y. Liu, G. Deng, Y. Li, K. Wang, T. Zhang, Y. Liu, H. Wang, Y. Zheng,
arXiv:2401.03428, pp. 1–55, 2024. and Y. Liu, “Prompt injection attack against LLM-integrated applications,”
[11] J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, arXiv preprint arXiv:2306.05499, pp. 1–18, 2023.
Q. Le, and D. Zhou, “Chain-of-thought prompting elicits reasoning in large [36] M. Fang, X. Cao, J. Jia, and N. Gong, “Local model poisoning attacks
language models,” arXiv preprint arXiv:2201.11903, pp. 1–43, 2023. to byzantine-robust federated learning,” in Proc. USENIX, pp. 1605–1622,
[12] S. Yao, D. Yu, J. Zhao, I. Shafran, T. L. Griffiths, Y. Cao, and 2020.
K. Narasimhan, “Tree of thoughts: Deliberate problem solving with large [37] J. Rando and F. Tramèr, “Universal jailbreak backdoors from poisoned
language models,” in Proc. NeurIPS, pp. 1–14, 2024. human feedback,” in Proc. ICLR, pp. 1–28, 2024.
32

[38] N. Carlini, F. Tramèr, E. Wallace, M. Jagielski, A. Herbert-Voss, K. Lee, [61] D. Driess, F. Xia, M. S. M. Sajjadi, C. Lynch, A. Chowdhery, B. Ichter,
A. Roberts, T. B. Brown, D. Song, Ú. Erlingsson, A. Oprea, and C. Raffel, A. Wahid, J. Tompson, Q. Vuong, T. Yu, W. Huang, Y. Chebotar, P. Ser-
“Extracting training data from large language models,” in Proc. USENIX, manet, D. Duckworth, S. Levine, V. Vanhoucke, K. Hausman, M. Tou-
pp. 2633–2650, 2021. ssaint, K. Greff, A. Zeng, I. Mordatch, and P. Florence, “PaLM-E: An
[39] L. Wang, C. Ma, X. Feng, Z. Zhang, H. Yang, J. Zhang, Z. Chen, embodied multimodal language model,” arXiv preprint arXiv:2303.03378,
J. Tang, X. Chen, Y. Lin, et al., “A survey on large language model pp. 1–18, 2023.
based autonomous agents,” Frontiers of Computer Science, vol. 18, no. 6, [62] Y. Talebirad and A. Nadiri, “Multi-agent collaboration: Harnessing the
p. 186345, 2024. power of intelligent LLM agents,” arXiv preprint arXiv:2306.03314, pp. 1–
[40] M. Xu, H. Du, D. Niyato, J. Kang, Z. Xiong, S. Mao, Z. Han, A. Ja- 11, 2023.
malipour, D. I. Kim, X. Shen, V. C. M. Leung, and H. V. Poor, “Unleashing [63] X. Huang, J. Lian, Y. Lei, J. Yao, D. Lian, and X. Xie, “Recommender AI
the power of edge-cloud generative AI in mobile networks: A survey of agent: Integrating large language models for interactive recommendations,”
AIGC services,” IEEE Communications Surveys & Tutorials, vol. 26, no. 2, arXiv preprint arXiv:2308.16505, pp. 1–18, 2024.
pp. 1127–1170, 2024. [64] Y. Chen, J. Yoon, D. S. Sachan, Q. Wang, V. Cohen-Addad, M. Bateni, C.-
[41] T. Guo, X. Chen, Y. Wang, R. Chang, S. Pei, N. V. Chawla, O. Wiest, and Y. Lee, and T. Pfister, “Re-Invoke: Tool invocation rewriting for zero-shot
X. Zhang, “Large language model based multi-agents: A survey of progress tool retrieval,” arXiv preprint arXiv:2408.01875, pp. 1–22, 2024.
and challenges,” arXiv preprint arXiv:2402.01680, pp. 1–15, 2024. [65] S. Jinxin, Z. Jiabao, W. Yilei, W. Xingjiao, L. Jiawen, and H. Liang,
[42] Z. Durante, Q. Huang, N. Wake, R. Gong, J. S. Park, B. Sarkar, R. Taori, “CGMI: Configurable general multi-agent interaction framework,” arXiv
Y. Noda, D. Terzopoulos, Y. Choi, et al., “Agent AI: Surveying the horizons preprint arXiv:2308.12503, pp. 1–11, 2023.
of multimodal interaction,” arXiv preprint arXiv:2401.03568, pp. 1–80, [66] H. Lai, X. Liu, I. L. Iong, S. Yao, Y. Chen, P. Shen, H. Yu, H. Zhang,
2024. X. Zhang, Y. Dong, and J. Tang, “AutoWebGLM: A large language model-
[43] X. Xu, Y. Wang, C. Xu, Z. Ding, J. Jiang, Z. Ding, and B. F. Karlsson, “A based web navigating agent,” in Proc. KDD, p. 5295–5306, 2024.
survey on game playing agents and large models: Methods, applications, [67] O. Ram, Y. Levine, I. Dalmedigos, D. Muhlgay, A. Shashua, K. Leyton-
and challenges,” arXiv preprint arXiv:2403.10249, pp. 1–13, 2024. Brown, and Y. Shoham, “In-context retrieval-augmented language models,”
[44] G. Qu, Q. Chen, W. Wei, Z. Lin, X. Chen, and K. Huang, “Mobile edge arXiv preprint arXiv:2302.00083, pp. 1–15, 2023.
intelligence for large language models: A contemporary survey,” arXiv [68] S. Hao, T. Liu, Z. Wang, and Z. Hu, “ToolkenGPT: Augmenting frozen
preprint arXiv:2407.18921, pp. 1–37, 2024. language models with massive tools via tool embeddings,” in Proc.
[45] K. Mei, Z. Li, S. Xu, R. Ye, Y. Ge, and Y. Zhang, “AIOS: LLM agent NeurIPS, pp. 1–25, 2023.
operating system,” arXiv preprints: arXiv:2403.16971, pp. 1–14, 2024. [69] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal,
[46] Y. Ge, Y. Ren, W. Hua, S. Xu, J. Tan, and Y. Zhang, “LLM as OS, agents G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever,
as apps: Envisioning AIOS, agents and the aios-agent ecosystem,” arXiv “Learning transferable visual models from natural language supervision,”
preprints: arXiv:2312.03815, pp. 1–35, 2023. in Proc. ICML, vol. 139, pp. 8748–8763, 2021.
[47] X. Guan, Y. Liu, H. Lin, Y. Lu, B. He, X. Han, and L. Sun, “Mitigating [70] Y. Wang, Y. Pan, M. Yan, Z. Su, and T. H. Luan, “A survey on ChatGPT:
large language model hallucinations via autonomous knowledge graph- AI-generated contents, challenges, and solutions,” IEEE Open Journal of
based retrofitting,” in Proc. AAAI, pp. 18126–18134, 2024. the Computer Society, vol. 4, pp. 280–302, 2023.
[48] Y. Wang, Z. Su, N. Zhang, R. Xing, D. Liu, T. H. Luan, and X. Shen, [71] Y. Wang, P. Li, M. Sun, and Y. Liu, “Self-knowledge guided retrieval aug-
“A survey on metaverse: Fundamentals, security, and privacy,” IEEE mentation for large language models,” arXiv preprint arXiv:2310.05002,
Communications Surveys & Tutorials, vol. 25, no. 1, pp. 319–352, 2023. pp. 1–12, 2023.
[49] Y. Wang, Z. Su, S. Guo, M. Dai, T. H. Luan, and Y. Liu, “A survey
[72] M. Zolfaghari, Y. Zhu, P. Gehler, and T. Brox, “Crossclr: Cross-modal
on digital twins: Architecture, enabling technologies, security and privacy,
contrastive learning for multi-modal video representations,” in Proc. ICCV,
and future prospects,” IEEE Internet of Things Journal, vol. 10, no. 17,
pp. 1430–1439, 2021.
pp. 14965–14987, 2023.
[73] R. Wang, D. Tang, N. Duan, Z. Wei, X. Huang, G. Cao, D. Jiang,
[50] G. Wang, Y. Xie, Y. Jiang, A. Mandlekar, C. Xiao, Y. Zhu, L. Fan, and
M. Zhou, et al., “K-adapter: Infusing knowledge into pre-trained models
A. Anandkumar, “Voyager: An open-ended embodied agent with large
with adapters,” in Proc. ACL-IJCNLP, pp. 1405–1418, 2021.
language models,” arXiv preprint arXiv:2305.16291, pp. 1–42, 2023.
[74] F. Wan, X. Huang, D. Cai, X. Quan, W. Bi, and S. Shi, “Knowledge fusion
[51] Y. Pan, Z. Su, Y. Wang, S. Guo, H. Liu, R. Li, and Y. Wu, “Cloud-
of large language models,” arXiv preprint arXiv:2401.10491, pp. 1–20,
edge collaborative large model services: Challenges and solutions,” IEEE
2024.
Network, pp. 1–8, 2024. doi:10.1109/MNET.2024.3442880.
[52] M. Besta, N. Blach, A. Kubicek, R. Gerstenberger, M. Podstawski, L. Gian- [75] N. Tandon, A. Madaan, P. Clark, and Y. Yang, “Learning to repair:
inazzi, J. Gajda, T. Lehmann, H. Niewiadomski, P. Nyczyk, et al., “Graph Repairing model output errors after deployment using a dynamic memory
of thoughts: Solving elaborate problems with large language models,” in of feedback,” arXiv preprint arXiv:2112.09737, pp. 1–14, 2021.
Proc. AAAI, pp. 17682–17690, 2024. [76] X. Dai, C. Guo, Y. Tang, H. Li, Y. Wang, J. Huang, Y. Tian, X. Xia,
[53] Z. Hu, A. Iscen, C. Sun, K.-W. Chang, Y. Sun, D. A. Ross, C. Schmid, Y. Lv, and F.-Y. Wang, “Vistarag: Toward safe and trustworthy autonomous
and A. Fathi, “AVIS: Autonomous visual information seeking with large driving through retrieval-augmented generation,” IEEE Transactions on
language model agent,” in Proc. NeurIPS, pp. 867–878, 2023. Intelligent Vehicles, vol. 9, no. 4, pp. 4579–4582, 2024.
[54] S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y. Cao, [77] J. Shen, C. Wang, L. Gong, and D. Song, “Joint language semantic and
“ReAct: Synergizing reasoning and acting in language models,” in Proc. structure embedding for knowledge graph completion,” arXiv preprint
ICLR, pp. 1–33, 2023. arXiv:2209.08721, pp. 1–14, 2022.
[55] N. Shinn, F. Cassano, E. Berman, A. Gopinath, K. Narasimhan, and S. Yao, [78] P. Li, Z. Liu, W. Pang, and J. Cao, “Semantic collaboration: A collaborative
“Reflexion: Language agents with verbal reinforcement learning,” in Proc. approach for multi-agent systems based on semantic communication,” in
NeurIPS, pp. 8634–8652, 2023. Proc. CNIOT, pp. 123–132, 2024.
[56] Z. Wang, S. Mao, W. Wu, T. Ge, F. Wei, and H. Ji, “Unleashing the [79] T. Ayoola, S. Tyagi, J. Fisher, C. Christodoulopoulos, and A. Pierleoni,
emergent cognitive synergy in large language models: A task-solving agent “Refined: An efficient zero-shot-capable approach to end-to-end entity
through multi-persona self-collaboration,” in Proc. ACL, vol. 1, pp. 257– linking,” arXiv preprint arXiv:2207.04108, pp. 1–12, 2022.
279, 2024. [80] Y. Ma, A. Wang, and N. Okazaki, “Dreeam: Guiding attention with
[57] Z. Wang, S. Cai, G. Chen, A. Liu, X. Ma, and Y. Liang, “Describe, explain, evidence for improving document-level relation extraction,” arXiv preprint
plan and select: Interactive planning with LLMs enables open-world multi- arXiv:2302.08675, pp. 1–13, 2023.
task agents,” in Proc. NeurIPS, pp. 1–37, 2023. [81] S. Gross and B. Krenn, The Role of Multimodal Data for Modeling
[58] H. Trivedi, N. Balasubramanian, T. Khot, and A. Sabharwal, “Interleaving Communication in Artificial Social Agents, pp. 83–93. 2023.
retrieval with chain-of-thought reasoning for knowledge-intensive multi- [82] C. Qian, Z. Xie, Y. Wang, W. Liu, Y. Dang, Z. Du, W. Chen, C. Yang,
step questions,” in Proc. ACL, pp. 10014–10037, 2023. Z. Liu, and M. Sun, “Scaling large-language-model-based multi-agent
[59] N. Liu, L. Chen, X. Tian, W. Zou, K. Chen, and M. Cui, “From LLM to collaboration,” arXiv preprint arXiv:2406.07155, pp. 1–11, 2024.
conversational agent: A memory enhanced architecture with fine-tuning of [83] J. Han, N. Collier, W. Buntine, and E. Shareghi, “Pive: Prompting with it-
large language models,” arXiv preprint arXiv:2401.02777, pp. 1–17, 2024. erative verification improving graph-based generative capability of LLMs,”
[60] M. Hu, T. Chen, Q. Chen, Y. Mu, W. Shao, and P. Luo, “HiAgent: arXiv preprint arXiv:2305.12392, pp. 1–17, 2023.
Hierarchical working memory management for solving long-horizon agent [84] S. Kuroki, M. Nishimura, and T. Kozuno, “Multi-agent behavior retrieval:
tasks with large language model,” arXiv preprint arXiv:2408.09559, pp. 1– Retrieval-augmented policy training for cooperative push manipulation by
17, 2024. mobile robots,” arXiv preprint arXiv:2312.02008, pp. 1–8, 2023.
33

[85] C. Zhang, K. Yang, S. Hu, Z. Wang, G. Li, Y. Sun, C. Zhang, Z. Zhang, [108] Q. Chen, Y. Zhang, J. Liu, Z. Wang, X. Deng, and J. Wang, “Multi-
A. Liu, S.-C. Zhu, et al., “ProAgent: building proactive cooperative agents modal fine-grained retrieval with local and global cross-attention,” in Proc.
with large language models,” in Proc. AAAI, pp. 17591–17599, 2024. ICUFN, pp. 1–7, 2023.
[86] C. Chen, X. Feng, J. Zhou, J. Yin, and X. Zheng, “Federated large language [109] K. Yang, D. Yang, J. Zhang, M. Li, Y. Liu, J. Liu, H. Wang, P. Sun, and
model: A position paper,” arXiv preprint arXiv:2307.08925, pp. 1–11, L. Song, “Spatio-temporal domain awareness for multi-agent collaborative
2023. perception,” in Proc. ICCV, pp. 23383–23392, 2023.
[87] Y. Chen, R. Li, Z. Zhao, C. Peng, J. Wu, E. Hossain, and H. Zhang, [110] J. Ji, J. Wang, C. Huang, J. Wu, B. Xu, Z. Wu, J. Zhang, and Y. Zheng,
“NetGPT: An AI-native network architecture for provisioning be- “Spatio-temporal self-supervised learning for traffic flow prediction,” in
yond personalized generative services,” IEEE Network, pp. 1–9, 2024. Proc. AAAI, pp. 4356–4364, 2023.
doi:10.1109/MNET.2024.3376419. [111] Q. Zhang, C. Huang, L. Xia, Z. Wang, Z. Li, and S. Yiu, “Automated
[88] M. Xu, D. Niyato, H. Zhang, J. Kang, Z. Xiong, S. Mao, and Z. Han, spatio-temporal graph contrastive learning,” in Proc. WWW, pp. 295–305,
“Cached model-as-a-resource: Provisioning large language model agents 2023.
for edge intelligence in space-air-ground integrated networks,” arXiv [112] J. Xu, M. A. Kishk, and M.-S. Alouini, “Space-air-ground-sea integrated
preprint arXiv:2403.05826, pp. 1–13, 2024. networks: Modeling and coverage analysis,” IEEE Transactions on Wireless
[89] Q. Wu, G. Bansal, J. Zhang, Y. Wu, B. Li, E. E. Zhu, L. Jiang, X. Zhang, Communications, vol. 22, no. 9, pp. 6298–6313, 2023.
S. Zhang, A. Awadallah, R. W. White, D. Burger, and C. Wang, “AutoGen: [113] M.-H. T. Nguyen, T. T. Bui, L. D. Nguyen, E. Garcia-Palacios, H.-J.
Enabling next-gen llm applications via multi-agent conversation,” in Proc. Zepernick, H. Shin, and T. Q. Duong, “Real-time optimized clustering
COLM, pp. 1–43, 2024. and caching for 6G satellite-UAV-terrestrial networks,” IEEE Transactions
[90] C. Qian, W. Liu, H. Liu, N. Chen, Y. Dang, J. Li, C. Yang, W. Chen, Y. Su, on Intelligent Transportation Systems, vol. 25, no. 3, pp. 3009–3019, 2024.
X. Cong, J. Xu, D. Li, Z. Liu, and M. Sun, “ChatDev: Communicative [114] Z. Liu, Y. Zhang, P. Li, Y. Liu, and D. Yang, “Dynamic LLM-agent
agents for software development,” in Proc. ACL, pp. 15174–15186, 2024. network: An LLM-agent collaboration framework with agent team opti-
[91] S. Hong, M. Zhuge, J. Chen, X. Zheng, Y. Cheng, J. Wang, and et al., mization,” arXiv preprint arXiv:2310.02170, pp. 1–21, 2023.
“MetaGPT: Meta programming for a multi-agent collaborative framework,” [115] A. Zhang, H. Fei, Y. Yao, W. Ji, L. Li, Z. Liu, and T.-S. Chua, “Vpgtrans:
in Proc. ICLR, pp. 1–26, 2024. Transfer visual prompt generator across LLMs,” Proc. NeurIPS, vol. 36,
[92] D. Wu, X. Wang, Y. Qiao, Z. Wang, J. Jiang, S. Cui, and F. Wang, pp. 20299–20319, 2024.
“NetLLM: Adapting large language models for networking,” in Proc. ACM [116] X. Ma, G. Fang, and X. Wang, “LLM-pruner: On the structural pruning of
SIGCOMM, pp. 661–678, 2024. large language models,” Proc. NeurIPS, vol. 36, pp. 21702–21720, 2023.
[93] R. Zhang, H. Du, Y. Liu, D. Niyato, J. Kang, S. Sun, X. Shen, [117] Z. Liu, A. Desai, F. Liao, W. Wang, V. Xie, Z. Xu, A. Kyrillidis, and
and H. V. Poor, “Interactive AI with retrieval-augmented genera- A. Shrivastava, “Scissorhands: Exploiting the persistence of importance
tion for next generation networking,” IEEE Network, pp. 1–10, 2024. hypothesis for LLM kv cache compression at test time,” Proc. NeurIPS,
doi:10.1109/MNET.2024.3401159. vol. 36, 2024.
[94] Y. Huang, H. Du, X. Zhang, D. Niyato, J. Kang, Z. Xiong, S. Wang, [118] X. Shen, P. Dong, L. Lu, Z. Kong, Z. Li, M. Lin, C. Wu, and Y. Wang,
and T. Huang, “Large language models for networking: Applications, “Agile-quant: Activation-guided quantization for faster inference of LLMs
enabling techniques, and challenges,” IEEE Network, pp. 1–7, 2024. on the edge,” in Proc. AAAI, pp. 18944–18951, 2024.
doi:10.1109/MNET.2024.3435752. [119] M. Zhang, J. Cao, X. Shen, and Z. Cui, “EdgeShard: Efficient LLM infer-
[95] G. Deng, Y. Liu, V. Mayoral-Vilches, P. Liu, Y. Li, Y. Xu, T. Zhang, Y. Liu, ence via collaborative edge computing,” arXiv preprint arXiv:2405.14371,
M. Pinzger, and S. Rass, “Pentestgpt: An LLM-empowered automatic 2024.
penetration testing tool,” arXiv preprint arXiv:2308.06782, pp. 1–22, 2023. [120] M. Xu, D. Niyato, J. Kang, Z. Xiong, S. Mao, Z. Han, D. I. Kim, and
[96] J. Xu, J. W. Stokes, G. McDonald, X. Bai, D. Marshall, S. Wang, K. B. Letaief, “When large language model agents meet 6G networks:
A. Swaminathan, and Z. Li, “Autoattacker: A large language model Perception, grounding, and alignment,” IEEE Wireless Communications,
guided system to implement automatic cyber-attacks,” arXiv preprint pp. 1–9, 2024. doi:10.1109/MWC.005.2400019.
arXiv:2403.01038, pp. 1–19, 2024. [121] C. H. Robinson and L. J. Damschroder, “A pragmatic context assessment
[97] R. Fang, R. Bindu, A. Gupta, and D. Kang, “LLM agents can autonomously tool (pCAT): using a think aloud method to develop an assessment of
exploit one-day vulnerabilities,” arXiv preprint arXiv:2404.08144, pp. 1– contextual barriers to change,” Implementation Science Communications,
13, 2024. vol. 4, no. 1, p. 3, 2023.
[98] E. Seraj, “Embodied, intelligent communication for multi-agent coopera- [122] A. Roberts, C. Raffel, and N. Shazeer, “How much knowledge can
tion,” in Proc. AAAI, pp. 16135–16136, 2023. you pack into the parameters of a language model?,” in Proc. EMNLP,
[99] C.-M. Chan, W. Chen, Y. Su, J. Yu, W. Xue, S. Zhang, J. Fu, and pp. 5418–5426, 2020.
Z. Liu, “ChatEval: Towards better llm-based evaluators through multi-agent [123] M. Kang, S. Lee, J. Baek, K. Kawaguchi, and S. J. Hwang, “Knowledge-
debate,” in Proc. ICLR, pp. 1–15, 2024. augmented reasoning distillation for small language models in knowledge-
[100] Y. Yan and T. Hayakawa, “Hierarchical noncooperative dynamical systems intensive tasks,” Proc. NeurIPS, vol. 36, pp. 1–30, 2024.
under intragroup and intergroup incentives,” IEEE Transactions on Control [124] M. Wu, A. Waheed, C. Zhang, M. Abdul-Mageed, and A. F. Aji, “Lamini-
of Network Systems, vol. 11, no. 2, pp. 743–755, 2024. Lm: A diverse herd of distilled models from large-scale instructions,” arXiv
[101] Y. Wang, H. Peng, Z. Su, T. H. Luan, A. Benslimane, and Y. Wu, “A preprint arXiv:2304.14402, pp. 1–21, 2023.
platform-free proof of federated learning consensus mechanism for sus- [125] M. Zhong, C. An, W. Chen, J. Han, and P. He, “Seeking neural nuggets:
tainable blockchains,” IEEE Journal on Selected Areas in Communications, Knowledge transfer in large language models from a parametric perspec-
vol. 40, no. 12, pp. 3305–3324, 2022. tive,” arXiv preprint arXiv:2310.11451, pp. 1–21, 2023.
[102] Z. Liu, Y. Zhang, P. Li, Y. Liu, and D. Yang, “Dynamic LLM-agent [126] D. Jiang, X. Ren, and B. Y. Lin, “LLM-blender: Ensembling large language
network: An llm-agent collaboration framework with agent team optimiza- models with pairwise ranking and generative fusion,” arXiv preprint
tion,” arXiv preprint arXiv:2310.02170, pp. 1–21, 2023. arXiv:2306.02561, pp. 1–18, 2023.
[103] J. S. Park, J. O’Brien, C. J. Cai, M. R. Morris, P. Liang, and M. S. [127] M. Wortsman, G. Ilharco, S. Y. Gadre, R. Roelofs, R. Gontijo-Lopes, A. S.
Bernstein, “Generative agents: Interactive simulacra of human behavior,” Morcos, H. Namkoong, A. Farhadi, Y. Carmon, S. Kornblith, et al., “Model
in Proc. ACM UIST, 2023. soups: averaging weights of multiple fine-tuned models improves accuracy
[104] E. Liu, Q. Zhang, and K. K. Leung, “Relay-assisted transmission with without increasing inference time,” in Proc. ICML, pp. 23965–23998, 2022.
fairness constraint for cellular networks,” IEEE Transactions on Mobile [128] A. Lazaridou, E. Gribovskaya, W. Stokowiec, and N. Grigorev, “Internet-
Computing, vol. 11, no. 2, pp. 230–239, 2012. augmented language models through few-shot prompting for open-domain
[105] H. Li, Y. Chong, S. Stepputtis, J. Campbell, D. Hughes, C. Lewis, question answering,” arXiv preprint arXiv:2203.05115, pp. 1–20, 2022.
and K. Sycara, “Theory of mind for multi-agent collaboration via large [129] Z. Chen, G. Weiss, E. Mitchell, A. Celikyilmaz, and A. Bosselut, “Reck-
language models,” in Proc. EMNLP, pp. 180–192, 2023. oning: reasoning through dynamic knowledge encoding,” Proc. NeurIPS,
[106] X. Wu, Z. Huang, L. Wang, J. Chanussot, and J. Tian, “Multimodal vol. 36, pp. 1–22, 2024.
collaboration networks for geospatial vehicle detection in dense, occluded, [130] Z. Hu, L. Wang, Y. Lan, W. Xu, E.-P. Lim, L. Bing, X. Xu, S. Poria, and
and large-scale events,” IEEE Transactions on Geoscience and Remote R. K.-W. Lee, “LLM-adapters: An adapter family for parameter-efficient
Sensing, vol. 62, pp. 1–12, 2024. fine-tuning of large language models,” arXiv preprint arXiv:2304.01933,
[107] S. Gur, N. Neverova, C. Stauffer, S.-N. Lim, D. Kiela, and A. Reiter, pp. 1–21, 2023.
“Cross-modal retrieval augmentation for multi-modal classification,” in [131] Y. Qin, J. Zhang, Y. Lin, Z. Liu, P. Li, M. Sun, and J. Zhou,
Findings of the Association for Computational Linguistics: EMNLP 2021, “ELLE: Efficient lifelong pre-training for emerging data,” arXiv preprint
pp. 111–123, 2021. arXiv:2203.06311, pp. 1–22, 2022.
34

[132] R. Zhang, Y. Su, B. D. Trisedya, X. Zhao, M. Yang, H. Cheng, and [157] C. Rebuffel, M. Roberti, L. Soulier, G. Scoutheeten, R. Cancelliere,
J. Qi, “Autoalign: fully automatic and effective knowledge graph alignment and P. Gallinari, “Controlling hallucinations at word level in data-to-text
enabled by large language models,” IEEE Transactions on Knowledge and generation,” Data Mining and Knowledge Discovery, pp. 1–37, 2022.
Data Engineering, vol. 36, no. 6, pp. 2357–2371, 2023. [158] Y. Xiao and W. Y. Wang, “On hallucination and predictive uncertainty in
[133] Z.-M. Jiang, J.-J. Bai, K. Lu, and S.-M. Hu, “Context-sensitive and conditional language generation,” in Proc. EACL, pp. 2734–2744, 2021.
directional concurrency fuzzing for data-race detection,” in Proc. NDSS, [159] X. Xu, K. Kong, N. Liu, L. Cui, D. Wang, J. Zhang, and M. Kankanhalli,
pp. 1–18, 2022. “An LLM can fool itself: A prompt-based adversarial attack,” in Proc.
[134] P. Verga, H. Sun, L. Baldini Soares, and W. Cohen, “Adaptable and ICLR, pp. 1–23, 2024.
interpretable neural MemoryOver symbolic knowledge,” in Proc. NAACL, [160] F. Shi, X. Chen, K. Misra, N. Scales, D. Dohan, E. H. Chi, N. Schärli,
pp. 3678–3691, 2021. and D. Zhou, “Large language models can be easily distracted by irrelevant
[135] N. De Cao, G. Izacard, S. Riedel, and F. Petroni, “Autoregressive entity context,” in Proc. ICML, vol. 202, pp. 31210–31227, 2023.
retrieval,” arXiv preprint arXiv:2010.00904, pp. 1–20, 2020. [161] R. Liu, R. Yang, C. Jia, G. Zhang, D. Yang, and S. Vosoughi, “Training
[136] I. Melnyk, P. Dognin, and P. Das, “Grapher: Multi-stage knowledge socially aligned language models on simulated social interactions,” in Proc.
graph construction using pretrained language models,” in NeurIPS 2021 ICLR, pp. 1–24, 2024.
Workshop on Deep Generative Models and Downstream Applications, [162] D. Zhu, J. Chen, X. Shen, X. Li, and M. Elhoseiny, “MiniGPT-4:
pp. 1–12, 2021. Enhancing vision-language understanding with advanced large language
[137] C. Chen, Y. Wang, B. Li, and K.-Y. Lam, “Knowledge is flat: A seq2seq models,” arXiv preprint arXiv:2304.10592, pp. 1–15, 2023.
generative framework for various knowledge graph completion,” arXiv [163] C. Du, Y. Li, Z. Qiu, and C. Xu, “Stable diffusion is unstable,” in Proc.
preprint arXiv:2209.07299, pp. 1–13, 2022. NeurIPS, pp. 1–22, 2023.
[138] Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y. Xu, E. Ishii, Y. J. Bang, [164] H. Wang, K. Dong, Z. Zhu, H. Qin, A. Liu, X. Fang, J. Wang, and X. Liu,
A. Madotto, and P. Fung, “Survey of hallucination in natural language “Transferable multimodal attack on vision-language pre-training models,”
generation,” ACM Computing Surveys, vol. 55, no. 12, pp. 1–38, 2023. in Proc. IEEE SP, pp. 102–102, 2024.
[139] F. Liu, K. Lin, L. Li, J. Wang, Y. Yacoob, and L. Wang, “Mitigating [165] H. Luo, J. Gu, F. Liu, and P. Torr, “An image is worth 1000 lies:
hallucination in large multi-modal models via robust instruction tuning,” Transferability of adversarial images across prompts on vision-language
in Proc. ICLR, pp. 1–40, 2023. models,” in Proc. ICLR, pp. 1–22, 2024.
[140] N. McKenna, T. Li, L. Cheng, M. J. Hosseini, M. Johnson, and M. Steed- [166] C. Liang, X. Wu, Y. Hua, J. Zhang, Y. Xue, T. Song, Z. Xue, R. Ma, and
man, “Sources of hallucination by large language models on inference H. Guan, “Adversarial example does good: Preventing painting imitation
tasks,” in Proc. EMNLP Findings, pp. 2758–2774, 2023. from diffusion models via adversarial examples,” in Proc. ICML, vol. 202,
[141] K. Lee, D. Ippolito, A. Nystrom, C. Zhang, D. Eck, C. Callison-Burch, and pp. 20763–20786, 2023.
N. Carlini, “Deduplicating training data makes language models better,” in
[167] Z. Yu, X. Liu, S. Liang, Z. Cameron, C. Xiao, and N. Zhang, “Don’t listen
Proc. ACL, pp. 8424–8445, 2022.
to me: Understanding and exploring jailbreak prompts of large language
[142] G. Penedo, Q. Malartic, D. Hesslow, R. Cojocaru, H. Alobeidli, A. Cap- models,” in Proc. USENIX, pp. 1–18, 2024.
pelli, B. Pannier, E. Almazrouei, and J. Launay, “The RefinedWeb dataset
[168] Y. Yang, B. Hui, H. Yuan, N. Gong, and Y. Cao, “Sneakyprompt:
for falcon LLM: outperforming curated corpora with web data only,” in
Jailbreaking text-to-image generative models,” in Proc. IEEE SP, pp. 123–
Proc. NeurIPS, pp. 1–32, 2023.
123, 2024.
[143] A. P. Parikh, X. Wang, S. Gehrmann, M. Faruqui, B. Dhingra, D. Yang,
[169] X. Shen, Z. Chen, M. Backes, Y. Shen, and Y. Zhang, “”do anything
and D. Das, “ToTTo: A controlled table-to-text generation dataset,” in Proc.
now”: Characterizing and evaluating in-the-wild jailbreak prompts on large
EMNLP, pp. 1173–1186, 2020.
language models,” in Proc. CCS, pp. 1–22, 2024.
[144] N. Lee, W. Ping, P. Xu, M. Patwary, P. Fung, M. Shoeybi, and B. Catanzaro,
“Factuality enhanced language models for open-ended text generation,” in [170] G. Deng, Y. Liu, Y. Li, K. Wang, Y. Zhang, Z. Li, H. Wang, T. Zhang, and
Proc. NeurIPS, pp. 1–24, 2022. Y. Liu, “Jailbreaker: Automated jailbreak across multiple large language
model chatbots,” in Proc. NDSS, pp. 1–15, 2024.
[145] S. Longpre, K. Perisetla, A. Chen, N. Ramesh, C. DuBois, and S. Singh,
“Entity-based knowledge conflicts in question answering,” in Proc. [171] S. Toyer, O. Watkins, E. A. Mendes, J. Svegliato, L. Bailey, T. Wang,
EMNLP, pp. 7052–7063, 2021. I. Ong, K. Elmaaroufi, P. Abbeel, T. Darrell, A. Ritter, and S. Russell,
[146] E. Perez, S. Ringer, K. Lukosiute, K. Nguyen, E. Chen, and et al., “Tensor trust: Interpretable prompt injection attacks from an online game,”
“Discovering language model behaviors with model-written evaluations,” in Proc. ICLR, pp. 1–34, 2024.
in Proc. ACL Findings, pp. 13387–13434, 2023. [172] K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz,
[147] M. Zhang, O. Press, W. Merrill, A. Liu, and N. A. Smith, “How language “Not what you’ve signed up for: Compromising real-world LLM-integrated
model hallucinations can snowball,” arXiv preprint arXiv:2305.13534, applications with indirect prompt injection,” in Proc. AIsec, pp. 79–90,
pp. 1–13, 2023. 2023.
[148] N. Mündler, J. He, S. Jenko, and M. Vechev, “Self-contradictory halluci- [173] L.-b. Ning, S. Wang, W. Fan, Q. Li, X. Xu, H. Chen, and F. Huang,
nations of large language models: Evaluation, detection and mitigation,” “CheatAgent: Attacking llm-empowered recommender systems via llm
in Proc. ICLR, pp. 1–30, 2024. agent,” in Proc. ACM KDD, p. 2284–2295, 2024.
[149] K. Tian, E. Mitchell, H. Yao, C. D. Manning, and C. Finn, “Fine-tuning [174] D. Bespalov, S. Bhabesh, Y. Xiang, L. Zhou, and Y. Qi, “Towards building
language models for factuality,” in Proc. ICLR, pp. 1–16, 2024. a robust toxicity predictor,” in Proc. ACL, pp. 581–598, 2023.
[150] P. Manakul, A. Liusie, and M. J. F. Gales, “Selfcheckgpt: Zero-resource [175] Y. Cheng, L. Jiang, W. Macherey, and J. Eisenstein, “AdvAug: Robust
black-box hallucination detection for generative large language models,” adversarial augmentation for neural machine translation,” in Proc. ACL,
in Proc. EMNLP, pp. 9004–9017, 2023. pp. 5961–5970, 2020.
[151] C. Chen, K. Liu, Z. Chen, Y. Gu, Y. Wu, M. Tao, Z. Fu, and J. Ye, “IN- [176] A. Kumar, C. Agarwal, S. Srinivas, S. Feizi, and H. Lakkaraju,
SIDE: LLMs’ internal states retain the power of hallucination detection,” “Certifying LLM safety against adversarial prompting,” arXiv preprint
in Proc. ICLR, pp. 1–21, 2024. arXiv:2309.02705, pp. 1–32, 2023.
[152] A. Mallen, A. Asai, V. Zhong, R. Das, D. Khashabi, and H. Hajishirzi, [177] A. Helbling, M. Phute, M. Hull, and D. H. Chau, “LLM self defense:
“When not to trust language models: Investigating effectiveness of para- By self examination, LLMs know they are being tricked,” arXiv preprint
metric and non-parametric memories,” in Proc. ACL, pp. 9802–9822, 2023. arXiv:2308.07308, pp. 1–11, 2023.
[153] W. Shi, X. Han, M. Lewis, Y. Tsvetkov, L. Zettlemoyer, and W.-t. Yih, [178] Y. Zeng, Y. Wu, X. Zhang, H. Wang, and Q. Wu, “AutoDefense:
“Trusting your evidence: Hallucinate less with context-aware decoding,” Multi-agent LLM defense against jailbreak attacks,” arXiv preprint
in Proc. of NAACL, pp. 783–791, 2024. arXiv:2403.04783, pp. 1–20, 2024.
[154] E. Jones, H. Palangi, C. S. Ribeiro, V. Chandrasekaran, S. Mukherjee, [179] L. Shen, Y. Pu, S. Ji, C. Li, X. Zhang, C. Ge, and T. Wang, “Improving
A. Mitra, A. H. Awadallah, and E. Kamar, “Teaching language models to the robustness of transformer-based large language models with dynamic
hallucinate less with synthetic tasks,” in Proc. ICLR, pp. 1–18, 2024. attention,” in Proc. NDSS, pp. 1–18, 2024.
[155] L. Gao, Z. Dai, P. Pasupat, A. Chen, A. T. Chaganty, Y. Fan, V. Y. [180] E. Jones, A. Dragan, A. Raghunathan, and J. Steinhardt, “Automatically
Zhao, N. Lao, H. Lee, D. Juan, and K. Guu, “RARR: researching and auditing large language models via discrete optimization,” in Proc. ICML,
revising what language models say, using language models,” in Proc. ACL, pp. 15307–15329, 2023.
pp. 16477–16508, 2023. [181] H. Xu, W. Zhang, Z. Wang, F. Xiao, R. Zheng, Y. Feng, Z. Ba, and
[156] Y. Zhou, C. Cui, J. Yoon, L. Zhang, Z. Deng, C. Finn, M. Bansal, and K. Ren, “RedAgent: Red teaming large language models with context-
H. Yao, “Analyzing and mitigating object hallucination in large vision- aware autonomous language agent,” arXiv preprint arXiv:2407.16667,
language models,” in Proc. ICLR, pp. 1–25, 2024. pp. 1–17, 2024.
35

[182] R. Schuster, C. Song, E. Tromer, and V. Shmatikov, “You autocomplete els via neighbourhood comparison,” in Proc. ACL Findings, pp. 11330–
me: Poisoning vulnerabilities in neural code completion,” in Proc. USENIX, 11343, 2023.
pp. 1559–1575, 2021. [207] W. Shi, A. Ajith, M. Xia, Y. Huang, D. Liu, T. Blevins, D. Chen, and
[183] A. Wan, E. Wallace, S. Shen, and D. Klein, “Poisoning language models L. Zettlemoyer, “Detecting pretraining data from large language models,”
during instruction tuning,” in Proc. ICLR, pp. 35413–35425, 2023. in Proc. ICLR, pp. 1–18, 2024.
[184] S. Zhou, F. F. Xu, H. Zhu, X. Zhou, R. Lo, A. Sridhar, X. Cheng, [208] F. Kong, J. Duan, R. Ma, H. T. Shen, X. Shi, X. Zhu, and K. Xu, “An
T. Ou, Y. Bisk, D. Fried, U. Alon, and G. Neubig, “WebArena: A efficient membership inference attack for the diffusion model by proximal
realistic web environment for building autonomous agents,” arXiv preprint initialization,” in Proc. ICLR, pp. 1–19, 2024.
arXiv:2307.13854, pp. 1–15, 2024. [209] N. Kandpal, K. Pillutla, A. Oprea, P. Kairouz, C. Choquette-Choo, and
[185] B. Zhang, Y. Tan, Y. Shen, A. Salem, M. Backes, S. Zannettou, and Z. Xu, “User inference attacks on large language models,” in Proc. NIPS
Y. Zhang, “Breaking agents: Compromising autonomous LLM agents International Workshop on Federated Learning in the Age of Foundation
through malfunction amplification,” arXiv preprint arXiv:2407.20859, Models, pp. 1–33, 2023.
pp. 1–15, 2024. [210] F. Mireshghallah, A. Uniyal, T. Wang, D. Evans, and T. Berg-Kirkpatrick,
[186] C.-M. Chan, J. Yu, W. Chen, C. Jiang, X. Liu, W. Shi, Z. Liu, W. Xue, “An empirical analysis of memorization in fine-tuned autoregressive lan-
and Y. Guo, “AgentMonitor: A plug-and-play framework for predictive and guage models,” in Proc. EMNLP, pp. 1816–1826, 2022.
secure multi-agent systems,” arXiv preprint arXiv:2408.14972, pp. 1–29, [211] W. Fu, H. Wang, C. Gao, G. Liu, Y. Li, and T. Jiang, “Practical membership
2024. inference attacks against fine-tuned large language models via self-prompt
[187] L. Struppek, D. Hintersdorf, and K. Kersting, “Rickrolling the artist: calibration,” arXiv preprint arXiv:2311.06062, pp. 1–13, 2023.
Injecting backdoors into text encoders for text-to-image synthesis,” in Proc. [212] X. Pan, M. Zhang, S. Ji, and M. Yang, “Privacy risks of general-purpose
ICCV, pp. 4584–4596, 2023. language models,” in Proc. IEEE SP, pp. 1314–1331, 2020.
[188] J. Xu, M. D. Ma, F. Wang, C. Xiao, and M. Chen, “Instructions as back- [213] L. Wang, J. Wang, J. Wan, L. Long, Z. Yang, and Z. Qin, “Property
doors: Backdoor vulnerabilities of instruction tuning for large language existence inference against generative models,” in Proc. USENIX, pp. 1–18,
models,” in Proc. NAACL, pp. 3111–3126, 2024. 2024.
[189] Z. Xiang, F. Jiang, Z. Xiong, B. Ramasubramanian, R. Poovendran, and [214] N. Kandpal, E. Wallace, and C. Raffel, “Deduplicating training data
B. Li, “BadChain: Backdoor chain-of-thought prompting for large language mitigates privacy risks in language models,” in Proc. ICML, pp. 10697–
models,” in Proc. ICLR, pp. 1–28, 2024. 10707, 2022.
[190] C. Chen and J. Dai, “Mitigating backdoor attacks in lstm-based text clas- [215] S. Hoory, A. Feder, A. Tendler, S. Erell, A. Peled-Cohen, I. Laish,
sification systems by backdoor keyword identification,” Neurocomputing, H. Nakhost, U. Stemmer, A. Benjamini, A. Hassidim, and Y. Matias,
vol. 452, pp. 253–262, 2021. “Learning and evaluating a differentially private pre-trained language
model,” in Proc. EMNLP Findings, pp. 1178–1189, 2021.
[191] S. Zhao, L. Gan, L. A. Tuan, J. Fu, L. Lyu, M. Jia, and J. Wen, “Defending
[216] S. Kim, S. Yun, H. Lee, M. Gubri, S. Yoon, and S. J. Oh, “ProPILE:
against weight-poisoning backdoor attacks for parameter-efficient fine-
Probing privacy leakage in large language models,” in Proc. ICLR, pp. 1–
tuning,” in Proc. NAACL Findings, pp. 3421–3438, 2024.
18, 2023.
[192] C. Xu, J. Wang, F. Guzmán, B. Rubinstein, and T. Cohn, “Mitigating data
[217] B. Wang and N. Z. Gong, “Stealing hyperparameters in machine learning,”
poisoning in text classification with differential privacy,” in Proc. EMNLP
in Proc. IEEE SP, pp. 36–52, 2018.
Findings, pp. 4348–4356, 2021.
[218] K. Krishna, G. S. Tomar, A. P. Parikh, N. Papernot, and M. Iyyer, “Thieves
[193] C. Wei, W. Meng, Z. Zhang, M. Chen, M. Zhao, W. Fang, L. Wang,
on sesame street! model extraction of bert-based APIs,” in Proc. ICLR,
Z. Zhang, and W. Chen, “Lmsanitator: Defending prompt-tuning against
pp. 1–19, 2020.
task-agnostic backdoors,” in Proc. NDSS, pp. 1–18, 2024.
[219] A. Naseh, K. Krishna, M. Iyyer, and A. Houmansadr, “Stealing the
[194] B. Wang, Y. Yao, S. Shan, H. Li, B. Viswanath, H. Zheng, and B. Y. Zhao, decoding algorithms of language models,” in Proc. CCS, pp. 1835–1849,
“Neural cleanse: Identifying and mitigating backdoor attacks in neural 2023.
networks,” in Proc. IEEE SP, pp. 707–723, 2019. [220] Z. Li, C. Wang, P. Ma, C. Liu, S. Wang, D. Wu, C. Gao, and Y. Liu,
[195] S. M. Abdullah, A. Cheruvu, S. Kanchi, T. Chung, P. Gao, M. Jadliwala, “On extracting specialized code abilities from large language models: A
and B. Viswanath, “An analysis of recent advances in deepfake image feasibility study,” in Proc. ICSE, pp. 1–13, 2024.
detection in an evolving threat landscape,” in Proc. IEEE SP, pp. 1–19, [221] Y. Jiang, C. Chan, M. Chen, and W. Wang, “Lion: Adversarial distillation
2024. of proprietary large language models,” in Proc. EMNLP, pp. 3134–3154,
[196] L. Dugan, A. Hwang, F. Trhlik, J. M. Ludan, A. Zhu, H. Xu, D. Ippolito, 2023.
and C. Callison-Burch, “RAID: A shared benchmark for robust evaluation [222] M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks that
of machine-generated text detectors,” in Proc. ACL, pp. 12463–12492, exploit confidence information and basic countermeasures,” in Proc. CCS,
2024. pp. 1322–1333, 2015.
[197] I. Shumailov, Y. Zhao, D. Bates, N. Papernot, R. Mullins, and R. Anderson, [223] X. Shen, Y. Qu, M. Backes, and Y. Zhang, “Prompt stealing attacks against
“Sponge examples: Energy-latency attacks on neural networks,” in Proc. text-to-image generation models,” in Proc. USENIX, pp. 1–20, 2024.
EuroS&P, pp. 212–231, 2021. [224] Z. Sha and Y. Zhang, “Prompt stealing attacks against large language
[198] A. Salem, M. Backes, and Y. Zhang, “Get a model! model hijacking attack models,” arXiv preprint arXiv:2402.12959, pp. 1–16, 2024.
against machine learning models,” Proc. NDSS, pp. 1–17, 2022. [225] J. Kirchenbauer, J. Geiping, Y. Wen, J. Katz, I. Miers, and T. Goldstein, “A
[199] W. M. Si, M. Backes, Y. Zhang, and A. Salem, “Two-in-one: A model hi- watermark for large language models,” in Proc. ICML, pp. 17061–17084,
jacking attack against text generation models,” in Proc. USENIX, pp. 2223– 2023.
2240, 2023. [226] C. Mauran, “Samsung bans ChatGPT, AI chatbots after data leak blunder.”
[200] J. Huang, H. Shao, and K. C.-C. Chang, “Are large pre-trained language Accessed on 2024-06-19.
models leaking your personal information?,” in Proc. EMNLP, Findings, [227] J. Lim, Y. Kim, B. Kim, C. Ahn, J. Shin, E. Yang, and S. Han,
pp. 2038–2047, 2022. “BiasAdv: Bias-adversarial augmentation for model debiasing,” in Proc.
[201] Z. Zhang, J. Wen, and M. Huang, “ETHICIST: Targeted training data CVPR, pp. 3832–3841, 2023.
extraction through loss smoothed soft prompting and calibrated confidence [228] L. Zhu, K. Xu, Z. Ke, and R. W. Lau, “Mitigating intensity bias in shadow
estimation,” in Proc. ACL, pp. 12674–12687, 2023. detection via feature decomposition and reweighting,” in Proc. CVPR,
[202] A. Panda, C. A. Choquette-Choo, Z. Zhang, Y. Yang, and P. Mittal, “Teach pp. 4682–4691, 2021.
LLMs to phish: Stealing private information from language models,” in [229] V. Chamola, V. Hassija, A. R. Sulthana, D. Ghosh, D. Dhingra, and
Proc. ICLR, pp. 1–25, 2024. B. Sikdar, “A review of trustworthy and explainable artificial intelligence
[203] R. Staab, M. Vero, M. Balunovic, and M. Vechev, “Beyond memorization: (XAI),” IEEE Access, vol. 11, pp. 78994–79015, 2023.
Violating privacy via inference with large language models,” in Proc. ICLR, [230] X. Feng and S. Hu, “Cyber-physical zero trust architecture for industrial
pp. 1–47, 2024. cyber-physical systems,” IEEE Transactions on Industrial Cyber-Physical
[204] N. Carlini, J. Hayes, M. Nasr, M. Jagielski, V. Sehwag, F. Tramèr, B. Balle, Systems, vol. 1, pp. 394–405, 2023.
D. Ippolito, and E. Wallace, “Extracting training data from diffusion [231] Y. Lin, Z. Gao, H. Du, D. Niyato, J. Kang, Z. Xiong, and Z. Zheng,
models,” in Proc. USENIX, pp. 5253–5270, 2023. “Blockchain-based efficient and trustworthy AIGC services in meta-
[205] F. Mireshghallah, K. Goyal, A. Uniyal, T. Berg-Kirkpatrick, and R. Shokri, verse,” IEEE Transactions on Services Computing, pp. 1–13, 2024.
“Quantifying privacy risks of masked language models using membership doi:10.1109/TSC.2024.3382958.
inference attacks,” in Proc. EMNLP, pp. 8332–8347, 2022.
[206] J. Mattern, F. Mireshghallah, Z. Jin, B. Schoelkopf, M. Sachan, and
T. Berg-Kirkpatrick, “Membership inference attacks against language mod-

You might also like