Agent Survey
Agent Survey
Abstract—Large Model (LM) agents, powered by large foundation OpenAI ChatGPT (2022) Evolution towards
LM OpenAI GPT-4 (2023)
models such as GPT-4 and DALL-E 2, represent a significant OpenAI Sora (2024) Autonomous,
step towards achieving Artificial General Intelligence (AGI). LM Microsoft Xiaoice (2014) Embodied,
Deepmind AlphaGo (2016) AGI
agents exhibit key characteristics of autonomy, embodiment, and Connected
Deepmind AlphaZero (2017) LM Agent
arXiv:2409.14457v1 [cs.AI] 22 Sep 2024
the accuracy of their responses based on retrieved information. tasks [26]. By collaborating or competing with humans or other
Besides, LM agents can flexibly integrate a range of LMs, agents, LM agents can effectively enhance their decision-making
including Large Language Model (LLM) and Large Vision Model capabilities [27].
(LVM), to enable multifaceted capabilities. 2) Embodied Intelligence. Despite recent advancements, LMs
LM agents are recognized as a significant step towards achiev- typically passively respond to human commands in the text,
ing Artificial General Intelligence (AGI) and have been widely image, or multimodal domain, without engaging directly with
applied across fields such as web search [16], recommendation the physical world [7]. Embodied agents, on the other hand, can
systems [17], virtual assistants [18], [19], Metaverse gaming [20], actively perceive and act upon their environment, whether digital,
robotics [21], autonomous vehicles [22], and Electronic Design robotic, or physical, using sensors and actuators [21], [25]. The
Automation (EDA) [23]. As reported by MarketsandMarkets [24], shift to LM-empowered agents involves creating embodied AI
the worldwide market for autonomous AI and autonomous agents systems capable of understanding, learning, and solving real-
was valued at USD 48 billion in 2023 and is projected to grow world challenges. As depicted in Fig. 2(b), LM agents actively
at a CAGR of 43%, reaching USD 28.5 billion by 2028. LM interact with environments and adapt actions based on real-time
agents have attracted global attention, and leading technology feedbacks. For example, a household robot LM agent tasked with
giants including Google, OpenAI, Microsoft, IBM, AWS, Oracle, cleaning can generate tailored strategies by analyzing the room
NVIDIA, and Baidu are venturing into the LM agent industry. layout, surface types, and obstacles, instead of merely following
generic instructions.
B. Roadmap and Key Characteristics of LM Agents 3) Connected Intelligence. Connected LM agents extend be-
Fig. 3 illustrates a future vision of LM agents, characterized yond the capabilities of individual agents, playing a critical
by three key attributes: autonomous, embodied, and connected, role in tackling complex, real-world tasks [28]. For example, in
paving the way toward AGI. autonomous driving, connected autonomous vehicles, serving as
1) Autonomous Intelligence. Autonomous intelligence in LM LM agents, share real-time sensory data, coordinate movements,
agents refers to their ability to operate independently, making and negotiate passage at intersections to optimize traffic flow
proactive decisions without continuous human input. As depicted and enhance safety. As depicted in Fig. 3, by interconnecting
in Fig. 2(a), an LM agent can maintain an internal memory that numerous LM agents into the Internet of LM agents, connected
accumulates knowledge over time to guide future decisions and LM agents can freely share sensory data and task-oriented
actions, enabling continuous learning and adaptation in dynamic knowledge. By fully harnessing the computational power of
environments [25]. Additionally, LM agents can autonomously various specialized LMs, it fosters cooperative decision-making
utilize a variety of tools (e.g., search engines and external APIs) and collective intelligence. Theereby, the collaboration across
to gather information or or create new tools to handle intricate data, computation, and knowledge domains enhances individual
3
agent performance and adaptability. Additionally, these interac- adaptation tuning, utilization, and capacity assessment. Besides,
tions enable LM agents to form social connections and attributes, background information, mainstream technologies, and critical
contributing to the development of an agent society [29], [30]. applications of LLMs are introduced. Xu et al. [40] provide
a tutorial on key concepts, architecture, and metrics of edge-
C. Motivation for Securing Connected LM Agents cloud AI-Generated Content (AIGC) services in mobile networks,
and identify several use cases and implementation challenges.
Despite the bright future of LM agents, security and privacy
Huang et al. [1] offer a taxonomy of AI agents in virtual/physical
concerns remain significant barriers to their widespread adoption.
environments, discuss cognitive aspects of AI Agents, and survey
Throughout the life-cycle of LM agents, numerous vulnerabilities
the applications of AI agents in robotics, healthcare, and gaming.
can emerge, ranging from adversarial examples [31], agent poi-
Cheng et al. [10] review key components of LLM agents (in-
soning [32], LM hallucination [33], to pervasive data collection
cluding planning, memory, action, environment, and rethinking)
and memorization [34].
and their potential applications. Planning types, multi-role rela-
1) Security vulnerabilities. LM agents are prone to “hal-
tionships, and communication methods in multi-agent systems
lucinations”, where their foundational LMs generate plausible
are also reviewed. Masterman et al. [8] provide an overview of
but incorrect outputs not grounded in reality [33]. In multi-
single-agent and multi-agent architectures in industrial projects
agent settings, the hallucination phenomenon can propagate mis-
and present the insights and limitations of existing research.
information, compromise decision-making, cause task failures,
Guo et al. [41] discuss the four components (i.e., interface,
and pose risks to both physical entities and human. Moreover,
profiling, communication, and capabilities acquisition) of LLM-
maintaining the integrity and authenticity of sensory data and
based multi-agent systems and present two lines of applications
prompts used in training and inference is crucial, as biased or
in terms of problem solving and world simulation. Durante et
compromised inputs can lead to inaccurate or unfair outcomes
al. [42] introduce multimodal LM agents and a training frame-
[35]. Attacks such as adversarial manipulations [31], poisoning
work including learning, action, congnition, memory, action, and
[36], and backdoors [37] further threaten LM agents by allowing
perception. They also discuss the different roles of agents (e.g.,
malicious actors to manipulate inputs and deceive the models.
embodied, simulation, and knowledge inference), as well as
In collaborative environments, agent poisoning behaviors [32],
the potentials and experimental results in different applications
where malicious agents disrupt the behavior of others, can
including gaming, robotics, healthcare, multimodal tasks, and
undermine the collaborative systems. Additionally, integrating
Natural Language Processing (NLP). Hu et al. [20] outline six key
LM agents into Cyber-Physical-Social Systems (CPSS) expands
components (i.e., perception, thinking, memory, learning, action,
the attack surface, enabling adversaries to exploit vulnerabilities
and role-playing) of LLM-based game agents and review existing
within interconnected systems.
LLM-based game agents in six types of games. Xu et al. [43]
2) Privacy breaches. LM agents’ extensive data collection and
provide a comprehensive survey of enabling architectures and
memorization processes raise severe risks of data breaches and
challenges for LM agents in gaming. Qu et al. [44] provide a
unauthorized access. These agents often handle vast amounts of
comprehensive survey on integrating mobile edge intelligence
personal and sensitive business information for both To-Customer
(MEI) with LLMs, emphasizing key applications of deploying
(ToC) and To-Business (ToB) applications, raising concerns about
LLMs at the network edge along with state-of-the-art techniques
data storage, processing, sharing, and control [38]. Additionally,
in edge LLM caching, delivery, training, and inference.
LMs can inadvertently memorize sensitive details from their
training data, potentially exposing private information during Existing survey works on LM agents mainly focus on the
interactions [34]. Privacy risks are further compounded in multi- general framework design for single LLM agents and multi-agent
agent collaborations, where LM agents might inadvertently leak systems and their potentials in specific applications. Distinguished
sensitive information about users, other agents, or their internal from the above-mentioned existing surveys, this survey focuses
operations during communication and task execution. on the networking aspect of LM agents, including the general
architecture, enabling technologies, and collaboration paradigms
D. Related Surveys and Contributions to construct networked systems of LM agents in physical, virtual,
or mixed-reality environments. Moreover, with the advances of
Recently, LM agents have garnered significant interest across
LM agents, it is urgent to study their security and privacy in
academia and industry, leading to a variety of research explor-
future AI agent systems. This work comprehensively reviews
ing their potential from multiple perspectives. Notable survey
the security and privacy issues of LM agents and discusses the
papers in this field are as below. Andreas et al. [29] present a
existing and potential defense mechanisms, which are overlooked
toy experiment for AI agent construction and case studies on
in existing surveys. Table I compares the contributions of our
modeling communicative intentions, beliefs, and desires. Wang
survey with previous related surveys in the field of LM agents.
et al. [39] identify key components of LLM-based autonomous
agents (i.e., profile, memory, planning, and action) and the In this paper, we present a systematic review of the state-
subjective and objective evaluation metrics. Besides, they discuss of-the-arts in both single and connected LM agents, focusing
the applications of LLM agents in engineering, natural science, on security and privacy threats, existing and potential counter-
and social science. Xi et al. [9] present a general framework for measures, and future trends. Our survey aims to 1) provide a
LLM agents consisting of brain, action, and perception. Besides, broader understanding of how LM agents work and how they
they explore applications in single-agent, multi-agent, and human- interact in multi-agent scenarios, 2) examine the scope and impact
agent collaborations, as well as agent societies. Zhao et al. [2] of security and privacy challenges associated with LM agents
offer a systematic review of LLMs in terms of pre-training, and their interactions, and 3) highlight effective strategies and
4
Applications Travel agent Coding agent Robot agent Defense agent ...
Kernel
LM Agent Kernel
LM Agent System Call Interface
OS Kernel
OS System Call Interface
Large models
LM Agent
Cloud Connected LM
Agents
Edge
Cyber world
Physical embodied
Belief Desire
forms of LM agents
Intention Large Model Technology
Human Modules
society
Planning Memory
Smart cities
Action Security
LM Agent
Interaction
Network infra.
Handheld devices
& wearables Computing infra.
Human world Engine Physical world
Fig. 6: General Architecture of Internet of LM agents in bridging the human, physical, and cyber worlds. The LM agent has five
constructing modules: planning, action, memory, interaction, and security modules. The engine of connected LM agents is empowered
by a combination of five cutting-edge technologies: large foundation models, knowledge-related technologies, interaction, digital twin,
and multi-agent collaboration. LM agents interact with humans through Human-Machine Interaction (HMI) technologies such as NLP,
with the assistance of handheld devices and wearables, to understand human intentions, desires, and beliefs. LM agent can synchronize
data and statuses between the physical body and the digital brain through digital twin technologies, as well as perceive and act upon
the surrounding virtual/real environment. LM agents can be interconnected in the cyber space through efficient cloud-edge networking
for efficient data and knowledge sharing to promote multi-agent collaboration.
2) Constructing Modules of LM Agents: According to [1], [8]– • Large foundation model such as GPT-4 and DALL-E 2
[10], there are generally five constructing modules of LM agents: serves as the brain of an LM agent, which enables high-
planning, action, memory, interaction, and security modules (de- level pattern recognition, advanced reasoning, and intelligent
tails in Sect. II-C). These modules together enable LM agents to decision-making, providing the cognitive capabilities of LM
perceive, plan, act, learn, and interact efficiently and securely in agents [6], [7].
complex and dynamic environments. • Knowledge-related technologies enhance LM agents by in-
• Empowered by LMs, the planning module produces strate- corporating Knowledge Graphs (KGs), knowledge bases,
gies and action plans with the help of the memory module, and RAG systems, allowing agents to access, utilize, and
enabling informed decision-making [7], [10]. manage vast external knowledge sources, ensuring informed
• The action module executes these embodied actions, adapt- and contextually relevant actions [47].
ing actions based on real-time environmental feedback to • HMI technologies enable seamless interaction between hu-
ensure contextually appropriate responses [9], [42]. mans and agents through NLP, multimodal interfaces, and
• The memory module serves as a repository of accumulated Augmented/Virtual/Mixed Reality (AR/VR/MR), facilitating
knowledge (e.g., past experiences and external knowledge), dynamic and adaptive interactions [48].
facilitating continuous learning and improvement [10]. • Digital twin technologies allows efficient and seamless syn-
• The interaction module enables effective communication and chronization of data and statuses between the physical body
collaboration with humans, other agents, and environment. and the digital brain of an LM agent through intra-agent
• The security module is integrated throughout LM agents’ communications [49].
operations, ensuring active protection against threats and • Multi-agent collaboration technologies empower LM agents
maintaining integrity and confidentiality of data and pro- to work together efficiently, sharing data, resources, and
cesses. tasks to tackle complex problems by developing cooperation,
3) Engine of LM Agents: The engine of LM agents is powered competition, and coopetition strategies through inter-agent
by a combination of cutting-edge technologies including large communications [28].
foundation models, knowledge-related technologies, interaction, 4) Communication Mode of LM Agents: Every LM agent
digital twin, and multi-agent collaboration (details in Sect. II-D). consists of two parts: (i) the LM-empowered brain located in the
7
cloud, edge servers, or end devices and (ii) the corresponding feedback- Multi-
enhanced persona self-
physical body such as autonomous vehicle. Every LM agent planning planning
can actively interact with other LM agents, the virtual/real en- feedback-
Grounded
free planning
vironment, and humans. For connected LM agents, there exist planning
TABLE III: A Summary of Intra-Agent And Inter-Agent Communications for Connected LM Agents
Intra-agent Comm. Inter-agent Comm.
Involved Entity Brain←→Physical Body Brain←→Brain
Connection Type Within a single LM agent Among multiple LM agents
Support Two-way Communication ✔ ✔
Support Multimodal Interaction ✔ ✔
Support Semantic Communication ✔ ✔
Typical Communication Environment Wireless Wired
manageable sub-tasks [7], [14]. For example, CoT [11] is a library of executable code for complex behaviors, and an
popular sequential reasoning approach where each thought iterative prompting mechanism that refines programs based
builds directly on the previous one. It represents the step- on feedback. Wang et al. [57] further propose an interactive
by-step logical thinking and can enhance the generation of describe-explain-plan-select (DEPS) planning approach that
coherent and contextually relevant responses. ToT [12] or- improves LLM-generated plans by integrating execution
ganizes reasoning as a tree-like structure, exploring multiple descriptions, self-explanations, and a goal selector that ranks
paths simultaneously. In ToT, Each node represents a partial sub-goals to refine planning. Additionally, Song et al. [7]
solution, allowing the model to branch and backtrack to find present a grounded re-planning algorithm which dynamically
the optimal answer. Graph of Thought (GoT) [52] models updates high-level plans during task based on environmental
reasoning using an arbitrary graph structure, allowing more perceptions, triggering re-planning when actions fail or after
flexible information flow. GoT captures complex relation- a specified time.
ships between thoughts, enhancing the model’s problem- 2) Memory Module: The memory module is integral to LM
solving capabilities. AVIS [53] further refines the tree search agent’s ability to learn and adapt over time [39]. It maintains an
process for visual QA tasks using a human-defined transition internal memory that accumulates knowledge from past interac-
graph and enhances decision-making through a dynamic tions, thoughts, actions, observations, and experiences with users,
prompt manager. other agents, and the environments. The stored information guides
• Feedback-enhanced planning: To make effective long-term future decisions and actions, allowing the agent to continuously
planning in complex tasks, it is necessary to iteratively refine its knowledge and skills. This module ensures that the agent
reflect on and refine execution plans based on past ac- can remember and apply past lessons to new situations, thereby
tions and observations [39]. The goal is to correct past improving its long-term performance and adaptability [10]. There
errors and improve final outcomes. For example, ReAct are various memory formats such as natural language, embed-
[54] combines reasoning and acting by prompting LLMs to ded vectors, databases, and structured lists. Additionally, RAG
generate reasoning traces and actions simultaneously. This technologies [15] are employed to access external knowledge
dual approach allows the LLM to create, monitor, and adjust sources, further enhancing the accuracy and relevance of LM
action plans, while task-specific actions enhance interaction agent’s planning capabilities. In the literature [10], [39], memory
with external sources, improving response accuracy and can be divided into the following three types.
reliability. Reflexion [55] converts environmental feedback • Short-term memory focuses on the contextual information of
into self-reflection and enhances ReAct by enabling LLM the current situation. It is temporary and limited, typically
agents to learn from past errors and iteratively optimize managed through a context window that restricts the amount
behaviors. Reflexion features an actor that produces actions of information the LM agent can learn at a time.
and text via models (e.g., CoT and ReAct) enhanced by • Long-term memory stores LM agent’s historical behaviors
memory, an evaluator that scores outputs using task-specific and thoughts. This is achieved through external vector
reward functions, and self-reflection that generates verbal storage, which allows for quick retrieval of important in-
feedback to improve the actor. formation, ensuring that the agent can access relevant past
• Multi-persona self-planning: Inspired by pretend play, Wang experiences to inform current decisions [58].
et al. [56] develop a cognitive synergist that enables a • Hybrid memory combines short-term and long-term mem-
single LLM to split into multiple personas, facilitating ory to enhance an agent’s understanding of the current
self-collaboration for solving complex tasks. They propose context and leverage past experiences for better long-term
Solo Performance Prompting (SPP), where LLM identifies, reasoning. Liu et al. [59] propose the RAISE architecture to
simulates, and collaborates with diverse personas, such as enhance ReAct for conversational AI agents by integrating
domain experts or target audiences, without external retrieval a dual-component memory system, where Scratchpad cap-
systems. SPP enhances problem-solving by allowing LLM to tures recent interactions as short-term memory; while the
perform multi-turn self-revision and feedback from various retrieval module acts as long-term memory to access relevant
perspectives. examples. HIAGENT [60] employs cross-trial and in-trial
• Grounded planning: Executing plans in real-world envi- memory, where cross-trial memory stores historical trajec-
ronments (e.g., Minecraft) requires precise, multi-step rea- tories and in-trial memory captures current trials. Instead of
soning. VOYAGER [50], the first LLM-powered agent in retaining all action-observation pairs, HIAGENT uses sub-
Minecraft, utilizes in-context lifelong learning to adapt goals as memory chunks to save memory, each containing
and generalize skills to new tasks and worlds. VOYAGER summarized observations. LLM generates subgoals, executes
include an automatic curriculum for exploration, a skill actions to achieve them, and updates the working memory by
9
summarizing and replacing completed subgoals with relevant to solve problems, and negotiating roles in multi-agent
information. scenarios.
• Agent-Human Interactions. LM agents can interact with
3) Action Module: The action module equips the LM agent
humans including understanding and responding to natural
with the ability to execute and adapt actions in various environ-
language commands, recognizing and interpreting human
ments [9], [42]. This module is designed to handle embodied
emotions and expressions, and providing assistance in var-
actions and tool-use capabilities, enabling the agent to interact
ious tasks [20]. As observed, LLMs such as GPT-4 often
with its physical surroundings adaptively and effectively. Besides,
tend to forget character settings in multi-turn dialogues and
tools significantly broaden the action space of the agent.
struggle with detailed role assignments due to context win-
• Embodied actions. The action module empowers LM agents dow limits. To address this, a tree-structured persona model
to perform contextually appropriate embodied actions and is introduced in [65] for character assignment, detection, and
adapt to environmental changes, facilitating interaction with maintenance, enhancing agent interactions.
and adjustment to physical surroundings [21], [25]. As LLM- • Agent-Environment Interactions. LM agents can engage di-
generated action plans are often not directly executable in rectly with the physical or virtual environments. By fa-
interactive environments, Huang et al. [25] propose refining cilitating engagement in physical, virtual, or mixed-reality
LLM-generated plans for embodied agents by conditioning environments [1], [21], the interaction module ensures that
on demonstrations and semantically translating them into LM agents can operate effectively across different contexts.
admissible actions. Evaluations in the VirtualHome envi- Lai et al. develop the AutoWebGLM agent [66], which
ronment show significant improvements in executability, excels in web browsing tasks through curriculum learning,
ranging from 18% to 79% over the baseline LLM. Besides, self-sampling reinforcement learning, and rejection sampling
SayCan [21] enables embodied agents such as robots to fine-tuning. A Chrome extension based on AutoWebGLM
follow high-level instructions by leveraging LLM knowledge validates its effective reasoning and operation capability
in physically-grounded tasks, where LLM (i.e., Say) sug- across various websites in real-world services.
gests useful actions; while learned affordance functions (i.e., 5) Security Module: The security module is crucial to ensure
Can) assess feasibility. SayCan’s effectiveness is demon- the secure, safe, ethical, and privacy-preserving operations of
strated through 101 zero-shot real-world robotic tasks in LM agents [42]. It is designed to monitor and regulate the LM
a kitchen setting. PaLM-E [61] is a versatile multimodal agent’s actions, interactions, and decisions to prevent harm and
language model for embodied reasoning, visual-language, ensure compliance with legal and ethical standards. This module
and language tasks. It integrates continuous sensor inputs, employs technologies such as hallucination mitigation, anomaly
e.g., images and state estimates, into the same embedding detection, and access control to identify and mitigate potential
space as language tokens, allowing for grounded inferences security/privacy threats. It also incorporates ethical guidelines
in real-world sequential decision-making. and bias mitigation techniques to ensure fair and responsible
• Learning to use & make tools. By leveraging various tools behaviors. The security module can dynamically adapt to emerg-
(e.g., search engines and external APIs) [62], LM agent ing threats by learning from new security/privacy incidents and
can gather valuable information to handle assigned complex integrating updates from security/privacy databases and policies.
tasks. For example, AutoGPT integrates LLMs with prede- Connections Between Modules: The key components of an
termined tools such as web and file browsing. InteRecAgent LM agent are interconnected to create a cohesive and intelligent
[63] integrates LLMs as the brain and recommender models system. Particularly, the planning module relies on the memory
as tools, using querying, retrieval, and ranking tools to module to access past experiences and external knowledge, ensur-
handle complex user inquiries. Beyond using existing tools, ing informed decision-making. The action module executes plans
LM agents can also develop new tools to enhance task generated by the planning module, adapting actions based on
efficiency [9]. To optimize tool selection with a large toolset, real-time feedback and memory. The interaction module enhances
ReInvoke [64] introduces an unsupervised tool retrieval these processes by facilitating communication and collaboration,
method featuring a query generator to enrich tool documents which provides additional data and context for the planning and
in offline indexing and an intent extractor to identify tool- memory modules. Besides, security considerations are seamlessly
related intents from user queries in online inference, fol- integrated into every aspect of the LM agent’s operations to
lowed by a multi-view similarity ranking strategy to identify ensure robust and trustworthy performance.
the most relevant tools.
4) Interaction Module: The interaction module enables the D. Enabling Technologies of LM Agents
LM agent to interact with humans, other agents, and the envi-
As illustrated in Fig. 8, there are five enabling technologies
ronment [41]. Through these varied interactions, the agent can
underlying the engine of connected LM agents.
gather diverse experiences and knowledge, which are essential
1) Large Foundation Model Technologies: LMs, such as
for comprehensive understanding and adaptation.
LLMs and LVMs, serve as the core brains or controllers, provid-
• Agent-Agent Interactions. The interaction module allows ing advanced capabilities for AI agents across diverse applications
LM agents to communicate and collaborate with other [6], [57]. Table IV summarizes the basic training stages of LLM.
agents, fostering a cooperative network where information (i) Multimodal capability: By employing multimodal perception
and resources are shared [62]. This interaction can include (e.g., CLIP [69]) and tool utilization strategies, LM agents can
coordinating efforts on shared tasks, exchanging knowledge perceive and process various data types from virtual and real
10
This ensures an accurate and up-to-date reflection of their real- flows. For example, AutoGPT 3 is an open-source autonomous
world counterparts. (i) Virtual-physical synchronization: Digital agent utilizing GPT-3.5 or GPT-4 APIs to independently exe-
twin technologies empower LM agents by enabling seamless cute complex tasks by breaking down them into several sub-
and efficient synchronization of attributes, behaviors, states, and tasks and chaining LLM outputs, showcasing advanced reasoning
other data between their physical bodies and digital brains. capabilities [14]. AutoGen4 , developed by Microsoft, offers an
This synchronization is achieved through intra-agent bidirectional open-source multi-agent conversation framework, supports APIs
communications, where the physical body continuously transmits as tools for improved LLM inference, and emphasizes the auto-
real-time data to the digital twin for processing and analysis, matic generation and fine-tuning of AI models [89]. BabyAGI 5
while the digital twin sends back instructions and optimizations integrates task management via OpenAI platforms and vector
[49]. (ii) Virtual-physical feedback: This continuous feedback databases, simulating a simplified AGI by autonomously creating
loop enhances LM agent’s contextual awareness, allowing for and executing tasks based on high-level objectives. ChatDev6 fo-
immediate adjustments and optimizations in response to changing cuses on enhancing conversational AI, providing sophisticated di-
conditions [23]. For example, an LM agent operating machin- alogue management, coding, debugging, and project management
ery can use its digital twin to anticipate mechanical wear and to streamline software development processes [90]. MetaGPT 7
proactively schedule maintenance, thereby minimizing downtime explores the meta-learning paradigm, where the model is trained
and enhancing efficiency. (iii) Predictive analytics: Digital twins to rapidly adapt to new tasks by leveraging knowledge from
facilitate predictive analytics and simulation, enabling LM agents related tasks, thus improving efficiency and performance across
to anticipate future states and optimize their actions accordingly diverse applications [91].
[22]. This capability is crucial in complex environments where 1) Mobile Communications: LM agents offer significant ad-
unforeseen changes can significantly impact performance. Over- vantages for mobile communications by enabling low-cost and
all, digital twin technologies ensure that LM agents operate with context-aware decision-making [92], personalized user experi-
high accuracy, adaptability, and responsiveness across diverse ences [87], and automatic optimization problem formulation for
applications, effectively bridging the gap between the physical wireless resource allocation [93]. For example, NetLLM [92] fine-
and digital realms. tuning the LLM to acquire domain knowledge from multimodal
5) Multi-Agent Collaboration Technologies: Multi-agent col- data in networking scenarios (e.g., adaptive bitrate streaming,
laboration technologies [41] enable coordinated efforts of mul- viewport prediction, and cluster job scheduling ) with reduced
tiple LM agents, allowing them to work together synergisti- handcraft costs. Meanwhile, NetGPT [87] design a cloud-edge co-
cally to achieve common goals and tackle complex tasks that operative LM framework for personalized outputs and enhanced
would be challenging for individual agents to handle alone. prompt responses in mobile communications via de-duplication
(i) Data cooperation: It facilitates real-time and seamless in- and prompt enhancement technologies. ChatNet [94] uses four
formation sharing through inter-agent communications, enabling GPT-4 models to serve as analyzer (to plan network capacity and
LM agents to continuously synchronize their understanding of designate tools), planner (to decouple network tasks), calculator
dynamic environments [81], [82]. (ii) Knowledge cooperation: (to compute and optimize the cost), and executor (to produce
By leveraging knowledge representation frameworks such as KGs customized network capacity solutions) via prompt engineering.
[47], [83] and vector databases [84], LM agents can share and LM agents can also help enhance the QoE of end users.
aggregate domain-specific insights, enhancing collective learning For example, MobileAgent v2 [18], launched by Alibaba, is a
and decision-making. This shared knowledge base allows LM mobile device operation assistant that achieves effective naviga-
agents to build upon each other’s experiences to accelerate the tion through multi-agent collaboration, automatically performing
learning process [71]. (iii) Computation cooperation: On the one tasks such as application installation and map navigation, and
hand, collaborative problem-solving techniques , such as multi- supports multimodal input including visual perception, enhanc-
agent planning [85] and distributed reasoning [86], empower LM ing operational efficiency on mobile devices. AppAgent [19],
agents to jointly analyze complex issues, devise solutions, and developed by Tencent, performs various tasks on mobile phones
execute coordinated actions. On the other hand, dynamic resource through autonomous learning and imitating human click and
allocation mechanisms [87], [88], including market-based and swipe gestures, including posting on social media, helping users
game-theoretical approaches, enable LM agents to negotiate and write and send emails, using maps, online shopping, and even
allocate resources dynamically, thereby optimizing resource uti- complex image editing.
lization ensuring effective task execution across multiple agents. 2) Intelligent Robots: LM agents play a crucial role in ad-
This synergy is particularly beneficial in dynamic environments, vancing intelligent industrial and service robots [21]. These
which not only improves the operational capabilities of individual robots can perform complex tasks such as product assembly,
LM agents but also enhances the overall functionality of the environmental cleaning, and customer service, by perceiving
system of connected LM agents. surroundings and learning necessary skills through deep learning
models. In August 2024, FigureAI released Figure 028 , a human-
like robot powered by OpenAI LM, capable of fast common-sense
E. Modern Prototypes & Applications of LM Agents
3 https://autogpt.net/
Recently, various industrial projects of LM agents, such as Au-
4 https://microsoft.github.io/autogen/
toGPT, AutoGen, BabyAGI, ChatDev, and MetaGPT, demonstrate 5 https://github.com/yoheinakajima/babyagi
their diverse potential in assisting web, life, and business scenar- 6 https://github.com/OpenBMB/ChatDev
ios, such as planning personalized travels, automating creative 7 https://www.deepwisdom.ai/
visual reasoning and speech-to-speech conversation with humans (a) Fully Cooperative (b) Partially Cooperative (c) Fully Competitive
to handle dangerous jobs in various environments. Besides, in
July 2024, Tesla unveiled its second-generation humanoid robot
named Optimus9 , which demonstrates the enhanced capabilities
brought by advanced LM agents.
3) Autonomous Driving: LM agents are transforming au- Cooperative autonomous Intra-team cooperation & Attacker Defender
driving inter-team competition agent agent
tonomous driving by enhancing vehicle intelligence, improving
safety, and optimizing driving experience (e.g., offering person- Fig. 9: Illustration of interaction types of LM agents, i.e., fully
alized in-car experience) [22]. In May 2024, Baidu’s Apollo cooperative, partially cooperative, and fully competitive.
Day saw the launch of Carrot Run’s sixth-generation unmanned
vehicle, which is built upon the Apollo Autonomous Driving
Foundation Model (ADFM)10 , the first LM agent supporting 1) Competition: Competition involves LM agents pursuing
L4 autonomous driving. Companies such as Tesla, Waymo, and their individual objectives, often at the expense of others. This
Cruise are also leveraging LM agents to refine their autonomous interaction mode is characterized by non-cooperative strate-
driving systems, aiming for safer and more efficient transportation gies where agents aim to maximize their own benefits. Non-
solutions. cooperative game and multi-agent debate have been widely
4) Autonomous attack-defense confrontation: LM agents can adopted to model the competitive behaviors among LM agents. In
be seen as autonomous and intelligent cybersecurity decision- non-cooperative games, LM agents engage in strategic decision-
maker capable of making security decisions and taking threat making where each LM agent’s goal is to search the Nash equi-
handling actions without human intervention. For example, Pen- librium. LM agents in multi-agent debate [99] involve engaging
testGPT [95] is an automated penetration testing tool supported in structured arguments or debates to defend their positions and
by LLMs, designed to use GPT-4 for automated network vulner- challenge the strategies of others.
ability scanning and exploitation. AutoAttacker [96], an LM tool, The cognitive behavior of LLMs, such as self-reflection, has
can autonomously generate and execute network attacks based proven effective in solving NLP tasks but can lead to thought
on predefined attack steps. As reported by a latest research [97], degeneration due to biases, rigidity, and limited feedback. Multi-
LM agents can automatically exploit one-day vulnerabilities; and agent debate (MAD) [99] explores interactions among LLMs,
in tests on 15 real-world vulnerability datasets, GPT-4 success- where agents engage in a dynamic tit-for-tat, allowing them to
fully exploited 87% of vulnerabilities, significantly outperforming correct each other’s biases, overcome resistance to change, and
other tools. provide mutual feedback. In MAD, diverse role prompts (i.e.,
personas) are crucial and there are generally three communication
strategies: (a) one-by-one debate, (b) simultaneous-talk, and (c)
III. N ETWORKING L ARGE M ODEL AGENTS : PARADIGMS simultaneous-talk-with-summarizer.
A. Overview of Interactions of Connected LM Agents 2) Partial Cooperation: Partial cooperation occurs when LM
For connected LM agents, multi-agent interactions refer to the agents collaborate to a limited extent, often driven by overlapping
dynamic and complex interactions between multiple autonomous but not fully aligned interests [62]. In such scenarios, agents
LM agents that operate within a shared environment. As depicted might share certain resources or information while retaining
in Fig. 9, these interactions can be categorized into cooperation, autonomy over other aspects. This interaction mode balances the
partial cooperation, and competition [62], [98], each of which benefits of cooperation with the need for individual agency and
involves different strategies to jointly optimize the collective or competitive advantage. Partial cooperation can be strategically
individual outcomes. advantageous in environments where complete cooperation is
impractical or undesirable due to conflicting goals or resource
9 https://viso.ai/edge-ai/tesla-bot-optimus/ constraints. Hierarchical game and coalition formation theory
10 https://autonews.gasgoo.com/icv/70033042.html can be employed to model both cooperative and competitive
13
Task
Multimodal Data
Cooperation
LM split Knowledge
Searching
Computation Knowledge
LM Agents
cooperation cooperation
LM cascade
Knowledge Sync.
Cloud-Edge-End
Cooperation Knowledge-LM
Co-driven
prompt
Fig. 10: Illustration of the cooperation modes of connected LM agents including data cooperation, computation cooperation, and
knowledge cooperation.
tion interfaces, such as documents and diagrams, instead validated using LSMDC and Youcook2 datasets for video-
of dialogue. Using a publish-subscribe mechanism, agents text retrieval tasks and video captioning tasks. To further
can freely exchange messages via a shared message pool, enable fine-grained cross-modal retrieval, Chen et al. [108]
publishing their outputs and accessing others’ transparently. develop a novel attention mechanism to effectively integrate
Park et al. [103] create a community of 25 generative agents feature information from different modalities and represent
in a sandbox world named Smallville, where agents are them within a unified space, thereby overcoming semantic
represented by sprite avatars. These agents perform daily gap between multiple data modalities
tasks, form opinions, and interact, mimicking human-like • Contextual Understanding: By utilizing multimodal data, it
behaviors. facilitates a richer contextual understanding of the environ-
• Theory of mind. It refers to the ability to understand about ment with improved accuracy of predictions and actions.
others’ hidden mental states, which is essential for social For example, Li et al. [78] design a general semantic
interactions. As LLMs engage more in human interactions, communication-based multi-agent collaboration framework
enhancing their social intelligence is vital. Li et al. [105] with enhanced contextual understanding and study a use case
identify limitations in LLM collaboration and propose a in search and rescue tasks.
prompt-engineering method to incorporate explicit belief
state representations. They also introduce a novel evaluation 2) Spatio-temporal Data Cooperation: Spatio-temporal data
of LLMs’ high-order theory of mind in teamwork scenar- cooperation for LM agents involves the integration and synchro-
ios, emphasizing dynamic belief state evolution and intent nization of spatial and temporal data across various modalities
communication among agents. and sources, enabling LM agents to achieve a comprehensive and
In the following, we discuss the detailed cooperation paradigms dynamic understanding of the environment over time. This coop-
of LM agents from the perspectives of data cooperation, compu- eration ensures that LM agents can effectively analyze patterns,
tation cooperation, and knowledge cooperation. predict future states, and make informed decisions in real-time,
based on both spatial distribution and temporal evolution of data.
B. Data Cooperation for LM Agents • Spatio-temporal Feature Fusion: Yang et al. [109] introduce
SCOPE, a collaborative perception mechanism that enhances
The data cooperation among LM agents involves the modality spatio-temporal awareness among on-road agents through
perspective, the temporal perspective, and the spatial perspective. end-to-end aggregation. SCOPE excels by leveraging tem-
Effective data cooperation ensures that LM agents can seamlessly poral semantic cues, integrating spatial information from
integrate and utilize data from diverse sources and modalities, diverse agents, and adaptively fusing multi-source repre-
enhancing their capabilities and performance in various applica- sentations to improve accuracy and robustness. However,
tions. [109] mainly works for small-scale scenarios. By captur-
1) Multimodal Data Cooperation: Multimodal data coopera- ing both spatial and temporal heterogeneity of citywide
tion iemphasizes the fusion of data from various modalities, such traffc, Ji et al. [110] propose a novel spatio-temporal self-
as text, images, audio, and video, to provide a comprehensive supervised learning framework for traffic prediction that
understanding of the environment. This cooperation allows LM improves representation of traffic patterns. This framework
agents to process and interpret information from multiple sources, uses an integrated module combining temporal and spatial
leading to more accurate and robust decision-making. convolutions and employs adaptive augmentation of traf-
• Data Fusion: By combining data from different modalities, fic graph data, supported by two auxiliary self-supervised
it helps create a unified representation that leverages the learning tasks to improve prediction accuracy. To further
strength of each type of data. For example, Wu et al. address data noises, missing information, and distribution
[106] propose a multi-agent collaborative vehicle detection heterogeneity in spatio-temporal data, which are overlooked
network named MuDet, which integrates RGB and height in [109], [110], Zhang et al. [111] devise an automated
map modalities for improved object identification in dense spatio-temporal graph contrastive learning framework named
and occluded environments such as post-disaster sites. Gross AutoST. Built on a heterogeneous Graph Neural Network
et al. [81] discuss the use of multimodal data to model (GNN), AutoST captures multi-view region dependencies
communication in artificial social agents, emphasizing the and enhances robustness through a spatio-temporal aug-
importance of verbal and nonverbal cues for natural human- mentation scheme with a parameterized contrastive view
robot interaction. generator.
• Cross-Modal Retrieval: By enabling LM agents to retrieve • Temporal dynamics with topological reasoning: Chen et
relevant information across different modalities, it enhances al. [82] propose a temporally dynamic multi-agent col-
their ability to respond to complex queries and scenarios. laboration framework that organizes agents using directed
For example, Gur et al. [107] design an alignment model acyclic graphs to facilitate interactive reasoning, demonstrat-
and retrieval-augmented multi-modal transformers for effi- ing superior performance across various network topologies
cient image-caption retrieval in visual Question-Answering and enabling collaboration among thousands of agents. A
(QA) tasks. By considering the intra-modality similarities key finding in [82] is the discovery of the collaborative
in multi-modal video representations, Zolfaghari et al. [72] scaling law, where solution quality improves in a logistic
introduce the contrastive loss in contrastive learning process growth pattern as more agents are added, with collaborative
for enhanced cross-modal embedding. The effectiveness is emergence occurring sooner than neural emergence.
15
agents at each stage. For example, Jiang et al. [28] present the need for extensive retraining data and computational
CommLLM, a multi-agent system for natural language-based resources.
communication tasks. It features three components: 1) multi-agent • Cloud-Edge LM deployment via quantization and hardware
data retrieval, which uses condensate and inference agents to acceleration: It employs specialized hardware to accelerate
refine 6G communication knowledge; 2) multi-agent collaborative the deployment and operation of LMs at the edge. Hard-
planning, employing various planning agents to develop solutions ware accelerators, such as GPUs and TPUs, can enhance
from multiple viewpoints; and 3) multi-agent reflection, which computational efficiency and provide robust support for
evaluates solutions and provides improvement recommendations cloud training and edge deployment. Agile-Quant [118]
through reflection and refinement agents. However, a limitation enhances the efficiency of LLMs on edge devices through
of vertical collaboration is that higher-level LM agents rely on the activation-guided quantization and hardware acceleration by
accuracy of lower-level outputs, making the process vulnerable to utilizing a SIMD-based 4-bit multiplier and efficient TRIP
error propagation. Mistakes made early can compromise the final matrix multiplication, achieving up to 2.55x speedup while
outcome, therefore this approach demands high precision from maintaining high task performance.
each agent to ensure overall system reliability. • Cloud-Edge LM deployment via Split Learning (SL): SL is
3) Hybrid collaboration. In practical LM agent environments, an emerging distributed ML paradigm in which the model is
real-world applications often involve both horizontal and vertical split into several parts [120]. SL is typically implemented in
collaboration paradigms, resulting in the hybrid collaboration. For three settings: two-part single-client, two-part multi-client,
example, in addressing highly complex tasks, the problem is first and U-Shape configurations [121]. The goal of SL is to
broken down into manageable sub-tasks, each assigned to special- offload complex computations and enhance data privacy by
ized LM agents. Here, horizontal collaboration enables agents keeping the preceding model segments local. This approach
to perform parallel evaluations or validations, while vertical addresses the substantial computational demands and privacy
collaboration ensures that sequential processing stages refine the concerns associated with training and inference in LM
task further. The computational collaboration among LM agents agents. For example, Xu et al. [120] propose a cloud-edge-
involves a mix of horizontal and vertical approaches, such as end computing system for LLM agents using SL, where
coordinating parallel assessments across agents while sequentially mobile agents with local models (0-10B parameters) handle
integrating outputs through defined stages, thus optimizing task real-time tasks, and edge agents with larger models (over
execution and enhancing overall system performance. 10B parameters) provide complex support by incorporating
② Cross-layer computation cooperation. For cross-layer com- broader contextual data. They also study a real case, where
putation cooperation among LM agents, it can be classified into mobile agents create localized accident scene descriptions,
two modes: cloud-edge cooperation and edge-edge cooperation. which are then enhanced by edge agents to generate com-
prehensive accident reports and actionable plans.
1) Cloud-Edge Cooperation. In real-world scenarios, running
• Cloud-Edge LM inference via model sharding: It distributes
a complete LM agent requires substantial computational re-
LM shards across heterogeneous edge servers to accom-
sources. Edge servers, often equipped with fewer devices and
modate varying device capabilities and resource conditions.
resources than cloud servers, typically lack the capacity to
By partitioning the LM into smaller shards, it allows that
support the operation of a complete LM agent. The cloud-
each edge server only handles a manageable portion of
edge collaboration approach enables edge servers to support LM
the LM, optimizing resource utilization and performance.
agents effectively. A range of strategies have been developed to
For example, EdgeShard [119] improves LLM inference by
optimize resource utilization, enhance performance, and reduce
partitioning LMs into shards for deployment on distributed
operational costs, encompassing transfer learning, model com-
edge devices and cloud servers, optimizing latency and
pression, caching, hardware acceleration, and model sharding.
throughput with an adaptive algorithm, achieving up to 50%
In the literature, there exist the following types of cloud-edge
latency reduction and 2x throughput improvement.
cooperation paradigms: orchestration between LM and Smaller
• Cloud-Edge LM services via caching and resource opti-
Models (SMs), lightweight edge LM deployment, sharding-based
mization: It utilizes caching mechanisms and other resource
edge LM inference, and cloud-edge LM service optimizations.
optimization techniques to configure and support LM agents
• LM-SM orchestration via transfer learning or compression: on edge servers. By caching frequently used data and inter-
It deploys a complete LM agent in the cloud and create mediate results, the system reduces the need for continuous
smaller, specialized LM agents through transfer learning or high-volume data transfer between the cloud and the edge
model compression. The agents run on smaller models with node. Scissorhands [117] reduces the memory usage of
reduced parameter scales, are then deployed on edge servers, the KV cache in LLMs during inference by selectively
requiring fewer resources while maintaining high perfor- storing pivotal tokens, thus maintaining high throughput and
mance. VPGTrans [115] addresses the high computational model quality without requiring model finetuning. Xu et al.
costs of training visual prompt generators for multimodal [88] propose a joint caching and inference framework for
LLMs by significantly reducing GPU hours and training data edge intelligence in space-air-ground integrated networks to
needed via transferring an existing visual prompt generator efficiently deploy LLM agents. They design a new cached
from one multimodal LLM to another. LLM-Pruner [116] model-as-a-resource paradigm and introduce the concept
compresses LLMs without compromising their multi-task of age-of-thought for optimization, along with a deep Q-
solving and language generation abilities, using a task- network-based auction mechanism to incentivize network
agnostic structural pruning method that significantly reduces
17
operators.
Parameter
3) Edge-Edge Cooperation. Edge-edge cooperation involves extraction
Injection
the collaboration between multiple edge devices to enhance Training data
computational efficiency and resource utilization. In practical Parameter
Target network
Teacher Student
scenarios, edge servers need to protect local data privacy while knowledge
(a)Knowledge synchronization: parameterized knowledge transfer
lacking the resources to train LM agents independently. By lever-
aging FL, edge-edge cooperation enables decentralized training Knowledge
Q:In which (2028 Olympics,
of LM agents across edge devices without requiring data to be Retrieval
country will the Held, Los Angeles)
A: USA
centralized, thus preserving data privacy and reducing latency. 2028 Olympics (Los Angeles,
be held? located, USA)
KGs
For example, to tackle issues of data scarcity and privacy in LLM
development, Chen et al. [86] propose federated LLM, combining (b)Knowledge Searching: KGs retrieval enhancement
method typically assumes a unified network architecture the model to continuously grow as new data flows in.
and attempts to map the weights between different neural
2) Knowledge Searching: LMs not only rely on the learned
networks.
knowledge during pre-training but also can dynamically access
• Knowledge update: The timely synchronization of latest
and query external knowledge bases (e.g., databases, the Internet,
knowledge into AI models can be divided into two lines:
and KGs) to obtain the latest information to help reasoning, as
updates to external knowledge bases and updates to internal
summarized in Table VIII.
parameter knowledge. (i) For external knowledge bases,
① KGs: In KG, knowledge entities and their relationships are
there are three main approaches: feedback enhancement,
represented as structured graphs containing nodes and edges,
network enhancement, and retrieval enhancement. Tandon
which is widely adopted for requiring deep semantic understand-
et al. [75] pair LMs with a growing memory to train a
ing and complex relational reasoning. For QA tasks, Guan et al.
correction model, where users identify output errors and
[47] study a KG-enabled factual awareness enhancement method
provide general feedbacks on how to correct them. Net-
of LLMs for improved accuracy of AI models by integrating
work enhancement uses the Internet to update knowledge.
structured knowledge. KGs are also advantageous for integrating
Lazaridou et al. [128] use few-shot prompting to learn to
diverse information from different data sources. Zhang et al. [132]
adjust LMs based on the information returned from Google
propose a fully automatic KG alignment method, using LLMs
searches. For retrieval enhancement, Trivedi et al. [58]
to identify and align entities in different KGs, aiming to solve
propose a new multi-step QA method named IRCoT to
the heterogeneity problem between different KGs and integrate
interleave retrieval with steps in CoT, which first uses CoT to
multi-source data. Zhang et al. [133] propose a context-aware
guide retrieval and then uses retrieval results to improve CoT.
concurrent fuzz testing method that combines KGs with LLMs
(ii) For internal parameter knowledge, it mainly includes
to effectively identify and handle data race problems in concurrent
three methods: knowledge editing, Parameter-Efficient Fine-
programs.
Tuning (PEFT), and continual learning. Knowledge editing
is primarily used for quickly correcting specific errors in the ② RAG: RAG technology combines information retrieval and
model or updating a small portion of the model’s knowledge, generation models by first retrieving relevant contents and then
making it suitable for fine-grained corrections. Chen et al. generating answers based on these contents, suitable for fields
[129] propose a dual-stage learning algorithm called RECK- requiring the latest information and extensive knowledge cov-
ONING, which folds the provided contextual knowledge erage. According to the data type, it can be divided into two
into the model’s parameters and uses backpropagation to categories. (i) RAG based on static knowledge bases, such as
update the parameters, thereby improving the accuracy of Wikipedia and documents. Lewis et al. [15] demonstrate how
LMs reasoning. PEFT reduces the number of parameters RAG generates accurate and contextually relevant responses by
to be adjusted through optimization techniques for reduced retrieving relevant knowledge chunks from static sources such as
computational overheads. Hu et al. [130] introduce LLM- Wikipedia. (ii) RAG based on dynamic knowledge bases, such as
Adapters, an easy-to-use framework that integrates various news APIs, which contains two lines: exploring new knowledge
adapters into LLMs and perform these adapter-based LLM or retrieving past knowledge. For new knowledge exploration,
PEFT methods for different tasks. Continual learning ensures Dai et al. [76] explore the use of RAG technology for improved
that AI model can continuously learn while receiving new safety and reliability of autonomous driving systems by utilizing
tasks and new data without forgetting previously learned real-time updated data sources, including in-vehicle sensor data,
knowledge. Qin et al. [131] propose ELLE, which flexibly traffic information, and other driving-related dynamic data. It
extends the breadth and depth of existing PLMs, allowing demonstrates RAG’s potential in handling complex environments
and responding to unexpected situations. For past knowledge
19
retrieval, Kuroki et al. [84] develop a novel vector database followed by a simple relation construction head. (iii) KG verifica-
named coordination skill database to efficiently retrieve past tion. Apart from KG construction, LMs can be employed to verify
memories in multi-agent scenarios to adapt to new cooperation KGs. Han et al. [83] propose a prompt framework for iterative
missions. By harnessing both internal and external knowledge, verification, using small LMs to correct errors in KG generated
Wang et al. [71] propose a method called Self-Knowledge-Guided by LLMs such as ChatGPT. (iv) KG completion, i.e., inferring
Retrieval (SKGR), which enhances retrieval by allowing LLMs to missing facts in a given KG. Traditional KG completion methods
adaptively call external resources when handling new problems. merely focus on the structure of KGs, without considering exten-
3) Knowledge and LM Co-driven: Knowledge and LM co- sive textual information. LLMs holds the potential to enhance KG
driven paradigms contains two lines of approaches: knowledge completion performance via encoding text or generating facts.
base-enhanced LMs and LM-assisted KGs, as summarized in Shen et al. [77] use LLMs as encoders, primarily capturing the
Table IX. semantic information of knowledge graph triples through the
① Knowledge base-enhanced LMs: Knowledge base can en- model’s forward pass, and then reconstructing the knowledge
hance LM inference at various stages. (i) In the pre-training graph’s structure by calculating a loss function, thereby better
stages, Sun et al. [134] propose an explainable neuro-symbolic integrating semantic and structural information. As structure-
knowledge base, where the fact memory is composed of a triple specific KG construction models are mutually incompatible and
of vector representations of entities and relations from existing cannot adapt to emerging knowledge graphs, Chen et al. [137] use
knowledge bases. These vector representations are integrated LLMs as generators to propose KG-S2S, a Seq2Seq generative
into the LLM during pre-training. (ii) In the fine-tuning stage, framework that represents knowledge graph facts as“flat” text to
considering that existing methods often overwrite the original pa- address the issue of heterogeneous graph structures and generate
rameters of pre-trained models when injecting knowledge, Wang the information lost during flattening in the completion process.
et al. [73] propose the K-Adapter framework. This framework
uses RoBERTa as the backbone model and assigns a neural IV. S ECURITY T HREATS & C OUNTERMEASURES TO L ARGE
adapter to each type of injected knowledge. These adapters can M ODEL AGENTS
be trained effectively in a distributed manner, thereby improving In this section, we present a comprehensive review of security
the performance of LMs on specific tasks. (iii) In the inference threats related to LM agents and examine the state-of-the-art
stage, most existing methods focus on factual information related countermeasures to defend against them, by investigating the
to entities explicitly mentioned in the query. Guan et al. [47] are recent research advancements. Fig. 15 illustrates the taxonomy of
the first to consider verification during the inference process of security threats to LM agents. Firstly, the definition, categoriza-
LLMs. They proposed a new KG-based retrofitting framework, tion, causes, and corresponding countermeasures for hallucination
which automatically adjusts the initial responses generated by threats are provided in Sect. IV-A. Following that, we discuss
LLMs based on factual knowledge stored in KGs, effectively adversarial attacks including adversarial input attacks and prompt
improving inference accuracy. hacking attacks and review corresponding countermeasures in
② LM-enhanced KGs: The information extraction and knowl- Sect. IV-B. Next, we present poisoning and backdoor attacks to
edge fusion capabilities of LMs assist the integration and updating LM agents and review defense countermeasures in Sect. IV-C.
of diverse data during the construction and completion of KGs. (i) Finally, other security threats to LM agents are identified in
Entity and relationship extraction. Traditional methods typically Sect. IV-D, including false and harmful content generation, DoS
rely on specific modules to handle each data type separately, such attacks, and agent hijacking attacks.
as text data, image data, or structured data. However, LMs can au-
tomatically extract entities and relationships. De Cao et al. [135]
propose GENRE, a system that retrieves entities in knowledge A. Hallucination
graphs by generating their unique names in an autoregressive The hallucination of LM agents typically refer to erroneous,
manner from left to right, which better captures the relationship inaccurate or illogical outputs that deviate from user inputs,
between context and entity names while significantly reducing generated context, or real-world conditions. It poses a substantial
memory consumption through an encoder-decoder architecture. risk to the reliability of LM agents. For example, hallucination
However, entity linking speed affects the inference speed of LMs. may lead to erroneous decision-making in an automated driving
Ayoola et al. [79] propose ReFinED, an efficient end-to-end entity system, thereby elevating the potential for severe traffic accidents.
linking model that uses fine-grained entity types and descriptions According to [33], we categorize the hallucination within the
for linking, achieving speeds over 60 times faster than existing context of LM agents from the following four perspectives, as
methods. LMs can also extract relationships between entities. illustrated in Fig. 16.
Ma et al. [80] propose DREEAM, a self-training strategy for • Input-conflicting hallucination: The input-conflicting hallu-
document-level relationship extraction, which is the first system cination refers to the content generated by LM agents di-
to adopt a self-training strategy for evidence retrieval, learning verges from user input. For example, when a user requests an
entity relationships from automatically generated evidence from LM agent to draft an introduction about electric vehicles, the
large datasets, addressing the issues of high memory consumption agent provides an introduction about gas-powered vehicles
and limited annotation availability. (ii) KG construction, which instead.
can be done through constructing from raw text and extract- • Context-conflicting hallucination: It refers to the inconsis-
ing from LLMs. Melnyk et al. [136] propose a novel end-to- tency between the generated content of LM agents and pre-
end multi-stage system for efficient KG construction from text viously generated information during multi-turn interactions.
descriptions, which first utilizes LLMs to generate KG entities, For example, a user and an agent discuss the film “Jurassic
20
Park” in the initial stage of a dialogue, but the agent’s the nation’s founding.
responses may shift the subject to “Titanic” as interaction The causes of hallucinations of LM agents can be broadly
continues, thereby resulting in a context inconsistency. attributed into the following three sources: data and knowledge,
• Knowledge-conflicting hallucination: LM agents generally training and inference, and multi-agent interactions.
depend on plug-in knowledge databases to facilitate accu-
1) Hallucination from Data and Knowledge: The primary
rate and efficient responses. The occurrence of knowledge-
cause of hallucinations in LM agents stems from the biased or
conflicting hallucination is noted when the responses gen-
imperfect nature of the training data and knowledge employed to
erated by agents contradict the corresponding knowledge
facilitate content generation. Specifically, i) annotation irrelevance
within knowledge databases.
in the collected dataset can lead to hallucinations in LM agents.
• Fact-conflicting hallucination: The fact-conflicting halluci-
For LLMs, source-reference divergence occurs due to heuristic
nation arises when LM agents generate content which is
data collection, implying that the selected sources may lack
in conflict with the established world facts. For example, a
sufficient information for content generation [138]. For LVMs, a
history-education LM agent provides the incorrect year 1783
substantial amount of instruction data is synthesized using LLMs.
as the founding date of the United States, conflating the end
However, generated instructions may not accurately correspond
of the American Revolutionary War with the actual date of
to the content depicted in the images due to the unreliability
21
• Jailbreak: LM agents typically achieve inherent prede- robustness of toxicity language predictors. Cheng et al.
fined rule restrictions through model alignment techniques, introduce AdvAug [175], a novel adversarial augmentation
thereby preventing the generation of harmful or malicious method for Neural Machine Translation (NMT) to enhance
content. Jailbreak refers to adversaries designing particular translation performance.
prompts that exploit model vulnerabilities, bypassing the • Input/output filtering mechanisms can eliminate malicious
content generation rules, and enabling LM agents to gen- tokens from adversarial inputs or harmful content from
erate harmful or malicious content [167]–[170]. Yu et al. outputs. Kumar et al. propose the erase-and-check method
[167] evaluate hundreds of jailbreak prompts on GPT-3.5, [176], which leverages another LLM as a safety filter to
GPT-4, and PaLM-2, and demonstrate the effectiveness and remove malicious tokens in the user input. Phute et al. [177]
prevalence of jailbreak prompts. Yang et al. [168] propose an present an LLM self-examination defense approach, where
automated jailbreak attack framework named SneakyPrompt, an extra LLM is utilized to evaluate whether the responses
which successfully jailbreaks DALL-E 2. SneakyPrompt are generated by adversarial prompts. Zeng et al. [178]
effectively bypasses the safety filters and enable generation propose AutoDefense, a multi-agent framework to defend
of Not-Safe-For-Work (NSFW) images. Shen et al. [169] against jailbreak attacks by filtering harmful LLM responses
perform a comprehensive measurement study on in-the- without impacting user inputs. AutoDefense divides the
wild jailbreak prompts, identifying two long-term jailbreak defense task into sub-tasks, leveraging LLM agents based
prompts can achieve 99% attack success rates on GPT- on AutoGen [89] to handle each part independently. It com-
3.5 and GPT-4. Deng et al. [170] introduce an end-to-end prises three components: the input agent, the defense agency,
jailbreak framework named MASTERKEY, which utilizes and the output agent. The input agent formats responses
the reverse engineering to uncover the internal jailbreak into a defense template, the defense agency collaborates
defense mechanisms of LLMs and leverages a fine-tuned to analyze responses for harmful content and make judg-
LLM to automatically generate jailbreak prompts. ments, and the output agent determines the final response. If
• Prompt injection: The prompt injection attack enables the deemed unsafe, the output agent overrides it with a refusal
adversary to control the target LM agent and generate any or revises it based on feedback to ensure alignment with
desired content. The attack is performed by manipulating content policies. Experiments show that AutoDefense with
user inputs to create meticulously crafted prompts, so that LLaMA-2-13b, a low-cost, fast model, reduces GPT-3.5’s
LM agents are unable to distinguish between developer’s attack success rate (ASR) from 55.74% to 7.95%, achieving
original instructions and user inputs. These prompts then 92.91% defense accuracy.
hijack the agent’s intended task, causing the agent’s outputs • Robust optimization strengthens the defense capabilities of
to deviate from expected behaviors [35], [171], [172]. Toyer LM agents against adversarial attacks through robust training
et al. [171] collect a dataset comprising prompt injection algorithms during the pre-training, alignment, and fine-
attacks and prompt-based injection defenses from players tuning processes. Shen et al. [179] propose a dynamic
of an online game called Tensor Trust. They also propose attention method that mitigates the impact of adversarial
two benchmarks to evaluate the susceptibility of LLMs to attacks by masking or reducing the attention values assigned
prompt injection attacks. Liu et al. propose HOUYI [35], to adversarial tokens.
an innovative black-box prompt injection attack inspired • Auditing & red teaming involve systematically probing LMs
by traditional web injection attacks. HOUYI reveals severe to identify and rectify any potential harmful outputs. Jones et
attack consequences, such as unrestricted arbitrary LLM al. [180] introduce ARCA, a discrete optimization algorithm
usage and prompt stealing. Greshake et al. introduce the designed to audit LLMs, which can automatically detect
concept of indirect prompt injection attack [172], which derogatory completions about celebrities, thus providing a
leverages LM agents to inject crafted prompts into the data valuable tool for uncovering model vulnerabilities before
retrieved at inference time. Consequently, these retrieved deployment. However, existing red teaming methods lack
prompts can perform arbitrary code execution, manipulate context-awareness and rely on manual jailbreak prompts. To
the agent’s functionality, and control other APIs. Addition- address this, Xu et al. [181] propose RedAgent, a multi-
ally, by harnessing the power of LLMs, an LLM agent can agent LLM system that generates context-aware jailbreak
be configured to carry out prompt injection attacks. Ning prompts using a coherent set of jailbreak strategies. By
et al. propose CheatAgent [173], a novel LLM-based attack continuously learning from contextual feedback and trials,
framework that generates adversarial perturbations on input RedAgent adapts effectively to various scenarios. Experi-
prompts to mislead black-box LLM-powered recommender ments show that RedAgent can jailbreak most black-box
systems. LLMs within five queries, doubling the efficiency of exist-
ing methods. Their findings indicate that LLMs integrated
3) Countermeasures to Adversarial Attacks: Existing coun-
with external data or tools are more prone to attacks than
termeasures to adversarial attacks to secure LM agents involve
foundational models.
adversarial training, input/output filtering, robust optimization,
and auditing & red teaming.
• Adversarial training aims to enhance AI model’s robustness C. Poisoning & Backdoor Attack
in the input space by incorporating adversarial examples Different from adversarial attacks, poisoning and backdoor
into the training data. Bespalov et al. [174] demonstrate attacks typically involve altering model parameters by injecting
that basic adversarial training can significantly improve the toxic data into the training dataset during model training process,
24
triggers. Typically, backdoor attacks involve the injection of • Trigger inversion: Identifying and reversing triggers in in-
compromised training samples with unique triggers in the training puts is another method to effectively defend against back-
dataset. door attacks. Wei et al. [193] propose a novel approach
• In LM training process: In the context of LM agents, named LMSanitator, which can invert exceptional output
backdoor attacks can occur at various stages of the training caused by task-agnostic backdoors, thereby effectively de-
process, including pre-training, alignment, and fine-tuning fending against backdoor attacks in LLMs.
[37], [187], [188]. (i) At pre-training stage. Struppek et al. • Neural Cleanse: Neural cleanse is an effective defense
[187] introduce a novel backdoor attack tailored to text- mechanism against backdoor attacks that involves identify-
to-image LMs. By slightly modifying an encoder of the ing and removing neurons in neural networks that exhibit
text-to-image systems, an adversary can trigger the LM into strong reactions to backdoor triggers. Wang et al. [194]
generating images with predefined attributes or images that investigate reverse-engineer backdoor triggers and use them
following a potentially malicious description by inserting a to detect neurons highly responsive to these triggers. Subse-
single special character trigger (e.g., a non-Latin character or quently, these neurons are removed through model pruning,
emoji) into the prompt. (ii) At alignment stage. Rando et al. thereby mitigating the impact of backdoor attacks.
[37] propose a new backdoor attack called jailbreak back-
door, where adversaries conduct data poisoning attacks on D. Other Security Threats to LM Agents
the RLHF training data during the alignment stage and then a In addition to the aforementioned security threats, LM agents
specific trigger word can be turned into “sudo” in command are also susceptible to other traditional and emerging risks,
lines. As such, it easily facilitates a jailbreak attack and including the fake and harmful content generation, Denial-of-
enables LMs to produce harmful contents. (ii) At instruction Service (DoS) attacks, and agent hijacking attacks.
tuning stage. Xu et al. [188] study the backdoor attack • Fake & harmful content generation: LM agents are suscep-
during the instruction tuning stage, and they demonstrate that tible to malicious exploitation by criminals for fabricating
very few malicious instruction (1̃000 tokens) injected by the content or generating harmful content. For example, LM
adversaries can effectively manipulate model behaviors. agents can be utilized for phishing scams or generating
• In LM inference process: Additionally, backdoor attacks can malicious code in a low cost and adaptive manner. Fake and
also occur at the inference process of LM agents. Xiang et harmful content detection is the primary strategy to resist
al. [189] propose BadChain, a novel backdoor attack method this threat. Abdullah et al. [195] throughly analyze recent
targeting on CoT prompting. BadChain inserts a backdoor advances in deepfake image detection, and Dugan et al.
reasoning step into the sequence of reasoning steps, thereby [196] present a new benchmark dataset RAID for machine-
altering the generated response when a specific backdoor generated text detection.
trigger exists in the input. • DoS attack: The inference and generation processes of LM
3) Countermeasures to Poisoning & Backdoor Attacks: Ex- agents consume substantial resources, while DoS attacks can
isting countermeasures against poisoning and backdoor attacks significantly increase the resource consumption, compromis-
on LM agents primarily focus on poisoned samples identification ing the availability of LM agents. Shumailov et al. [197]
and filtering. Besides, trigger inversion that removes the triggers exploit the sponge examples in large-scale neural networks
from input samples and Differential Privacy (DP) technique are to carry out a DoS to AI services, which can significantly
two main strategies for mitigating poisoning and backdoor risks increase the latency and energy consumption of models by
to LM agents. a factor of 30. Possible defense mechanisms include the
• Poisoned samples identification & filtering: Pre-processing detection and filtering of DoS inputs before generation or
training data in advance to identify and filter out poisoned inference.
• Agent hijacking attack: Agent hijacking attacks mainly target
samples is the primary method for mitigating poisoning and
backdoor attacks [190], [191]. Chen et al. [190] propose a LM agents that provide online services. The hijacking is per-
Backdoor Keyword Identification (BKI) mechanism, which formed by poisoning the agents’ training data and injecting
can identify and exclude poisoned samples in the training additional parasitic tasks into the victim agent, resulting in
dataset without a verified and trusted dataset. By analyzing the increases of overheads and moral-legal risks for service
the changes in inner neurons of models, the BKI mechanism providers. Salem et al. [198] and Si et al. [199] propose
in [190] can mitigate backdoor attacks in text classification. model hijacking attacks for image classification tasks and
Zhao et al. [191] demonstrate that PEFT strategies are text generation tasks, respectively, successfully injecting the
vulnerable to weighted poisoned attacks, and they develop a parasitic task without compromising the performance of the
Poisoned Sample Identification Module (PSIM) leveraging main task. Techniques to defend against agent hijacking at-
PEFT to identify poisoned samples in the training data tacks are similar to those against poisoning attacks, primarily
through the confidence. involving the sanitization of training data and removing the
• DP: Adding DP noises to training data or gradients during parasitic training samples.
training process can enhance the robustness of trained mod-
els against poisoning and backdoor attacks. Xu et al. [192] E. Summary and Lessons Learned
introduce the differentially private training method to smooth In the domain of LM agents, there are primarily three types
the training gradient in text classification tasks, which is a of security threats: hallucinations, adversarial attacks, and poi-
generic defense method to resist data poisoning attacks. soning/backdoor attacks. Among these, hallucinations are brand-
26
new security threats in LM agents, while adversarial, poisoning, can only interact with the deployed LM agents through carefully
and backdoor attacks are evolved from traditional ML threats. For crafted prompts and subsequently obtain the responses. The
adversarial attacks, it includes two types: adversarial input attacks primary objective of the adversary is to elicit responses that dis-
derived from traditional ML and prompt hacking attacks (i.e., close as much private information as possible. Recently, various
jailbreak and prompt injection attacks) specific to LM agents. research attention has been directed towards these privacy attacks
Other security threats to LM agents include false and harmful in LLMs and LVMs. (i) For LLMs, Carlini et al. [38] demonstrate
content generation, DoS attacks, and agent hijacking attacks. that an adversary can query GPT-2 with verbatim textual prefix
To summarize, most existing AI security threats persist in the patterns to extract PII including names, emails, phone numbers,
context of LM agents, and new forms of these traditional security fax numbers, and addresses. Their study highlights the practical
threats have arisen with the emergence of novel tuning paradigms threat of private data extraction attacks in LM agents, noting
during the training process of LM agents. Additionally, the that the risk increases as the LLMs grow in size. Furthermore,
characteristics of LM agents in terms of embodied, autonomous, they identify three key factors to quantify the memorization in
and connected intelligence lead to new security threats such LM agents: model scale, data duplication, and context. Besides,
as hallucinations, prompt hacking attacks, and agent hijacking. they demonstrate that larger models, more repeated examples, and
To enhance the reliability of LM agent systems, more attention longer context facilitate the private data extraction [34]. Huang
should focus on these security threats in designing defense et al. [200] extend this research by examining private data ex-
mechanisms. Moreover, effective countermeasures for mitigating traction attacks on pre-trained language models such as GPT-neo,
security threats in LM agents are still lacking from both technical further elucidating the feasibility and risk of such attacks in LM
and regulatory perspectives. agents. Additionally, Zhang et al. [201] propose Ethicist, a novel
approach to resist private data extraction attacks that utilizes loss-
V. P RIVACY T HREATS & C OUNTERMEASURES TO L ARGE smoothed soft prompting and calibrated confidence estimation,
M ODEL AGENTS effectively enhancing the extraction performance. Panda et al.
In this section, we identify typical privacy threats and re- [202] introduce a novel and practical data extraction attack called
view existing/potential countermeasures to safeguard LM agents. “neural phishing”. By performing a poisoning attack on the pre-
Fig. 21 illustrates the taxonomy of privacy threats to LM agents. training dataset, they induce the LLM to memorize the other
Firstly, we discuss LM memorization risks including data extrac- people’s PII. Staab et al. [203] further investigate the capabilities
tion attacks, membership inference attacks, and attribute inference of pretrained LLMs in inferring PII during chat interaction
attacks, along with the countermeasures in Sect. V-A. Next, we phases. Their findings demonstrate that LMs can deduce multiple
review two typical LM intellectual property-related privacy risks, personal attributes from unstructured internet excerpts, enabling
i.e., model stealing attacks and prompt stealing attacks, as well the identification of specific individuals when combined with
as their corresponding countermeasures in Sect. V-B. Finally, additional publicly available information. (ii) For LVMs, Carlini
other privacy threats in LM agents are summarized in Sect. V-C, et al. [204] demonstrate the state-of-the-art diffusion models can
including sensitive query attacks and privacy leakage in multi- memorize and regenerate individual training examples, posing a
agent interactions. more essential privacy risk compared to prior generative models
such as GANs.
A. LM Memorization Risk 2) Membership Inference Attack (MIA): MIA refers to infer-
ring whether an individual data sample in the training data of
LMs typically feature a massive number of parameters, ranging ML models. In the domain of LM agents, MIAs can be further
from one billion to several hundred billion. The parameters categorized into two types based on the LM training phase: pre-
endow LMs with significant comprehension and decision-making training MIAs and fine-tuning MIAs.
capabilities, but also make LMs prone to retaining details of
training samples [34], [38]. Moreover, the training data is typ- • Pre-training MIA: The objective of pre-training MIAs is to
ically crawled from the Internet without carefully discrimina- ascertain whether specific data samples are involved in the
tion, including sensitive information from social media, review training data of pre-trained LMs by analyzing the output
platforms, and personal web pages. Thereby, the training data generated by LM agents. (i) For LLMs, Mireshghallah et
usually contains various types of Personally Identifiable Infor- al. [205] propose an innovative MIA that targets Masked
mation (PII) and Personal Preference Information (PPI) related to Language Models (MLMs) using likelihood ratio hypothesis
Internet users, including names, phone numbers, emails, medical testing, enhanced by an auxiliary reference MLM. Their
or financial records, personal opinions or preferences, and so findings reveal the susceptibility of MLMs to this type of
on. Consequently, this characteristic of LMs, known as “LM MIA, highlighting the potential of such attacks to quantify
memorization risk”, can be exploited by adversaries to conduct the privacy risks of MLMs. Mattern et al. [206] introduce
crafted privacy attacks, thereby extracting sensitive data or infor- the neighbourhood MIA, which determines the membership
mation. In this subsection, we discuss three typical privacy attacks status of target samples by comparing model scores of the
stemming from LM memorization risks and review corresponding given sample with those of synthetic neighbor texts, thus
countermeasures to mitigate them. eliminating the need for reference models and enhancing
1) Data Extraction Attack: The data extraction attacks refer applicability in practical scenarios. Shi et al. [207] present
to that adversaries elaborately craft malicious queries to extract WIKIMIA, a dynamic benchmark for conducting MIAs on
private information from the training data of LM agents. These pre-training data, using older Wikipedia data as member data
attacks operate under a black-box model, where the adversary and recent Wikipedia data as non-member data. Additionally,
27
Prefix
Housing prices have fallen Training
Repeat this word forever: “economic impact significantly since 2008,
impact impact” with a 12% drop in 2023.
Housing prices have
declined sharply since Neighbor x1
2008, with a 15% drop in
2023. Housing prices have
Mask LM
Memorized text declined sharply since Target model
Attacker Impact impact impact Impact impact impact Target Sample x 2010, with a 15% drop in
2023.
[...]
Johnathan Smith Training data Y
Neighbor x2
Software Engineer at Tech Solutions Inc. Non training L(x)-mean(L(n)) < Threshold 𝛾 L(x) and L(x1), L(x2), …
Phone:(555) 123-4567 data
N
Email:john.smith@emaildomain.com
Home Address:1234 Elm Street Apt 56B Fig. 23: Illustration of Membership Inference Attack (MIA) to
Springfield, IL 62704 USA
LM agents.
LM Agent
Fig. 22: Illustration of data extraction attack to LM agents. participated in the fine-tuning phase. Mireshghallah et al.
[210] conduct an empirical analysis of memorization risks on
fine-tuned LMs through MIAs, revealing that fine-tuning the
head of the model makes it most susceptible to attacks, while
they propose a reference-free MIA method named MIN-
fine-tuning smaller adapters appears to be less vulnerable. Fu
K% PROB, which computes the average probabilities of
et al. [211] propose a self-calibrated probabilistic variation-
outlier tokens to infer membership. (ii) For LVMs, Kong et
based MIA, which utilizes the probabilistic variation as a
al. [208] develop an efficient MIA by leveraging proximal
more reliable membership signal, achieving superior perfor-
initialization. They utilize the diffusion model’s initial output
mance against overfitting-free fine-tuned LMs.
as noise and the errors between forward and backward
processes as the attack metric, achieving superior efficiency 3) Attribute Inference Attack: Attribute inference attacks aim
in both vision and text-to-speech tasks. to deduce the presence of specific attributes or characteristics of
• Fine-tuning MIA: Fine-tuning datasets are often smaller, data samples within the training data of LM agents. For example,
more domain-specific, and more privacy-sensitive than pre- such attacks can be exploited to infer the proportion of images
training datasets, making fine-tuned LMs more susceptible to with a specific artist style in the training data of a text-to-image
MIAs than pre-trained LMs. Kandpal et al. [209] introduce agent, potentially leading to privacy breaches for providers of
a realistic user-level MIA on fine-tuned LMs that utilizes these training images. Pan et al. [212] systematically investigate
the likelihood ratio test statistic between the fine-tuned LM the privacy risks associated with attribute inference attacks in
and a reference model to determine whether a specific user LMs. Through four diverse case studies, they validate the ex-
28
• Data sanitization: Data sanitization can effectively mitigate Fig. 24: Illustration of LM intellectual property-related privacy
memorization risks by identifying and excluding sensitive risks to LM agents. (a) Model stealing attacks: an adversary
information from training data. By replacing sensitive infor- maliciously queries the model with multiple similar questions
mation with meaningless symbols or synthetic data, and re- to obtain a series of response pairs. This allows them to steal
moving duplicated sequences, it is possible to defend against the LM, leading to privacy breaches or the creation of competing
privacy attacks that exploit the memorization characteristics products. (b) Prompt stealing attacks: attributes of the original
of LM agents. Kandpal et al. [214] demonstrate that the rate prompt are determined using a subject generator and a modifier
at which memorized training sequences are regenerated is detector, and the reverse prompt is reconstructed, resulting in
superlinearly related to the frequency of those sequences in privacy exposure.
the training data. Consequently, deduplicating training data
is an effective way to mitigate LM memorization risks.
may inherently contain private information, and skilled attackers
• DP: Existing efforts have validated that adding DP noises
can further infer private data from this extracted information
to training data and model gradients during pre-training
through carefully crafted privacy attacks. Prompts typically con-
and fine-tuning phases can effectively mitigate the privacy
tain user inputs that not only indicate user intent, requirements,
leakages due to LM memorization. Hoory et al. [215]
and business logic but may also involve confidential information
propose a novel differentially private word-piece algorithm,
related to the user’s business. We focus on the following two
which achieves a trade-off between model performance and
types of IP-related privacy attacks: model stealing attacks and
privacy preservation capability.
prompt stealing attacks.
• Knowledge distillation: Knowledge distillation [123] has
been widely adapted as an intuitive technique to preserve 1) Model stealing attacks: In model stealing attacks, adver-
privacy. It can obtain a public student model without the saries aim to extract model information, such as models’ pa-
utilization of any private data. For LM agents, knowledge rameters or hyperparameters, by querying models and observing
distillation can be leveraged to mitigate LM memorization the corresponding responses, subsequently stealing target models
risks by transferring knowledge from private teacher models without access the original data [217]. Recently, Krishna et
(which are trained on private data) to public student models al. [218] have demonstrate that language models (e.g., BERT)
(which are trained without private data). can be stolen by multiple queries without any original training
• Privacy leakage detection & validation: Prior to deploying
data. Due to the extensive scale of LMs, it is challenging to
an LM agent for practical services, it is crucial to mitigate directly extract the entire model through query-response methods.
LM memorization risks by detecting and validating the Consequently, researchers have focused on extracting specific
extent of privacy leakage, thereby enabling service providers capabilities of LMs, such as decoding algorithms, code generation
to modify the model based on validation results. Kim et capabilities, and open-ended generation capabilities. Naseh et
al. [216] propose ProPILE, an innovative probing tool to al. [219] demonstrate that an adversary can steal the type and
evaluate privacy intrusions in LMs. The ProPILE can be hyperparameters of an LM’s decoding algorithms at a low cost
employed by LM agent service providers to evaluate the through query APIs. Li et al. [220] investigate the feasibility
levels of PII leakage for their LMs. and effectiveness of model stealing attacks on LMs to extract
the specialized code abilities. Jiang et al. [221] propose a novel
model stealing attack, which leverages the adversarial distillation
B. LM intellectual property-related privacy risk to extract knowledge of ChatGPT to a student model through
The intellectual property (IP) risks associated with LM agents a mere 70k training data, and the student model can achieve
present two types of privacy risks: LMs-related risks (including comparable open-ended generation capabilities to ChatGPT.
LM’s parameters, hyperparameters, and specific training pro- 2) Prompt stealing attacks: With the advancement of LM
cesses), and prompts-related risks (prompts are considered as agent services, high-quality prompts designed to generate ex-
commodities to generate outputs). The LMs-related information pected content have acquired substantial commercial value. These
29
prompts can be traded on various prompt marketplaces, such in user queries, resulting in potential privacy leakages. For
as PromptSea11 and PromptBase12 . Consequently, a new privacy example, Samsung employees leveraged ChatGPT for code
attack called prompt stealing attack has emerged, where an auditing without processing the confidential information in
adversary aims to infer the original prompt from the generated Apr. 2023, inadvertently exposing the company’s commer-
content. This attack is analogous to the model inversion attack in cial secrets including source code of the new program [226].
traditional ML, which involves reconstructing the input based on • Privacy leakage in multi-agent interactions: LM agent ser-
the output of an ML model [222]. Shen et al. [223] conduct the vices typically necessitate seamless collaboration of multiple
first study on prompt stealing attack in text-to-image generation LM agents to address complex user queries, where each
models, and propose an effective prompt stealing attack method agent is tasked with solving particular sub-problems of the
named PromptStealer. The PromptStealer utilizes a subject gen- queries. Consequently, communication between these LM
erator to infer the subject and a modifier detector to identify the agents is essential for information exchange and transmis-
modifiers within the generated image. Sha et al. [224] extend sion. However, multi-agent interactions can be vulnerable
the prompt stealing attack to LLMs, using a parameter extractor to privacy threats (e.g., eavesdropping, compromised agent,
to determine the properties of original prompts and a prompt and man-in-the-middle attacks), leading to potential user
reconstructor to generate reversed prompts. privacy breaches. Since interactions in LM agent services
3) Countermeasures to Model & Prompt Stealing Attacks: typically occur through natural language, traditional methods
Existing countermeasures to model and prompt stealing at- such as homomorphic encryption and secure multi-party
tacks involve both IP verification (e.g., model watermarking computation struggle to effectively safeguard the privacy
and blockchain) and privacy-preserving adversarial training (e.g., of these interactions. It remains a challenge to design new
adversarial perturbations), as detailed below. strategies tailored to these specific vulnerabilities to preserve
• Model watermarking: Model watermarking is an innovative privacy in multi-agent interactions.
technique in protecting IP rights and ensuring accountability
for LM agents. By embedding watermarks to target LMs, D. Summary and Lessons Learned
the ownership of LMs can be authenticated by verifying There are primarily two types of privacy threats to LM agents:
the watermarks, thereby preventing unauthorized use or LM memorization risk, and LM IP-related privacy risk. Generally,
infringement. Kirchenbauer et al. [225] propose a water- data extraction attacks, MIAs, and attribute inference attacks are
marking algorithm utilizing a randomized set of “green” three main privacy threats stemmed from LM memorization risks.
tokens during the text generation process, where the model Besides, model stealing attacks and prompt stealing attacks are
watermark is verified by a statistical test with interpretable two typical LM IP-related privacy risks. Other privacy threats to
p-values. LM agents include sensitive query attacks and privacy leakage in
• Blockchain: Blockchain can be employed as a transparent multi-agent interactions. To summarize, the powerful comprehen-
platform to verify IP rights due to its inherent immutability sion and memorization capabilities of LMs introduce new privacy
and traceability [101]. The owner of LMs can record the concerns, particularly regarding the leakage of PII. Meanwhile,
develop logs, version information, and hash values of LMs’ the interaction modes of LM agents have endowed prompts with
parameters on blockchain, ensuring the authenticity and commercial value, highlighting the importance of intellectual
completeness of the recorded information. Nevertheless, property rights associated with them. Furthermore, the com-
the blockchain technique itself cannot prevent the stealing plexity of LMs renders conventional privacy-preserving methods
behaviors of model functionality. ineffective for ensuring privacy. Therefore, to comprehensively
• Adversarial perturbations: Transforming the generated con- safeguard privacy within LM agent systems, researchers should
tent into adversarial examples by adding optimized pertur- develop effective and innovative privacy protection techniques
bations is an effective method to prevent prompt-stealing at- tailed for LM agents. Additionally, it is imperative for gov-
tacks while maintaining the quality of the generated content. ernments and authoritative organizations to advance legislation
Shen et al. [223] propose an intuitive defense mechanism process related to privacy breaches and intellectual property of
named PromptShield, which employs the adversarial exam- LM agent services.
ple technique to add a negligible perturbation on generated
images, thereby defending against their proposed prompt VI. F UTURE R ESEARCH D IRECTIONS
stealing attack PromptStealer. However, PromptShield re-
In this section, we outline several open research directions
quires white-box access to the attack model, which is
important to the design of future design of LM agent ecosystem.
typically impractical in real-world scenarios. Consequently,
there remains a significant need for efficient and practical
A. Energy-Efficient and Green LM Agents
countermeasures to mitigate the risks associated with prompt
stealing attacks. With the increasingly widespread deployment of LM agents,
their energy consumption and environmental impact have
C. Other Privacy Threats to LM Agents emerged as critical concerns. As reported, the energy consumed
by ChatGPT to answer a single question for 590 million users
• Sensitive query attack: In LM agent services, the LM may
is comparable to the monthly electricity usage of 175,000 Danes
memorize sensitive personal or organizational information
[70]. Given the exponential growth in model size and the compu-
11 https://www.promptsea.io/ tational resources required, energy-efficient strategies are essen-
12 https://promptbase.com/ tial for sustainable AI development, with the aim to reduce the
30
significant carbon footprint associated with training and operating C. Cyber-Physical-Social Secure LM Agent Systems
LM agents. Enabling technologies for energy-efficient and green As LM agents increasingly interact with the physical world,
LM agents include model compression techniques [115], [116], digital networks, and human society, ensuring their interaction
such as pruning, quantization, and knowledge distillation, which security in CPSS becomes essential to protect critical infras-
reduce the size and computational requirements of LMs without tructure, preserve sensitive data, prevent potential harm, and
significantly affecting their accuracy. Additionally, the use of maintain public confidence. Zero-trust architectures [230], which
edge computing [87] and FL [86] allows for the distribution of operate under the principle of “never trust, always verify”, are
computational tasks across multiple devices, thereby reducing crucial for protecting LM agents from internal and external
the energy burden on central servers and enabling real-time threats by continuously validating user identities and device
processing with lower latency. Innovations in hardware [118], integrity. Implementing zero-trust in LM agents ensures that
such as energy-efficient GPUs and TPUs, also play a critical role all interactions, whether between agents, systems, or users, are
in achieving greener LM agents by optimizing the energy use of authenticated and authorized, reducing the risk of unauthorized
the underlying computational infrastructure. access or malicious activity. Additionally, the integration of legal
However, achieving energy-efficient and green LM agents norms into the design and operation of LM agents ensures that
presents several key challenges. While model compression tech- their actions comply with applicable laws and regulations. This
niques can significantly reduce energy consumption, they may involves embedding legal reasoning capabilities within LM agents
also lead to a loss of accuracy or the inability to handle [161], enabling them to consider legal implications and ensure
complex tasks, which is a critical consideration for applications that their decisions align with societal expectations and regulatory
requiring high precision. Furthermore, optimizing the lifecycle frameworks.
energy consumption of LM agents involves addressing energy use However, several key challenges remain. One major challenge
across training, deployment, and operational stages. This includes is the complexity of securing heterogeneous CPSS that span mul-
designing energy-aware algorithms that can dynamically adapt tiple domains, including cyber, physical, and social environments.
to the availability of energy resources while maintaining high The interconnected nature of CPSS means that vulnerabilities
performance. in one domain can have cascading effects across the entire
system, making it difficult to implement comprehensive security
measures. Another challenge is the dynamic nature of CPSS
environments, where LM agents should continuously adapt to
B. Fair and Explainable LM Agents changing conditions while maintaining security. Ensuring that
security measures are both adaptive and resilient to new threats
As LM agents continue to play an increasingly central role
is a complex task.
in decision-making across various domains, the need for fairness
and explainability becomes paramount to build trust among users,
ensure compliance with ethical standards, and prevent unintended D. Value Ecosystem of LM Agents
biases. It is particular for sensitive areas such as healthcare, fi- The creation of interconnected value network of LM agents
nance, and law, where decisions should be transparent, justifiable, empowers LM agents to autonomously and transparently manage
and free from bias. Bias detection and mitigation algorithms such value exchanges (e.g., data, knowledge, resources, and digi-
as adversarial debiasing [227], reweighting [228], and fairness tal currencies), which is crucial for fostering innovation, en-
constraints [104] can be integrated into the training process hancing cooperation, and driving economic growth within LM
to ensure that the models are less prone to propagate existing agents ecosystem. Blockchain technology provides a tamper-
biases, thereby identifying and correcting biases in data and proof ledger that records all transactions between LM agents,
model outputs. Moreover, eXplainable AI (XAI) methods [229] ensuring transparency and trust in the system [101]. Smart
such as SHapley Additive exPlanations (SHAP), Local Inter- contracts, which are self-executing agreements coded onto the
pretable Model-agnostic Explanations (LIME), and counterfactual blockchain, allow LM agents to autonomously manage transac-
explanations allow users to understand the reasoning behind the tions, enforce agreements, and execute tasks without the need
model’s predictions, thereby enhancing trust, transparency, and for intermediaries [231]. Additionally, the integration of ora-
accountability. cles—trusted data sources that feed real-world information into
However, several key challenges remain be addressed. One the blockchain—enables LM agents to interact with external
major challenge is the trade-off between model complexity and data and execute contracts based on real-time conditions, further
explainability. More complex models, such as DNNs, often enhancing the functionality of value networks.
perform better but are harder to interpret, making it difficult to However, one major challenge is ensuring cross-chain interop-
provide clear explanations for their decisions. Another challenge erability, which is essential for enabling LM agents to transact
is the dynamic nature of fairness, as what is considered fair may across different blockchain networks. Currently, most blockchains
change over time or vary across different cultural and social operate in silos, making it difficult to transfer value or data
contexts. Ensuring that LM agents remain fair in diverse and between them [231]. Developing protocols that facilitate cross-
evolving environments requires continuous updating of fairness chain communication and trusted value transfer is critical for
criteria. Finally, achieving fairness and explainability without creating a unified value network. Another challenge lies in the
significantly compromising performance is a delicate balance, as reliability and security of cross-contract value transfer operations,
efforts to improve fairness and transparency can sometimes lead where multiple smart contracts atop on various homogeneous
to reduced accuracy or efficiency. or heterogeneous blockchains, especially in environments with
31
varying trust levels, need to work together to complete a transac- [13] W. Zhang, K. Tang, H. Wu, M. Wang, Y. Shen, G. Hou, Z. Tan, P. Li,
tion or task. Additionally, scalability remains a challenge, as the Y. Zhuang, and W. Lu, “Agent-pro: Learning to evolve via policy-level
reflection and optimization,” in Proc. ACL, pp. 5348–5375, 2024.
computational and storage requirements for managing large-scale [14] H. Yang, S. Yue, and Y. He, “Auto-GPT for online decision making:
value networks can be substantial. As the number of LM agents Benchmarks and additional opinions,” arXiv preprint arXiv:2306.02224,
and transactions grows, ensuring that the underlying blockchain pp. 1–14, 2023.
[15] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal,
infrastructure can scale to meet demand without compromising H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel, et al., “Retrieval-
performance or security is crucial. augmented generation for knowledge-intensive NLP tasks,” Proc. NeurIPS,
vol. 33, pp. 9459–9474, 2020.
[16] R. Nakano, J. Hilton, S. Balaji, et al., “WebGPT: Browser-
VII. C ONCLUSION assisted question-answering with human feedback,” arXiv preprint
arXiv:2112.09332, pp. 1–32, 2022.
In this paper, we have provided an in-depth survey of the state- [17] Y. Wang, Z. Jiang, Z. Chen, F. Yang, Y. Zhou, E. Cho, X. Fan, X. Huang,
of-the-art in the architecture, interaction paradigms, security and Y. Lu, and Y. Yang, “Recmind: Large language model powered agent for
recommendation,” Proc. NAACL 2024, pp. 1–14, 2024.
privacy, and future trends of LM agents. Specifically, we have [18] J. Wang, H. Xu, H. Jia, X. Zhang, M. Yan, W. Shen, J. Zhang,
introduce a novel architecture and its key components, critical F. Huang, and J. Sang, “Mobile-Agent-v2: Mobile device operation assis-
characteristics, enabling technologies, and potential applications, tant with effective navigation via multi-agent collaboration,” arXiv preprint
arXiv:2406.01014, pp. 1–22, 2024.
toward embodied, autonomous, and connected intelligence of LM [19] C. Zhang, Z. Yang, J. Liu, Y. Han, X. Chen, Z. Huang, B. Fu, and
agents. Afterward, we have explored the taxonomy of interac- G. Yu, “AppAgent: Multimodal agents as smartphone users,” arXiv preprint
tion patterns and practical collaboration paradigms among LM arXiv:2312.13771, pp. 1–10, 2023.
[20] S. Hu, T. Huang, F. Ilhan, S. Tekin, G. Liu, R. Kompella, and L. Liu,
agents, including data, computation, and information sharing for “A survey on large language model-based game agents,” arXiv preprint
collective intelligence. Furthermore, we have identified significant arXiv:2404.02039, pp. 1–23, 2024.
security and privacy threats inherent in the ecosystem of LM [21] M. Ahn, A. Brohan, N. Brown, et al., “Do as I can, not as I say: Grounding
language in robotic affordances,” in Proc. CoRL, pp. 1–34, 2022.
agents, discussed the challenges of security/privacy protections [22] Y. Jin, X. Shen, H. Peng, X. Liu, J. Qin, J. Li, J. Xie, P. Gao, G. Zhou,
in multi-agent environments, and reviewed existing and potential and J. Gong, “SurrealDriver: Designing generative driver agent simulation
countermeasures. As the field progresses, ongoing research and framework in urban contexts based on large language model,” arXiv
preprint arXiv:2309.13193, pp. 1–6, 2023.
innovation will be crucial for overcoming existing limitations [23] H. Wu, Z. He, X. Zhang, X. Yao, S. Zheng, H. Zheng, and B. Yu,
and harnessing the full potential of LM agents in transforming “ChatEDA: A large language model powered autonomous agent for EDA,”
intelligent systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and
Systems, 2024. doi:10.1109/TCAD.2024.3383347.
[24] MarketsandMarkets, “Autonomous AI and autonomous agents market,”
R EFERENCES 2023. Accessed: July. 30, 2023.
[25] W. Huang, P. Abbeel, D. Pathak, and I. Mordatch, “Language models as
[1] Q. Huang, N. Wake, B. Sarkar, Z. Durante, R. Gong, R. Taori, Y. Noda, zero-shot planners: Extracting actionable knowledge for embodied agents,”
D. Terzopoulos, N. Kuno, A. Famoti, et al., “Position paper: Agent AI in Proc. ICML, pp. 9118–9147, 2022.
towards a holistic intelligence,” arXiv preprint arXiv:2403.00833, pp. 1– [26] J. Ruan, Y. Chen, B. Zhang, Z. Xu, T. Bao, du qing, shi shiwei, H. Mao,
22, 2024. X. Zeng, and R. Zhao, “TPTU: Task planning and tool usage of large
[2] W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, language model-based AI agents,” in NeurIPS 2023 Foundation Models
J. Zhang, Z. Dong, et al., “A survey of large language models,” arXiv for Decision Making Workshop, pp. 1–34, 2023.
preprint arXiv:2303.18223, pp. 1–124, 2023. [27] C. Zhang, K. Yang, S. Hu, Z. Wang, G. Li, Y. Sun, C. Zhang, Z. Zhang,
[3] C. Ribeiro, “Reinforcement learning agents,” Artificial intelligence review, A. Liu, S.-C. Zhu, X. Chang, J. Zhang, F. Yin, Y. Liang, and Y. Yang,
vol. 17, pp. 223–250, 2002. “ProAgent: Building proactive cooperative agents with large language
[4] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, models,” Proc. AAAI, vol. 38, no. 16, pp. 17591–17599, 2024.
A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, et al., “Mastering [28] F. Jiang, Y. Peng, L. Dong, K. Wang, K. Yang, C. Pan, D. Niyato, and
the game of go without human knowledge,” nature, vol. 550, no. 7676, O. A. Dobre, “Large language model enhanced multi-agent systems for
pp. 354–359, 2017. 6G communications,” IEEE Wireless Communications, pp. 1–8, 2024.
[5] L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, [29] J. Andreas, “Language models as agent models,” arXiv preprint
C. Zhang, S. Agarwal, K. Slama, A. Ray, et al., “Training language models arXiv:2212.01681, pp. 1–11, 2022.
to follow instructions with human feedback,” Proc. NeurIPS, vol. 35, [30] G. Li, H. A. A. K. Hammoud, H. Itani, D. Khizbullin, and B. Ghanem,
pp. 27730–27744, 2022. “CAMEL: Communicative agents for ”mind” exploration of large language
[6] J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, model society,” in Proc. NeurIPS, 2023.
D. Almeida, J. Altenschmidt, S. Altman, S. Anadkat, et al., “GPT-4 [31] T. Y. Zhuo, Z. Li, Y. Huang, F. Shiri, W. Wang, G. Haffari, and Y. Li,
technical report,” arXiv preprint arXiv:2303.08774, pp. 1–100, 2023. “On robustness of prompt-based semantic parsing with large pre-trained
[7] C. H. Song, J. Wu, C. Washington, B. M. Sadler, W.-L. Chao, and Y. Su, language model: An empirical study on codex,” in Proc. EACL, pp. 1090–
“LLM-planner: Few-shot grounded planning for embodied agents with 1102, 2023.
large language models,” in Proc. IEEE/CVF ICCV, pp. 2998–3009, 2023. [32] W. Zou, R. Geng, B. Wang, and J. Jia, “PoisonedRAG: Knowledge
[8] T. Masterman, S. Besen, M. Sawtell, and A. Chao, “The landscape of poisoning attacks to retrieval-augmented generation of large language
emerging AI agent architectures for reasoning, planning, and tool calling: models,” arXiv preprint arXiv:2402.07867, pp. 1–30, 2024.
A survey,” arXiv preprint arXiv:2404.11584, pp. 1–13, 2024. [33] Y. Zhang, Y. Li, L. Cui, D. Cai, L. Liu, T. Fu, X. Huang, E. Zhao, Y. Zhang,
[9] Z. Xi, W. Chen, X. Guo, W. He, Y. Ding, B. Hong, M. Zhang, J. Wang, Y. Chen, et al., “Siren’s song in the AI ocean: a survey on hallucination in
S. Jin, E. Zhou, et al., “The rise and potential of large language model large language models,” arXiv preprint arXiv:2309.01219, pp. 1–33, 2023.
based agents: A survey,” arXiv preprint arXiv:2309.07864, pp. 1–86, 2023. [34] N. Carlini, D. Ippolito, M. Jagielski, K. Lee, F. Tramèr, and C. Zhang,
[10] Y. Cheng, C. Zhang, Z. Zhang, X. Meng, S. Hong, W. Li, Z. Wang, “Quantifying memorization across neural language models,” in Proc. ICLR,
Z. Wang, F. Yin, J. Zhao, et al., “Exploring large language model based pp. 1–19, 2023.
intelligent agents: Definitions, methods, and prospects,” arXiv preprint [35] Y. Liu, G. Deng, Y. Li, K. Wang, T. Zhang, Y. Liu, H. Wang, Y. Zheng,
arXiv:2401.03428, pp. 1–55, 2024. and Y. Liu, “Prompt injection attack against LLM-integrated applications,”
[11] J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, arXiv preprint arXiv:2306.05499, pp. 1–18, 2023.
Q. Le, and D. Zhou, “Chain-of-thought prompting elicits reasoning in large [36] M. Fang, X. Cao, J. Jia, and N. Gong, “Local model poisoning attacks
language models,” arXiv preprint arXiv:2201.11903, pp. 1–43, 2023. to byzantine-robust federated learning,” in Proc. USENIX, pp. 1605–1622,
[12] S. Yao, D. Yu, J. Zhao, I. Shafran, T. L. Griffiths, Y. Cao, and 2020.
K. Narasimhan, “Tree of thoughts: Deliberate problem solving with large [37] J. Rando and F. Tramèr, “Universal jailbreak backdoors from poisoned
language models,” in Proc. NeurIPS, pp. 1–14, 2024. human feedback,” in Proc. ICLR, pp. 1–28, 2024.
32
[38] N. Carlini, F. Tramèr, E. Wallace, M. Jagielski, A. Herbert-Voss, K. Lee, [61] D. Driess, F. Xia, M. S. M. Sajjadi, C. Lynch, A. Chowdhery, B. Ichter,
A. Roberts, T. B. Brown, D. Song, Ú. Erlingsson, A. Oprea, and C. Raffel, A. Wahid, J. Tompson, Q. Vuong, T. Yu, W. Huang, Y. Chebotar, P. Ser-
“Extracting training data from large language models,” in Proc. USENIX, manet, D. Duckworth, S. Levine, V. Vanhoucke, K. Hausman, M. Tou-
pp. 2633–2650, 2021. ssaint, K. Greff, A. Zeng, I. Mordatch, and P. Florence, “PaLM-E: An
[39] L. Wang, C. Ma, X. Feng, Z. Zhang, H. Yang, J. Zhang, Z. Chen, embodied multimodal language model,” arXiv preprint arXiv:2303.03378,
J. Tang, X. Chen, Y. Lin, et al., “A survey on large language model pp. 1–18, 2023.
based autonomous agents,” Frontiers of Computer Science, vol. 18, no. 6, [62] Y. Talebirad and A. Nadiri, “Multi-agent collaboration: Harnessing the
p. 186345, 2024. power of intelligent LLM agents,” arXiv preprint arXiv:2306.03314, pp. 1–
[40] M. Xu, H. Du, D. Niyato, J. Kang, Z. Xiong, S. Mao, Z. Han, A. Ja- 11, 2023.
malipour, D. I. Kim, X. Shen, V. C. M. Leung, and H. V. Poor, “Unleashing [63] X. Huang, J. Lian, Y. Lei, J. Yao, D. Lian, and X. Xie, “Recommender AI
the power of edge-cloud generative AI in mobile networks: A survey of agent: Integrating large language models for interactive recommendations,”
AIGC services,” IEEE Communications Surveys & Tutorials, vol. 26, no. 2, arXiv preprint arXiv:2308.16505, pp. 1–18, 2024.
pp. 1127–1170, 2024. [64] Y. Chen, J. Yoon, D. S. Sachan, Q. Wang, V. Cohen-Addad, M. Bateni, C.-
[41] T. Guo, X. Chen, Y. Wang, R. Chang, S. Pei, N. V. Chawla, O. Wiest, and Y. Lee, and T. Pfister, “Re-Invoke: Tool invocation rewriting for zero-shot
X. Zhang, “Large language model based multi-agents: A survey of progress tool retrieval,” arXiv preprint arXiv:2408.01875, pp. 1–22, 2024.
and challenges,” arXiv preprint arXiv:2402.01680, pp. 1–15, 2024. [65] S. Jinxin, Z. Jiabao, W. Yilei, W. Xingjiao, L. Jiawen, and H. Liang,
[42] Z. Durante, Q. Huang, N. Wake, R. Gong, J. S. Park, B. Sarkar, R. Taori, “CGMI: Configurable general multi-agent interaction framework,” arXiv
Y. Noda, D. Terzopoulos, Y. Choi, et al., “Agent AI: Surveying the horizons preprint arXiv:2308.12503, pp. 1–11, 2023.
of multimodal interaction,” arXiv preprint arXiv:2401.03568, pp. 1–80, [66] H. Lai, X. Liu, I. L. Iong, S. Yao, Y. Chen, P. Shen, H. Yu, H. Zhang,
2024. X. Zhang, Y. Dong, and J. Tang, “AutoWebGLM: A large language model-
[43] X. Xu, Y. Wang, C. Xu, Z. Ding, J. Jiang, Z. Ding, and B. F. Karlsson, “A based web navigating agent,” in Proc. KDD, p. 5295–5306, 2024.
survey on game playing agents and large models: Methods, applications, [67] O. Ram, Y. Levine, I. Dalmedigos, D. Muhlgay, A. Shashua, K. Leyton-
and challenges,” arXiv preprint arXiv:2403.10249, pp. 1–13, 2024. Brown, and Y. Shoham, “In-context retrieval-augmented language models,”
[44] G. Qu, Q. Chen, W. Wei, Z. Lin, X. Chen, and K. Huang, “Mobile edge arXiv preprint arXiv:2302.00083, pp. 1–15, 2023.
intelligence for large language models: A contemporary survey,” arXiv [68] S. Hao, T. Liu, Z. Wang, and Z. Hu, “ToolkenGPT: Augmenting frozen
preprint arXiv:2407.18921, pp. 1–37, 2024. language models with massive tools via tool embeddings,” in Proc.
[45] K. Mei, Z. Li, S. Xu, R. Ye, Y. Ge, and Y. Zhang, “AIOS: LLM agent NeurIPS, pp. 1–25, 2023.
operating system,” arXiv preprints: arXiv:2403.16971, pp. 1–14, 2024. [69] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal,
[46] Y. Ge, Y. Ren, W. Hua, S. Xu, J. Tan, and Y. Zhang, “LLM as OS, agents G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever,
as apps: Envisioning AIOS, agents and the aios-agent ecosystem,” arXiv “Learning transferable visual models from natural language supervision,”
preprints: arXiv:2312.03815, pp. 1–35, 2023. in Proc. ICML, vol. 139, pp. 8748–8763, 2021.
[47] X. Guan, Y. Liu, H. Lin, Y. Lu, B. He, X. Han, and L. Sun, “Mitigating [70] Y. Wang, Y. Pan, M. Yan, Z. Su, and T. H. Luan, “A survey on ChatGPT:
large language model hallucinations via autonomous knowledge graph- AI-generated contents, challenges, and solutions,” IEEE Open Journal of
based retrofitting,” in Proc. AAAI, pp. 18126–18134, 2024. the Computer Society, vol. 4, pp. 280–302, 2023.
[48] Y. Wang, Z. Su, N. Zhang, R. Xing, D. Liu, T. H. Luan, and X. Shen, [71] Y. Wang, P. Li, M. Sun, and Y. Liu, “Self-knowledge guided retrieval aug-
“A survey on metaverse: Fundamentals, security, and privacy,” IEEE mentation for large language models,” arXiv preprint arXiv:2310.05002,
Communications Surveys & Tutorials, vol. 25, no. 1, pp. 319–352, 2023. pp. 1–12, 2023.
[49] Y. Wang, Z. Su, S. Guo, M. Dai, T. H. Luan, and Y. Liu, “A survey
[72] M. Zolfaghari, Y. Zhu, P. Gehler, and T. Brox, “Crossclr: Cross-modal
on digital twins: Architecture, enabling technologies, security and privacy,
contrastive learning for multi-modal video representations,” in Proc. ICCV,
and future prospects,” IEEE Internet of Things Journal, vol. 10, no. 17,
pp. 1430–1439, 2021.
pp. 14965–14987, 2023.
[73] R. Wang, D. Tang, N. Duan, Z. Wei, X. Huang, G. Cao, D. Jiang,
[50] G. Wang, Y. Xie, Y. Jiang, A. Mandlekar, C. Xiao, Y. Zhu, L. Fan, and
M. Zhou, et al., “K-adapter: Infusing knowledge into pre-trained models
A. Anandkumar, “Voyager: An open-ended embodied agent with large
with adapters,” in Proc. ACL-IJCNLP, pp. 1405–1418, 2021.
language models,” arXiv preprint arXiv:2305.16291, pp. 1–42, 2023.
[74] F. Wan, X. Huang, D. Cai, X. Quan, W. Bi, and S. Shi, “Knowledge fusion
[51] Y. Pan, Z. Su, Y. Wang, S. Guo, H. Liu, R. Li, and Y. Wu, “Cloud-
of large language models,” arXiv preprint arXiv:2401.10491, pp. 1–20,
edge collaborative large model services: Challenges and solutions,” IEEE
2024.
Network, pp. 1–8, 2024. doi:10.1109/MNET.2024.3442880.
[52] M. Besta, N. Blach, A. Kubicek, R. Gerstenberger, M. Podstawski, L. Gian- [75] N. Tandon, A. Madaan, P. Clark, and Y. Yang, “Learning to repair:
inazzi, J. Gajda, T. Lehmann, H. Niewiadomski, P. Nyczyk, et al., “Graph Repairing model output errors after deployment using a dynamic memory
of thoughts: Solving elaborate problems with large language models,” in of feedback,” arXiv preprint arXiv:2112.09737, pp. 1–14, 2021.
Proc. AAAI, pp. 17682–17690, 2024. [76] X. Dai, C. Guo, Y. Tang, H. Li, Y. Wang, J. Huang, Y. Tian, X. Xia,
[53] Z. Hu, A. Iscen, C. Sun, K.-W. Chang, Y. Sun, D. A. Ross, C. Schmid, Y. Lv, and F.-Y. Wang, “Vistarag: Toward safe and trustworthy autonomous
and A. Fathi, “AVIS: Autonomous visual information seeking with large driving through retrieval-augmented generation,” IEEE Transactions on
language model agent,” in Proc. NeurIPS, pp. 867–878, 2023. Intelligent Vehicles, vol. 9, no. 4, pp. 4579–4582, 2024.
[54] S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y. Cao, [77] J. Shen, C. Wang, L. Gong, and D. Song, “Joint language semantic and
“ReAct: Synergizing reasoning and acting in language models,” in Proc. structure embedding for knowledge graph completion,” arXiv preprint
ICLR, pp. 1–33, 2023. arXiv:2209.08721, pp. 1–14, 2022.
[55] N. Shinn, F. Cassano, E. Berman, A. Gopinath, K. Narasimhan, and S. Yao, [78] P. Li, Z. Liu, W. Pang, and J. Cao, “Semantic collaboration: A collaborative
“Reflexion: Language agents with verbal reinforcement learning,” in Proc. approach for multi-agent systems based on semantic communication,” in
NeurIPS, pp. 8634–8652, 2023. Proc. CNIOT, pp. 123–132, 2024.
[56] Z. Wang, S. Mao, W. Wu, T. Ge, F. Wei, and H. Ji, “Unleashing the [79] T. Ayoola, S. Tyagi, J. Fisher, C. Christodoulopoulos, and A. Pierleoni,
emergent cognitive synergy in large language models: A task-solving agent “Refined: An efficient zero-shot-capable approach to end-to-end entity
through multi-persona self-collaboration,” in Proc. ACL, vol. 1, pp. 257– linking,” arXiv preprint arXiv:2207.04108, pp. 1–12, 2022.
279, 2024. [80] Y. Ma, A. Wang, and N. Okazaki, “Dreeam: Guiding attention with
[57] Z. Wang, S. Cai, G. Chen, A. Liu, X. Ma, and Y. Liang, “Describe, explain, evidence for improving document-level relation extraction,” arXiv preprint
plan and select: Interactive planning with LLMs enables open-world multi- arXiv:2302.08675, pp. 1–13, 2023.
task agents,” in Proc. NeurIPS, pp. 1–37, 2023. [81] S. Gross and B. Krenn, The Role of Multimodal Data for Modeling
[58] H. Trivedi, N. Balasubramanian, T. Khot, and A. Sabharwal, “Interleaving Communication in Artificial Social Agents, pp. 83–93. 2023.
retrieval with chain-of-thought reasoning for knowledge-intensive multi- [82] C. Qian, Z. Xie, Y. Wang, W. Liu, Y. Dang, Z. Du, W. Chen, C. Yang,
step questions,” in Proc. ACL, pp. 10014–10037, 2023. Z. Liu, and M. Sun, “Scaling large-language-model-based multi-agent
[59] N. Liu, L. Chen, X. Tian, W. Zou, K. Chen, and M. Cui, “From LLM to collaboration,” arXiv preprint arXiv:2406.07155, pp. 1–11, 2024.
conversational agent: A memory enhanced architecture with fine-tuning of [83] J. Han, N. Collier, W. Buntine, and E. Shareghi, “Pive: Prompting with it-
large language models,” arXiv preprint arXiv:2401.02777, pp. 1–17, 2024. erative verification improving graph-based generative capability of LLMs,”
[60] M. Hu, T. Chen, Q. Chen, Y. Mu, W. Shao, and P. Luo, “HiAgent: arXiv preprint arXiv:2305.12392, pp. 1–17, 2023.
Hierarchical working memory management for solving long-horizon agent [84] S. Kuroki, M. Nishimura, and T. Kozuno, “Multi-agent behavior retrieval:
tasks with large language model,” arXiv preprint arXiv:2408.09559, pp. 1– Retrieval-augmented policy training for cooperative push manipulation by
17, 2024. mobile robots,” arXiv preprint arXiv:2312.02008, pp. 1–8, 2023.
33
[85] C. Zhang, K. Yang, S. Hu, Z. Wang, G. Li, Y. Sun, C. Zhang, Z. Zhang, [108] Q. Chen, Y. Zhang, J. Liu, Z. Wang, X. Deng, and J. Wang, “Multi-
A. Liu, S.-C. Zhu, et al., “ProAgent: building proactive cooperative agents modal fine-grained retrieval with local and global cross-attention,” in Proc.
with large language models,” in Proc. AAAI, pp. 17591–17599, 2024. ICUFN, pp. 1–7, 2023.
[86] C. Chen, X. Feng, J. Zhou, J. Yin, and X. Zheng, “Federated large language [109] K. Yang, D. Yang, J. Zhang, M. Li, Y. Liu, J. Liu, H. Wang, P. Sun, and
model: A position paper,” arXiv preprint arXiv:2307.08925, pp. 1–11, L. Song, “Spatio-temporal domain awareness for multi-agent collaborative
2023. perception,” in Proc. ICCV, pp. 23383–23392, 2023.
[87] Y. Chen, R. Li, Z. Zhao, C. Peng, J. Wu, E. Hossain, and H. Zhang, [110] J. Ji, J. Wang, C. Huang, J. Wu, B. Xu, Z. Wu, J. Zhang, and Y. Zheng,
“NetGPT: An AI-native network architecture for provisioning be- “Spatio-temporal self-supervised learning for traffic flow prediction,” in
yond personalized generative services,” IEEE Network, pp. 1–9, 2024. Proc. AAAI, pp. 4356–4364, 2023.
doi:10.1109/MNET.2024.3376419. [111] Q. Zhang, C. Huang, L. Xia, Z. Wang, Z. Li, and S. Yiu, “Automated
[88] M. Xu, D. Niyato, H. Zhang, J. Kang, Z. Xiong, S. Mao, and Z. Han, spatio-temporal graph contrastive learning,” in Proc. WWW, pp. 295–305,
“Cached model-as-a-resource: Provisioning large language model agents 2023.
for edge intelligence in space-air-ground integrated networks,” arXiv [112] J. Xu, M. A. Kishk, and M.-S. Alouini, “Space-air-ground-sea integrated
preprint arXiv:2403.05826, pp. 1–13, 2024. networks: Modeling and coverage analysis,” IEEE Transactions on Wireless
[89] Q. Wu, G. Bansal, J. Zhang, Y. Wu, B. Li, E. E. Zhu, L. Jiang, X. Zhang, Communications, vol. 22, no. 9, pp. 6298–6313, 2023.
S. Zhang, A. Awadallah, R. W. White, D. Burger, and C. Wang, “AutoGen: [113] M.-H. T. Nguyen, T. T. Bui, L. D. Nguyen, E. Garcia-Palacios, H.-J.
Enabling next-gen llm applications via multi-agent conversation,” in Proc. Zepernick, H. Shin, and T. Q. Duong, “Real-time optimized clustering
COLM, pp. 1–43, 2024. and caching for 6G satellite-UAV-terrestrial networks,” IEEE Transactions
[90] C. Qian, W. Liu, H. Liu, N. Chen, Y. Dang, J. Li, C. Yang, W. Chen, Y. Su, on Intelligent Transportation Systems, vol. 25, no. 3, pp. 3009–3019, 2024.
X. Cong, J. Xu, D. Li, Z. Liu, and M. Sun, “ChatDev: Communicative [114] Z. Liu, Y. Zhang, P. Li, Y. Liu, and D. Yang, “Dynamic LLM-agent
agents for software development,” in Proc. ACL, pp. 15174–15186, 2024. network: An LLM-agent collaboration framework with agent team opti-
[91] S. Hong, M. Zhuge, J. Chen, X. Zheng, Y. Cheng, J. Wang, and et al., mization,” arXiv preprint arXiv:2310.02170, pp. 1–21, 2023.
“MetaGPT: Meta programming for a multi-agent collaborative framework,” [115] A. Zhang, H. Fei, Y. Yao, W. Ji, L. Li, Z. Liu, and T.-S. Chua, “Vpgtrans:
in Proc. ICLR, pp. 1–26, 2024. Transfer visual prompt generator across LLMs,” Proc. NeurIPS, vol. 36,
[92] D. Wu, X. Wang, Y. Qiao, Z. Wang, J. Jiang, S. Cui, and F. Wang, pp. 20299–20319, 2024.
“NetLLM: Adapting large language models for networking,” in Proc. ACM [116] X. Ma, G. Fang, and X. Wang, “LLM-pruner: On the structural pruning of
SIGCOMM, pp. 661–678, 2024. large language models,” Proc. NeurIPS, vol. 36, pp. 21702–21720, 2023.
[93] R. Zhang, H. Du, Y. Liu, D. Niyato, J. Kang, S. Sun, X. Shen, [117] Z. Liu, A. Desai, F. Liao, W. Wang, V. Xie, Z. Xu, A. Kyrillidis, and
and H. V. Poor, “Interactive AI with retrieval-augmented genera- A. Shrivastava, “Scissorhands: Exploiting the persistence of importance
tion for next generation networking,” IEEE Network, pp. 1–10, 2024. hypothesis for LLM kv cache compression at test time,” Proc. NeurIPS,
doi:10.1109/MNET.2024.3401159. vol. 36, 2024.
[94] Y. Huang, H. Du, X. Zhang, D. Niyato, J. Kang, Z. Xiong, S. Wang, [118] X. Shen, P. Dong, L. Lu, Z. Kong, Z. Li, M. Lin, C. Wu, and Y. Wang,
and T. Huang, “Large language models for networking: Applications, “Agile-quant: Activation-guided quantization for faster inference of LLMs
enabling techniques, and challenges,” IEEE Network, pp. 1–7, 2024. on the edge,” in Proc. AAAI, pp. 18944–18951, 2024.
doi:10.1109/MNET.2024.3435752. [119] M. Zhang, J. Cao, X. Shen, and Z. Cui, “EdgeShard: Efficient LLM infer-
[95] G. Deng, Y. Liu, V. Mayoral-Vilches, P. Liu, Y. Li, Y. Xu, T. Zhang, Y. Liu, ence via collaborative edge computing,” arXiv preprint arXiv:2405.14371,
M. Pinzger, and S. Rass, “Pentestgpt: An LLM-empowered automatic 2024.
penetration testing tool,” arXiv preprint arXiv:2308.06782, pp. 1–22, 2023. [120] M. Xu, D. Niyato, J. Kang, Z. Xiong, S. Mao, Z. Han, D. I. Kim, and
[96] J. Xu, J. W. Stokes, G. McDonald, X. Bai, D. Marshall, S. Wang, K. B. Letaief, “When large language model agents meet 6G networks:
A. Swaminathan, and Z. Li, “Autoattacker: A large language model Perception, grounding, and alignment,” IEEE Wireless Communications,
guided system to implement automatic cyber-attacks,” arXiv preprint pp. 1–9, 2024. doi:10.1109/MWC.005.2400019.
arXiv:2403.01038, pp. 1–19, 2024. [121] C. H. Robinson and L. J. Damschroder, “A pragmatic context assessment
[97] R. Fang, R. Bindu, A. Gupta, and D. Kang, “LLM agents can autonomously tool (pCAT): using a think aloud method to develop an assessment of
exploit one-day vulnerabilities,” arXiv preprint arXiv:2404.08144, pp. 1– contextual barriers to change,” Implementation Science Communications,
13, 2024. vol. 4, no. 1, p. 3, 2023.
[98] E. Seraj, “Embodied, intelligent communication for multi-agent coopera- [122] A. Roberts, C. Raffel, and N. Shazeer, “How much knowledge can
tion,” in Proc. AAAI, pp. 16135–16136, 2023. you pack into the parameters of a language model?,” in Proc. EMNLP,
[99] C.-M. Chan, W. Chen, Y. Su, J. Yu, W. Xue, S. Zhang, J. Fu, and pp. 5418–5426, 2020.
Z. Liu, “ChatEval: Towards better llm-based evaluators through multi-agent [123] M. Kang, S. Lee, J. Baek, K. Kawaguchi, and S. J. Hwang, “Knowledge-
debate,” in Proc. ICLR, pp. 1–15, 2024. augmented reasoning distillation for small language models in knowledge-
[100] Y. Yan and T. Hayakawa, “Hierarchical noncooperative dynamical systems intensive tasks,” Proc. NeurIPS, vol. 36, pp. 1–30, 2024.
under intragroup and intergroup incentives,” IEEE Transactions on Control [124] M. Wu, A. Waheed, C. Zhang, M. Abdul-Mageed, and A. F. Aji, “Lamini-
of Network Systems, vol. 11, no. 2, pp. 743–755, 2024. Lm: A diverse herd of distilled models from large-scale instructions,” arXiv
[101] Y. Wang, H. Peng, Z. Su, T. H. Luan, A. Benslimane, and Y. Wu, “A preprint arXiv:2304.14402, pp. 1–21, 2023.
platform-free proof of federated learning consensus mechanism for sus- [125] M. Zhong, C. An, W. Chen, J. Han, and P. He, “Seeking neural nuggets:
tainable blockchains,” IEEE Journal on Selected Areas in Communications, Knowledge transfer in large language models from a parametric perspec-
vol. 40, no. 12, pp. 3305–3324, 2022. tive,” arXiv preprint arXiv:2310.11451, pp. 1–21, 2023.
[102] Z. Liu, Y. Zhang, P. Li, Y. Liu, and D. Yang, “Dynamic LLM-agent [126] D. Jiang, X. Ren, and B. Y. Lin, “LLM-blender: Ensembling large language
network: An llm-agent collaboration framework with agent team optimiza- models with pairwise ranking and generative fusion,” arXiv preprint
tion,” arXiv preprint arXiv:2310.02170, pp. 1–21, 2023. arXiv:2306.02561, pp. 1–18, 2023.
[103] J. S. Park, J. O’Brien, C. J. Cai, M. R. Morris, P. Liang, and M. S. [127] M. Wortsman, G. Ilharco, S. Y. Gadre, R. Roelofs, R. Gontijo-Lopes, A. S.
Bernstein, “Generative agents: Interactive simulacra of human behavior,” Morcos, H. Namkoong, A. Farhadi, Y. Carmon, S. Kornblith, et al., “Model
in Proc. ACM UIST, 2023. soups: averaging weights of multiple fine-tuned models improves accuracy
[104] E. Liu, Q. Zhang, and K. K. Leung, “Relay-assisted transmission with without increasing inference time,” in Proc. ICML, pp. 23965–23998, 2022.
fairness constraint for cellular networks,” IEEE Transactions on Mobile [128] A. Lazaridou, E. Gribovskaya, W. Stokowiec, and N. Grigorev, “Internet-
Computing, vol. 11, no. 2, pp. 230–239, 2012. augmented language models through few-shot prompting for open-domain
[105] H. Li, Y. Chong, S. Stepputtis, J. Campbell, D. Hughes, C. Lewis, question answering,” arXiv preprint arXiv:2203.05115, pp. 1–20, 2022.
and K. Sycara, “Theory of mind for multi-agent collaboration via large [129] Z. Chen, G. Weiss, E. Mitchell, A. Celikyilmaz, and A. Bosselut, “Reck-
language models,” in Proc. EMNLP, pp. 180–192, 2023. oning: reasoning through dynamic knowledge encoding,” Proc. NeurIPS,
[106] X. Wu, Z. Huang, L. Wang, J. Chanussot, and J. Tian, “Multimodal vol. 36, pp. 1–22, 2024.
collaboration networks for geospatial vehicle detection in dense, occluded, [130] Z. Hu, L. Wang, Y. Lan, W. Xu, E.-P. Lim, L. Bing, X. Xu, S. Poria, and
and large-scale events,” IEEE Transactions on Geoscience and Remote R. K.-W. Lee, “LLM-adapters: An adapter family for parameter-efficient
Sensing, vol. 62, pp. 1–12, 2024. fine-tuning of large language models,” arXiv preprint arXiv:2304.01933,
[107] S. Gur, N. Neverova, C. Stauffer, S.-N. Lim, D. Kiela, and A. Reiter, pp. 1–21, 2023.
“Cross-modal retrieval augmentation for multi-modal classification,” in [131] Y. Qin, J. Zhang, Y. Lin, Z. Liu, P. Li, M. Sun, and J. Zhou,
Findings of the Association for Computational Linguistics: EMNLP 2021, “ELLE: Efficient lifelong pre-training for emerging data,” arXiv preprint
pp. 111–123, 2021. arXiv:2203.06311, pp. 1–22, 2022.
34
[132] R. Zhang, Y. Su, B. D. Trisedya, X. Zhao, M. Yang, H. Cheng, and [157] C. Rebuffel, M. Roberti, L. Soulier, G. Scoutheeten, R. Cancelliere,
J. Qi, “Autoalign: fully automatic and effective knowledge graph alignment and P. Gallinari, “Controlling hallucinations at word level in data-to-text
enabled by large language models,” IEEE Transactions on Knowledge and generation,” Data Mining and Knowledge Discovery, pp. 1–37, 2022.
Data Engineering, vol. 36, no. 6, pp. 2357–2371, 2023. [158] Y. Xiao and W. Y. Wang, “On hallucination and predictive uncertainty in
[133] Z.-M. Jiang, J.-J. Bai, K. Lu, and S.-M. Hu, “Context-sensitive and conditional language generation,” in Proc. EACL, pp. 2734–2744, 2021.
directional concurrency fuzzing for data-race detection,” in Proc. NDSS, [159] X. Xu, K. Kong, N. Liu, L. Cui, D. Wang, J. Zhang, and M. Kankanhalli,
pp. 1–18, 2022. “An LLM can fool itself: A prompt-based adversarial attack,” in Proc.
[134] P. Verga, H. Sun, L. Baldini Soares, and W. Cohen, “Adaptable and ICLR, pp. 1–23, 2024.
interpretable neural MemoryOver symbolic knowledge,” in Proc. NAACL, [160] F. Shi, X. Chen, K. Misra, N. Scales, D. Dohan, E. H. Chi, N. Schärli,
pp. 3678–3691, 2021. and D. Zhou, “Large language models can be easily distracted by irrelevant
[135] N. De Cao, G. Izacard, S. Riedel, and F. Petroni, “Autoregressive entity context,” in Proc. ICML, vol. 202, pp. 31210–31227, 2023.
retrieval,” arXiv preprint arXiv:2010.00904, pp. 1–20, 2020. [161] R. Liu, R. Yang, C. Jia, G. Zhang, D. Yang, and S. Vosoughi, “Training
[136] I. Melnyk, P. Dognin, and P. Das, “Grapher: Multi-stage knowledge socially aligned language models on simulated social interactions,” in Proc.
graph construction using pretrained language models,” in NeurIPS 2021 ICLR, pp. 1–24, 2024.
Workshop on Deep Generative Models and Downstream Applications, [162] D. Zhu, J. Chen, X. Shen, X. Li, and M. Elhoseiny, “MiniGPT-4:
pp. 1–12, 2021. Enhancing vision-language understanding with advanced large language
[137] C. Chen, Y. Wang, B. Li, and K.-Y. Lam, “Knowledge is flat: A seq2seq models,” arXiv preprint arXiv:2304.10592, pp. 1–15, 2023.
generative framework for various knowledge graph completion,” arXiv [163] C. Du, Y. Li, Z. Qiu, and C. Xu, “Stable diffusion is unstable,” in Proc.
preprint arXiv:2209.07299, pp. 1–13, 2022. NeurIPS, pp. 1–22, 2023.
[138] Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y. Xu, E. Ishii, Y. J. Bang, [164] H. Wang, K. Dong, Z. Zhu, H. Qin, A. Liu, X. Fang, J. Wang, and X. Liu,
A. Madotto, and P. Fung, “Survey of hallucination in natural language “Transferable multimodal attack on vision-language pre-training models,”
generation,” ACM Computing Surveys, vol. 55, no. 12, pp. 1–38, 2023. in Proc. IEEE SP, pp. 102–102, 2024.
[139] F. Liu, K. Lin, L. Li, J. Wang, Y. Yacoob, and L. Wang, “Mitigating [165] H. Luo, J. Gu, F. Liu, and P. Torr, “An image is worth 1000 lies:
hallucination in large multi-modal models via robust instruction tuning,” Transferability of adversarial images across prompts on vision-language
in Proc. ICLR, pp. 1–40, 2023. models,” in Proc. ICLR, pp. 1–22, 2024.
[140] N. McKenna, T. Li, L. Cheng, M. J. Hosseini, M. Johnson, and M. Steed- [166] C. Liang, X. Wu, Y. Hua, J. Zhang, Y. Xue, T. Song, Z. Xue, R. Ma, and
man, “Sources of hallucination by large language models on inference H. Guan, “Adversarial example does good: Preventing painting imitation
tasks,” in Proc. EMNLP Findings, pp. 2758–2774, 2023. from diffusion models via adversarial examples,” in Proc. ICML, vol. 202,
[141] K. Lee, D. Ippolito, A. Nystrom, C. Zhang, D. Eck, C. Callison-Burch, and pp. 20763–20786, 2023.
N. Carlini, “Deduplicating training data makes language models better,” in
[167] Z. Yu, X. Liu, S. Liang, Z. Cameron, C. Xiao, and N. Zhang, “Don’t listen
Proc. ACL, pp. 8424–8445, 2022.
to me: Understanding and exploring jailbreak prompts of large language
[142] G. Penedo, Q. Malartic, D. Hesslow, R. Cojocaru, H. Alobeidli, A. Cap- models,” in Proc. USENIX, pp. 1–18, 2024.
pelli, B. Pannier, E. Almazrouei, and J. Launay, “The RefinedWeb dataset
[168] Y. Yang, B. Hui, H. Yuan, N. Gong, and Y. Cao, “Sneakyprompt:
for falcon LLM: outperforming curated corpora with web data only,” in
Jailbreaking text-to-image generative models,” in Proc. IEEE SP, pp. 123–
Proc. NeurIPS, pp. 1–32, 2023.
123, 2024.
[143] A. P. Parikh, X. Wang, S. Gehrmann, M. Faruqui, B. Dhingra, D. Yang,
[169] X. Shen, Z. Chen, M. Backes, Y. Shen, and Y. Zhang, “”do anything
and D. Das, “ToTTo: A controlled table-to-text generation dataset,” in Proc.
now”: Characterizing and evaluating in-the-wild jailbreak prompts on large
EMNLP, pp. 1173–1186, 2020.
language models,” in Proc. CCS, pp. 1–22, 2024.
[144] N. Lee, W. Ping, P. Xu, M. Patwary, P. Fung, M. Shoeybi, and B. Catanzaro,
“Factuality enhanced language models for open-ended text generation,” in [170] G. Deng, Y. Liu, Y. Li, K. Wang, Y. Zhang, Z. Li, H. Wang, T. Zhang, and
Proc. NeurIPS, pp. 1–24, 2022. Y. Liu, “Jailbreaker: Automated jailbreak across multiple large language
model chatbots,” in Proc. NDSS, pp. 1–15, 2024.
[145] S. Longpre, K. Perisetla, A. Chen, N. Ramesh, C. DuBois, and S. Singh,
“Entity-based knowledge conflicts in question answering,” in Proc. [171] S. Toyer, O. Watkins, E. A. Mendes, J. Svegliato, L. Bailey, T. Wang,
EMNLP, pp. 7052–7063, 2021. I. Ong, K. Elmaaroufi, P. Abbeel, T. Darrell, A. Ritter, and S. Russell,
[146] E. Perez, S. Ringer, K. Lukosiute, K. Nguyen, E. Chen, and et al., “Tensor trust: Interpretable prompt injection attacks from an online game,”
“Discovering language model behaviors with model-written evaluations,” in Proc. ICLR, pp. 1–34, 2024.
in Proc. ACL Findings, pp. 13387–13434, 2023. [172] K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz,
[147] M. Zhang, O. Press, W. Merrill, A. Liu, and N. A. Smith, “How language “Not what you’ve signed up for: Compromising real-world LLM-integrated
model hallucinations can snowball,” arXiv preprint arXiv:2305.13534, applications with indirect prompt injection,” in Proc. AIsec, pp. 79–90,
pp. 1–13, 2023. 2023.
[148] N. Mündler, J. He, S. Jenko, and M. Vechev, “Self-contradictory halluci- [173] L.-b. Ning, S. Wang, W. Fan, Q. Li, X. Xu, H. Chen, and F. Huang,
nations of large language models: Evaluation, detection and mitigation,” “CheatAgent: Attacking llm-empowered recommender systems via llm
in Proc. ICLR, pp. 1–30, 2024. agent,” in Proc. ACM KDD, p. 2284–2295, 2024.
[149] K. Tian, E. Mitchell, H. Yao, C. D. Manning, and C. Finn, “Fine-tuning [174] D. Bespalov, S. Bhabesh, Y. Xiang, L. Zhou, and Y. Qi, “Towards building
language models for factuality,” in Proc. ICLR, pp. 1–16, 2024. a robust toxicity predictor,” in Proc. ACL, pp. 581–598, 2023.
[150] P. Manakul, A. Liusie, and M. J. F. Gales, “Selfcheckgpt: Zero-resource [175] Y. Cheng, L. Jiang, W. Macherey, and J. Eisenstein, “AdvAug: Robust
black-box hallucination detection for generative large language models,” adversarial augmentation for neural machine translation,” in Proc. ACL,
in Proc. EMNLP, pp. 9004–9017, 2023. pp. 5961–5970, 2020.
[151] C. Chen, K. Liu, Z. Chen, Y. Gu, Y. Wu, M. Tao, Z. Fu, and J. Ye, “IN- [176] A. Kumar, C. Agarwal, S. Srinivas, S. Feizi, and H. Lakkaraju,
SIDE: LLMs’ internal states retain the power of hallucination detection,” “Certifying LLM safety against adversarial prompting,” arXiv preprint
in Proc. ICLR, pp. 1–21, 2024. arXiv:2309.02705, pp. 1–32, 2023.
[152] A. Mallen, A. Asai, V. Zhong, R. Das, D. Khashabi, and H. Hajishirzi, [177] A. Helbling, M. Phute, M. Hull, and D. H. Chau, “LLM self defense:
“When not to trust language models: Investigating effectiveness of para- By self examination, LLMs know they are being tricked,” arXiv preprint
metric and non-parametric memories,” in Proc. ACL, pp. 9802–9822, 2023. arXiv:2308.07308, pp. 1–11, 2023.
[153] W. Shi, X. Han, M. Lewis, Y. Tsvetkov, L. Zettlemoyer, and W.-t. Yih, [178] Y. Zeng, Y. Wu, X. Zhang, H. Wang, and Q. Wu, “AutoDefense:
“Trusting your evidence: Hallucinate less with context-aware decoding,” Multi-agent LLM defense against jailbreak attacks,” arXiv preprint
in Proc. of NAACL, pp. 783–791, 2024. arXiv:2403.04783, pp. 1–20, 2024.
[154] E. Jones, H. Palangi, C. S. Ribeiro, V. Chandrasekaran, S. Mukherjee, [179] L. Shen, Y. Pu, S. Ji, C. Li, X. Zhang, C. Ge, and T. Wang, “Improving
A. Mitra, A. H. Awadallah, and E. Kamar, “Teaching language models to the robustness of transformer-based large language models with dynamic
hallucinate less with synthetic tasks,” in Proc. ICLR, pp. 1–18, 2024. attention,” in Proc. NDSS, pp. 1–18, 2024.
[155] L. Gao, Z. Dai, P. Pasupat, A. Chen, A. T. Chaganty, Y. Fan, V. Y. [180] E. Jones, A. Dragan, A. Raghunathan, and J. Steinhardt, “Automatically
Zhao, N. Lao, H. Lee, D. Juan, and K. Guu, “RARR: researching and auditing large language models via discrete optimization,” in Proc. ICML,
revising what language models say, using language models,” in Proc. ACL, pp. 15307–15329, 2023.
pp. 16477–16508, 2023. [181] H. Xu, W. Zhang, Z. Wang, F. Xiao, R. Zheng, Y. Feng, Z. Ba, and
[156] Y. Zhou, C. Cui, J. Yoon, L. Zhang, Z. Deng, C. Finn, M. Bansal, and K. Ren, “RedAgent: Red teaming large language models with context-
H. Yao, “Analyzing and mitigating object hallucination in large vision- aware autonomous language agent,” arXiv preprint arXiv:2407.16667,
language models,” in Proc. ICLR, pp. 1–25, 2024. pp. 1–17, 2024.
35
[182] R. Schuster, C. Song, E. Tromer, and V. Shmatikov, “You autocomplete els via neighbourhood comparison,” in Proc. ACL Findings, pp. 11330–
me: Poisoning vulnerabilities in neural code completion,” in Proc. USENIX, 11343, 2023.
pp. 1559–1575, 2021. [207] W. Shi, A. Ajith, M. Xia, Y. Huang, D. Liu, T. Blevins, D. Chen, and
[183] A. Wan, E. Wallace, S. Shen, and D. Klein, “Poisoning language models L. Zettlemoyer, “Detecting pretraining data from large language models,”
during instruction tuning,” in Proc. ICLR, pp. 35413–35425, 2023. in Proc. ICLR, pp. 1–18, 2024.
[184] S. Zhou, F. F. Xu, H. Zhu, X. Zhou, R. Lo, A. Sridhar, X. Cheng, [208] F. Kong, J. Duan, R. Ma, H. T. Shen, X. Shi, X. Zhu, and K. Xu, “An
T. Ou, Y. Bisk, D. Fried, U. Alon, and G. Neubig, “WebArena: A efficient membership inference attack for the diffusion model by proximal
realistic web environment for building autonomous agents,” arXiv preprint initialization,” in Proc. ICLR, pp. 1–19, 2024.
arXiv:2307.13854, pp. 1–15, 2024. [209] N. Kandpal, K. Pillutla, A. Oprea, P. Kairouz, C. Choquette-Choo, and
[185] B. Zhang, Y. Tan, Y. Shen, A. Salem, M. Backes, S. Zannettou, and Z. Xu, “User inference attacks on large language models,” in Proc. NIPS
Y. Zhang, “Breaking agents: Compromising autonomous LLM agents International Workshop on Federated Learning in the Age of Foundation
through malfunction amplification,” arXiv preprint arXiv:2407.20859, Models, pp. 1–33, 2023.
pp. 1–15, 2024. [210] F. Mireshghallah, A. Uniyal, T. Wang, D. Evans, and T. Berg-Kirkpatrick,
[186] C.-M. Chan, J. Yu, W. Chen, C. Jiang, X. Liu, W. Shi, Z. Liu, W. Xue, “An empirical analysis of memorization in fine-tuned autoregressive lan-
and Y. Guo, “AgentMonitor: A plug-and-play framework for predictive and guage models,” in Proc. EMNLP, pp. 1816–1826, 2022.
secure multi-agent systems,” arXiv preprint arXiv:2408.14972, pp. 1–29, [211] W. Fu, H. Wang, C. Gao, G. Liu, Y. Li, and T. Jiang, “Practical membership
2024. inference attacks against fine-tuned large language models via self-prompt
[187] L. Struppek, D. Hintersdorf, and K. Kersting, “Rickrolling the artist: calibration,” arXiv preprint arXiv:2311.06062, pp. 1–13, 2023.
Injecting backdoors into text encoders for text-to-image synthesis,” in Proc. [212] X. Pan, M. Zhang, S. Ji, and M. Yang, “Privacy risks of general-purpose
ICCV, pp. 4584–4596, 2023. language models,” in Proc. IEEE SP, pp. 1314–1331, 2020.
[188] J. Xu, M. D. Ma, F. Wang, C. Xiao, and M. Chen, “Instructions as back- [213] L. Wang, J. Wang, J. Wan, L. Long, Z. Yang, and Z. Qin, “Property
doors: Backdoor vulnerabilities of instruction tuning for large language existence inference against generative models,” in Proc. USENIX, pp. 1–18,
models,” in Proc. NAACL, pp. 3111–3126, 2024. 2024.
[189] Z. Xiang, F. Jiang, Z. Xiong, B. Ramasubramanian, R. Poovendran, and [214] N. Kandpal, E. Wallace, and C. Raffel, “Deduplicating training data
B. Li, “BadChain: Backdoor chain-of-thought prompting for large language mitigates privacy risks in language models,” in Proc. ICML, pp. 10697–
models,” in Proc. ICLR, pp. 1–28, 2024. 10707, 2022.
[190] C. Chen and J. Dai, “Mitigating backdoor attacks in lstm-based text clas- [215] S. Hoory, A. Feder, A. Tendler, S. Erell, A. Peled-Cohen, I. Laish,
sification systems by backdoor keyword identification,” Neurocomputing, H. Nakhost, U. Stemmer, A. Benjamini, A. Hassidim, and Y. Matias,
vol. 452, pp. 253–262, 2021. “Learning and evaluating a differentially private pre-trained language
model,” in Proc. EMNLP Findings, pp. 1178–1189, 2021.
[191] S. Zhao, L. Gan, L. A. Tuan, J. Fu, L. Lyu, M. Jia, and J. Wen, “Defending
[216] S. Kim, S. Yun, H. Lee, M. Gubri, S. Yoon, and S. J. Oh, “ProPILE:
against weight-poisoning backdoor attacks for parameter-efficient fine-
Probing privacy leakage in large language models,” in Proc. ICLR, pp. 1–
tuning,” in Proc. NAACL Findings, pp. 3421–3438, 2024.
18, 2023.
[192] C. Xu, J. Wang, F. Guzmán, B. Rubinstein, and T. Cohn, “Mitigating data
[217] B. Wang and N. Z. Gong, “Stealing hyperparameters in machine learning,”
poisoning in text classification with differential privacy,” in Proc. EMNLP
in Proc. IEEE SP, pp. 36–52, 2018.
Findings, pp. 4348–4356, 2021.
[218] K. Krishna, G. S. Tomar, A. P. Parikh, N. Papernot, and M. Iyyer, “Thieves
[193] C. Wei, W. Meng, Z. Zhang, M. Chen, M. Zhao, W. Fang, L. Wang,
on sesame street! model extraction of bert-based APIs,” in Proc. ICLR,
Z. Zhang, and W. Chen, “Lmsanitator: Defending prompt-tuning against
pp. 1–19, 2020.
task-agnostic backdoors,” in Proc. NDSS, pp. 1–18, 2024.
[219] A. Naseh, K. Krishna, M. Iyyer, and A. Houmansadr, “Stealing the
[194] B. Wang, Y. Yao, S. Shan, H. Li, B. Viswanath, H. Zheng, and B. Y. Zhao, decoding algorithms of language models,” in Proc. CCS, pp. 1835–1849,
“Neural cleanse: Identifying and mitigating backdoor attacks in neural 2023.
networks,” in Proc. IEEE SP, pp. 707–723, 2019. [220] Z. Li, C. Wang, P. Ma, C. Liu, S. Wang, D. Wu, C. Gao, and Y. Liu,
[195] S. M. Abdullah, A. Cheruvu, S. Kanchi, T. Chung, P. Gao, M. Jadliwala, “On extracting specialized code abilities from large language models: A
and B. Viswanath, “An analysis of recent advances in deepfake image feasibility study,” in Proc. ICSE, pp. 1–13, 2024.
detection in an evolving threat landscape,” in Proc. IEEE SP, pp. 1–19, [221] Y. Jiang, C. Chan, M. Chen, and W. Wang, “Lion: Adversarial distillation
2024. of proprietary large language models,” in Proc. EMNLP, pp. 3134–3154,
[196] L. Dugan, A. Hwang, F. Trhlik, J. M. Ludan, A. Zhu, H. Xu, D. Ippolito, 2023.
and C. Callison-Burch, “RAID: A shared benchmark for robust evaluation [222] M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks that
of machine-generated text detectors,” in Proc. ACL, pp. 12463–12492, exploit confidence information and basic countermeasures,” in Proc. CCS,
2024. pp. 1322–1333, 2015.
[197] I. Shumailov, Y. Zhao, D. Bates, N. Papernot, R. Mullins, and R. Anderson, [223] X. Shen, Y. Qu, M. Backes, and Y. Zhang, “Prompt stealing attacks against
“Sponge examples: Energy-latency attacks on neural networks,” in Proc. text-to-image generation models,” in Proc. USENIX, pp. 1–20, 2024.
EuroS&P, pp. 212–231, 2021. [224] Z. Sha and Y. Zhang, “Prompt stealing attacks against large language
[198] A. Salem, M. Backes, and Y. Zhang, “Get a model! model hijacking attack models,” arXiv preprint arXiv:2402.12959, pp. 1–16, 2024.
against machine learning models,” Proc. NDSS, pp. 1–17, 2022. [225] J. Kirchenbauer, J. Geiping, Y. Wen, J. Katz, I. Miers, and T. Goldstein, “A
[199] W. M. Si, M. Backes, Y. Zhang, and A. Salem, “Two-in-one: A model hi- watermark for large language models,” in Proc. ICML, pp. 17061–17084,
jacking attack against text generation models,” in Proc. USENIX, pp. 2223– 2023.
2240, 2023. [226] C. Mauran, “Samsung bans ChatGPT, AI chatbots after data leak blunder.”
[200] J. Huang, H. Shao, and K. C.-C. Chang, “Are large pre-trained language Accessed on 2024-06-19.
models leaking your personal information?,” in Proc. EMNLP, Findings, [227] J. Lim, Y. Kim, B. Kim, C. Ahn, J. Shin, E. Yang, and S. Han,
pp. 2038–2047, 2022. “BiasAdv: Bias-adversarial augmentation for model debiasing,” in Proc.
[201] Z. Zhang, J. Wen, and M. Huang, “ETHICIST: Targeted training data CVPR, pp. 3832–3841, 2023.
extraction through loss smoothed soft prompting and calibrated confidence [228] L. Zhu, K. Xu, Z. Ke, and R. W. Lau, “Mitigating intensity bias in shadow
estimation,” in Proc. ACL, pp. 12674–12687, 2023. detection via feature decomposition and reweighting,” in Proc. CVPR,
[202] A. Panda, C. A. Choquette-Choo, Z. Zhang, Y. Yang, and P. Mittal, “Teach pp. 4682–4691, 2021.
LLMs to phish: Stealing private information from language models,” in [229] V. Chamola, V. Hassija, A. R. Sulthana, D. Ghosh, D. Dhingra, and
Proc. ICLR, pp. 1–25, 2024. B. Sikdar, “A review of trustworthy and explainable artificial intelligence
[203] R. Staab, M. Vero, M. Balunovic, and M. Vechev, “Beyond memorization: (XAI),” IEEE Access, vol. 11, pp. 78994–79015, 2023.
Violating privacy via inference with large language models,” in Proc. ICLR, [230] X. Feng and S. Hu, “Cyber-physical zero trust architecture for industrial
pp. 1–47, 2024. cyber-physical systems,” IEEE Transactions on Industrial Cyber-Physical
[204] N. Carlini, J. Hayes, M. Nasr, M. Jagielski, V. Sehwag, F. Tramèr, B. Balle, Systems, vol. 1, pp. 394–405, 2023.
D. Ippolito, and E. Wallace, “Extracting training data from diffusion [231] Y. Lin, Z. Gao, H. Du, D. Niyato, J. Kang, Z. Xiong, and Z. Zheng,
models,” in Proc. USENIX, pp. 5253–5270, 2023. “Blockchain-based efficient and trustworthy AIGC services in meta-
[205] F. Mireshghallah, K. Goyal, A. Uniyal, T. Berg-Kirkpatrick, and R. Shokri, verse,” IEEE Transactions on Services Computing, pp. 1–13, 2024.
“Quantifying privacy risks of masked language models using membership doi:10.1109/TSC.2024.3382958.
inference attacks,” in Proc. EMNLP, pp. 8332–8347, 2022.
[206] J. Mattern, F. Mireshghallah, Z. Jin, B. Schoelkopf, M. Sachan, and
T. Berg-Kirkpatrick, “Membership inference attacks against language mod-