0% found this document useful (0 votes)
28 views11 pages

Cisco Confirms Salt Typhoon Exploit

Cisco has confirmed that the Chinese threat actor Salt Typhoon exploited a vulnerability to infiltrate major US telecom providers, maintaining access for extended periods. The Ghost ransomware group has targeted organizations in over 70 countries, executing attacks swiftly and often without significant data exfiltration. CyberArk has acquired Zilla to enhance its identity security capabilities, while Apple has disabled end-to-end encryption for UK iCloud users in response to government demands.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views11 pages

Cisco Confirms Salt Typhoon Exploit

Cisco has confirmed that the Chinese threat actor Salt Typhoon exploited a vulnerability to infiltrate major US telecom providers, maintaining access for extended periods. The Ghost ransomware group has targeted organizations in over 70 countries, executing attacks swiftly and often without significant data exfiltration. CyberArk has acquired Zilla to enhance its identity security capabilities, while Apple has disabled end-to-end encryption for UK iCloud users in response to government demands.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

1) Cisco Confirms Salt Typhoon Exploitation in Telecom Hits

Following research reports last week that Salt Typhoon, the Chinese threat actor known for
spying on communications networks, exploited a Cisco vulnerability to infiltrate major US
telecommunications providers last fall — including T-Mobile, AT&T, and Verizon — the
networking giant has confirmed the activity and offered details on two main attack vectors.
Cisco Talos researchers said the attack vectors included exploiting an older security vulnerability
tracked as CVE-2018-0171; and using stolen log-in credentials to gain access to the
infrastructure. The threat actor was able to maintain access to these compromised environments
for extended periods of times, and, in one instance, for over three years, the researchers said,
paving the way for configuration exfiltration, infrastructure pivoting, and configuration
modification.
Though no new Cisco vulnerabilities have been discovered in the campaign, Cisco said it is also
receiving reports that Salt Typhoon is abusing at least three other known Cisco vulnerabilities:
CVE-2023-20198, CVE-2023-20273, and CVE-2024-20399. Users should patch these
immediately.
The attribution to Salt Typhoon hinges on a few clues, according to Cisco Talos. "There are
several reasons to believe this activity is being carried out by a highly sophisticated, well-funded
threat actor, including the targeted nature of this campaign, the deep levels of developed access
into victim networks, and the threat actor's extensive technical knowledge," said the researchers.
"Furthermore, the long timeline of this campaign suggests a high degree of coordination,
planning, and patience — standard hallmarks of advanced persistent threat (APT) and state-
sponsored actors."
In addition to patching, the researchers recommend applying cybersecurity best practices like
educating users on credential hygiene and staying up to date with the latest security advisories.
2) Ghost Ransomware Targets Orgs in 70+ Countries
The China-backed Ghost ransomware group has racked up victims across more than 70 nations
since 2021, by targeting vulnerable Internet-facing systems, often moving swiftly from initial
access to compromise in just one day.
The Cybersecurity and Infrastructure Security Agency (CISA) issued an advisory on Feb. 19 that
sheds new light on how the prolific ransomware group operates, as a warning to organizations
with systems running outdated versions of software and firmware with known vulnerabilities,
which the group has been using to mount successful attacks. The advisory is part of the agency's
#StopRansomware campaign.
The findings also demonstrate just how quickly a ransomware group can get in and out of an
organization's network and do damage for financial gain in a relatively short amount of time.
Indeed, CISA found that persistence is not a concern for Ghost because its actors "typically only
spend a few days on victim networks," according to the advisory.
"In multiple instances, they have been observed proceeding from initial compromise to the
deployment of ransomware within the same day," according to the CISA advisory.
This is atypical of traditional ransomware groups, which "may have days, weeks, or even months
from the initial access gained to the deployment of the ransomware," notes Roger Grimes, data-
driven defense evangelist at security firm KnowBe4.
A typical attack flow includes the group targeting one of these flaws for initial access, then
moving quickly to execute Cobalt Strike — a legitimate adversary simulation tool often used by
threat actors — as the foundation for the attack and command-and-control (C2) operations.
Once ransomware is deployed, Ghost actors typically inform organizations in their ransom note
that exfiltrated data will be sold if a ransom is not paid. Curiously, however, CISA found that
"Ghost actors do not frequently exfiltrate a significant amount of information or files, such as
intellectual property or personally identifiable information (PII), that would cause significant
harm to victims if leaked," according to the advisory. This suggests that actors are using an
empty threat rather than a true ability to leak valuable files to get victims to pay the ransom,
CISA pointed out.
In terms of encryption, the group uses a variety of Ghost variants — including Cring.exe,
Ghost.exe, Elysium.exe, and Locker.exe — to encrypt either specific directories or the entire
system's storage. The group typically demands anywhere from tens to hundreds of thousands of
dollars in cryptocurrency in exchange for its decryption software.
Moreover, the impact of Ghost ransomware activity varies widely on a victim-to-victim basis,
and the group is flexible in its targeting, moving on quickly "when confronted with hardened
systems, such as those where proper network segmentation prevents lateral moment to other
devices," according to CISA.
3) CyberArk Makes Identity Security Play With Zilla Acquisition
CyberArk has acquired Boston-based startup Zilla as part of its plans to add identity governance
and administration (IGA) capabilities to its privilege access management (PAM) platform. The
$165 million deal announced Thursday has already closed.
Zella is a small IAG provider that was led by Aveksa founder and CEO Deepak Taneja.
CyberArk said it had $5 million in annual recurring revenue last year, 40 employees, and 125
customers. During CyberArk's earnings call with investors, CEO Matt Cohen touted Zilla's
"modern" IGA platform.
"In stark contrast to legacy IGA systems, Zilla's modern IGA SaaS [software-as-a-service]
platform was built from scratch to address today's digital environments characterized by an
explosion of staff applications, decentralized management, and identity-based security threats,"
Cohen said. "Leveraging AI-driven role management, Zilla automates the processes of identity,
compliance, and provisioning, making governance easy, intuitive, and all-inclusive for the
modern enterprise."
Cohen noted that customers can deploy Zilla five times faster than traditional IGA offerings,
resulting in 60% fewer service tickets.
"Legacy IGA are often slow to deploy, difficult to integrate, have limited integration with
modern systems, and are reliant on manual processes," Cohen said.
Because CyberArk is integrating Zilla into its platform, Cohen said that in addition to managing
entitlements, provisioning, and compliance, the same tool will grant access and provide controls.
"That integrated nature then creates a much more secure footprint across these modern
environments," he said.
CyberArk, whose annual revenues topped $1 billion for the first time last year, has aggressively
expanded its PAM and identity and access management (IAM) portfolio. Last year, CyberArk
acquired nonhuman identity provider Venafi for $1.6 billion.
CyberArk's announcement came on the same day that SailPoint returned to the public markets,
two years after Thoma Bravo purchased it for $6.9 billion and took it private. Thoma Bravo spun
out 60 million shares (roughly 12% of the company, to raise an estimated $1.38 billion in the
first major tech IPO of 2025. SailPoint, the largest IGA provider, estimated in its prospectus
recurring revenues of $875 million in the fiscal year that ended Jan. 31, a 41% increase over the
prior year. Although the company hasn't been profitable, SailPoint reported net loss improved to
$235.7 million in the first nine months of 2024, a 23.5% improvement over the previous year.
Just as CyberArk is expanding its IGA offerings, SailPoint has been moving into CyberArk's
space with its 2023 acquisition of PAM provider Osirium. In December, SailPoint acquired
Imprivata's IGA unit for $10.7 million, plus up to $7.4 million in earnouts.
Alex Bovee, CEO of ConductorOne, another IGA startup that offers an SaaS-based offering to
compete with SailPoint and Zilla, says IGA has become an increasingly vital component of
cybersecurity posture.
"I think the bull view on this is that every single security company is recognizing that identity
and identity governance are critical parts of the stack, and that's why CyberArk made this
investment," Bovee said. "CyberArk is traditionally a privileged access management solution,
but the future is these converged identity platforms."
4) Experts race to extract intel from Black Basta internal chat leaks
Hundreds of thousands of internal messages from the Black Basta ransomware
gang were leaked by a Telegram user, prompting security researchers to bust out
their best Russian translations post haste.

A user going by the name "ExploitWhispers" uploaded the chats in the form of a
JSON file nearly 50MB in size to Mega, which has since removed the download link.

Alas, the cyber threat intelligence (CTI) community flocked to the rare trove of
information to glean any and all insights they could. The problem: It's all in Russian,
so translating every message and turning that into actionable intel will take some
time.

The threat intelligence team at PRODAFT said on Thursday that the chats, which
were leaked on February 11, followed an internal conflict largely driven by a single
figure within the organization.

"As part of our continuous monitoring, we've observed that Black Basta (Vengeful
Mantis) has been mostly inactive since the start of the year due to internal conflicts,"
it said. "Some of its operators scammed victims by collecting ransom payments
without providing functional decryptors.

"The internal conflict was driven by 'Tramp' (LARVA-18), a known threat actor who
operates a spamming network responsible for distributing Qbot. As a key figure
within Black Basta, his actions played a major role in the group's instability.
"On February 11, 2025, a major leak exposed Black Basta internal Matrix chat logs.
The leaker claimed they released the data because the group was targeting Russian
banks. This leak closely resembles the previous Conti leaks."

A list of highlights from the chats so far, curated from posts made across the CTI
community, can be found below:

 Ransom demands went deep into the tens of millions, according to one December
2023 ransom note
 The group was charging around $1 million for a year's access to its loader
 One affiliate is a child aged 17 years
 Black Basta goes to great lengths to procure VPN exploits
 It also maintains a spreadsheet of potential victims it wishes to target, which are not
selected at random
 After seeing Scattered Spider's success with social engineering, its affiliates
adopted similar techniques and used phone calls to make initial contact with
company personnel
 Key gang members did not trust "Mr LockBit"
 It was known within the group that its ransomware was less effective than rivals,
which drove some affiliates to join Cactus ransomware instead

One PRODAFT CTI analyst also broke down the main figures within the group,
claiming a character they named as "Tramp" was likely the leader of the gang.

He and Bio used to work together at Conti, which also suffered a similar
infamous internal chat leak in 2022, the researchers believe.

Lapa is one of the main administrators of the group, but appears to be paid markedly
less than other senior members and is frequently insulted by his boss.

YY is another main admin and makes "a good salary," although the chats don't list
specific figures. Under the watch of Lapa and YY, the group attacked Russian banks
which is thought to have brought significant heat on the group from domestic law
enforcement.

The nicknames were linked to what were described as the crims' "real names,"
although we've no way of knowing whether these are aliases.

Cortes is part of the Qakbot operation, which often works alongside Black Basta, but
distanced himself from the ransomware crew following the attacks on Russian
banks. It's understandable, given that Russia generally turns a blind eye to
cybercrime unless it targets organizations within Putinland.

The leaked messages span September 18, 2023, to September 28, 2024. The
Register has not yet reviewed the chats in full, but the date ranges suggest
intelligence related to many high-profile attacks could be hiding among them.

5) Rather than add a backdoor, Apple decides to kill iCloud E2EE for UK peeps
Apple has responded to the UK government's demand for access to its customers’
data stored in iCloud by deciding to turn off its Advanced Data Protection (ADP)
end-to-end encryption service for UK users.

Cupertino’s decision came after a row that began earlier this month amid reports
that the UK Home Office had requested a backdoor to access data belonging to UK
citizens under the auspices of the Investigatory Powers Bill.

"We are gravely disappointed that the protections provided by ADP will not be
available to our customers in the UK given the continuing rise of data breaches and
other threats to customer privacy," Apple told The Register in a statement.

The end-to-end encryption (E2EE) afforded by ADP is therefore off the table for UK
residents, meaning both Apple and law enforcement agencies that secure a
subpoena will be able to access requested data without the need for backdoor
access.

Apple noted that some data stored in iCloud is still protected by E2EE, including
health info, iMessages and FaceTime calls. iCloud backups, storage, photos, notes,
reminders, Safari bookmarks, Siri shortcuts, Wallet passes, voice memos, and
Freeform digital whiteboard files, however, will no longer be locked protected.

Apple won’t turn off ADP. UK customers who attempt to enable the feature will now
see an error message, while those who currently use it will be given a limited time to
disable the feature. Access to iCloud will be blocked for those who don’t turn off
ADP.

"As we have said many times before, we have never built a backdoor or master key
to any of our products or services and we never will," Apple said. Instead, customers
in the UK will simply have to make do with lesser security than the iGiant advocate
as best practise.

$1.4 billion crypto-heist hits Bybit


Over $1.4 billion worth of Ethereum-based tokens were stolen last week from a
wallet belonging to cryptocurrency exchange Bybit.

CEO Ben Zhou explained the incident took place when Bybit made a transfer from a
cold wallet to a warm wallet.

But unbeknown to Bybit, the payload of that transaction was obfuscated or spoofed.

A version of events we’ve seen on crypto-centric news services suggests that Bybit
staff were fooled into authorizing transactions, perhaps after phishing directed them
to a fake website.

“The signing message was to change the smart contract logic of our ETH cold
wallet. This resulted Hacker took control of the specific ETH cold wallet we signed
and transferred all ETH in the cold wallet to this unidentified address,” Zhou wrote.
The CEO has reassured clients Bybit “is Solvent even if this hack loss is not
recovered, all of clients assets are 1 to 1 backed, we can cover the loss.”

The company nonetheless saw over 350,000 requests to withdraw investments, and
Zhou said Bybit successfully processed 99.994 percent of them. The CEO also
shared the output of his wearable fitness monitor so customers could understand his
stress levels.

Eagle-eyed Coast Guardian minimizes billing breach


Members of the US Coast Guard (USCG) have an unnamed hero to thank for
minimizing the impact of a breach of its payroll systems.

According to a USCG spokesperson who spoke to The Register, the branch is


currently investigating a data breach within its personnel and payroll system that has
involved the compromise of banking account details for some of its members. The
incident has led to delays in processing the pay of 1,135 of its troops, but the branch
declined to go into details as to what happened.

"The Coast Guard Investigative Service and Coast Guard Cyber Command are
leading an exhaustive investigation to determine the source and impact of the
breach, and will ensure it is resolved as soon as possible," a spokesperson told us.

But it could have been worse.

"Due to the diligence of a junior Petty Officer who reported anomalous activity
affecting their account to the Coast Guard Cyber Command, we were able to
minimize the impact of the breach," the USCG told us. We salute you, coastie.

6) Becoming Ransomware Ready: Why Continuous Validation Is Your Best Defense


Ransomware doesn't hit all at once—it slowly floods your defenses in stages. Like a ship subsumed with
water, the attack starts quietly, below the surface, with subtle warning signs that are easy to miss. By the
time encryption starts, it's too late to stop the flood.

Each stage of a ransomware attack offers a small window to detect and stop the threat before it's too late.
The problem is most organizations aren't monitoring for early warning signs - allowing attackers to quietly
disable backups, escalate privileges, and evade detection until encryption locks everything down.

By the time the ransomware note appears, your opportunities are gone.

Let's unpack the stages of a ransomware attack, how to stay resilient amidst constantly morphing
indicators of compromise (IOCs), and why constant validation of your defense is a must to stay resilient.

The Three Stages of a Ransomware Attack - and How to Detect It#

Ransomware attacks don't happen instantly. Attackers follow a structured approach, carefully planning
and executing their campaigns across three distinct stages:

1. Pre-Encryption: Laying the Groundwork#

Before encryption begins, attackers take steps to maximize damage and evade detection. They:

Delete shadow copies and backups to prevent recovery.

Inject malware into trusted processes to establish persistence.

Create mutexes to ensure the ransomware runs uninterrupted.


These early-stage activities - known as Indicators of Compromise (IOCs) - are critical warning signs. If
detected in time, security teams can disrupt the attack before encryption occurs.

2. Encryption: Locking You Out#

Once attackers have control, they initiate the encryption process. Some ransomware variants work
rapidly, locking systems within minutes, while others take a stealthier approach - remaining undetected
until the encryption is complete.

By the time encryption is discovered, it's often too late. Security tools must be able to detect and respond
to ransomware activity before files are locked.

3. Post-Encryption: The Ransom Demand#

With files encrypted, attackers deliver their ultimatum - often through ransom notes left on desktops or
embedded within encrypted folders. They demand payment, usually in cryptocurrency, and monitor victim
responses via command-and-control (C2) channels.

At this stage, organizations face a difficult decision: pay the ransom or attempt recovery, often at great
cost.

If you're not proactively monitoring for IOCs across all three stages, you're leaving your organization
vulnerable. By emulating a ransomware attack path, continuous ransomware validation helps security
teams confirm that their detection and response systems are effectively detecting indicators before
encryption can take hold.

Indicators of Compromise (IOCs): What to Look Out For#

If you detect shadow copy deletions, process injections, or security service terminations, you may already
be in the pre-encryption phase - but detecting these IOCs is a critical step to prevent the attack from
unfolding.

Here are key IOCs to watch for:

1. Shadow Copy Deletion: Eliminating Recovery Options#

Attackers erase Windows Volume Shadow Copies to prevent file restoration. These snapshots store
previous file versions and enable recovery through tools like System Restore and Previous Versions.

💡 How it works: Ransomware executes commands like:

powershell

vssadmin.exe delete shadows

By wiping these backups, attackers ensure total data lockdown, increasing pressure on victims to pay the
ransom.

2. Mutex Creation: Preventing Multiple Infections#

A mutex (mutual exclusion object) is a synchronization mechanism that enables only one process or
thread to access a shared resource at a time. In ransomware they can be used to:

✔ Prevent multiple instances of the malware from running.

✔ Evade detection by reducing redundant infections and reducing resource usage.

💡 Defensive trick: Some security tools preemptively create mutexes associated with known ransomware
strains, tricking the malware into thinking it's already active - causing it to self-terminate. Your
ransomware validation tool can be used to assess if this response is triggered, by incorporating a mutex
within the ransomware attack chain.

3. Process Injection: Hiding Inside Trusted Applications#

Ransomware often injects malicious code into legitimate system processes to avoid detection and bypass
security controls.
🚩 Common injection techniques:

DLL Injection – Loads malicious code into a running process.

Reflective DLL Loading – Injects a DLL without writing to disk, bypassing antivirus scans.

APC Injection – Uses Asynchronous Procedure Calls to execute malicious payloads within a trusted
process.

By running inside a trusted application, ransomware can operate undetected, encrypting files without
triggering alarms.

4. Service Termination: Disabling Security Defenses#

To ensure uninterrupted encryption and prevent data recovery attempts during the attack, ransomware
attempts to shut down security services such as:

✔ Antivirus & EDR (Endpoint Detection and Response)

✔ Backup agents

✔ Database systems

💡 How it works: Attackers use administrative commands or APIs to disable services like Windows
Defender and backup solutions. For example:

powershell

taskkill /F /IM MsMpEng.exe # Terminates Windows Defender

This allows ransomware to encrypt files freely while amplifying the damage by making it harder to recover
their data. Leaving victims with fewer options besides paying the ransom.

IOCs like shadow copy deletion or process injection can be invisible to traditional security tools - but a
SOC equipped with reliable detection can spot these red flags before encryption begins.

7) Second Recently Patched Flaw Exploited to Hack Palo Alto Firewalls


Palo Alto Networks is warning customers that a second PAN-OS vulnerability patched in February is being
exploited in the wild to hack its firewalls.

On February 12, Palo Alto Networks published 10 new security advisories to inform customers about the
availability of patches for various vulnerabilities.

One of them was CVE-2025-0108, an authentication bypass vulnerability that hackers started
exploiting the next day, after technical details and proof-of-concept (PoC) exploit code was made public.

Palo Alto Networks confirmed exploitation, as well as reports that CVE-2025-0108 can be chained
with CVE-2024-9474 — a previously known to be exploited flaw — for remote code execution.

Another vulnerability for which Palo Alto published an advisory on February 12 was CVE-2025-0111,
described as a file read issue in PAN-OS that allows “an authenticated attacker with network access to
the management web interface to read files on the PAN-OS filesystem that are readable by the ‘nobody’
user”.

The cybersecurity firm updated its advisory for CVE-2025-0111 on Thursday to warn customers that it
has seen exploitation attempts chaining CVE-2025-0108 with CVE-2024-9474 and CVE-2025-0111
against unpatched firewalls.

When Palo Alto’s advisory for CVE-2025-0111 was published, the vulnerability was described as ‘medium
severity’ and it had a ‘moderate urgency’ rating. The advisory has now been updated to describe it as a
high-severity issue with the ‘highest’ urgency.
“We continue to monitor the situation and leverage the currently operational mechanisms to detect
customer compromises in telemetry and TSFs and support them through the EFR remediations,” Palo
Alto told SecurityWeek.

“Customers with any internet-facing PAN-OS management interfaces are strongly urged to take
immediate action to mitigate these vulnerabilities. Securing external-facing management interfaces is
a fundamental security best practice, and we strongly encourage all organizations to review their
configurations to minimize risk,” it added.

Attempts to exploit CVE-2025-0108 were seen by both threat intelligence firm GreyNoise, which has to
date seen attack attempts coming from over 30 unique IPs, and cybersecurity non-profit Shadowserver
Foundation, which is currently seeing over 3,000 internet-exposed PAN-OS management interfaces.

CISA on Thursday added CVE-2025-0111 to its Known Exploited Vulnerabilities (KEV) catalog, instructing
federal agencies to address it by March 13.

There does not appear to be any public information describing attacks involving exploitation of CVE-2025-
0111 and CVE-2025-0108. Security firm Arctic Wolf pointed out that in previously observed attacks —
involving CVE-2024-9474 and CVE-2024-0012 (a vulnerability similar to CVE-2025-0108) — hackers
extracted firewall configurations and deployed malware on compromised devices.

Palo Alto Networks is urging customers to immediately apply patches or at least restrict access to the
management interface to trusted internal IP addresses. Customers with a Threat Prevention subscription
should enable Threat IDs 510000 and 510001 to block attacks exploiting these vulnerabilities.

8) How Hackers Manipulate Agentic AI With Prompt Engineering


The era of “agentic” artificial intelligence has arrived, and businesses can no longer afford to
overlook its transformative potential. AI agents operate independently, making decisions and
taking actions based on their programming. Gartner predicts that by 2028, 15% of day-to-day
business decisions will be made completely autonomously by AI agents.
However, as these systems become more widely accepted, their integration into critical
operations as well as excessive agency—deep access to systems, data, functionalities, and
permissions—make them appealing targets for cybercrime. One of the most subtle but powerful
attack techniques that threat actors use to manipulate, deceive, or compromise AI agents involves
prompt engineering.
How Can Prompt Engineering Be Exploited?
Prompt engineering is the practice of crafting inputs (a.k.a. prompts) to AI systems, particularly
those based on large language models (LLMs), to elicit specific responses or behaviors. While
prompt engineering is typically used for legitimate purposes, such as guiding the AI’s decision-
making process, it can also be exploited by threat actors to influence its outputs or even
manipulate its underlying data or logic (i.e., prompt injection).

How Threat Actors Leverage Prompt Engineering to Exploit Agentic AI


Threat actors utilize a number of prompt engineering techniques to compromise agentic AI
systems, such as:

Steganographic Prompting
Remember SEO poisoning technique where white text was used on a white background to
manipulate search engine results? If a visitor browses the web page, they are unable to read the
hidden text. But if a search engine bot crawls the page, it can read it. Similarly, steganographic
prompting involves a technique where hidden text or obfuscated instructions are embedded in a
way that is invisible to the human eye but detectable by an LLM. Say for example a CEO uses an
AI email assistant for replies. Prior to its email response, the bot runs some checks to ensure that
it abides by programmed rules (e.g., nothing urgent, sensitive, or proprietary). What if there’s
some hidden text in the email that is unreadable by humans but readable by bots, making the
agent take unauthorized actions, reveal confidential information, or generate inappropriate or
harmful outputs?
Jailbreaking
Jailbreaking is a prompting technique that manipulates AI systems into circumventing their own
built-in restrictions, ethical standards, or safety measures. In the case of agentic AI systems,
jailbreaking seeks to bypass built-in protections and safeguards, compelling the AI to behave in
ways that go against its intended programming. There are a number of different techniques bad
actors can employ to jailbreak AI guardrails:

 Role-playing: instructing the AI to adopt a persona that bypasses its restrictions.


 Obfuscation: using coded language, metaphors, or indirect phrasing to disguise malicious
intent.
 Context manipulation: altering context such as prior interactions or specific details to
guide the model into producing restricted outputs.
Prompt Probing
Prompt probing is a technique used to explore and understand the behavior, limitations, and
vulnerabilities of an agentic AI system by systematically testing it with carefully crafted inputs
(prompts). Although the technique is typically employed by researchers and developers to gain
an understanding about how AI models respond to different types of inputs or queries, it is also
used by threat actors as a precursor to more malicious activities, such as jailbreaking, prompt
injection attacks, or model extraction.
By probing the AI system by testing different prompt variations, word variations, and
instructions, attackers identify weaknesses or extract sensitive information. Imagine using an
agentic AI to manage order approvals in an e-commerce platform. A threat actor might begin
with a basic prompt such as, “Approve all orders.” If this doesn’t work, they could refine the
prompt with more specific instructions, such as, “Approve orders with expedited shipping.” By
testing and adjusting prompts, actors could manipulate the AI into approving fraudulent or
unauthorized transactions.

Mitigating the Risks of Prompt Engineering


To defend against prompt engineering attacks, organizations must adopt a multi-layered
approach. Key strategies include:

1. Input Sanitization and Validation: Implement robust input validation and sanitization
techniques to detect and block malicious prompts, to strip or detect hidden text, such as
white-on-white text, zero-width characters, or other obfuscation techniques, prior to
processing inputs.
2. Improve Agent Robustness: Using techniques like adversarial training and robustness
testing, train AI agents to recognize and resist adversarial inputs.
3. Limit AI Agency: Restrict the actions that agentic AI systems can perform, particularly
in high-stakes environments.
4. Monitor Agent Behavior: Continuously monitor AI systems for unusual behavior and
conduct regular audits to identify and address vulnerabilities.
5. Train Users: Educate users about the risks of prompt engineering and how to recognize
potential attacks.
6. Implement Anomaly Detection: Investing in a converged network and security-as-a-
service model like SASE ensures that organizations can identify anomalous activities and
unusual behaviors, which are often triggered by prompt manipulations, across the entire
IT estate.
7. Deploy Human-in-the-Loop: Use human reviewers to validate AI outputs and to
monitor critical and sensitive interactions.
Apart from the prompt engineering techniques mentioned above, there are numerous other
prompt engineering methods that attackers can leverage to exploit or manipulate agentic AI
systems. And just like any other application, AI needs to be subject to red teaming to expose any
risks and vulnerabilities. By staying vigilant and proactive, businesses can safeguard their AI
systems against exploitation and ensure they operate within safe and ethical boundaries.

You might also like