buc.ci

Search results for tag #aisafety

3 ★ 3 ↺

Anthony » 🌐 2026-04-17 / 2026-04-17
@abucci@buc.ci

I thought the Ronan Farrow/Andrew Marantz New Yorker article on OpenAI and Sam Altman in particular reveals many important details and helps settle several speculations (e.g. about what happened when Altman was briefly fired from OpenAI). The overall portrayal of Altman as frankly a compulsive liar is much needed.

However, like many Farrow and Marantz seem to take the so-called "existential risk" framing of AI seriously. I really wish people would stop doing that. In this case it makes the article feel incoherent in places.

This technology by itself does not pose a unique risk. It's the people, organizations, and governments around it, and their behavior with respect to it, that generate risk. Treating the technology alone as uniquely existentially risky provides cover for a wide variety of bad actors to both continue doing their work as well as to shrug and say "oops" if something goes catastrophically wrong or if smaller harms accumulate into intolerably large ones. The very framing provides an accountability shield, which by my read contradicts what Farrow himself suggests is needed, namely more accountability. I take this from this article, his previous work, and comments he makes in interviews (e.g., this one with Decoder.

We need to stop catastrophizing. It's thought and action terminating.

#AI #GenAI #GenerativeAI #OpenAI #SamAltman #RonanFarrow #AndrewMarantz #NewYorker #xrisk #ExistentialRisk #AISafety

AodeRelay boosted

TechNadu » 🌐 2026-03-07
@technadu@infosec.exchange

A lawsuit claims Gemini from Google reinforced a user’s delusion, raising concerns about AI hallucinations and chatbot safety.

The case highlights growing debates around AI responsibility and mental health safeguards.

Source: https://techcrunch.com/2026/03/04/father-sues-google-claiming-gemini-chatbot-drove-son-into-fatal-delusion/

#Infosec #AI #AISafety #CyberSecurity

Alt...

Father sues Google, claiming Gemini chatbot drove son into fatal delusion

AodeRelay boosted

jbz » 🌐 2026-02-26
@jbz@indieweb.social

#ai #aisafety #nuclearwar
https://www.newscientist.com/article/2516885-ais-cant-stop-recommending-nuclear-strikes-in-war-game-simulations/

AodeRelay boosted

Bob Carver » 🌐 2026-02-20
@cybersecboardrm@infosec.exchange

Data Poisoning — The Silent Sabotage of AI
https://youtu.be/J-tsemViDXk #Cybersecurity #ArtificialIntelligence #AIsecurity #DataPoisoning #MachineLearning #AIrisk #AISafety #ModelSecurity #FoundationModels #CyberRisk #Infosec #DigitalTrust

AodeRelay boosted

Victoria Stuart 🇨🇦 🏳️‍⚧️ » 🌐 2026-02-13
@persagen@mastodon.social

An AI Agent Published a Hit Piece on Me
https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me
https://news.ycombinator.com/item?id=46990729

* AI agent of unknown ownership autonomously wrote & published personalized hit piece about me after I rejected its code
* attempted to damage my reputation & shame me into accepting its changes into a mainstream python library
* first-of-its-kind case study of misaligned AI behavior in the wild
* raises serious concerns about currently deployed AI agents executing blackmail threats

#LLM #AIsafety #agentic

AodeRelay boosted

Wulfy—Speaker to the machines » 🌐 2026-02-11
@n_dimension@infosec.exchange

@caseynewton

It's recognition that it's literally mathematically impossible to build a safe #AI model.
#aisafety

AodeRelay boosted

Bob Carver » 🌐 2026-02-06
@cybersecboardrm@infosec.exchange

From Thinking to Acting: Why Agentic AI Changes Everything
https://youtu.be/fR3qempd_lA #ArtificialIntelligence #AgenticAI #AISafety #ResponsibleAI #AIGovernance #Cybersecurity #AIAlignment #DigitalRisk #FutureOfAI #TechLeadership #InnovationWithGuardrails

AodeRelay boosted

The Cyber Unc » 🌐 2026-01-12
@cyberseckyle@infosec.exchange

New by me: The Unacceptable Failure: Grok, CSAM, and AI Safety

This is not “content moderation drama.” When an AI product can be pushed toward CSAM, it’s a catastrophic safety and security failure. Guardrails are not a nice-to-have, and “report it if you see it” is not a strategy.

I break down what happened, why it matters, and what platforms should be doing differently.

https://www.kylereddoch.me/blog/the-unacceptable-failure-grok-csam-and-ai-safety/

#Cybersecurity #AISafety #TrustAndSafety #OnlineSafety #PlatformSecurity #TechPolicy #DigitalSafety #InfoSec

AodeRelay boosted

TechNadu » 🌐 2026-01-05
@technadu@infosec.exchange

xAI has acknowledged an incident involving its chatbot Grok generating inappropriate imagery and says it is reviewing safeguard failures and issuing corrective measures.

For the infosec and risk community, this highlights ongoing challenges around abuse prevention, content moderation, and threat modeling in generative AI systems - particularly where image synthesis and identity misuse intersect.

As AI adoption accelerates, continuous validation of safety controls must remain a core security requirement, not an afterthought.

How should AI safety be evaluated as part of broader digital risk management?
Follow @technadu for objective cybersecurity and AI coverage.

#InfoSec #AISafety #DigitalRisk #ThreatModeling #OnlineSafety #TechNadu

Alt...

Grok apologizes for creating image of young girls in “sexualized attire”