linkhut

Sort by:

Order:

05 Sep 25

How big are our embeddings now and why? • Buttondown

https://newsletter.vickiboykis.com/archive/how-big-are-our-embeddings-now-and-why/

via: https://lobste.rs/s/jdqoem/how_big_are_our_embeddings_now_why

by silas 5 months ago

Tags:

04 Sep 25

lechmazur/generalization: Thematic Generalization Benchmark

https://github.com/lechmazur/generalization

Measures how effectively various LLMs can infer a narrow or specific “theme” (category/rule) from a small set of examples and anti-examples, then detect which item truly fits that theme among a collection of misleading candidates.

by cos 5 months ago

Tags:

lechmazur/step_game: Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure

https://github.com/lechmazur/step_game

A multi-player “step-race” that challenges LLMs to engage in public conversation before secretly picking a move (1, 3, or 5 steps). Whenever two or more players choose the same number, all colliding players fail to advance.

by cos 5 months ago

Tags:

02 Sep 25

Selective Temporal Training

https://www.robinsloan.com/lab/selective-temporal-training/

by silas 5 months ago

Tags:

ai
llm

What's an old AI model worth?

https://www.robinsloan.com/lab/old-models-2/

by silas 5 months ago

Tags:

Vibe Coding Terminal Editor

https://matklad.github.io/2025/08/31/vibe-coding-terminal-editor.html

by silas 5 months ago

Tags:

31 Aug 25

The perils of vibe coding

https://simonwillison.net/2025/Aug/29/the-perils-of-vibe-coding/

by silas 5 months ago

Tags:

ai
llm

The Future of Forums is Lies, I Guess

https://aphyr.com/posts/389-the-future-of-forums-is-lies-i-guess

by silas 5 months ago saved 3 times

Tags:

30 Aug 25

Cerebras

https://www.cerebras.ai/

by shubxam 5 months ago

Tags:

28 Aug 25

Piloting Claude for Chrome

https://simonwillison.net/2025/Aug/26/piloting-claude-for-chrome/

by silas 5 months ago

Tags:

ai
llm

27 Aug 25

Texts as Toys

https://contraptions.venkateshrao.com/p/texts-as-toys

by silas 5 months ago

Tags:

26 Aug 25

DeepSeek 3.1

https://simonwillison.net/2025/Aug/22/deepseek-31/

by silas 5 months ago

Tags:

ai
llm

20 Aug 25

Import AI 425: iPhone video generation; subtle misalignment; making open weight models safe through surgical deletion | Import AI

https://jack-clark.net/2025/08/18/import-ai-425-iphone-video-generation-subtle-misalignment-making-open-weight-models-safe-through-surgical-deletion/

by silas 5 months ago

Tags:

llm
ai

due diligence

https://blog.ayjay.org/due-diligence/

by silas 5 months ago

Tags:

a word to my students

https://blog.ayjay.org/a-word-to-my-students/

by silas 5 months ago

Tags:

AGENTS.md

https://agents.md/

Finally, some convergence. Weird that OpenCode is not mentioned, though.

by dvejmz 5 months ago saved 2 times

Tags:

AI is impressive because we’ve failed at semantic web and personal computing | exotext

https://rakhim.exotext.com/ai-is-impressive-because-we-ve-failed-at-semantic-web-and-personal-computing

by silas 5 months ago

Tags:

ai
llm

My Workflow to Review Articles with LLMs

https://www.binwang.me/2025-08-15-My-Workflow-to-Review-Articles-with-LLMs.html

by silas 5 months ago

Tags:

llm

AGENTS.md

https://agents.md/

Finally, some convergence. Weird that OpenCode is not mentioned, though.

by sebastien 5 months ago saved 2 times

Tags:

19 Aug 25

gpt-oss is not for developers. It’s for agents. | Tigris Object Storage

https://www.tigrisdata.com/blog/gpt-oss/

Discover why OpenAI’s gpt-oss model family is ideal for building reliable and safe AI agents, not just for developers.

by chrisSt 5 months ago

Tags:

❮ Previous 1 … 7 8 9 … 21 Next ❯