20 Oct 25
A technique for deterministic labeling from stochastic models, with benchmarked Golang implementation.
13 Oct 25
Wikipedia’s high-level guidance noting some ways to identify LLM writing
11 Oct 25
09 Oct 25
03 Oct 25
Claude Code is the most delightful AI agent/workflow I have used so far. Not only does it make targeted edits or vibe coding throwaway tools less annoying, …
01 Oct 25
Wikipedia’s high-level guidance noting some ways to identify LLM writing
30 Sep 25
Wikipedia’s high-level guidance noting some ways to identify LLM writing
25 Sep 25
via: https://news.ycombinator.com/item?id=45362813
22 Sep 25
In a landmark study, OpenAI researchers reveal that large language models will always produce plausible but false outputs, even with perfect data, due to fundamental statistical and computational limits.
20 Sep 25
19 Sep 25
15 Sep 25
This leaderboard ranks LLMs based on their performance in Valyrian Games competitions. Models are ranked using the TrueSkill rating system, which accounts for win/loss records and the relative skill of opponents.
Benchmark of smartphone interaction for LLMs and multi-agent systems
14 Sep 25
13 Sep 25
11 Sep 25
A map/reduce workflow for LLMs, with what looks like local caching. To me build systems and data processing pipelines like this one have a big intersection.