linkhut

10 Jul 25

Bit Twiddling Hacks

https://graphics.stanford.edu/~seander/bithacks.html

Some lovely, old-Internet notes about interesting bit hackery you can use for all kinds of purposes. I really like these for implementing “scalar SIMD” (i.e. treating 64-bit integers as 8x8-bit integer vectors) optimizations.

by bal4e 7 months ago saved 2 times

Tags:

optimization

27 Jun 25

OKSolar

https://meat.io/oksolar

An interesting variant of Solarized that feels more uniform in places. The idea of custom background colors, and tweaking things following OKLab, is quite interesting.

by bal4e 7 months ago

Tags:

02 May 25

Compiler Optimizations Are Hard Because They Forget

https://faultlore.com/blah/oops-that-was-important/

A fun read describing an important (and seemingly inherent) difficulty of compiler optimization. E-graphs get around this, but IIRC are significantly slower.

by bal4e 9 months ago saved 3 times

Tags:

30 Apr 25

Co-dfns versus BQN's implementation

https://mlochbaum.github.io/BQN/implementation/codfns.html

A really interesting discussion of array-oriented compilation architectures. Tries to answer the same sorts of questions I’ve been asking myself about making compilers faster – although I think I come to different conclusions right now.

by bal4e 9 months ago

Tags:

21 Apr 25

The Futhark Programming Language

https://futhark-lang.org/

For me, fills a hole between APL (high-performance CPU/GPU array manipulation) and Rust (strong type checking). Super interesting stuff.

by bal4e 9 months ago saved 2 times

Tags:

20 Apr 25

Keyoxide

https://keyoxide.org/

Interesting concept, and they seem to have a good number of integrations already. It was a bit hard to find the actual documentation for how it works, but it otherwise looks cool.

by bal4e 9 months ago saved 3 times

Tags:

A Data Parallel Compiler Hosted on the GPU

https://web.archive.org/web/20240706132515/https://scholarworks.iu.edu/dspace/bitstream/2022/24749/1/Hsu%20Dissertation.pdf

Amazing dissertation by Aaron Wen-yao Hsu, demonstrating a novel memory layout for ASTs and advocating for bottom-up traversal patterns. The fact that the compiler is just 17 lines of APL (which I can’t read in the slightest) is even cooler.

by bal4e 9 months ago

Tags:

RVSDG: An Intermediate Representation for Optimizing Compilers

https://arxiv.org/abs/1912.05036

I’ve been looking for a good replacement to SSA form for mid-level and low-level optimisations. This feels like the right direction forward, but I need to try it myself before I’m convinced.

by bal4e 9 months ago

Tags:

Efficient E-Matching for Super Optimizers

https://blog.vortan.dev/ematching/

Discussion of super-optimization based on equality graphs and finding equivalent expressions. Contrast with https://egraphs-good.github.io/.

by bal4e 9 months ago saved 2 times

Tags:

Deus Lex Machina

https://validark.dev/posts/deus-lex-machina/

Interesting study of vectorizing the tokenization of a complex language.

by bal4e 9 months ago

Tags:

12 Apr 25

Battling the Prefetcher: Exploring Coffee Lake (Part 1)

https://www.abertschi.ch/blog/2022/prefetching/

Lovely visual summaries of how prefetching works on modern Intel chips. I would guess that older chips follow similar patterns. Great for designing data structures and optimizing memory traversal.

by bal4e 10 months ago saved 2 times

Tags:

A Novel Hybrid Quicksort Algorithm Vectorized using AVX-512 on Intel Skylake

https://ar5iv.labs.arxiv.org/html/1704.08579

I’ve thought about vectorized sorting for a while, but I didn’t know that bitonic sorts can be implemented efficiently on SIMD registers. This is an interesting approach.

by bal4e 10 months ago