It's kind of surprising to me that "basically LZ77" -- copying bytes from earlier in the decompressed text -- is still approximately the best model people have for general-purpose data compression.
Simpson's paradox variant:
(1, A, red, soft)
(1, A, red, hard)
(2, B, red, soft)
(10, A, red, hard)
(9, B, blue, hard)
(10, A, blue, hard)
On average, As have the same value as Bs. Among either red or blue things, As are higher; among either hard or soft things, Bs are higher.
"[program] consists of a single small executable without external dependencies."
It's too bad that when I see a statement like that I can be almost certain that a program is written in Go. I wish this was the norm.
Why is CRC called cyclic when it's computed for an input of arbitrary length? Doesn't "cyclic" only apply for n-bit inputs, when the generator polynomial divides x^n - 1?
Why is it that (linear block) erasure codes are easy -- just choose any appropriate matrix over GF(2^n) -- but general error-correcting codes are much trickier to implement efficiently, and more restricted?
All write-optimized data structures are alike; each read-optimized data structure is optimized in its own way.
Happy to see that GFNI (specifically GF2P8AFFINEQB) will be supported on AVX10 chips!
Proposal: Rename "obstruction-free"/"lock-free"/"wait-free" to "deadlock-free"/"livelock-free"/"starvation-free".
Even if you're not explicitly using index notation, I think I'll have a much easier time reading matrix indexing if you use lower/upper indices where applicable to indicate columns/rows, rather than lower indices everywhere. Just write "A_i^j" to indicate ith column, jth row!