The functions in this project leverage the properties of C unions to improve readability and reasoning for a simple UTF-8 codec algorithm.
Unlike structs, unions have their members sharing a single memory space the size of the biggest member. Since you can create unions with any type, an array of smaller integers can span the space of a bigger integer.
This is the main idea implemented here. A 32-bit integer — codepoint — is logically separated into four 8-bit integers — octets. This allows us to pick the full information or only what's useful to the algorithm, thus simplifying bit manipulations and making the code much more readable and intuitive.