Pnut compiles a reasonably large subset of C99 to human-readable POSIX shell scripts. It can be used to generate portable shell scripts without having to write shell. Try the web version!
Its main uses are:
- As a transpiler to write portable shell scripts in C.
- As a way to bootstrap a compiler written in C with an executable version that is still human readable (See reproducible builds).
Main features:
- No new language to learn -- C code in, shell code out.
- The human-readable shell script is easy to read and understand.
- A runtime library including file I/O and dynamic memory allocations.
- A preprocessor (
#include,#ifdef,#if,#define MACRO ...,#define MACRO_F(x) ...). - Integrates easily with existing shell scripts.
The examples directory contains many examples. We invite you take a look!
Other than being able to compile itself, Pnut can also compile the Ribbit Virtual Machine which can run a R4RS Scheme Read-eval-print loop directly in shell. See repl.sh for the generated shell script.
Pnut can be distributed as the pnut-sh.sh shell script, or compiled to
executable code using a C compiler. Pregenerated shell scripts can be found on
the GitHub releases page.
To compile and install pnut:
> git clone https://github.com/udem-dlteam/pnut.git
> cd pnut
> sudo make install DESTDIR=/usr/localThis installs both pnut-sh.sh and pnut to /usr/local/bin/.
Pnut also support a native code backend that generates executable code (x86
Linux and MacOS for now), which we call pnut-exe. To install pnut-exe and
its shell version pnut-exe.sh:
> sudo make install-pnut-exe DESTDIR=/usr/localCompilation options can be used to change the generated shell script:
SH_ANNOTATE=1includes the original C code in the generated shell script.SH_COMPACT_RT=1reduces the size of the runtime library at the cost of reduced I/O performance.SH_FAST=1make pnut-sh generate faster shell code by using a faster calling convention.MINIMAL=1support only the minimal set of C features required to bootstrap pnut, reducing the size ofpnut-sh.shandpnut-exe.sh.
They can be set using make install SH_ANNOTATE=1 ....
The pnut compiler takes a C file path as input, and outputs to stdout the
POSIX shell code.
Here's an example of how to compile a C file using Pnut:
> pnut-sh.sh examples/fib.c > fib.sh # Compile fib.c to a shell script
> chmod +x fib.sh # Make the shell script executable
> ./fib.sh # Run the shell scriptUnfortunately, certains C constructs don't map nicely to POSIX shell which means:
- No support for floating point numbers and unsigned integers.
gotoandswitchfallthrough are not supported.- The address of (
&) operator on local variables is not supported. - Arrays and structures cannot be stack-allocated or passed by value.
- Function pointers and indirect calls.
- Arrays of aggregate types (structures or arrays) are not supported.
The #include "file.sh" directive can be used to include shell code in
the generated shell script. This makes it possible to call system utilities from
C code, or to use shell scripts generated by Pnut as a library. See
select-file.c and posix-utils.sh
for how to use this feature.
Because Pnut generates purely POSIX shell code, the generated shell scripts can
run on any POSIX compliant shell. However, certain shells are faster than
others. For faster executions, we recommend the use of ksh, dash or bash.
zsh is also supported but tends to be slower on large programs.
Because Pnut can be distributed as a human-readable shell script (pnut-sh.sh),
it can serve as the basis for a reproducible build system. Along with a POSIX
compliant shell, pnut-sh.sh is sufficient to bootstrap pnut-exe, which can
then be used to bootstrap TCC. Because TCC can be
used to compile GCC, this makes it possible to bootstrap a fully featured
build toolchain from only human-readable source files and a POSIX shell.
Because pnut-sh.sh is not directly capable of compiling TCC, the pnut-exe
compiler is used as an intermediate step. pnut-exe is implemented in the C
subset supported by pnut-sh, while supporting a large enough subset of C99 to
compile TCC. Because pnut-exe doesn't require much C support, it is also
compatible with other minimal compilers, such as
M2-Planet, allowing its use in a variety
of bootstrapping scenarios.
To bootstrap tcc from pnut-sh.sh, the following steps are taken:
- Compile
pnut-exe.ctopnut-exe.shusingpnut-sh.sh.pnut-exe.shis a shell script that turns C code into machine code. - Compile
pnut-exe.ctopnut-exeusingpnut-exe.sh. This version ofpnut-exeis an executable and is much faster. - Compile the
kit/bintools.cusingpnut-exeto produce binary utilities used for bootstrapping. - Compile TCC using
pnut-exe, then recompile it with TCC (a few times) to get the finaltccexecutable.
The ./kit/setup-rootfs.sh script can be used to create an isolated environment
where the first 3 steps can be performed:
# Make jammed.sh archive (self-extracting shell script)
> make kit/jammed.sh
# Setup isolated root filesystem in "island" directory
> ./kit/setup-rootfs.sh --dir "island" --path-to-jammed kit/jammed.sh --include-utilsTo enter the isolated environment and run the bootstrap process, use:
# Enter the chroot environment
> sudo chroot island /bin/bash
# List files in the chroot
bash-5.3$ . ls.sh
# Extract files to initiate bootstrap
bash-5.3$ . jammed.sh
# Run the bootstrap process from the extracted files
bash-5.3$ . bootstrap.shAfter the bootstrap.sh script finishes, the pnut-exe executable will be
installed in /usr/bin/, along with the following utilities: chmod, cp,
mkdir, sha256sum, simple-patch, ungz. These tools can then be used to
compile TCC from source, like how it is done in
live-bootstrap. Work to extend
the bootstrap script to include TCC and GCC is ongoing.
pnut-sh can include C code annotations in the generated shell scripts (with
the SH_ANNOTATE=1 makefile option) to make them self-contained and easier to
audit. These annotations correspond to the original C source code, with the
lines inside inactive #if/#ifdef blocks removed, with each top-level shell
declaration prefixed with its corresponding C code as comment.
This feature is especially interesting when applied to pnut-sh.sh, as it turns
it into a "quine-like" program: from the annotated shell script, the original C
source code can be extracted and recompiled to obtain the original pnut-sh.sh
shell script, closing the loop and confirming that the generated shell script
matches the embedded C code. This can be done with the following commands:
# Generate pnut-sh.sh with annotations
> make pnut-sh.sh SH_ANNOTATE=1
# Extract C code
> /bin/sh build/pnut-sh.sh -C build/pnut-sh.sh > build/pnut-sh.c
# Recompile C code
> /bin/sh build/pnut-sh.sh build/pnut-sh.c > build/pnut-sh-from-annotations.sh
# Verify both shell scripts match
> diff build/pnut-sh.sh build/pnut-sh-from-annotations.shA similar process can be done with pnut-exe.sh, this time producing an
executable version of pnut-exe:
# Generate pnut-exe.sh with annotations
> make pnut-exe.sh SH_ANNOTATE=1
# Extract C code
> /bin/sh build/pnut-exe.sh -C build/pnut-exe.sh > build/pnut-exe.c
# Recompile C
> /bin/sh build/pnut-exe.sh build/pnut-exe.c > build/pnut-exe-from-annotations
# Let's compare with the "regular" bootstrapped version
> make pnut-exe-bootstrapped
# Verify both executables match
> diff build/pnut-exe-from-annotations build/pnut-exe-bootstrappedYou can find my slides from my presentation at the Software Language Engineering
(SLE24) conference and the paper the presentation is based in the doc/
directory.
Pnut is a research project and contributions are welcome. Please open an issue to report any bugs or to discuss new features.
To make sure your changes are good, a good practice is to try to compile pnut with itself. This can be done with:
make bootstrap-pnut-sh BOOTSTRAP_SHELL=/bin/sh: to bootstrappnut-sh.shwithpnut-shmake bootstrap-pnut-exe-script BOOTSTRAP_SHELL=/bin/sh: to bootstrappnut-exe.shwithpnut-shmake bootstrap-pnut-exe BOOTSTRAP_SHELL=/bin/sh: to bootstrappnut-exewithpnut-exe.shmake bootstrap-pnut-exe-no-shell: to bootstrappnut-exewithpnut-exe
See make bootstrap-help for more details.