Skip to content

udem-dlteam/pnut

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ₯œ Pnut: A Self-Compiling C Transpiler Targeting Human-Readable POSIX Shell

Pnut compiles a reasonably large subset of C99 to human-readable POSIX shell scripts. It can be used to generate portable shell scripts without having to write shell. Try the web version!

Its main uses are:

  • As a transpiler to write portable shell scripts in C.
  • As a way to bootstrap a compiler written in C with an executable version that is still human readable (See reproducible builds).

Main features:

  • No new language to learn -- C code in, shell code out.
  • The human-readable shell script is easy to read and understand.
  • A runtime library including file I/O and dynamic memory allocations.
  • A preprocessor (#include, #ifdef, #if, #define MACRO ..., #define MACRO_F(x) ...).
  • Integrates easily with existing shell scripts.

The examples directory contains many examples. We invite you take a look!

Other than being able to compile itself, Pnut can also compile the Ribbit Virtual Machine which can run a R4RS Scheme Read-eval-print loop directly in shell. See repl.sh for the generated shell script.

Install

Pnut can be distributed as the pnut-sh.sh shell script, or compiled to executable code using a C compiler. Pregenerated shell scripts can be found on the GitHub releases page.

To compile and install pnut:

> git clone https://github.com/udem-dlteam/pnut.git
> cd pnut
> sudo make install DESTDIR=/usr/local

This installs both pnut-sh.sh and pnut to /usr/local/bin/.

Pnut also support a native code backend that generates executable code (x86 Linux and MacOS for now), which we call pnut-exe. To install pnut-exe and its shell version pnut-exe.sh:

> sudo make install-pnut-exe DESTDIR=/usr/local

Compilation Options

Compilation options can be used to change the generated shell script:

  • SH_ANNOTATE=1 includes the original C code in the generated shell script.
  • SH_COMPACT_RT=1 reduces the size of the runtime library at the cost of reduced I/O performance.
  • SH_FAST=1 make pnut-sh generate faster shell code by using a faster calling convention.
  • MINIMAL=1 support only the minimal set of C features required to bootstrap pnut, reducing the size of pnut-sh.sh and pnut-exe.sh.

They can be set using make install SH_ANNOTATE=1 ....

How to Use

The pnut compiler takes a C file path as input, and outputs to stdout the POSIX shell code.

Here's an example of how to compile a C file using Pnut:

> pnut-sh.sh examples/fib.c > fib.sh # Compile fib.c to a shell script
> chmod +x fib.sh                    # Make the shell script executable
> ./fib.sh                           # Run the shell script

C Language Support

Unfortunately, certains C constructs don't map nicely to POSIX shell which means:

  • No support for floating point numbers and unsigned integers.
  • goto and switch fallthrough are not supported.
  • The address of (&) operator on local variables is not supported.
  • Arrays and structures cannot be stack-allocated or passed by value.
  • Function pointers and indirect calls.
  • Arrays of aggregate types (structures or arrays) are not supported.

Mixing C and Shell Code

The #include "file.sh" directive can be used to include shell code in the generated shell script. This makes it possible to call system utilities from C code, or to use shell scripts generated by Pnut as a library. See select-file.c and posix-utils.sh for how to use this feature.

Which shell to use

Because Pnut generates purely POSIX shell code, the generated shell scripts can run on any POSIX compliant shell. However, certain shells are faster than others. For faster executions, we recommend the use of ksh, dash or bash. zsh is also supported but tends to be slower on large programs.

Reproducible Builds

Because Pnut can be distributed as a human-readable shell script (pnut-sh.sh), it can serve as the basis for a reproducible build system. Along with a POSIX compliant shell, pnut-sh.sh is sufficient to bootstrap pnut-exe, which can then be used to bootstrap TCC. Because TCC can be used to compile GCC, this makes it possible to bootstrap a fully featured build toolchain from only human-readable source files and a POSIX shell.

Because pnut-sh.sh is not directly capable of compiling TCC, the pnut-exe compiler is used as an intermediate step. pnut-exe is implemented in the C subset supported by pnut-sh, while supporting a large enough subset of C99 to compile TCC. Because pnut-exe doesn't require much C support, it is also compatible with other minimal compilers, such as M2-Planet, allowing its use in a variety of bootstrapping scenarios.

To bootstrap tcc from pnut-sh.sh, the following steps are taken:

  1. Compile pnut-exe.c to pnut-exe.sh using pnut-sh.sh. pnut-exe.sh is a shell script that turns C code into machine code.
  2. Compile pnut-exe.c to pnut-exe using pnut-exe.sh. This version of pnut-exe is an executable and is much faster.
  3. Compile the kit/bintools.c using pnut-exe to produce binary utilities used for bootstrapping.
  4. Compile TCC using pnut-exe, then recompile it with TCC (a few times) to get the final tcc executable.

The ./kit/setup-rootfs.sh script can be used to create an isolated environment where the first 3 steps can be performed:

# Make jammed.sh archive (self-extracting shell script)
> make kit/jammed.sh
# Setup isolated root filesystem in "island" directory
> ./kit/setup-rootfs.sh --dir "island" --path-to-jammed kit/jammed.sh --include-utils

To enter the isolated environment and run the bootstrap process, use:

# Enter the chroot environment
> sudo chroot island /bin/bash
# List files in the chroot
bash-5.3$ . ls.sh
# Extract files to initiate bootstrap
bash-5.3$ . jammed.sh
# Run the bootstrap process from the extracted files
bash-5.3$ . bootstrap.sh

After the bootstrap.sh script finishes, the pnut-exe executable will be installed in /usr/bin/, along with the following utilities: chmod, cp, mkdir, sha256sum, simple-patch, ungz. These tools can then be used to compile TCC from source, like how it is done in live-bootstrap. Work to extend the bootstrap script to include TCC and GCC is ongoing.

Annotated Shell Scripts

pnut-sh can include C code annotations in the generated shell scripts (with the SH_ANNOTATE=1 makefile option) to make them self-contained and easier to audit. These annotations correspond to the original C source code, with the lines inside inactive #if/#ifdef blocks removed, with each top-level shell declaration prefixed with its corresponding C code as comment.

This feature is especially interesting when applied to pnut-sh.sh, as it turns it into a "quine-like" program: from the annotated shell script, the original C source code can be extracted and recompiled to obtain the original pnut-sh.sh shell script, closing the loop and confirming that the generated shell script matches the embedded C code. This can be done with the following commands:

# Generate pnut-sh.sh with annotations
> make pnut-sh.sh SH_ANNOTATE=1
# Extract C code
> /bin/sh build/pnut-sh.sh -C build/pnut-sh.sh > build/pnut-sh.c
# Recompile C code
> /bin/sh build/pnut-sh.sh build/pnut-sh.c > build/pnut-sh-from-annotations.sh
# Verify both shell scripts match
> diff build/pnut-sh.sh build/pnut-sh-from-annotations.sh

A similar process can be done with pnut-exe.sh, this time producing an executable version of pnut-exe:

# Generate pnut-exe.sh with annotations
> make pnut-exe.sh SH_ANNOTATE=1
# Extract C code
> /bin/sh build/pnut-exe.sh -C build/pnut-exe.sh > build/pnut-exe.c
# Recompile C
> /bin/sh build/pnut-exe.sh build/pnut-exe.c > build/pnut-exe-from-annotations
# Let's compare with the "regular" bootstrapped version
> make pnut-exe-bootstrapped
# Verify both executables match
> diff build/pnut-exe-from-annotations build/pnut-exe-bootstrapped

Documentation

You can find my slides from my presentation at the Software Language Engineering (SLE24) conference and the paper the presentation is based in the doc/ directory.

Contributing

Pnut is a research project and contributions are welcome. Please open an issue to report any bugs or to discuss new features.

To make sure your changes are good, a good practice is to try to compile pnut with itself. This can be done with:

  1. make bootstrap-pnut-sh BOOTSTRAP_SHELL=/bin/sh: to bootstrap pnut-sh.sh with pnut-sh
  2. make bootstrap-pnut-exe-script BOOTSTRAP_SHELL=/bin/sh: to bootstrap pnut-exe.sh with pnut-sh
  3. make bootstrap-pnut-exe BOOTSTRAP_SHELL=/bin/sh: to bootstrap pnut-exe with pnut-exe.sh
  4. make bootstrap-pnut-exe-no-shell: to bootstrap pnut-exe with pnut-exe

See make bootstrap-help for more details.

About

πŸ₯œ A Self-Compiling C Transpiler Targeting Human-Readable POSIX Shell

Resources

License

Stars

Watchers

Forks

Packages

No packages published