Skip to content

kostya/petc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

petc - PetCompiler

A small IR for building programming languages.

What is it?

  • Simple DSL over LLVM/QBE.
  • Your language AST -> petc IR -> [LLVM / QBE / C] -> binary.
  • ~25 stack-based opcodes.
  • Whole IR spec fits in 15 minutes of reading.
  • Compiles to native code via LLVM, QBE, or C.
  • Fast compilation, zero overhead.
  • ~5700 lines in Crystal.
  • Includes a C-subset compiler in ~230 lines of Python as a demo.

Why?

  • I was writing my own language and got tired of fighting with LLVM IR. SSA, phi nodes, basic blocks — X(.
  • Usually when you write your own language, you first build a parser and generate an AST. Then comes the hell stage — translating your AST into LLVM or another backend. Petc takes on all that complexity.
  • LLVM is complex.
  • Petc is simple and fun.
  • Stack-based opcodes are easy to emit from AST with a simple one-pass tree walk.
  • You're not locked into one backend. LLVM for speed, QBE for fast compiles, C for anywhere.

Current status

Alpha. But already powerful. All 3 backends work smoothly.

Ultimate goal

Beat LLVM (joke). Real goal: beat gcc :).

Benchmark:

Mandelbrot renderer from mandel.bf (by Erik Bosman). All IR represent the same program. Shows whether Petc adds overhead over direct backend usage. Running on Macbook M1 in benchmark/brainfuck-compiler.

IR Compiler IR size, Kb Compile time Run time
llvm clang(-O3) 1529 1303ms 638ms
petc petc-llvm(--release) 443 1092ms 613ms
qbe-ssa qbe + clang(as+linker) 345 241ms + 91ms 771ms
petc petc-qbe(--release) 443 756ms 786ms
c clang(-O3) 128 1420ms 638ms
petc petc-c(--release) 443 1436ms 636ms

Petc adds zero overhead over LLVM and C backends. QBE is an exception: ~400-700ms compile overhead due to suboptimal code generation (will be fixed by future peephole passes).

Install

Requires Crystal to compile the petc compiler.

Quick Start (compile and run first program).

echo 'FUNC main BODY PUSH "Hello petc\n" PRINTF 0 ENDFUNC' | crystal src/cli/llvm.cr r

Build

git clone https://github.com/kostya/petc
cd petc

# compile Petc C backend
crystal build src/cli/c.cr --release -o petc-c

# compile Petc LLVM backend
# requires LLVM >= 15.0, install it system wide or provide LLVM_CONFIG env variable
crystal build src/cli/llvm.cr --release -o petc-llvm 

# compile Petc Qbe backend
git clone https://github.com/kostya/qbe.git plugins/qbe
cd plugins/qbe; make; cd -
crystal build src/cli/qbe.cr --release -o petc-qbe

Usage

Usage: ./petc-llvm COMMAND [OPTIONS] INPUT [INPUT]* [OUTPUT]

Commands:

  compile|c  ; compile multiple .petc files into executable binary
             ;   ./petc-llvm c file.petc out
             ;   ./petc-llvm c --release *.petc out
             ;   cat file.petc | ./petc-llvm c --release out

  run|r      ; compile multiple .petc files and run the program
             ;   ./petc-llvm r file.petc
             ;   ./petc-llvm r --release file.petc
             ;   cat file.petc | ./petc-llvm r --release

  obj|o      ; compile one .petc file into object file (.o) for linking
             ;   ./petc-llvm o file.petc file.o
             ;   ./petc-llvm o --release file.petc file.o
             ;   cat file.petc | ./petc-llvm o --release file.o

  dump|d     ; output backend IR to console (for debugging and optimization analysis)
             ;   ./petc-llvm d file.petc
             ;   ./petc-llvm d --release file.petc
             ;   cat file.petc | ./petc-llvm d --release

  beautify|b ; format, validate, and add auto-comments to .petc files (prepare for commit)
             ;   ./petc-llvm b .
             ;   ./petc-llvm b src/
             ;   ./petc-llvm b file1.petc file2.petc

  version|v  ; display version information
             ;   ./petc-llvm version

OPTIONS:
  --release ; compile in performance mode (optimizations enabled)
  --target=TARGET   (TARGET: arm64, x86_64, x86, wasm32, ...; default: native)

petc IR

All opcodes self documented. Also see examples.

  • 19 main opcodes: PUSH, LOCAL, STORE, CALL, PARAM, BINARY, UNARY, RESULT, FIELD, DEREF, OFFSET, ADDR, AS, SELECT, MALLOC, CREATE, INSPECT, PRINTF, STACK
  • 6 Control flow: IF/THEN/ELSE, LOOP/INIT/COND/BODY/STEP, SWITCH/CASE, BREAK, NEXT, RET
  • Types: STRUCT, ENUM/VARIANT, FLAT + void, bool, i8..i64, u8..u64, f32, f64, ptr

Compile and run examples

./petc-llvm run --release examples/mandel.petc
./petc-llvm run --release examples/bf.petc
./petc-llvm run --release examples/fact.petc

Example: C compiler (small subset) with petc IR.

~230 lines of Python. Compiles a subset of C to petc IR via pycparser.

cd examples/c-compiler/
python3 -m venv py
source py/bin/activate
pip install pycparser

# run
python c2petc.py tests/03-loop.cc | ../../petc-llvm run --release 

# show llvm dump
python c2petc.py tests/03-loop.cc | ../../petc-llvm d

# show llvm optimized dump
python c2petc.py tests/03-loop.cc | ../../petc-llvm d --release

Example: Brainfuck compiler with petc IR.

cd benchmark/brainfuck-compiler
python3 bf2petc.py mandel.bf | ../../petc-llvm run --release

Example: factorial in petc IR, examples/fact.petc, translation

FUNC fact
  ARGS
    TYPE i32
  RETURN
    TYPE i32
  BODY
    PUSH 1
    PARAM 0
    BINARY less_eq
    IF
      THEN
        PUSH 1
        RESULT
        RET
    ENDIF
    PUSH 1
    PARAM 0
    BINARY sub
    CALL fact
    PARAM 0
    BINARY mul
    RESULT
ENDFUNC

FUNC main
  BODY
    PUSH 5
    CALL fact
    INSPECT
ENDFUNC
LLVM Backend `./petc-llvm dump examples/fact.petc`
; ModuleID = 'fact'
source_filename = "fact"
target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-n32:64-S128-Fn32"
target triple = "arm64-apple-darwin23.3.0"

@str = private constant [15 x i8] c"fact(%d) = %d\0A\00"

define i32 @fact(i32 %0) {
alloca:
  %__petc_result = alloca i32, align 4
  br label %body

body:                                             ; preds = %alloca
  %1 = icmp sle i32 %0, 1
  br i1 %1, label %then, label %endif

ret:                                              ; preds = %endif, %then
  %2 = load i32, ptr %__petc_result, align 4
  ret i32 %2

then:                                             ; preds = %body
  store i32 1, ptr %__petc_result, align 4
  br label %ret

endif:                                            ; preds = %body
  %3 = sub i32 %0, 1
  %4 = call i32 @fact(i32 %3)
  %5 = mul i32 %0, %4
  store i32 %5, ptr %__petc_result, align 4
  br label %ret
}

define void @main() {
alloca:
  br label %body

body:                                             ; preds = %alloca
  %0 = call i32 @fact(i32 5)
  %1 = call i32 (ptr, ...) @printf(ptr @str, i32 5, i32 %0)
  br label %ret

ret:                                              ; preds = %body
  ret void
}

declare i32 @printf(ptr, ...)
QBE Backend `./petc-qbe dump examples/fact.petc`
data $str_0 = { b "fact(%d) = %d\n", b 0 }
export function w $fact(w %arg0) {
@start
  %__petc_result =l alloc8 4
  jmp @body
@body
  %t1 =w cslew %arg0, 1
  jnz %t1, @then_1, @endif_2
@then_1
  storew 1, %__petc_result
  jmp @ret
@endif_2
  %t2 =w sub %arg0, 1
  %t3 =w call $fact(w %t2)
  %t4 =w mul %arg0, %t3
  storew %t4, %__petc_result
  jmp @ret
@ret
  %ret_val =w loadw %__petc_result
  ret %ret_val
}

export function  $main() {
@start
  jmp @body
@body
  %t1 =w call $fact(w 5)
  %t2 =w call $printf(l $str_0, ..., w 5, w %t1)
  jmp @ret
@ret
  ret
}
C Backend `./petc-c dump examples/fact.petc`
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <inttypes.h>

int32_t fact(int32_t arg0);
void main();
int32_t fact(int32_t arg0) {
  int32_t __petc_result;
  int t1 = arg0 <= 1;
  if (t1) goto then_1; else goto endif_2;

then_1:;
  __petc_result = 1;
  goto ret;

endif_2:;
  int32_t t2 = arg0 - 1;
  int32_t t3 = fact(t2);
  int32_t t4 = arg0 * t3;
  __petc_result = t4;
  goto ret;

ret:;
  return __petc_result;
}
void main() {
  int32_t t5 = fact(5);
  int32_t t6 = printf("fact(%d) = %d\n", 5, t5);
  goto ret;

ret:;
  return;
}

Run tests

crystal spec

License

Licensed under the Apache License, Version 2.0.

Thanks

About

PetCompiler - A small IR for building programming languages.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors