LLVM-first

Skunk Language Reference

Skunk is a human-designed, AI-implemented experimental programming language. This reference covers the implemented language surface in the current compiler/runtime.

Project Shape

Skunk is a language-design project and compiler playground targeting LLVM.

Safety Note

Skunk is experimental and should not be used for critical, safety-sensitive, or high-reliability software.

Status

The primary execution path is native compilation through LLVM and clang. The repository still contains legacy interpreter code, but the compiler/runtime path is the main focus.

  • Source-level syntax is defined in src/grammar.pest.
  • Implemented behavior is backed by parser, type-checker, source-loader, and compiler tests.
  • This page documents the currently implemented language surface rather than future ideas.

Compiler Notebook

If you are new to compilers or new to LLVM, the repository now includes a slower, beginner-oriented guide to how Skunk is built.

Toolchain

Build Skunk with Rust, then compile a Skunk entry file into a native executable.

cargo build
cargo run -- compile path/to/main.skunk ./out
./out

The legacy interpreter path still exists while the codebase transitions further toward compiler-only execution.

cargo run -- path/to/main.skunk

Programs and Modules

Skunk supports multi-file programs through module, import, and export.

module app.math;

function helper(n: int): int {
    return n + 1;
}

export function inc(n: int): int {
    return helper(n);
}
import app.math;

function main(): void {
    print(inc(41));
}
  • import app.math; resolves to app/math.skunk relative to the entry file directory.
  • Imported files must declare the matching module name.
  • If a module uses export, only exported top-level declarations are visible to importers.
  • If a module uses no export, current behavior stays all-public for compatibility.

Types

Skunk currently supports primitive types, fixed arrays, slices, safe references, raw pointers, function types, structs, enums, and generic instantiations.

Form Meaning
byte, short, int, long Signed integer primitives
float, double Floating-point primitives
boolean, char, string, void Core built-in types
[N]T Fixed-size value array
[]T Slice view over contiguous elements
&T, &mut T Safe shared or mutable reference to one value
*T Raw pointer to one value for unsafe memory operations
*const T, []const T Read-only pointer or slice view
(A, B) -> C Function type
Box[int] Generic type instantiation

Allocator and Arena are built-in runtime types used for explicit memory management.

Bindings and Const

Skunk distinguishes between const bindings and const views.

const answer: int = 42;

function copy_into(const dst: []int, src: []const int): void {
    for (i: int = 0; i < src.len; i = i + 1) {
        dst[i] = src[i];
    }
}
  • const name: T makes the binding non-reassignable.
  • []const T and *const T make the viewed elements or pointee read-only.
  • []T is assignable to []const T, but not the other way around.
  • const dst: []int still allows dst[i] = ... because the binding is const, not the slice contents.

Functions and Control Flow

Skunk supports named functions, lambdas, closures, if, for, return, and block scoping.

function add(a: int, b: int): int {
    return a + b;
}

function main(): void {
    total: int = add(5, 7);

    if (total > 10) {
        print(total);
    }

    counter: () -> int = function(): int {
        total = total + 1;
        return total;
    };

    print(counter());
}

Closures can capture and mutate surrounding locals. Function values can be stored, returned, and passed as arguments.

Arrays and Slices

Fixed arrays use value semantics. Slices are views over contiguous storage.

a: [4]int;
b: [4]int = [4]int::fill(7);
c: [4]int = [1, 2, 3, 4];

mid: []const int = c[1:3];

print(a[0]);
print(b.len);
print(mid[0]);
  • [N]T without an initializer is zero-initialized.
  • [N]T::fill(value) fills every element with the given value.
  • Slices support indexing, .len, and range slicing with omitted bounds.
  • Fixed arrays can be passed and returned by value.

Structs and Attached Behavior

Structs are data-only product types. Behavior lives in separate attach blocks.

struct Counter {
    const seed: int;
    value: int;
}

attach Counter {
    function new(seed: int, value: int): Counter {
        return Counter { seed: seed, value: value };
    }

    function bump(mut self): void {
        self.value = self.value + 1;
    }

    function get(self): int {
        return self.value;
    }
}

function main(): void {
    counter: Counter = Counter::new(1, 4);
    counter.bump();
    print(counter.seed);
    print(counter.get());
}

Receiver mutability is explicit:

  • attach Type { ... } adds inherent methods to a type without declaring trait conformance.
  • Attached functions without self are called with Type::name(...) and work well for constructors and factories.
  • Struct fields may be declared const; they may be initialized but not reassigned later.
  • self means the method may read but may not mutate receiver state.
  • mut self means the method may mutate receiver state.
  • *const T may call only read-only self methods.

Pointers, Allocators, and Arenas

Skunk uses explicit allocation. Plain values use value semantics; safe borrows use &T and &mut T; allocator-backed single objects use *T; allocator-backed buffers use []T.

struct Point {
    x: int;
    y: int;
}

function make_point(alloc: Allocator): *Point {
    point: *Point = Point::create(alloc);
    point.x = 3;
    point.y = 4;
    return point;
}

function main(): void {
    system_alloc: Allocator = System::allocator();
    arena: Arena = Arena::init(system_alloc);
    arena_alloc: Allocator = arena.allocator();

    point: *Point = make_point(arena_alloc);
    values: []int = []int::alloc(arena_alloc, 8);

    print(point.x + values.len);

    arena.deinit();
}
  • System::allocator() returns the system allocator handle.
  • T::create(alloc) allocates one object and returns *T.
  • []T::alloc(alloc, len) allocates a slice buffer.
  • alloc.destroy(ptr) releases a pointer allocation.
  • alloc.free(slice) releases a slice allocation.
  • Arena::init(backing), arena.allocator(), arena.reset(), and arena.deinit() provide arena-style lifetime management.
  • &T shares read-only access and &mut T grants checked mutable access without entering unsafe.
  • Field access like point.x and method calls like point.bump() still auto-deref ordinary typed pointers.

Unsafe Memory

Skunk now has a small unsafe memory layer for low-level pointer work. These operations must appear inside an unsafe { ... } block.

function main(): void {
    value: int = 41;
    other: int = 0;

    unsafe {
        ptr: *int = &value;
        print(ptr.*);
        ptr.* = 42;

        bytes: [4]byte;
        second: *byte = *byte::offset(&bytes[0], 1);
        second.* = 9;

        Memory::set(&bytes[0], 7, 4);
        Memory::copy(*byte::cast(&other), *byte::cast(&value), int::size_of());
    }

    print(int::size_of());
    print(int::align_of());
}
  • unsafe { ... } enables low-level operations the compiler cannot verify as memory-safe.
  • &expr and &mut expr create safe references by default.
  • When a raw pointer is explicitly expected, the same address-of syntax feeds unsafe pointer operations such as *T::cast, *byte::offset, and Memory::copy.
  • ptr.* explicitly dereferences a pointer value.
  • T::size_of() and T::align_of() are safe compile-time layout queries.
  • *T::cast(ptr) reinterprets one pointer type as another pointer type.
  • *byte::offset(ptr, n) performs byte-wise pointer offsetting.
  • Memory::copy(dst, src, count) and Memory::set(dst, value, count) operate on raw bytes.

Windowed 2D

Skunk now includes a small window/input/drawing runtime for simple 2D programs. The current implementation is macOS-first and is aimed at rectangle-based games and visual prototypes.

function main(): void {
    window: Window = Window::create(800, 600, "Skunk");

    for (; window.is_open(); ) {
        window.poll();

        if (Keyboard::is_down(window, 'q')) {
            window.close();
        }

        window.clear(Color::rgb(8, 12, 24));
        window.draw_rect(120.0, 140.0, 96.0, 64.0, Color::white());
        window.present();
    }

    window.deinit();
}
  • Window::create(width, height, title) creates a native window handle.
  • window.poll() pumps OS events so keyboard and close state stay current.
  • window.is_open(), window.close(), and window.deinit() control the window lifetime.
  • window.clear(color) fills the framebuffer and window.draw_rect(x, y, w, h, color) draws clipped solid rectangles.
  • window.present() shows the current frame and updates window.delta_time().
  • Keyboard::is_down(window, 'w') currently uses character keys; arrow keys and richer input enums can come later.
  • Color::rgb(r, g, b), Color::rgba(r, g, b, a), and constants like Color::white() pack colors for drawing.
  • The repository now includes examples/pong.skunk as a complete example built on this API.

Generics

Skunk supports generic structs, generic functions, and generic enums through monomorphization.

struct Box[T] {
    value: T;
}

function wrap[T](value: T): Box[T] {
    return Box[T] { value: value };
}
  • Nested instantiations such as Box[Box[int]] are supported.
  • Function type argument inference works from call arguments in common cases.
  • Explicit function call type arguments are supported with forms like id[int](42).

Enums and Match

Skunk supports generic enums with unit variants and tuple-style payload variants, plus exhaustive enum-focused match.

enum Option[T] {
    None;
    Some(T);
}

function unwrap(value: Option[int]): int {
    match (value) {
        case None: {
            return 0;
        }
        case Some(v): {
            return v;
        }
    }
}
  • Construct variants with forms like Option[int]::None() and Option[int]::Some(7).
  • Variants may carry multiple payload values, such as Pair(A, B).
  • match is exhaustiveness-checked for enums.

Traits, Conform, and Shapes

Traits work both as generic constraints and as runtime interface values. Traits may extend other traits, and shapes provide reusable structural bounds.

trait Readable {
    function value(self): int;
}

trait Writer: Readable {
    function write(mut self, value: int): int;

    function write_twice(mut self, value: int): int {
        self.write(value);
        return self.write(value);
    }
}

trait Resettable {
    function reset(mut self): void;
}

shape WriterLike {
    function write(mut self, value: int): int;
}

struct Counter {
    value: int;
}

conform Writer for Counter {
    function value(self): int {
        return self.value;
    }

    function write(mut self, value: int): int {
        self.value = self.value + value;
        return self.value;
    }
}

conform Resettable for Counter {
    function reset(mut self): void {
        self.value = 0;
    }
}

function use_counter[T: Writer + Resettable](counter: *T): int {
    counter.reset();
    return counter.write(41);
}

function save[T](value: T): T
where T: Writer + Resettable {
    return value;
}

function use_writer_like[T: WriterLike](writer: *T): int {
    return writer.write(5);
}
function main(): void {
    writer: Writer = Counter { value: 1 };
    print(writer.write_twice(4));
}
  • Trait conformance is explicit through conform Trait for Type { ... }.
  • Traits may extend other traits with trait Writer: Readable { ... }; implementing the child trait also satisfies the parent traits.
  • Traits may provide default method bodies, and conform blocks only need to implement the required methods they want to customize.
  • Shapes provide structural bounds, for example T: WriterLike, without introducing a runtime trait value.
  • Conform targets may be concrete or generic, for example conform[T] SizedThing for Box[T] { ... }.
  • Generic bounds can stay inline with function save[T: Writer](...) or move into a where clause for longer signatures.
  • Bounds stack with +, for example where T: Writer + Resettable.
  • Trait names may be used as runtime types, such as writer: Writer.
  • Assigning an addressable concrete value to a trait value borrows its storage for dynamic dispatch; rvalues still box a runtime value with a vtable.

Patterns and Destructuring

Skunk currently supports enum patterns in match, struct patterns in match, and standalone struct destructuring statements.

struct Point {
    x: int;
    y: int;
}

function sum(point: Point): int {
    match (point) {
        case Point { x, y }: {
            return x + y;
        }
    }
}

function main(): void {
    point: Point = Point { x: 3, y: 4 };
    Point { x, y: py } = point;
    print(sum(point));
    print(x + py);
}
  • Struct field bindings may use aliases such as y: py.
  • Destructuring statements introduce local bindings in the current scope.
  • Struct pattern matching is exact-type and currently supports one case in V1.

Current Limitations

  • No user-defined allocators yet. The current allocator and arena model is still runtime-provided.
  • The unsafe memory layer is intentionally small today: there are no raw pointer trait bounds, no arbitrary pointer arithmetic beyond *byte::offset, and no general unsafe standard library yet.
  • Struct match is intentionally narrow in this first pass.
  • Runtime trait values still use a simple V1 representation. Lvalues reuse their existing storage, while temporaries and other rvalues are boxed when converted to trait values.

Design Notes

The language reference should stay focused on implemented behavior. For deeper design context, see the supporting notes in this repository.