Skip to content

Conversation

@karlseguin
Copy link
Collaborator

@karlseguin karlseguin commented Oct 27, 2025

zigdom

lightpanda with a Zig-native DOM

Install

Needs cargo

install-html5ever-dev

Testing

Still a work in progress. But the test filter has been improved. TEST_FILTER="..." or (make test F="...") can now run specific HTML file cases using the #partial_file_name:

make test F="Document"
make test F="Document#query_selector.html"
make test F="Document#query_selector"
make test F="#query_selector.html"

Prototype

The prototype system relies on two special fields. The _proto field references the parent of a type. For example:

    • Element has a _proto: *Node field,
    • Node has a _proto: *EventTarget field,
    • EventTarget has no _proto field

Going the other way, "parents" have a _type field which is a tagged union:

pub const Node = @This;

_type: Type,

pub const Type = union(enum) {
  element: *Element,
  document: *Document,
  // ....
};

As a convention, parents expose an as method:

const el = node.as(Element) orelse return;
// or
const input = node.as(Element.Html.Input) orelse return;

As a convenience, children expose an as$Ancestor, e.g. input.asElement() or input.asNode(), although some code might simply access input._proto.

Bonus: The special return union types, e.g. Node.Union or Element.Union, are no longer needed. You can return any part of the prototype chain. In other words, as far as the JS bridge is concerned, you can return input or input.asElement() or input.asNode().

Explicit JS Mapping

Naming conventions are no longer used to create the JS mapping. Every type that is mapped has a nested JsApi and JsApi.Meta struct:

const Window = @This();

...

pub const JsApi = struct {
    pub const bridge = js.Bridge(Window);

    pub const Meta = struct {
        pub const name = "Window";
        pub const prototype_chain = bridge.prototypeChain();
        pub var class_index: u16 = 0;
    };

    pub const self = bridge.accessor(Window.getWindow, null, .{ });
    pub const window = bridge.accessor(Window.getWindow, null, .{ });
    ...
};

A bit more tedious, but new APIs aren't added that often. Allows for having per-definition configuration (e.g. enabling DOMExceptions on individually methods, rather than the entire type). Can also result in more idiomatic Zig code. For example, Element.innerHTML is able to take an *Io.Writer, with the mapper providing a wrapper:

    pub const innerText = bridge.accessor(_innerText, null, .{});
    fn _innerText(self: *Element, page: *const Page) ![]const u8 {
        var buf = std.Io.Writer.Allocating.init(page.call_arena);
        try self.getInnerText(&buf.writer);
        return buf.written();
    }

Consistent DOM handling

Whether an element is created by the parser or via document.createElement, the same code path is taken (as much as possible). This creates more consistency, e.g. in setting the select.selectedIndex. Individual DOM elements can opt-into build callbacks. For example Input.zig gets "created" events:

// Input.zig

pub const Build = struct {
    pub fn created(node: *Node, page: *Page) !void {
        var self = node.as(Input);
        const element = self.asElement();

        // Store initial values from attributes
        self._default_value = element.getAttributeSafe("value");
        // ....
    }
}

Naming Convention

Underscore fields names are used throughout the WebAPI in large part to avoid naming conflicts. Structs-as-a-file is used extensively, and field names are more likely to cause conflicts in this setup.

In non WebAPI classes (e.g. the Page), they are used as "private" markers. Within the project, there's now a clear "lightpanda" library, and I'm starting to think about what should and shouldn't be exposed from the library.

Memory

This branch was born from an experiment that used a hybrid memory management - arenas for DOM objects and reference counting for other types (e.g. XHR objects). Some of this complexity is retained (page._factory), despite everything using either page.arena or page.call_arena. I'm hopeful that some explicit memory management can be re-added in the future (XHR objects can hold onto large amounts of memory) - but right now, it's worthless complexity.

Note 1

In both main and zigdom, most types that are returned to JavaScript are placed on the heap. The JS bridge handles this. So zigdom isn't putting more objects on the heap than main, it's just being more explicit about it.

Note 2

Values of the _type union are almost always pointers. The only exception are 8-byte leaf nodes (i.e. leaf nodes that only have a _proto: *Parent field) which can be directly embedded into the union.

There's a real tradeoff here. But in general, zigdom aims for memory efficiency at the cost of performance and, in this case, potential fragmentation (which I'm hoping can be solved by a smarter allocation strategy). This is clearly generating many more small allocations than libdom was.

mookums and others added 29 commits November 17, 2025 07:25
Applies the `rework-types` changes to zigdom branch.
Forgot to revert this...
No idea how I removed this single line while rebasing...
karlseguin and others added 30 commits December 16, 2025 16:24
`UIEvent`, `MouseEvent` and `KeyboardEvent`
In createIsolatedWorld, we set  a default value to false for optional
grantUniveralAccess parameter.
macos-13 is unsupported. We Have to switch for payed instance.
see actions/runner-images#13046
Add support for both modes - parsing and post-parsing. In post-parsing mode,
document.write implicitly calls document open, and document.open wipes the
document. This mode is probably rarely, if ever, used.

However, while parsing, document.write does not call document.open and does not
remove all existing nodes. It just writes the html into the document where the
parser is. That isn't something we can properly do..but we can hack it. We
create a new DocumentFragment, parse the html into the document fragment, then
transfer the children into the document where we currently are.

Our hack probably doesn't work for some advance usage of document.write (e.g
nested calls), but it should work for more common cases, e.g. injecting a script
tag.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants