pyadf

A high-performance Python library for converting Atlassian Document Format (ADF) to Markdown.

Features

Rust-powered — parsing and rendering run in native code via PyO3
Robust error handling with detailed, context-aware error messages
Type-safe with comprehensive type hints and Python 3.11+ support
Comprehensive node support:
- Text formatting (bold, italic, links)
- Headings (h1-h6)
- Lists (bullet, ordered, task lists)
- Tables with headers and column spans
- Code blocks with syntax highlighting
- Blockquotes and panels
- Status badges, inline cards, block cards, emoji, mentions
- Dates with configurable timezone and format
Streaming JSONL API for ETL pipelines processing millions of documents

Installation

pip install pyadf

Prebuilt wheels are available for Linux and macOS (x86_64 and aarch64) and Windows (x86_64).

pyadf only supports Python version from 3.11.

Usage

Basic Usage

from pyadf import Document

adf_data = {
    "version": 1,
    "type": "doc",
    "content": [
        {
            "type": "paragraph",
            "content": [
                {"type": "text", "text": "Hello, "},
                {"type": "text", "text": "world!", "marks": [{"type": "strong"}]}
            ]
        }
    ]
}

doc = Document(adf_data)
print(doc.to_markdown())
# Output: Hello, **world!**

Converting from JSON String

from pyadf import Document

adf_json = '{"type": "doc", "content": [...]}'
doc = Document(adf_json)
markdown = doc.to_markdown()

Parsing Markdown to ADF

from pyadf import Document, markdown_to_adf

doc = Document.from_markdown("# Hello\n\nThis is **bold**.")
adf = doc.to_adf()

adf2 = markdown_to_adf("1. First\n2. Second")

The Markdown importer is currently strict and targets the canonical subset that pyadf already renders well.

Detailed ADF element and Markdown import policy lives in docs/adf-element-policy.md.

Converting Individual Nodes

from pyadf import Document

node = {
    "type": "heading",
    "attrs": {"level": 2},
    "content": [{"type": "text", "text": "My Heading"}]
}

doc = Document(node)
print(doc.to_markdown())
# Output: ## My Heading

Batch JSONL Processing

For ETL pipelines processing large volumes of ADF documents:

from pyadf import convert_jsonl, MarkdownConfig

# From a JSONL file (one ADF document per line)
for result in convert_jsonl("export.jsonl"):
    print(result)

# From bytes with custom config
config = MarkdownConfig(bullet_marker="*", show_links=True)
for result in convert_jsonl(jsonl_bytes, config=config, batch_size=10_000):
    print(result)

# Error handling modes
from pyadf import ConversionError

for result in convert_jsonl(data, on_error="include"):
    if isinstance(result, ConversionError):
        print(f"Line {result.line_number}: {result.error}")
    else:
        print(result)

convert_jsonl accepts:

source: file path (str), raw bytes, or a binary file-like object
config: optional MarkdownConfig
on_error: "include" (default, yields ConversionError), "skip", or "raise"
batch_size: lines per Rust batch (default 10,000)

Error Handling

from pyadf import Document, InvalidJSONError, UnsupportedNodeTypeError

try:
    doc = Document('invalid json')
except InvalidJSONError as e:
    print(f"Invalid JSON: {e}")

try:
    doc = Document({"type": "unsupported_type"})
except UnsupportedNodeTypeError as e:
    print(f"Unsupported node: {e}")

# Known unsupported nodes like "extension" can be skipped, warned on, error, or preserved as HTML at render time
doc = Document({"type": "extension"})
assert doc.to_markdown() == ""

doc = Document(
    {
        "type": "extension",
        "attrs": {"extensionKey": "toc", "extensionType": "com.atlassian.confluence.macro.core"},
    }
)
assert doc.to_markdown(on_known_unsupported="html") == (
    '<div adf="extension" '
    'params=\'{"extensionKey":"toc","extensionType":"com.atlassian.confluence.macro.core"}\'></div>'
)

Known unsupported node handling:

Document(...).to_markdown() defaults to on_known_unsupported="warn" and emits UserWarning while skipping known unsupported nodes such as extension
Document(...).to_markdown(on_known_unsupported="skip") silently skips known unsupported nodes
Document(...).to_markdown(on_known_unsupported="error") raises UnsupportedNodeTypeError
Document(...).to_markdown(on_known_unsupported="html") preserves known unsupported nodes as invisible HTML fallback elements like <div adf="extension" params='...'></div> (or <span ...></span> in inline/cell contexts)

The same on_known_unsupported option is available on convert_jsonl(...).

Customizing Markdown Output

from pyadf import Document, MarkdownConfig

doc = Document(adf_data)

# Default bullet marker is -
doc.to_markdown()  # "- Item 1\n- Item 2"

# Use * for bullet lists
config = MarkdownConfig(bullet_marker="*")
doc.to_markdown(config)  # "* Item 1\n* Item 2"

# Links are shown by default
doc.to_markdown()  # [Link text](http://example.com)

# Hide underlying href while keeping link text marked
config = MarkdownConfig(show_links=False)
doc.to_markdown(config)  # [Link text]

Option	Values	Default	Description
`bullet_marker`	`+`, `-`, `*`	`-`	Character used for bullet list items
`show_links`	`True`, `False`	`True`	Show underlying links in markdown
`date_timezone`	IANA timezone name	`UTC`	Timezone used to render `date` nodes (e.g. `America/New_York`)
`date_format`	strftime pattern	`%Y-%m-%dT%H:%M:%S%:z`	Format used to render `date` nodes

Supported Markdown Import Subset

Document.from_markdown(...) and markdown_to_adf(...) currently support a small, strict subset of Markdown:

Paragraphs
ATX headings (# through ######)
Bold / italic / bold+italic
Inline links
Bullet and ordered lists
Blockquotes
Fenced code blocks
GFM tables
pyadf HTML fallback elements such as <div adf="extension" ...></div>

The importer intentionally rejects many other Markdown forms for now (for example generic HTML), so roundtrip behavior stays deterministic while the feature set is being expanded.

For the living ADF element and Markdown import policy, see docs/adf-element-policy.md.

Known Unsupported Nodes

These node types are recognized but not rendered. By default they are warned:

mediaSingle
mediaGroup
mediaInline
expand
rule
media
embedCard
extension

Supported ADF Node Types

ADF Node Type	Markdown Output	Notes
`doc`	Document root	Top-level container
`paragraph`	Plain text with newlines
`text`	Text with optional formatting	Supports bold, italic, links
`heading`	`# Heading` (levels 1-6)
`bulletList`	`- Item`
`orderedList`	`1. Item`
`taskList`	`- [ ] Task`	Checkbox tasks
`codeBlock`	```language\ncode\n```	Optional language syntax
`blockquote`	`> Quote`
`panel`	`> Panel content`	Info/warning/error boxes
`table`	Markdown table	Supports headers and colspan
`status`	`[STATUS]`	Status badges
`inlineCard`	`[link]` or code block	Link previews
`emoji`	Unicode emoji
`hardBreak`	Line break
`mention`	`@DisplayName`	Jira user mentions
`blockCard`	`[link]` or code block	Link previews
`date`	`2020-02-19T22:49:19+00:00`	Configurable via `date_timezone` / `date_format`

Exception Types

PyADFError — Base exception for all pyadf errors
InvalidJSONError — Raised when JSON parsing fails
InvalidInputError — Raised when input type is incorrect
InvalidADFError — Raised when ADF structure is invalid
MissingFieldError — Raised when required fields are missing
InvalidFieldError — Raised when field values are invalid
UnsupportedNodeTypeError — Raised when encountering unsupported node types
NodeCreationError — Raised when node creation fails

All exceptions include detailed context about the error location in the ADF tree.

Development

Prerequisites

Python 3.11+
Rust toolchain (stable)
maturin (uv tool install maturin)

Setup

git clone https://github.com/YoungseokCh/pyadf.git
cd pyadf
uv sync
uv run maturin develop

Testing

cargo test              # Rust unit tests
uv run pytest tests/ -v # Python tests

Linting

# Rust
cargo fmt --check
cargo clippy -- -D warnings

# Python
ruff check src/ tests/ benchmarks/
ruff format --check src/ tests/ benchmarks/

License

MIT License — see LICENSE file for details.

Changelog

0.5.2

Add date node support, rendered via the new MarkdownConfig.date_timezone (IANA timezone, default UTC) and date_format (strftime, default ISO 8601 date-time) options
Add Python 3.15 support metadata and CI coverage

0.5.1

Support 'version' property for top-level ADF Document node
Add Python 3.14 support

0.5.0

Move on_known_unsupported=error|skip|warn|html from Document(...) construction to Document(...).to_markdown(...)
Add on_known_unsupported="html" to render known unsupported nodes as invisible HTML fallback elements
Add Document.from_markdown(...) for strict Markdown -> ADF parsing
Add Document.to_adf() for exporting canonical ADF dictionaries
Expand Markdown import support for inline code, strikethrough, task lists with TODO / DONE state, nested lists, multi-paragraph list items, and canonical GFM tables with inline marks
Preserve taskList.attrs.localId and taskItem.attrs.localId/state when exporting ADF; Markdown import sets task state but does not generate localId
Canonicalize accepted Markdown variants such as underscore emphasis, URL autolinks, and code-block info strings while rejecting reference-style links
Tighten pyadf HTML fallback parsing by rejecting unclosed fallback wrappers and malformed params JSON

0.4.3

Show link targets by default in markdown output
Use - as the default bullet marker
Treat extension as a known unsupported node instead of failing by default
Add on_known_unsupported=error|skip|warn for known unsupported nodes; unknown node types still error

0.4.2

Add support for blockCard node type

0.4.1

Fix linux x86_64 wheel builds

0.4.0

Rust core via PyO3 — 5x faster single-doc, 24x faster batch processing
New convert_jsonl() streaming API for batch JSONL processing
New ConversionError dataclass for structured batch error handling
Build system switched from setuptools to maturin
abi3 stable ABI wheels for Linux, macOS (x86_64 + aarch64) and Windows (x86_64)

Breaking changes:

Removed set_debug_mode() and _logger module (will be replaced with Rust-native tracing in a future release)
nodes and _types modules removed (internal implementation replaced by Rust)

0.3.2

Added support for showing href links in markdown output

0.3.1

Added mention node support

0.3.0

Added emoji node support
Added configurable bullet markers via MarkdownConfig

0.1.0

Class-based API with Document class
Support for common ADF node types
Type-safe architecture with comprehensive type hints (Python 3.11+)
Flexible input handling (JSON strings, dictionaries, individual nodes)

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
docs		docs
rust/src		rust/src
src/pyadf		src/pyadf
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

pyadf

Features

Installation

Usage

Basic Usage

Converting from JSON String

Parsing Markdown to ADF

Converting Individual Nodes

Batch JSONL Processing

Error Handling

Customizing Markdown Output

Supported Markdown Import Subset

Known Unsupported Nodes

Supported ADF Node Types

Exception Types

Development

Prerequisites

Setup

Testing

Linting

License

Changelog

0.5.2

0.5.1

0.5.0

0.4.3

0.4.2

0.4.1

0.4.0

0.3.2

0.3.1

0.3.0

0.1.0

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 10

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages