Tutorial
Folio in thirty minutes.
A follow-along guide. You'll initialize a repository, create a datasheet, edit its prose, validate, commit, query, search, export, and simulate a teammate's merge. At the end you'll have exercised every core capability and be able to use Folio for real work.
typst installed, the PDF export step is optional — everything else works without it.
Initialize a repository.
A Folio repository is a git repository with a .folio/config.toml at its root and datasheets under datasheets/. You create one the same way you'd create any project directory:
mkdir acme-docs && cd acme-docs folio init --tier controlled . ok: initialized Folio repository at . tier: Controlled
The --tier flag sets how conservative the repo is about what you can store here:
open— up to internal classification onlycontrolled— up to confidential classificationrestricted— up to restricted classification
Pick based on who has read access. Folio will refuse to commit a confidential datasheet to an open repo.
git2 under the hood, so your existing .gitconfig (name, email, signing key) applies. No extra setup.Create your first datasheet.
folio new payment-service \ --owner platform-team \ --classification internal ok: scaffolded datasheet 'payment-service' at datasheets/payment-service files staged but not committed; run `folio checkpoint <message>` when ready
Folio has created a directory tree:
ls datasheets/payment-service folio.toml public/ teaser.toml teaser.md sections/ snapshot.md purpose.md interfaces.md non-functional.md dependencies.md revision-history.md
The manifest (folio.toml) holds structured metadata. The teaser directory contains a short public summary. The sections/ files are Markdown prose that describe the service in detail — these are what a reader will focus on, so they're meant to be edited.
Open datasheets/payment-service/folio.toml in your editor. It looks like this:
schema_version = 1
[identity]
id = "payment-service"
name = "Payment Service"
version = "0.1.0"
created = "2026-04-24T14:22:00Z"
[status]
lifecycle = "in-development"
maturity = "experimental"
last_reviewed = "2026-04-24T14:22:00Z"
[dependencies]
Edit the manifest and prose.
Fill in the manifest with real content. Add links, declare interfaces, state what the service depends on:
[identity]
id = "payment-service"
name = "Payment Service"
version = "1.2.0"
created = "2024-06-10T09:00:00Z"
[status]
lifecycle = "active"
maturity = "stable"
last_reviewed = "2026-04-01T00:00:00Z"
[[links]]
url = "https://git.example.org/platform/payment-service"
kind = "repository"
description = "Source"
[[links]]
url = "https://docs.example.org/payment-service"
kind = "documentation"
description = "API reference"
[[interfaces]]
name = "payment-api"
protocol = "gRPC"
transport = "HTTP/2 over TLS"
description = "Primary payment processing endpoint"
[dependencies]
conflicts = []
[[dependencies.requires]]
name = "postgres"
version = ">=15"
kind = "runtime"
Then write the prose. Open datasheets/payment-service/sections/purpose.md and describe why this service exists — not what it does (that's the snapshot) but the reason it was built:
# Purpose
Before this service existed, payment processing was embedded in
the monolith. Schema changes required coordinated deploys across
three teams and failures were difficult to isolate. Payment
Service extracts that concern behind a single gRPC interface so
upstream services can evolve independently of payment logic.
Write the other required sections (snapshot.md, interfaces.md, non-functional.md, dependencies.md) in the same way. Keep them brief — a reader should be able to absorb the whole datasheet in five minutes.
Validate and commit.
Before committing, run the validator. This is the same gate checkpoint uses — running it explicitly first is a good habit:
folio validate ok payment-service 1 datasheet: 1 valid, 0 invalid
Green across the board. Commit:
folio checkpoint "initial payment-service datasheet" ok: committed 4a3f2e1b initial payment-service datasheet
Under the hood, folio checkpoint:
- Runs the same validator
folio validateruns - Updates
revision-history.mdwith the commit metadata - Stages everything and makes a git commit with your message
- Refuses to commit if any datasheet fails validation
If you're paranoid, inspect: git log --oneline shows a clean commit, git show HEAD shows exactly what Folio did. Nothing magical happened.
List, show, and get statistics.
Now that you have a datasheet committed, the observational commands become useful:
folio list ID NAME LIFECYCLE MATURITY OWNER CLASSIFICATION ---------------- ---------------- ----------- ---------- ---------------- -------------- payment-service Payment Service active stable platform-team internal 1 datasheet
folio show payment-service # Payment Service id: payment-service owner: platform-team classification: internal lifecycle: active maturity: stable version: 1.2.0 last reviewed: 23 days ago ...
folio stats folio stats 1 datasheet LIFECYCLE MATURITY active 1 stable 1 CLASSIFICATION OWNERS internal 1 platform-team 1 LAST REVIEWED fresh (<30d) 1 recent (<90d) 0 stale (<1y) 0 ancient (>1y) 0 REFERENCES links 2 interfaces 1 requirements 1
Not particularly interesting with one datasheet. It becomes more so as your repository grows. folio stats against a repo with 30 services tells you at a glance how many are deprecated, how many are stale, who owns what.
Search with BM25.
Full-text search across every manifest, teaser, and section body:
folio search "gRPC" payment-service score=2.4 "Primary payment processing endpoint over gRPC..."
BM25 is a ranking function from information retrieval research, widely used in search engines. It weights rare words more heavily than common ones and caps the benefit of repeated occurrences. Practically: searching for "deprecation migration" surfaces sections that discuss both concepts, not sections that say "deprecation" 40 times.
Machine-readable output:
folio search "gRPC" --json | jq '.[] | .id' "payment-service"
Query with SPARQL.
Every datasheet becomes RDF triples. You can ask structural questions the tabular view can't answer:
folio query 'PREFIX folio: <https://folio.tools/vocab#> SELECT ?ds ?name WHERE { ?ds folio:lifecycle "active" ; folio:name ?name } ORDER BY ?name' ds name --------------------- ---------------- ds:payment-service Payment Service 1 row
The vocabulary lives at https://folio.tools/vocab# and covers every manifest field (name, owner, lifecycle, maturity, classification, lastReviewed) plus relations (hasLink, hasInterface, requires).
More interesting queries become useful as you have more datasheets. Try these against an eventual multi-datasheet repo:
# Services owned by platform-team that are stale (not reviewed in >90 days) folio query 'PREFIX folio: <https://folio.tools/vocab#> SELECT ?ds ?reviewed WHERE { ?ds folio:owner "platform-team" ; folio:lastReviewed ?reviewed . FILTER (?reviewed < "2026-01-25") }'
# Every confidential service that depends on a deprecated library folio query 'PREFIX folio: <https://folio.tools/vocab#> SELECT ?ds ?dep WHERE { ?ds folio:classification "confidential" ; folio:requires ?r . ?r folio:requirementName ?dep . ?deplib folio:id ?dep ; folio:lifecycle "deprecated" . }'
This is the superpower: the queries you can now ask are the queries a senior engineer thinks in, and having them answerable makes that senior engineer's job radically easier.
Check link integrity.
Every URL in every manifest gets a real HTTP request, with timeout and concurrency controls:
folio check-links checking 2 link(s)... ok payment-service https://git.example.org/platform/payment-service 200 342ms broken payment-service https://docs.example.org/payment-service 404 128ms 1 broken, 1 ok
Exit code is non-zero when any link is broken, which makes it natural to wire up as a CI step. The machine-readable form has everything you'd want to pipe into a dashboard:
folio check-links --json [ { "datasheet": "payment-service", "url": "https://git.example.org/platform/payment-service", "status": "ok", "http_status": 200, "latency_ms": 342 }, { "datasheet": "payment-service", "url": "https://docs.example.org/payment-service", "status": "broken", "http_status": 404, "latency_ms": 128 } ]
Export to PDF.
If you have typst installed (optional), Folio can render a datasheet as a typeset PDF:
folio export payment-service ok: wrote out/payment-service-2026-04-24.pdf
This is the artifact you hand to a customer or save as a design-review snapshot. It was generated from the exact source your engineers edit. No "update the deck" step ever again.
out/ is gitignored. If you want to version PDFs, drop that ignore entry — Folio doesn't care.Simulate a teammate's merge.
Let's create a divergent branch, make conflicting changes, and let Folio do the three-way merge:
git branch teammate HEAD git checkout teammate # imagine a teammate changed the owner sed -i 's/owner = "platform-team"/owner = "payments-team"/' \ datasheets/payment-service/public/teaser.toml folio checkpoint "teammate: renamed platform-team to payments-team" git checkout master # meanwhile you added a new link on main cat >> datasheets/payment-service/folio.toml << 'EOF' [[links]] url = "https://dashboards.example.org/payments" kind = "external-reference" EOF folio checkpoint "added dashboards link" folio sync teammate --dry-run folio sync: merging refs/heads/teammate into HEAD 1 datasheet examined 1 clean auto-merge(s) AUTO-MERGED: payment-service dry run; no changes written
Folio's semantic merge recognizes that your local change (added a link) and the teammate's change (renamed owner) don't overlap structurally. Git's text merge might have flagged this as a conflict depending on line adjacency; Folio doesn't. Now run for real:
folio sync teammate --strategy=local resolved 1 datasheet(s) with strategy=LocalWinsAll; working tree updated git diff # shows both the rename and the new link applied together
For genuine conflicts — say you both changed the owner to different values — Folio groups them by shape and offers bulk resolution. If ten datasheets all disagree about the same field with the same local-vs-remote values, you make one decision, not ten.
Run doctor in CI.
folio doctor runs nine structural checks against the whole repo:
folio doctor folio doctor ok git repository: opened at /path/to/acme-docs ok .folio/config.toml: present, tier=Controlled ok datasheets/ structure: 1 datasheet(s) ok manifests parse and validate: all manifests ok ok required section files: all datasheets have required sections ok orphan files: no orphan files ok typst (for `folio export`): available: typst 0.14.2 ok working tree: clean ok last checkpoint: 0 day(s) ago all checks passed
Exit 0 on clean, 2 on warnings, 3 on failures. In a Gitea Actions or GitHub Actions workflow:
- name: Folio health check
run: |
folio doctor
folio validate
folio check-links --timeout 10
Any regression — a broken link, a missing required section, an invalid manifest — fails the build before it merges.
You're done.
Where to go from here.
You have a working Folio repository with one datasheet, validated, committed, exported, searched, queried, and synced. The next thirty datasheets work the same way. The loop is:
folio new <id>- Edit the manifest and prose
folio validatefolio checkpoint "what changed"
Everything else — queries, stats, syncs, exports — becomes useful as a consequence of having real content. Go populate the repo with the components your team actually owns.