// machlib.org

A formal math library — for machines, by machines.

MachLib — Lean-checkable, EML-native, zero Mathlib dependency.

import MachLib.EML — release snapshot pending — counts by manifest

A machine-native formal-library corpus workbench.
EML-native, Forge-linked, and Lean-checkable where verified artifacts are present. Zero Mathlib dependency, gate-backed.

// the numbers

What MachLib is, in four numbers.

Counts and build timing are tied to release snapshots. The zero-Mathlib dependency gate must pass for every release that claims it.

By release manifest
Records
By release manifest
Tests
Snapshot-specific
Cold Build
Gate-backed zero
Mathlib Imports
// how it works

The full chain — equation to proof.

MachLib is a local formal-library surface in a larger toolchain. It can work on its own, and stack integrations are reviewed separately. Here's the candidate path a single equation can take.

01Engineer writes equation (.eml)Domain expert, single source of truth, no toolchain leaks.
02Forge compiles to 9 targetsC, Rust, Python, LLVM, wasm, Lean, Verilog, VHDL, Chisel.
03Lean output targets MachLib surfacesZero Mathlib dependency is checked by the local release gate.
04MachLib build status is snapshot-specificCounts and timing should be read from the release manifest.
05Lean-check status recorded per artifactVerification metadata is scoped to the artifacts present in the release.

Review every line from equation to generated artifact to Lean-check status.
No upload, public release, or marketplace action is implied here.

// why machlib

Why a separate library?

Mathlib is a monumental human achievement. It was not built for agents — it's slow to import, sparse on metadata, and silent about the proofs that didn't work. MachLib is the opposite of that, on purpose.

Mathlib import

  • 45 minute cold build
  • ≈500,000 lines of dependencies
  • No chain-order annotations
  • No difficulty lanes
  • No tactic trace, no failure data
  • No common-mistake corpus
  • Built for human mathematicians

MachLib import

  • Build timing recorded per release snapshot
  • Compact foundations with gate-backed zero Mathlib dependency
  • Chain-order annotations on every record
  • 6 difficulty lanes (foundations → advanced challenge lane)
  • Tactic trace + success rate per tactic
  • Common-mistake corpus per theorem
  • Built for machines and the agents that train on them
// what every record contains

Eight sections per theorem.

Every MachLib record is a self-contained training example. A single theorem ships with formal + informal statements, multiple proofs, tactic traces, common mistakes, structural profile, and relationships. One record, one JSON file, no ambient state.

depth_of_const.jsonlane 1 · beginner
{
  "schema_version": "1.0.0",
  "theorem": {
    "id": "depth_of_const",
    "statement": {
      "informal": "The depth of a constant-leaf EML tree is zero. A bare constant has no `ceml` operations to count.",
      "formal_lean": "theorem depth_of_const (c : ℂ) : (EMLTree.const c).depth = 0 := rfl"
    },
    "domain": "eml",
    "lane": 1,
    "tags": [
      "depth",
      "definition",
      "rfl",
      "lane-1"
    ]
  },
  "proofs": [
    {
      "id": "p1",
      "tactics": [
        "rfl"
      ],
      "tactic_count": 1,
      "eml_node_cost": 1,
      "style": "definitional",
      "is_optimal": true
    }
  ],
  "difficulty": {
    "lane": 1,
    "label": "beginner",
    "prerequisite_skills": []
  },
  "common_mistakes": [
    {
      "why_fails": "Using `simp [EMLTree.depth]` works but is overkill — `rfl` already suffices."
    },
    {
      "why_fails": "`by decide` fails because `EMLTree.depth` is not a `Decidable` proposition."
    }
  ],
  "structural_profile": {
    "chain_order": null,
    "drift_risk": "LOW",
    "dynamics": {
      "oscillations": 0,
      "decays": 0
    }
  },
  "relationships": {
    "structural_siblings": [
      "depth_ceml_pos",
      "exp_in_EML_one"
    ]
  },
  "metadata": {
    "verified": true,
    "verification_method": "lean4_kernel"
  }
}
// how it grows

Six engines feed the corpus.

MachLib is organized around release snapshots and pipeline metadata. Candidate records can arrive from synthesis, compilation, RL training, search, contribution, and cross-domain bridging, with status recorded per snapshot.

01

Synthetic Generation

Five strategies — constant_swap, domain_change, operator_swap, composition_depth, negation — fan a single base theorem into a family of structurally related variants. Release snapshots record which candidate records came from this engine.

02

Forge Mining

Every @verify(lean) annotation in Monogate Forge emits a Lean obligation. When Forge compiles an industrial vertical, MachLib gains the precision and overflow theorems that vertical demands.

03

Proof Gym (RL)

A Gymnasium-compatible Lean environment with a 54-tactic vocabulary. Agents discover non-obvious proofs; successful trajectories are added back into the corpus along with their tactic traces.

04

Multi-Proof BFS

For each theorem, breadth-first search enumerates alternative proofs shorter or different from the canonical one. MachLib stores all of them — is_optimal is a property, not a chosen branch.

05

Community

Lean developers contribute new theorems through pull requests. Every PR runs the schema validator + lake build + structural sibling check. Accepted records inherit the contributor's name in metadata.

06

Cross-Domain

When a structural sibling is discovered between two domains (Gaussian ↔ photoreceptor, sigmoid ↔ attention), the relationship is recorded on both endpoints. The corpus densifies as it grows.

// philosophy

For machines, by machines.

Mathlib is the canonical formal-mathematics library, and a monumental gift to the world. It was built — over fifteen years, by hundreds of contributors — for human mathematicians. Its proofs are dense, idiomatic, and chained through hundreds of files of inheritance.

An agent that needs to verify a single equation does not need that. It needs an axiomatic real number, four arithmetic operations, exp, log, sin, cos, and a small number of identities. That fits in ~500 lines.

MachLib is what you get when you put metadata first and history second. Every theorem ships with the things an agent actually consumes during training: chain order, difficulty lane, tactic trace, common mistakes, structural siblings. The proofs are there too, but they are no longer the only signal.

Academia has said, of agentic mathematics, that it does not belong in Mathlib. They may be right. So we made a different library — one that does belong to the agents.

// get started

Three ways in.

Pick the one that matches what you do today. The corpus, the foundations, and the contribution loop are reviewed per release snapshot.

ML researchers

Dataset access is pending/private-gated until a reviewed public release is approved. Counts and schema status are published per release snapshot.

# No public Hugging Face release is performed by this site copy. # See the release manifest for dataset access status.
Dataset access status →

Agent builders

Use the Lean foundations as a local check target where verified artifacts are present. The current public default tree and release target are zero Mathlib dependency, backed by the local gate.

git clone https://github.com/agent-maestro/machlib cd machlib/foundations && lake build
github.com/agent-maestro/machlib →

Lean developers

Contribute. The schema is documented; the validator runs on every PR. Add a theorem, add its tactic trace, add the mistakes that don't work. The corpus grows from there.

lake new my_theorem import MachLib
Read the contributing guide →
// the monogate ecosystem

Part of a larger stack.

MachLib is an EML-native, Forge-linked formal-library surface. CapCard and PETAL integrations are gated and subject to separate review. No PETAL upload or CapCard marketplace/public profile action is implied by this site.