Open Source Release

VAREK Data Room

AI Pipeline Programming Language — Complete release archive

Kenneth Wayne Douglas, MD  ·  MIT License  ·  2025–2026

5Versions
659Tests Passing
13,006Lines of Code
261Stdlib Functions
10Files
530KBTotal Size
Source Releases
📦
varek-v0.1.zip
Foundation layer. Formal EBNF grammar, hand-rolled lexer (all tokens, spans, error recovery), and recursive-descent parser with panic-mode synchronisation. Includes AST node hierarchy and a visitor-based pretty-printer. The entire front-end pipeline.
v0.1.0 109 tests 2,546 lines Lexer Parser AST EBNF Grammar
34 KB
📦
varek-v0.2.zip
Full static type system. Hindley-Milner inference (Algorithm W), let-polymorphism, Robinson's unification with occurs check, tensor shape tracking via symbolic dimensions, schema structural subtyping, Optional coercion, Result<T> propagation, and a runtime SchemaValidator. Type errors with source spans and "did you mean?" hints.
v0.2.0 163 tests 5,562 lines HM Inference Type System Unification Schema Validation
60 KB
📦
varek-v0.3.zip
Native code generation. Raw ctypes bindings to libLLVM-20.so — no llvmlite. Emits verified LLVM IR (SSA form, phi nodes, alloca/load/store), native assembly (.s), and object files (.o). Singleton TargetMachine, LLVM optimisation passes, and a tree-walking interpreter retained alongside the compiled path. First benchmarks: IR generation in ~1ms, native 10–100× faster than CPython (projected).
v0.3.0 97 tests 7,792 lines LLVM IR Native Codegen Interpreter Benchmarks
86 KB
📦
varek-v0.4.zip
Full standard library surface — 7 modules, 261 exported functions. var::io (41 fns: files, paths, streams, env), var::tensor (111 fns: NumPy-backed, linear algebra, activations, distance), var::http (19 fns: client + server), var::async (32 fns: channels, futures, parallel_map, mutex, atomic), var::pipeline (15 fns: execution engine, combinators, streaming), var::model (14 fns: inference, embeddings, tokenization), var::data (29 fns: CSV/JSON/NPY, splits, augmentation, metrics).
v0.4.0 182 tests 11,097 lines var::io var::tensor var::http var::async var::pipeline var::model var::data
118 KB
📦
varek-v1.0.zip
Stable release. syn package manager (20 commands: new / build / run / install / publish / search / repl / fmt / doc). Package format (.synpkg, syn.toml, syn.lock) with semver resolution and SHA-256 checksum verification. Interactive REPL with :type / :bench / :ir. Source formatter. Doc generator (Markdown + HTML + JSON). Governance docs, 3 implemented RFCs, community contributor ladder, and formal stability guarantees.
v1.0.0 Stable 108 tests 13,006 lines Package Manager REPL Formatter Doc Gen RFC Process Governance
148 KB
Documentation & Specification
📄
VAREK-Spec-Paper-Douglas.pdf
Formal language specification paper. Covers design motivation, formal grammar, type system semantics, pipeline execution model, and the case for a unified AI/ML pipeline language. Authored by Kenneth Wayne Douglas, MD.
Specification Academic Paper
21 KB
🌐
varek-landing.html
Project landing page (single-file HTML). Suitable for hosting on GitHub Pages, a personal domain, or as a project overview. Contains feature overview, code examples, version history, and the design rationale.
Landing Page GitHub Pages Ready
22 KB
📝
VAREK-README.md
Project README. Quick start, feature overview, language tour, stdlib module listing, and architecture diagram. Drop this directly into the root of a GitHub repository.
README GitHub Ready
4.5 KB
📝
VAREK-SPEC.md
Draft language specification in Markdown. Grammar rules, type system formal description, pipeline semantics, and built-in function signatures. v0.1 draft — superseded by the formal grammar in each version's grammar/VAREK.ebnf.
Language Spec Draft v0.1
6.2 KB
🐍
varek-parser.py
Standalone single-file parser prototype. Self-contained — no imports beyond Python stdlib. Useful as a quick reference implementation or for embedding in other projects without the full package structure. Pre-dates the modular v0.1 release.
Standalone Zero Dependencies Reference Implementation
17.5 KB
How to Use This Data Room

🚀 Starting a new project

Download varek-v1.0.zip. Unzip, then:

  • python varek_cli.py new my-proj
  • cd my-proj && python ../syn_cli.py run
  • python ../syn_cli.py repl

Requirements: Python 3.10+, NumPy. LLVM optional (for native emit).

📖 Reading the code

Each zip is a self-contained milestone. Recommended reading order:

  • v0.1 — lexer.py → parser.py → ast.py
  • v0.2 — types.py → unify.py → infer.py
  • v0.3 — llvm_api.py → codegen.py
  • v0.4 — stdlib/*.py
  • v1.0 — packager.py → syn_cli.py

🔬 Academic / research use

Start with VAREK-Spec-Paper-Douglas.pdf for the formal design rationale. Then VAREK-SPEC.md for the grammar and type system. The grammar/VAREK.ebnf inside each zip is the normative grammar definition at each version.

🌐 GitHub / open source setup

  • Use VAREK-README.md as README.md in the repo root
  • Host varek-landing.html on GitHub Pages
  • Upload .zip files as GitHub Release assets
  • Copy docs/GOVERNANCE.md and docs/RFC_TEMPLATE.md from v1.0 into the repo
  • Add rfcs/ directory for community proposals

🏗️ Extending VAREK

The syn::* stdlib is the primary extension surface. Add a new module in varek/stdlib/mymodule.py, register it in stdlib/__init__.py, and wire it to the runtime. No core language changes needed.

For language changes, open an RFC using the template in docs/RFC_TEMPLATE.md.

⚡ Running the tests

  • v0.1: python tests/test_varek.py
  • v0.2: python tests/test_types.py
  • v0.3: python tests/test_v03.py
  • v0.4: python tests/test_stdlib.py
  • v1.0: python tests/test_v10.py

All tests pass green. Total: 659 tests.

License
MIT

MIT License — Free for any use

Copyright © 2025–2026 Kenneth Wayne Douglas, MD.
Permission is granted, free of charge, to any person obtaining a copy of this software to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies, subject to the condition that the copyright notice and this permission notice appear in all copies or substantial portions of the software.

This is the most permissive open source license. You can fork it, build commercial products on it, publish papers citing it, or include it in proprietary software — just keep the copyright notice.

Cumulative Build Totals
Version What Ships Tests Py Lines Size
v0.1 Lexer · Parser · AST · Grammar 109 2,546 34 KB
v0.2 HM Type Inference · Unification · Schemas 163 5,562 60 KB
v0.3 LLVM Codegen · Native Emit · Interpreter · Benchmarks 97 7,792 86 KB
v0.4 7 Stdlib Modules · 261 Functions 182 11,097 118 KB
v1.0 ✓ Package Manager · REPL · Formatter · Docs · RFC Process 108 13,006 148 KB
TOTAL All versions · All artifacts 659 13,006 530 KB