Open-Source Project

Mini-LLM Pretraining Framework

A fully self-contained codebase for training transformer language models from scratch. Rather than depending on large frameworks, the project re-implements the core mechanisms of modern transformers using only PyTorch — the goal is to surface the design decisions that production frameworks abstract away.

View on GitHub →

Key Features

Why Build This

Modern LLM research often hides crucial design decisions behind large-scale libraries. Re-implementing from first principles:

Usage

Launching training after setup is one command:

uv run python3 train/pretrain.py

Model size, architecture, and training hyperparameters are all controlled through a JSON config file.

Scope & Design Choices

Repository

Full code, setup instructions, and examples: github.com/asantucci/language-model