GitHub - MaincodeHQ/mainrun: Maincode's LLM Training Optimization Assessment

Why Mainrun?

Mainrun is Maincode's standardized assessment framework for evaluating machine learning engineering expertise, specifically focused on LLM training and optimization. This project provides a consistent, real-world environment to explore a candidate's knowledge of:

Transformer architectures and training dynamics
Performance optimization techniques
Code quality and engineering practices
Problem-solving approaches to ML challenges
Areas of personal interest and expertise in ML optimization

The Challenge

Your task is to analyze and improve the training run of a small GPT-2 style model. The code is designed to train on a Hacker News headline dataset.

Your goal is to minimize the validation loss as low as possible within 7 epochs. The baseline to beat is a validation loss of 1.754.

The Rules

The rules are simple:

You cannot change the number of epochs, the random seed, the dataset, or the validation fraction.
You cannot use pre-trained weights or augment the training data.
You cannot change the evaluate() function.
Everything else is fair game! You can change the model architecture, initialization, tokenization, hyperparameters, optimizer, scheduler, training loop, etc.

Getting Started

# 1. Clone the repository and open it in VS Code
git clone <your-assessment-repo-url>
cd mainrun
code .

# 2. When prompted, reopen the project in the Dev Container
# This initial build may take a few minutes.

# 3. Run the baseline training to see the starting point
task train

# 4. Analyze the baseline performance in the log file
cat mainrun/logs/baseline.log

# 5. Start optimizing!

Key Commands

You will only need two main commands for this assessment:

task train: The primary command. It first runs a checkpoint and then executes the full training pipeline in train.py. You have complete freedom to refactor or extend the code, but all functionality must be accessible through this single script.
task submit: When you are ready, this command will create a final checkpoint, zip up your entire repository (including all code and logs), and upload it to Maincode's evaluation system for review.

Evaluation

Submissions will be evaluated on:

Model Performance: Most importantly, the improvement in validation loss compared to the baseline.
Code Quality: Clean, maintainable, and well-documented code.
Innovation: Creative or insightful approaches to optimization.
Documentation: A short report.pdf that walks through the changes you made, your reasoning, and their effect on the training curves. Place the report in the mainrun folder before submitting.

Legal and Contributing

By participating and submitting your work, you agree to the terms outlined in our Legal Notice. This includes the assignment of all intellectual property rights in your submission to Maincode Pty Ltd.

For information on how to report issues or suggest improvements to the assessment framework itself, please see our Contributing Guide.

The Team

This project is made possible by these wonderful contributors:

Name	GitHub Profile
Kees	@casebakker
Sara	@maincode-sarae
Fabian	@fabian-maincode
Dave	@maincode-dave

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.devcontainer		.devcontainer
docs/images		docs/images
mainrun		mainrun
scripts		scripts
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LEGAL-NOTICE.md		LEGAL-NOTICE.md
LICENSE		LICENSE
README.md		README.md
Taskfile.yml		Taskfile.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Why Mainrun?

The Challenge

The Rules

Getting Started

Key Commands

Evaluation

Legal and Contributing

The Team

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

MaincodeHQ/mainrun

Folders and files

Latest commit

History

Repository files navigation

Why Mainrun?

The Challenge

The Rules

Getting Started

Key Commands

Evaluation

Legal and Contributing

The Team

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages