Instructions to use mjbommar/linux-as-a-model-32M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mjbommar/linux-as-a-model-32M with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="mjbommar/linux-as-a-model-32M")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("mjbommar/linux-as-a-model-32M")
model = AutoModelForCausalLM.from_pretrained("mjbommar/linux-as-a-model-32M")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use mjbommar/linux-as-a-model-32M with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "mjbommar/linux-as-a-model-32M"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mjbommar/linux-as-a-model-32M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/mjbommar/linux-as-a-model-32M

SGLang

How to use mjbommar/linux-as-a-model-32M with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "mjbommar/linux-as-a-model-32M" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mjbommar/linux-as-a-model-32M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "mjbommar/linux-as-a-model-32M" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mjbommar/linux-as-a-model-32M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use mjbommar/linux-as-a-model-32M with Docker Model Runner:
```
docker model run hf.co/mjbommar/linux-as-a-model-32M
```

LaaM - Linux as a Model

What happens when we train a simple transformer model to memorize the GPL2 source of the Linux kernel?

Source

https://github.com/mjbommar/laam

Motivation

Simply put, the OSI is making a grave mistake by ignoring the most important transitive dependency in AI - the training data.

As of the latest version of The Open Source AI Definition (draft v. 0.0.8), the OSI has decided that the legal status of training data is irrelevant to their subsequent "approval" of models as "open."

The argument in favor of this omission is that such a requirement would be inconvenient and legally ambiguous in some jurisdictions.

This would be like Creative Commons encouraging the authors of textual or audiovisual works to ignore the terms of copyleft licenses.

Simply put, organizations like the OSI must take a clear, common sense stance: "AI" models like text or multimodal LLMs cannot be considered "open" if they are trained on "stolen" or "closed source" data.

Details

To demonstrate how ridiculous the OSI's position is, I have trained simple transformer models to memorize the source code of Linux version 1.0, which is licensed under the GPL2.

This model is documented and trained in perfect compliance with the OSI's draft guidance on Data Information, Code, and Model sections. All source code is available in the GitHub repository, all dependencies are open source, all input training data is directly described by the source code, and all model weights are available on Hugging Face.

Example Model - 5M parameter Llama2 architecture

For example, this 5M parameter model can be trained on practically any device in a minutes to hours. The model trivially emits copies of Linux 1.0 source code. For example, using the HuggingFace hub copy at mjbommar/linux-as-a-model-5M:

>>> from transformers import pipeline
>>> p = pipeline('text-generation', 'mjbommar/linux-as-a-model-32M')
>>> print(p('', max_new_tokens=256, do_sample=True, temperature=0.2)[0]['generated_text'])
 linux/drivers/FPU-emu/reg_u_div.S.file "reg_u_div.S"
/*---------------------------------------------------------------------------+
 | reg_u_div.S |
 | |
 | Core division routines |
 | |
 | Copyright (C) 1992,1993 |
 | W. Metzenthen, 22 Parker St, Ormond, Vic 3163, |
 | Australia. E-mail billm@vaxc.cc.monash.edu.au |
 | |
 |

License

For the sake of demonstration, I have licensed the model source and weights under the MIT terms, and the OSI should support this model as completely open and compliant with their draft guidance.

Train your own model

# ensure poetry available
# curl -sSL https://install.python-poetry.org | python3 -

# setup poetry environment
$ poetry install --no-root

# optionally install flash-attn
# poetry run pip install wheel
# MAX_JOBS=4 poetry run pip install flash-attn --no-build-isolation

# train a tokenizer with fixed vocab size on linux version 1.0
$ PYTHONPATH=. poetry run python3 -m laam.commands.train_tokenizer \
    --version v1.0/1.0 \
    --vocab-size 32768

# train a 5M parameter model on it

# stage 1: large batch size, 1e-3 learning rate to safely converge near solution 
$ PYTHONPATH=. poetry run accelerate launch \
    laam/commands/train_llama.py \
    --version v1.0/1.0 \
    --precision bfloat16 \
    --hidden_size 64 \
    --intermediate_size 256 \
    --num_hidden_layers 8 \
    --num_attention_heads 32 \
    --max_position_embeddings 512 \
    --learning_rate 0.001 \
    --batch_size 64 \
    --epochs 100
    
# stage 2: single sample batches with 1e-4 learning rate to memorize
$ PYTHONPATH=. poetry run accelerate launch \
    laam/commands/train_llama.py \
    --version v1.0/1.0 \
    --precision bfloat16 \
    --reload \
    --learning_rate 0.0001 \
    --batch_size 1 \
    --epochs 100

Downloads last month: 15

Safetensors

Model size

31.7M params

Tensor type

F32