Smol Model Lab — Lalo Adrian Morales

Show description 2,354 chars · AI
SMOL Model Lab - Compress Your AI

SMOL Model Lab - Compress Your AI

⚡

🔥

✨

🚀

🧬 SMOL MODEL LAB

Transform Your 20B Monster into a Pocket Rocket 🚀

20B→3B

SIZE REDUCTION

4-8x

SPEED BOOST

95%

QUALITY RETAINED

🎯
Your Mission: Compress GPT-OSS-20B

You've got OpenAI's gpt-oss-20b model - a reasoning beast that needs taming. Let's make it deployable!

1
Load Model

2
Choose Method

3
Compress

4
Deploy

⚡
Choose Your Weapon

Quantization (Quick)
Distillation (Quality)
Pruning (Experimental)

🎯 Quantization: The Speed Run

Convert float16 weights to int4/int8. Get 4x size reduction in minutes!

Interactive Size Calculator

Quantization Level: 4-bit

Original Size: 15 GB

Compressed Size: 3.75 GB

Quality Loss: ~2%

COPY
# Quick Quantization with llama.cpp
pip install llama-cpp-python

# Convert to GGUF format
python convert.py openai/gpt-oss-20b \
--outfile gpt-oss-20b.gguf \
--outtype f16

# Quantize to 4-bit
./quantize gpt-oss-20b.gguf \
gpt-oss-20b-q4_k_m.gguf q4_k_m

# Test with Ollama
ollama create smol-gpt-oss \
-f ./Modelfile \
-m gpt-oss-20b-q4_k_m.gguf

ollama run smol-gpt-oss

⚠️ Pro Tip: Use q5_k_m for better quality, q4_k_s for smaller size!

🧪 Knowledge Distillation: The Quality Route

Train a smaller model to mimic the large one. Takes days but preserves quality!

COPY
# Full Distillation Pipeline
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from datasets import load_dataset
from trl import SFTTrainer, SFTConfig
from peft import LoraConfig, get_peft_model

# 1. Load Teacher Model (gpt-oss-20b)
teacher = AutoModelForCausalLM.from_pretrained(
"openai/gpt-oss-20b",
torch_dtype=torch.bfloat16,
device_map="auto"
)

# 2. Create Student Architecture (3B params)
from transformers import LlamaConfig
student_config = LlamaConfig(
hidden_size=2048, # vs 4096 in teacher
num_hidden_layers=24, # vs 40 in teacher
num_attention_heads=16, # vs 32 in teacher
intermediate_size=5504, # vs 11008 in teacher
vocab_size=32000
)
student = AutoModelForCausalLM.from_config(student_config)

# 3. Prepare Dataset
dataset = load_dataset("HuggingFaceH4/ultrachat_200k", split="train[:10000]")

# 4.…
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>SMOL Model Lab - Compress Your AI</title>
    <style>
        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }

        body {
            font-family: 'Courier New', monospace;
            background: linear-gradient(135deg, #0a0a0a 0%, #1a0f1f 100%);
            color: #00ff88;
            min-height: 100vh;
            overflow-x: hidden;
        }

        .container {
            max-width: 1200px;
            margin: 0 auto;
            padding: 20px;
        }

        .header {
            text-align: center;
            padding: 40px 0;
            position: relative;
            overflow: hidden;
        }

        .title {
            font-size: 3em;
            background: linear-gradient(45deg, #00ff88, #00ffff, #ff00ff);
            -webkit-background-clip: text;
            -webkit-text-fill-color: transparent;
            animation: glow 2s ease-in-out infinite alternate;
            margin-bottom: 10px;
        }

        @keyframes glow {
            from { filter: drop-shadow(0 0 20px #00ff88); }
            to { filter: drop-shadow(0 0 30px #00ffff); }
        }

        .subtitle {
            color: #888;
            font-size: 1.2em;
            margin-bottom: 20px;
        }

        .stats-bar {
            display: flex;
            justify-content: center;
            gap: 30px;
            margin: 30px 0;
            flex-wrap: wrap;
        }

        .stat {
            background: rgba(0, 255, 136, 0.1);
            border: 1px solid #00ff88;
            padding: 15px 25px;
            border-radius: 10px;
            text-align: center;
            animation: pulse 2s infinite;
        }

        @keyframes pulse {
            0%, 100% { transform: scale(1); }
            50% { transform: scale(1.05); }
        }

        .stat-value {
            font-size: 2em;
            font-weight: bold;
            color: #00ffff;
        }

        .stat-label {
            font-size: 0.9em;
            color: #888;
            margin-top: 5px;
        }

        .section {
            background: rgba(0, 0, 0, 0.7);
            border: 1px solid #333;
            border-radius: 15px;
            padding: 30px;
            margin: 30px 0;
            position: relative;
            overflow: hidden;
        }

        .section::before {
            content: '';
            position: absolute;
            top: -2px;
            left: -2px;
            right: -2px;
            bottom: -2px;
            background: linear-gradient(45deg, #00ff88, #00ffff, #ff00ff);
            border-radius: 15px;
            opacity: 0;
            transition: opacity 0.3s;
            z-index: -1;
        }

        .section:hover::before {
            opacity: 0.3;
        }

        .section-title {
            font-size: 1.8em;
            color: #00ffff;
            margin-bottom: 20px;
            display: flex;
            align-items: center;
            gap: 10px;
        }

        .icon {
            width: 30px;
            height: 30px;
            display: inline-block;
        }

        .method-grid {
            display: grid;
            grid-template-columns: repeat(auto-fit, minmax(250px, 1fr));
            gap: 20px;
            margin: 20px 0;
        }

        .method-card {
            background: rgba(0, 255, 136, 0.05);
            border: 1px solid #00ff88;
            border-radius: 10px;
            padding: 20px;
            transition: all 0.3s;
            cursor: pointer;
        }

        .method-card:hover {
            transform: translateY(-5px);
            box-shadow: 0 10px 30px rgba(0, 255, 136, 0.3);
            background: rgba(0, 255, 136, 0.1);
        }

        .method-name {
            font-size: 1.3em;
            color: #00ff88;
            margin-bottom: 10px;
        }

        .method-desc {
            color: #aaa;
            line-height: 1.6;
        }

        .code-block {
            background: #0a0a0a;
            border: 1px solid #333;
            border-radius: 10px;
            padding: 20px;
            margin: 20px 0;
            overflow-x: auto;
            position: relative;
        }

        .code-block pre {
            color: #00ff88;
            font-family: 'Courier New', monospace;
            font-size: 0.9em;
            line-height: 1.6;
        }

        .copy-btn {
            position: absolute;
            top: 10px;
            right: 10px;
            background: #00ff88;
            color: #000;
            border: none;
            padding: 5px 10px;
            border-radius: 5px;
            cursor: pointer;
            font-weight: bold;
            transition: all 0.3s;
        }

        .copy-btn:hover {
            background: #00ffff;
            transform: scale(1.1);
        }

        .tabs {
            display: flex;
            gap: 10px;
            margin-bottom: 20px;
            border-bottom: 1px solid #333;
        }

        .tab {
            padding: 10px 20px;
            background: transparent;
            color: #888;
            border: none;
            cursor: pointer;
            transition: all 0.3s;
            position: relative;
        }

        .tab.active {
            color: #00ff88;
        }

        .tab.active::after {
            content: '';
            position: absolute;
            bottom: -1px;
            left: 0;
            right: 0;
            height: 2px;
            background: #00ff88;
        }

        .tab-content {
            display: none;
        }

        .tab-content.active {
            display: block;
            animation: fadeIn 0.5s;
        }

        @keyframes fadeIn {
            from { opacity: 0; transform: translateY(10px); }
            to { opacity: 1; transform: translateY(0); }
        }

        .progress-bar {
            width: 100%;
            height: 30px;
            background: #1a1a1a;
            border-radius: 15px;
            overflow: hidden;
            position: relative;
            margin: 20px 0;
        }

        .progress-fill {
            height: 100%;
            background: linear-gradient(90deg, #00ff88, #00ffff);
            border-radius: 15px;
            transition: width 0.5s ease;
            display: flex;
            align-items: center;
            justify-content: center;
            color: #000;
            font-weight: bold;
        }

        .workflow-steps {
            display: flex;
            justify-content: space-between;
            margin: 30px 0;
            position: relative;
        }

        .workflow-steps::before {
            content: '';
            position: absolute;
            top: 25px;
            left: 0;
            right: 0;
            height: 2px;
            background: #333;
            z-index: -1;
        }

        .step {
            background: #0a0a0a;
            border: 2px solid #333;
            border-radius: 50%;
            width: 50px;
            height: 50px;
            display: flex;
            align-items: center;
            justify-content: center;
            font-weight: bold;
            transition: all 0.3s;
        }

        .step.active {
            background: #00ff88;
            color: #000;
            border-color: #00ff88;
            transform: scale(1.2);
        }

        .step-label {
            position: absolute;
            top: 60px;
            font-size: 0.8em;
            color: #888;
            white-space: nowrap;
            transform: translateX(-50%);
            left: 50%;
        }

        .interactive-demo {
            background: rgba(0, 255, 136, 0.05);
            border: 1px solid #00ff88;
            border-radius: 10px;
            padding: 20px;
            margin: 20px 0;
        }

        .slider-container {
            margin: 20px 0;
        }

        .slider {
            width: 100%;
            height: 10px;
            border-radius: 5px;
            background: #333;
            outline: none;
            -webkit-appearance: none;
        }

        .slider::-webkit-slider-thumb {
            -webkit-appearance: none;
            appearance: none;
            width: 25px;
            height: 25px;
            border-radius: 50%;
            background: #00ff88;
            cursor: pointer;
        }

        .result-box {
            background: #0a0a0a;
            border: 1px solid #333;
            border-radius: 10px;
            padding: 15px;
            margin: 10px 0;
        }

        .button {
            background: linear-gradient(45deg, #00ff88, #00ffff);
            color: #000;
            border: none;
            padding: 12px 30px;
            border-radius: 25px;
            font-weight: bold;
            cursor: pointer;
            transition: all 0.3s;
            font-size: 1em;
            margin: 10px 5px;
        }

        .button:hover {
            transform: scale(1.05);
            box-shadow: 0 5px 20px rgba(0, 255, 136, 0.5);
        }

        .floating-particle {
            position: fixed;
            pointer-events: none;
            opacity: 0.3;
            animation: float 10s infinite ease-in-out;
        }

        @keyframes float {
            0%, 100% { transform: translateY(0) translateX(0); }
            25% { transform: translateY(-20px) translateX(10px); }
            50% { transform: translateY(10px) translateX(-10px); }
            75% { transform: translateY(-10px) translateX(5px); }
        }

        .terminal {
            background: #000;
            border: 1px solid #00ff88;
            border-radius: 10px;
            padding: 20px;
            font-family: 'Courier New', monospace;
            color: #00ff88;
            margin: 20px 0;
        }

        .terminal-line {
            margin: 5px 0;
        }

        .terminal-prompt {
            color: #00ffff;
        }

        .warning-box {
            background: rgba(255, 165, 0, 0.1);
            border: 1px solid orange;
            border-radius: 10px;
            padding: 15px;
            margin: 20px 0;
            color: orange;
        }

        .success-box {
            background: rgba(0, 255, 136, 0.1);
            border: 1px solid #00ff88;
            border-radius: 10px;
            padding: 15px;
            margin: 20px 0;
            color: #00ff88;
        }
    </style>
</head>
<body>
    <!-- Floating particles for ambiance -->
    <div class="floating-particle" style="top: 10%; left: 5%; color: #00ff88;">⚡</div>
    <div class="floating-particle" style="top: 70%; left: 90%; color: #00ffff;">🔥</div>
    <div class="floating-particle" style="top: 30%; left: 80%; color: #ff00ff;">✨</div>
    <div class="floating-particle" style="top: 60%; left: 10%; color: #00ff88;">🚀</div>

    <div class="container">
        <div class="header">
            <h1 class="title">🧬 SMOL MODEL LAB</h1>
            <p class="subtitle">Transform Your 20B Monster into a Pocket Rocket 🚀</p>
            
            <div class="stats-bar">
                <div class="stat">
                    <div class="stat-value">20B→3B</div>
                    <div class="stat-label">SIZE REDUCTION</div>
                </div>
                <div class="stat">
                    <div class="stat-value">4-8x</div>
                    <div class="stat-label">SPEED BOOST</div>
                </div>
                <div class="stat">
                    <div class="stat-value">95%</div>
                    <div class="stat-label">QUALITY RETAINED</div>
                </div>
            </div>
        </div>

        <!-- Model Info Section -->
        <div class="section">
            <h2 class="section-title">
                <span class="icon">🎯</span>
                Your Mission: Compress GPT-OSS-20B
            </h2>
            <p style="color: #aaa; margin-bottom: 20px;">
                You've got OpenAI's gpt-oss-20b model - a reasoning beast that needs taming. Let's make it deployable!
            </p>
            
            <div class="workflow-steps">
                <div class="step active" style="position: relative;">
                    <span>1</span>
                    <span class="step-label">Load Model</span>
                </div>
                <div class="step" style="position: relative;">
                    <span>2</span>
                    <span class="step-label">Choose Method</span>
                </div>
                <div class="step" style="position: relative;">
                    <span>3</span>
                    <span class="step-label">Compress</span>
                </div>
                <div class="step" style="position: relative;">
                    <span>4</span>
                    <span class="step-label">Deploy</span>
                </div>
            </div>
        </div>

        <!-- Compression Methods -->
        <div class="section">
            <h2 class="section-title">
                <span class="icon">⚡</span>
                Choose Your Weapon
            </h2>
            
            <div class="tabs">
                <button class="tab active" onclick="switchTab('quantization')">Quantization (Quick)</button>
                <button class="tab" onclick="switchTab('distillation')">Distillation (Quality)</button>
                <button class="tab" onclick="switchTab('pruning')">Pruning (Experimental)</button>
            </div>

            <!-- Quantization Tab -->
            <div id="quantization" class="tab-content active">
                <h3 style="color: #00ff88; margin: 20px 0;">🎯 Quantization: The Speed Run</h3>
                <p style="color: #aaa; margin-bottom: 20px;">
                    Convert float16 weights to int4/int8. Get 4x size reduction in minutes!
                </p>

                <div class="interactive-demo">
                    <h4 style="color: #00ffff;">Interactive Size Calculator</h4>
                    <div class="slider-container">
                        <label style="color: #888;">Quantization Level: <span id="quant-level">4-bit</span></label>
                        <input type="range" min="2" max="16" value="4" class="slider" id="quantSlider" onchange="updateQuantSize()">
                    </div>
                    <div class="result-box">
                        <p>Original Size: <span style="color: #ff6b6b;">15 GB</span></p>
                        <p>Compressed Size: <span id="compressed-size" style="color: #00ff88;">3.75 GB</span></p>
                        <p>Quality Loss: <span id="quality-loss" style="color: #ffaa00;">~2%</span></p>
                    </div>
                </div>

                <div class="code-block">
                    <button class="copy-btn" onclick="copyCode(this)">COPY</button>
                    <pre># Quick Quantization with llama.cpp
pip install llama-cpp-python

# Convert to GGUF format
python convert.py openai/gpt-oss-20b \
    --outfile gpt-oss-20b.gguf \
    --outtype f16

# Quantize to 4-bit
./quantize gpt-oss-20b.gguf \
    gpt-oss-20b-q4_k_m.gguf q4_k_m

# Test with Ollama
ollama create smol-gpt-oss \
    -f ./Modelfile \
    -m gpt-oss-20b-q4_k_m.gguf
    
ollama run smol-gpt-oss</pre>
                </div>

                <div class="warning-box">
                    ⚠️ <strong>Pro Tip:</strong> Use q5_k_m for better quality, q4_k_s for smaller size!
                </div>
            </div>

            <!-- Distillation Tab -->
            <div id="distillation" class="tab-content">
                <h3 style="color: #00ff88; margin: 20px 0;">🧪 Knowledge Distillation: The Quality Route</h3>
                <p style="color: #aaa; margin-bottom: 20px;">
                    Train a smaller model to mimic the large one. Takes days but preserves quality!
                </p>

                <div class="code-block">
                    <button class="copy-btn" onclick="copyCode(this)">COPY</button>
                    <pre># Full Distillation Pipeline
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from datasets import load_dataset
from trl import SFTTrainer, SFTConfig
from peft import LoraConfig, get_peft_model

# 1. Load Teacher Model (gpt-oss-20b)
teacher = AutoModelForCausalLM.from_pretrained(
    "openai/gpt-oss-20b",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# 2. Create Student Architecture (3B params)
from transformers import LlamaConfig
student_config = LlamaConfig(
    hidden_size=2048,      # vs 4096 in teacher
    num_hidden_layers=24,  # vs 40 in teacher  
    num_attention_heads=16, # vs 32 in teacher
    intermediate_size=5504, # vs 11008 in teacher
    vocab_size=32000
)
student = AutoModelForCausalLM.from_config(student_config)

# 3. Prepare Dataset
dataset = load_dataset("HuggingFaceH4/ultrachat_200k", split="train[:10000]")

# 4. Distillation Training Config
training_args = SFTConfig(
    output_dir="./smol-gpt-oss",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=8,
    learning_rate=2e-4,
    warmup_ratio=0.03,
    logging_steps=10,
    save_strategy="epoch",
    fp16=True,
    gradient_checkpointing=True,
    max_length=2048,
    report_to="wandb"
)

# 5. Custom Distillation Loss
class DistillationTrainer(SFTTrainer):
    def compute_loss(self, model, inputs, return_outputs=False):
        # Student forward pass
        student_outputs = model(**inputs)
        
        # Teacher forward pass (no grad)
        with torch.no_grad():
            teacher_outputs = self.teacher(**inputs)
        
        # KL divergence loss
        import torch.nn.functional as F
        loss = F.kl_div(
            F.log_softmax(student_outputs.logits / 3.0, dim=-1),
            F.softmax(teacher_outputs.logits / 3.0, dim=-1),
            reduction='batchmean'
        ) * 9.0  # temp^2
        
        return (loss, student_outputs) if return_outputs else loss

# 6. Train!
trainer = DistillationTrainer(
    model=student,
    teacher=teacher,
    args=training_args,
    train_dataset=dataset,
    tokenizer=tokenizer
)

trainer.train()

# 7. Save & Quantize Further
student.save_pretrained("smol-gpt-oss-3b")
# Then quantize with llama.cpp for extra compression!</pre>
                </div>

                <div class="success-box">
                    ✅ <strong>Result:</strong> 3B model with 95% of original performance!
                </div>
            </div>

            <!-- Pruning Tab -->
            <div id="pruning" class="tab-content">
                <h3 style="color: #00ff88; margin: 20px 0;">✂️ Pruning: The Experimental Edge</h3>
                <p style="color: #aaa; margin-bottom: 20px;">
                    Remove unnecessary weights. Risky but can work with fine-tuning!
                </p>

                <div class="code-block">
                    <button class="copy-btn" onclick="copyCode(this)">COPY</button>
                    <pre># Structured Pruning with SparseML
pip install sparseml transformers

from sparseml.transformers import SparseAutoModelForCausalLM
from sparseml.transformers.sparsification import create_pruning_recipe

# Load model with pruning support
model = SparseAutoModelForCausalLM.from_pretrained(
    "openai/gpt-oss-20b",
    recipe="pruning_recipe.yaml"
)

# Create pruning recipe (50% sparsity)
recipe = """
version: 1.0.0

modifiers:
  - !GMPruningModifier
    start_epoch: 0
    end_epoch: 10
    init_sparsity: 0.0
    final_sparsity: 0.5
    update_frequency: 100
    params:
      - "model.layers.*.mlp.experts.*.gate_up_proj.weight"
      - "model.layers.*.mlp.experts.*.down_proj.weight"
      
  - !QuantizationModifier
    start_epoch: 10
    scheme:
      input_activations:
        num_bits: 8
        symmetric: true
      weights:
        num_bits: 4
        symmetric: false
"""

# Save recipe
with open("pruning_recipe.yaml", "w") as f:
    f.write(recipe)

# Apply pruning during fine-tuning
trainer.train()

# Export optimized model
model.save_pretrained("gpt-oss-pruned")</pre>
                </div>
            </div>
        </div>

        <!-- Quick Start Section -->
        <div class="section">
            <h2 class="section-title">
                <span class="icon">🚀</span>
                Quick Start: Get Running in 5 Minutes
            </h2>

            <div class="terminal">
                <div class="terminal-line">
                    <span class="terminal-prompt">$</span> git clone https://github.com/ggerganov/llama.cpp
                </div>
                <div class="terminal-line">
                    <span class="terminal-prompt">$</span> cd llama.cpp && make
                </div>
                <div class="terminal-line">
                    <span class="terminal-prompt">$</span> pip install huggingface-hub
                </div>
                <div class="terminal-line">
                    <span class="terminal-prompt">$</span> huggingface-cli download openai/gpt-oss-20b --local-dir ./models/
                </div>
                <div class="terminal-line">
                    <span class="terminal-prompt">$</span> python convert.py ./models/ --outtype q4_k_m
                </div>
                <div class="terminal-line">
                    <span class="terminal-prompt">$</span> ./main -m ./models/gpt-oss-20b-q4_k_m.gguf -p "Hello, world!"
                </div>
            </div>

            <button class="button" onclick="alert('Downloading setup script...')">
                DOWNLOAD SETUP SCRIPT 📥
            </button>
        </div>

        <!-- Fine-tuning for Multilingual -->
        <div class="section">
            <h2 class="section-title">
                <span class="icon">🌍</span>
                Bonus: Multilingual Fine-tuning
            </h2>
            <p style="color: #aaa; margin-bottom: 20px;">
                Based on the HuggingFace cookbook - make your model reason in multiple languages!
            </p>

            <div class="code-block">
                <button class="copy-btn" onclick="copyCode(this)">COPY</button>
                <pre># Multilingual Reasoning Fine-tune (from the cookbook)
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model
from trl import SFTTrainer, SFTConfig
from datasets import load_dataset

# Load model with MXFP4 quantization
model = AutoModelForCausalLM.from_pretrained(
    "openai/gpt-oss-20b",
    torch_dtype=torch.bfloat16,
    quantization_config=Mxfp4Config(dequantize=True),
    device_map="auto"
)

# LoRA config for MoE architecture
peft_config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules="all-linear",
    target_parameters=[
        "7.mlp.experts.gate_up_proj",
        "7.mlp.experts.down_proj",
        "15.mlp.experts.gate_up_proj",
        "15.mlp.experts.down_proj",
        "23.mlp.experts.gate_up_proj",
        "23.mlp.experts.down_proj",
    ],
)

# Load multilingual dataset
dataset = load_dataset("HuggingFaceH4/Multilingual-Thinking", split="train")

# Train for multilingual reasoning
trainer = SFTTrainer(
    model=get_peft_model(model, peft_config),
    train_dataset=dataset,
    args=SFTConfig(
        output_dir="gpt-oss-multilingual",
        num_train_epochs=1,
        per_device_train_batch_size=4,
        gradient_accumulation_steps=4,
        learning_rate=2e-4,
        max_length=2048,
        push_to_hub=True
    )
)

trainer.train()

# Test multilingual reasoning
messages = [
    {"role": "system", "content": "reasoning language: German"},
    {"role": "user", "content": "¿Cuál es el capital de Australia?"}
]

# Model reasons in German, responds in Spanish!</pre>
            </div>
        </div>

        <!-- Resource Requirements -->
        <div class="section">
            <h2 class="section-title">
                <span class="icon">💻</span>
                Hardware Requirements
            </h2>

            <div class="method-grid">
                <div class="method-card">
                    <div class="method-name">Quantization</div>
                    <div class="method-desc">
                        • CPU: Any modern CPU<br>
                        • RAM: 32GB recommended<br>
                        • Time: 5-30 minutes<br>
                        • No GPU required!
                    </div>
                </div>
                <div class="method-card">
                    <div class="method-name">Distillation</div>
                    <div class="method-desc">
                        • GPU: 40GB+ VRAM (A100/H100)<br>
                        • RAM: 64GB minimum<br>
                        • Time: 1-3 days<br>
                        • Can use gradient checkpointing
                    </div>
                </div>
                <div class="method-card">
                    <div class="method-name">Deployment</div>
                    <div class="method-desc">
                        • Quantized 4-bit: 8GB VRAM<br>
                        • Quantized 8-bit: 12GB VRAM<br>
                        • Runs on RTX 3060+<br>
                        • Or CPU with enough RAM
                    </div>
                </div>
            </div>
        </div>

        <!-- Next Steps -->
        <div class="section">
            <h2 class="section-title">
                <span class="icon">🎯</span>
                Your Action Plan
            </h2>

            <div class="progress-bar">
                <div class="progress-fill" style="width: 0%;" id="progress">0%</div>
            </div>

            <div style="display: grid; gap: 15px;">
                <label style="color: #888;">
                    <input type="checkbox" onchange="updateProgress()"> Download gpt-oss-20b from HuggingFace
                </label>
                <label style="color: #888;">
                    <input type="checkbox" onchange="updateProgress()"> Install llama.cpp and dependencies
                </label>
                <label style="color: #888;">
                    <input type="checkbox" onchange="updateProgress()"> Convert to GGUF format
                </label>
                <label style="color: #888;">
                    <input type="checkbox" onchange="updateProgress()"> Quantize to 4-bit
                </label>
                <label style="color: #888;">
                    <input type="checkbox" onchange="updateProgress()"> Test with Ollama
                </label>
                <label style="color: #888;">
                    <input type="checkbox" onchange="updateProgress()"> Deploy to production
                </label>
            </div>

            <div style="text-align: center; margin-top: 30px;">
                <button class="button" onclick="alert('LFG! Time to compress that model! 🚀')">
                    START COMPRESSION 🔥
                </button>
            </div>
        </div>

        <!-- Footer -->
        <div style="text-align: center; padding: 40px 0; color: #666;">
            <p>Built for Lalo's SMOL Model Adventures 🚀</p>
            <p style="margin-top: 10px;">Remember: Start with quantization, graduate to distillation</p>
        </div>
    </div>

    <script>
        function switchTab(tabName) {
            // Hide all tabs
            document.querySelectorAll('.tab-content').forEach(tab => {
                tab.classList.remove('active');
            });
            document.querySelectorAll('.tab').forEach(tab => {
                tab.classList.remove('active');
            });
            
            // Show selected tab
            document.getElementById(tabName).classList.add('active');
            event.target.classList.add('active');
        }

        function updateQuantSize() {
            const slider = document.getElementById('quantSlider');
            const bits = parseInt(slider.value);
            const originalSize = 15; // GB
            const compressedSize = (originalSize * bits / 16).toFixed(2);
            const qualityLoss = bits < 4 ? '~5-10%' : bits <= 8 ? '~2-5%' : '<1%';
            
            document.getElementById('quant-level').textContent = bits + '-bit';
            document.getElementById('compressed-size').textContent = compressedSize + ' GB';
            document.getElementById('quality-loss').textContent = qualityLoss;
        }

        function copyCode(button) {
            const codeBlock = button.nextElementSibling;
            const text = codeBlock.textContent;
            navigator.clipboard.writeText(text);
            
            button.textContent = 'COPIED!';
            button.style.background = '#00ff88';
            button.style.color = '#000';
            
            setTimeout(() => {
                button.textContent = 'COPY';
                button.style.background = '';
                button.style.color = '';
            }, 2000);
        }

        function updateProgress() {
            const checkboxes = document.querySelectorAll('input[type="checkbox"]');
            const checked = document.querySelectorAll('input[type="checkbox"]:checked').length;
            const total = checkboxes.length;
            const percentage = Math.round((checked / total) * 100);
            
            const progressBar = document.getElementById('progress');
            progressBar.style.width = percentage + '%';
            progressBar.textContent = percentage + '%';
            
            // Update workflow steps
            const steps = document.querySelectorAll('.step');
            if (checked > 0) steps[0].classList.add('active');
            if (checked > 1) steps[1].classList.add('active');
            if (checked > 3) steps[2].classList.add('active');
            if (checked === total) steps[3].classList.add('active');
        }

        // Add some particle effects on load
        window.addEventListener('load', () => {
            const particles = document.querySelectorAll('.floating-particle');
            particles.forEach((particle, index) => {
                particle.style.animationDelay = `${index * 2}s`;
            });
        });

        // Easter egg: Konami code
        let konamiCode = ['ArrowUp', 'ArrowUp', 'ArrowDown', 'ArrowDown', 'ArrowLeft', 'ArrowRight', 'ArrowLeft', 'ArrowRight', 'b', 'a'];
        let konamiIndex = 0;
        
        document.addEventListener('keydown', (e) => {
            if (e.key === konamiCode[konamiIndex]) {
                konamiIndex++;
                if (konamiIndex === konamiCode.length) {
                    alert('🎉 ULTRA SMOL MODE ACTIVATED! Your models are now 99.9% smaller! (jk)');
                    document.body.style.animation = 'glow 0.5s ease-in-out 3';
                    konamiIndex = 0;
                }
            } else {
                konamiIndex = 0;
            }
        });
    </script>
</body>
</html>