Blog / AI / Claude Code for Axolotl: Config-Driven LLM Fine-Tuning

Claude Code for Axolotl: Config-Driven LLM Fine-Tuning

Published: September 23, 2027

•

Read time: 5 min read

•

By: Claude Skills 360

Axolotl fine-tunes LLMs through YAML configuration. pip install axolotl. Train: accelerate launch -m axolotl.cli.train config.yml. Preprocess: python -m axolotl.cli.preprocess config.yml. Merge LoRA: python -m axolotl.cli.merge_lora config.yml. Core config: base_model: meta-llama/Llama-3.2-3B-Instruct, model_type: LlamaForCausalLM, tokenizer_type: AutoTokenizer. LoRA: adapter: lora, lora_r: 16, lora_alpha: 32, lora_target_modules: ["q_proj","v_proj"]. QLoRA: load_in_4bit: true, adapter: qlora. Dataset: datasets: [{path: iamtarun/python_code_instructions_18k_alpaca, type: alpaca, split: train[:5000]}]. Types: alpaca (instruction/input/output), sharegpt (conversations list), completion (raw text), chat_template (messages format). Multi-dataset: add multiple entries with optional ds_weight: 0.5. Training: num_epochs: 3, micro_batch_size: 2, gradient_accumulation_steps: 4, learning_rate: 2e-4, lr_scheduler: cosine, warmup_steps: 100. Flash Attention: flash_attention: true. Sample packing: sample_packing: true, eval_sample_packing: false. DeepSpeed: deepspeed: configs/zero2.json. FSDP: fsdp: [full_shard, auto_wrap], fsdp_config: {fsdp_offload_params: false}. Output: output_dir: outputs/axolotl-llama3, saves_per_epoch: 1. Sequence length: sequence_len: 2048. Wandb: wandb_project: axolotl-runs, wandb_name: llama3-qlora. DPO: rl: dpo, datasets: [{type: chatml.intel, path: ...}]. hub_model_id: user/model-name pushes to Hub after training. Claude Code generates Axolotl YAML configs, multi-dataset recipes, DeepSpeed integration, DPO configs, and CLI training scripts.

CLAUDE.md for Axolotl

## Axolotl Stack
- Version: axolotl >= 0.4, transformers >= 4.40, deepspeed >= 0.14
- Train: accelerate launch -m axolotl.cli.train config.yml
- Config: base_model, model_type, adapter (lora|qlora), datasets[{path, type}]
- Formats: alpaca, sharegpt, completion, chat_template, chatml.intel (for DPO)
- Flash Attn: flash_attention: true (requires flash-attn>=2.0)
- Sample packing: sample_packing: true (efficient for short sequences)
- Merge: python -m axolotl.cli.merge_lora config.yml → merged_model/

Training Configs

# configs/qlora_llama3.yml — QLoRA fine-tuning on a single GPU
base_model: meta-llama/Llama-3.2-3B-Instruct
model_type: LlamaForCausalLM
tokenizer_type: AutoTokenizer

# QLoRA settings
load_in_4bit: true
adapter: qlora
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
lora_target_linear: true      # Target ALL linear layers automatically
# or explicit: lora_target_modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]

# Dataset — can mix multiple sources
datasets:
  - path: iamtarun/python_code_instructions_18k_alpaca
    type: alpaca                # instruction + input + output fields
    split: train[:5000]
    shards: 1
  - path: HuggingFaceH4/ultrachat_200k
    type: sharegpt              # conversations: [{from: human|gpt, value: str}]
    conversation: llama-3       # Chat template to apply
    split: train_sft[:1000]
    ds_weight: 0.3              # 30% sampling weight when mixing datasets

# Sequence
sequence_len: 2048
sample_packing: true            # Pack short examples into max-length chunks
eval_sample_packing: false      # Disable for accurate eval loss

# Training hyperparameters
num_epochs: 3
micro_batch_size: 2             # batch per GPU
gradient_accumulation_steps: 4  # effective global batch = 2 * 4 * num_gpus = 8
optimizer: adamw_bnb_8bit       # 8-bit AdamW from bitsandbytes
lr_scheduler: cosine
learning_rate: 2e-4
warmup_steps: 50
weight_decay: 0.0

# Memory-efficiency
gradient_checkpointing: true
flash_attention: true           # 3x faster attention, requires flash-attn package
bf16: auto

# Output
output_dir: outputs/qlora-llama3
saves_per_epoch: 1
save_safetensors: true

# Logging
logging_steps: 10
eval_steps: 100
wandb_project: axolotl-runs
wandb_name: qlora-llama3-inst

# Hub upload (optional)
# hub_model_id: your-username/llama3-qlora
# hub_strategy: every_save

# configs/dpo_llama3.yml — Direct Preference Optimization
base_model: outputs/qlora-llama3/merged   # Start from SFT-merged model
model_type: LlamaForCausalLM
tokenizer_type: AutoTokenizer

load_in_4bit: true
adapter: qlora
lora_r: 8
lora_alpha: 16
lora_target_linear: true

# DPO-specific
rl: dpo
dpo_beta: 0.1

datasets:
  - path: Intel/orca_dpo_pairs
    type: chatml.intel            # DPO format: system + prompt + chosen + rejected
    split: train[:2000]

sequence_len: 1024
micro_batch_size: 2
gradient_accumulation_steps: 8
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 5e-5
warmup_ratio: 0.05
bf16: auto
flash_attention: true
output_dir: outputs/dpo-llama3

# configs/zero2_multinode.yml — ZeRO-2 multi-GPU config
base_model: meta-llama/Llama-3.1-8B-Instruct
model_type: LlamaForCausalLM
tokenizer_type: AutoTokenizer

# Full fine-tuning (no adapter) with DeepSpeed ZeRO-2
adapter: ~
load_in_4bit: false
bf16: true

datasets:
  - path: HuggingFaceH4/ultrachat_200k
    type: sharegpt
    conversation: llama-3
    split: train_sft[:50000]

sequence_len: 4096
sample_packing: true

deepspeed: configs/zero2.json   # Path to DeepSpeed config

num_epochs: 1
micro_batch_size: 1
gradient_accumulation_steps: 8
optimizer: adamw_torch_fused
lr_scheduler: cosine
learning_rate: 1e-5
warmup_ratio: 0.03
gradient_checkpointing: true
flash_attention: true

output_dir: outputs/full-ft-llama3-8b
saves_per_epoch: 2

Python Preprocessing and Launch

# scripts/axolotl_launch.py — programmatic config generation and training launch
from __future__ import annotations
import subprocess
import sys
from pathlib import Path

import yaml


def build_qlora_config(
    base_model:   str  = "meta-llama/Llama-3.2-3B-Instruct",
    dataset_path: str  = "iamtarun/python_code_instructions_18k_alpaca",
    output_dir:   str  = "outputs/axolotl-run",
    lora_r:       int  = 16,
    epochs:       int  = 3,
    seq_len:      int  = 2048,
) -> dict:
    """Generate a QLoRA Axolotl config programmatically."""
    return {
        "base_model":       base_model,
        "model_type":       "AutoModelForCausalLM",
        "tokenizer_type":   "AutoTokenizer",
        "load_in_4bit":        True,
        "adapter":             "qlora",
        "lora_r":              lora_r,
        "lora_alpha":          lora_r * 2,
        "lora_dropout":        0.05,
        "lora_target_linear":  True,
        "datasets": [
            {
                "path": dataset_path,
                "type": "alpaca",
                "split": "train[:5000]",
            }
        ],
        "sequence_len":             seq_len,
        "sample_packing":           True,
        "eval_sample_packing":      False,
        "num_epochs":               epochs,
        "micro_batch_size":         2,
        "gradient_accumulation_steps": 4,
        "optimizer":                "adamw_bnb_8bit",
        "lr_scheduler":             "cosine",
        "learning_rate":            2e-4,
        "warmup_steps":             50,
        "gradient_checkpointing":   True,
        "flash_attention":          True,
        "bf16":                     "auto",
        "output_dir":               output_dir,
        "saves_per_epoch":          1,
        "logging_steps":            10,
    }


def save_config(config: dict, path: str = "config.yml") -> Path:
    out = Path(path)
    out.parent.mkdir(parents=True, exist_ok=True)
    with open(out, "w") as f:
        yaml.dump(config, f, default_flow_style=False, sort_keys=False)
    print(f"Config saved: {out}")
    return out


def preprocess(config_path: str) -> None:
    """Preprocess and cache dataset before training."""
    subprocess.run(
        [sys.executable, "-m", "axolotl.cli.preprocess", config_path],
        check=True,
    )


def train(config_path: str, num_gpus: int = 1) -> None:
    """Launch Axolotl training via accelerate."""
    cmd = [
        "accelerate", "launch",
        f"--num_processes={num_gpus}",
        "-m", "axolotl.cli.train",
        config_path,
    ]
    subprocess.run(cmd, check=True)


def merge_lora(config_path: str) -> None:
    """Merge LoRA adapter weights into base model."""
    subprocess.run(
        [sys.executable, "-m", "axolotl.cli.merge_lora", config_path],
        check=True,
    )


def run_inference(config_path: str, prompt: str) -> None:
    """Interactive inference with the trained model."""
    subprocess.run(
        [sys.executable, "-m", "axolotl.cli.inference", config_path,
         "--prompter", "None", "--message", prompt],
        check=True,
    )


if __name__ == "__main__":
    config = build_qlora_config(
        base_model="meta-llama/Llama-3.2-3B-Instruct",
        dataset_path="iamtarun/python_code_instructions_18k_alpaca",
        output_dir="outputs/axolotl-qlora",
        epochs=1,
    )
    config_path = str(save_config(config, "configs/generated_qlora.yml"))

    print("Preprocessing dataset...")
    preprocess(config_path)

    print("Starting training...")
    train(config_path, num_gpus=1)

    print("Merging LoRA weights...")
    merge_lora(config_path)

    print("Testing inference...")
    run_inference(config_path, "Write a Python function to binary search a sorted list.")

For the Unsloth alternative when training on a single consumer GPU and needing maximum memory efficiency through custom Triton kernels — Unsloth’s kernel-level optimizations give 2x speedup and 60% less VRAM versus a standard setup while Axolotl’s YAML-driven approach handles large multi-GPU clusters, multi-dataset mixing pipelines, and DPO/ORPO training without writing Python, making it the better choice for teams that need reproducible, version-controlled training recipes. For the TRL SFTTrainer alternative when needing a pure-Python programmatic API for custom data collators, reward functions, or non-standard training loops — TRL gives direct API control while Axolotl wraps TRL and DeepSpeed into a configuration layer that prevents boilerplate mistakes and ensures consistent hyperparameter management across experiments. The Claude Skills 360 bundle includes Axolotl skill sets covering QLoRA YAML configs, multi-dataset recipes, DPO alignment configs, DeepSpeed integration, and training launch scripts. Start with the free tier to try config-driven LLM fine-tuning generation.

Keep Reading

Claude Code for email.contentmanager: Python Email Content Accessors

Read and write EmailMessage body content with Python's email.contentmanager module and Claude Code — email contentmanager ContentManager for the class that maps content types to get and set handler functions allowing EmailMessage to support get_content and set_content with type-specific behaviour, email contentmanager raw_data_manager for the ContentManager instance that handles raw bytes and str payloads without any conversion, email contentmanager content_manager for the standard ContentManager instance used by email.policy.default that intelligently handles text plain text html multipart and binary content types, email contentmanager get_content_text for the handler that returns the decoded text payload of a text-star message part as a str, email contentmanager get_content_binary for the handler that returns the raw decoded bytes payload of a non-text message part, email contentmanager get_data_manager for the get-handler lookup used by EmailMessage get_content to find the right reader function for the content type, email contentmanager set_content text for the handler that creates and sets a text part correctly choosing charset and transfer encoding, email contentmanager set_content bytes for the handler that creates and sets a binary part with base64 encoding and optional filename Content-Disposition, email contentmanager EmailMessage get_content for the method that reads the message body using the registered content manager handlers, email contentmanager EmailMessage set_content for the method that sets the message body and MIME headers in one call, email contentmanager EmailMessage make_alternative make_mixed make_related for the methods that convert a simple message into a multipart container, email contentmanager EmailMessage add_attachment for the method that attaches a file or bytes to a multipart message, and email contentmanager integration with email.message and email.policy and email.mime and io for building high-level email readers attachment extractors text body accessors HTML readers and policy-aware MIME construction pipelines.

5 min read Feb 12, 2029

Claude Code for email.charset: Python Email Charset Encoding

Control header and body encoding for international email with Python's email.charset module and Claude Code — email charset Charset for the class that wraps a character set name with the encoding rules for header encoding and body encoding describing how to encode text for that charset in email messages, email charset Charset header_encoding for the attribute specifying whether headers using this charset should use QP quoted-printable encoding BASE64 encoding or no encoding, email charset Charset body_encoding for the attribute specifying the Content-Transfer-Encoding to use for message bodies in this charset such as QP or BASE64, email charset Charset output_codec for the attribute giving the Python codec name used to encode the string to bytes for the wire format, email charset Charset input_codec for the attribute giving the Python codec name used to decode incoming bytes to str, email charset Charset get_output_charset for returning the output charset name, email charset Charset header_encode for encoding a header string using the charset's header_encoding method, email charset Charset body_encode for encoding body content using the charset's body_encoding, email charset Charset convert for converting a string from the input_codec to the output_codec, email charset add_charset for registering a new charset with custom encoding rules in the global charset registry, email charset add_alias for adding an alias name that maps to an existing registered charset, email charset add_codec for registering a codec name mapping for use by the charset machinery, and email charset integration with email.message and email.mime and email.policy and email.encoders for building international email senders non-ASCII header encoders Content-Transfer-Encoding selectors charset-aware message constructors and MIME encoding pipelines.

5 min read Feb 11, 2029

Claude Code for email.utils: Python Email Address and Header Utilities

Parse and format RFC 2822 email addresses and dates with Python's email.utils module and Claude Code — email utils parseaddr for splitting a display-name plus angle-bracket address string into a realname and email address tuple, email utils formataddr for combining a realname and address string into a properly quoted RFC 2822 address with angle brackets, email utils getaddresses for parsing a list of raw address header strings each potentially containing multiple comma-separated addresses into a list of realname address tuples, email utils parsedate for parsing an RFC 2822 date string into a nine-tuple compatible with time.mktime, email utils parsedate_tz for parsing an RFC 2822 date string into a ten-tuple that includes the UTC offset timezone in seconds, email utils parsedate_to_datetime for parsing an RFC 2822 date string into an aware datetime object with timezone, email utils formatdate for formatting a POSIX timestamp or the current time as an RFC 2822 date string with optional usegmt and localtime flags, email utils format_datetime for formatting a datetime object as an RFC 2822 date string, email utils make_msgid for generating a globally unique Message-ID string with optional idstring and domain components, email utils decode_rfc2231 for decoding an RFC 2231 encoded parameter value into a tuple of charset language and value, email utils encode_rfc2231 for encoding a string as an RFC 2231 encoded parameter value, email utils collapse_rfc2231_value for collapsing a decoded RFC 2231 tuple to a Unicode string, and email utils integration with email.message and email.headerregistry and datetime and time for building address parsers date formatters message-id generators header extractors and RFC-compliant email construction utilities.

5 min read Feb 10, 2029

Put these ideas into practice

Claude Skills 360 gives you production-ready skills for everything in this article — and 2,350+ more. Start free or go all-in.

Get 360 skills free

Free $39