Model Card for Evaluate360M

Model Details

Model Description

Evaluate360M is a lightweight large language model optimized for reasoning tasks. It is designed to run efficiently on low-end commercial hardware, such as mobile phones, while maintaining strong performance in logical reasoning and general-purpose applications.

  • Developed by: [More Information Needed]
  • Funded by [optional]: [More Information Needed]
  • Shared by [optional]: [More Information Needed]
  • Model type: Transformer-based decoder model
  • Language(s) (NLP): English
  • License: [More Information Needed]
  • Finetuned from model [optional]: HuggingFaceTB/SmolLM2-360M-Instruct

Model Sources

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

Direct Use

Evaluate360M is intended for general-purpose reasoning tasks and can be used in applications that require lightweight LLMs, such as:

  • Mobile-based AI assistants
  • Low-power embedded systems
  • Edge computing applications

Downstream Use

It can be further fine-tuned for specific domains, including code generation, summarization, or dialogue systems.

Out-of-Scope Use

  • Not optimized for handling very large context windows
  • Not designed for generating high-fidelity creative text, such as poetry or fiction

Bias, Risks, and Limitations

Limitations

  • Struggles with handling large context windows.
  • Not evaluated for potential biases yet.

Recommendations

Users should be aware of the model’s limitations in context length and should evaluate its performance for their specific use cases.

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "evaluate360m"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

inputs = tokenizer("What is the capital of France?", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))

Training Details

Training Data

  • Dataset: HuggingFaceH4/Bespoke-Stratos-17k
  • Preprocessing: Token packing enabled (--packing), sequence length up to 2048 tokens

Training Procedure

  • Optimizer & Precision:
    • bf16 mixed precision
    • gradient_accumulation_steps = 8
    • Gradient checkpointing enabled
  • Hyperparameters:
    • Learning rate: 2e-5
    • Epochs: 3
    • Batch size: 4 (per device, both training and evaluation)
  • Evaluation & Saving:
    • Evaluation every 500 steps
    • Model checkpoint saved every 1000 steps, keeping a max of 2 checkpoints

Compute Infrastructure

  • Hardware Used: A100 GPU
  • Training Time: 6 hours

Evaluation

  • Benchmarks: No evaluation conducted yet.
  • Metrics: Not available yet.

Environmental Impact

  • Hardware Type: A100 GPU
  • Hours Used: 6 hours
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications

Model Architecture

  • Similar to SmolLM2-360M
  • Inspired by MobileLLM
  • Uses Grouped-Query Attention (GQA)
  • Prioritizes depth over width

Citation [optional]

BibTeX:
[More Information Needed]

APA:
[More Information Needed]

More Information

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Downloads last month
55
Safetensors
Model size
362M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.