YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

EpsteinGPT - Minimal GPT Model

This repository contains a Minimal GPT (MVT) model trained on the Epstein email threads dataset.

Model Details

This is a custom-built Causal Transformer model (MinimalGPT) inspired by nanoGPT/minGPT architectures. It was trained from scratch using a custom Byte-Pair Encoding (BPE) tokenizer.

Configuration (config.json)

{
  "vocab_size": 5000,
  "block_size": 256,
  "n_layer": 8,
  "n_head": 8,
  "n_embd": 512,
  "batch_size": 16,
  "dropout": 0.1,
  "bias": false
}

Files Included

  • epsteingpt_tokenizer.json: The custom BPE tokenizer used for encoding and decoding text.
  • EpsteinGPT.pt: The PyTorch checkpoint containing the trained model weights.
  • EpsteinGPT.ptl: The TorchScript Lite version of the trained model, optimized for deployment.
  • model.py: Defines the MVTConfig class and the MinimalGPT model architecture.
  • config.json: Model configuration in JSON format.
  • README.md: This file.

How to Use

To use this model, you would typically:

  1. Load the tokenizer:
    from tokenizers import Tokenizer
    tokenizer = Tokenizer.from_file("epsteingpt_tokenizer.json")
    
  2. Load the model architecture and configuration (from model.py and config.json).
  3. Load the trained weights from EpsteinGPT.pt into the model.
  4. Use the model for text generation or other tasks.

For generation, you can refer to the generate.py script used during development.

Training

The model was trained on a dataset of Epstein email threads. The training process involved:

  1. Tokenizer Training: A BPE tokenizer was trained on the raw text data.
  2. Data Preparation: The text data was tokenized and converted into a numerical format.
  3. Model Training: The MinimalGPT model was trained using a custom training loop.

Further Information

For more details on the model architecture and training process, refer to the model.py and train.py scripts.

Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support