karpathy/tiny_shakespeare
Updated • 5.53k • 75
This is a lightweight GPT-style decoder-only transformer model trained on the Tiny Shakespeare dataset (karpathy/tiny_shakespeare). It uses a custom implementation in PyTorch and supports character-level text generation.
Trained on the full Tiny Shakespeare dataset for 4 epochs using Adam optimizer and cross-entropy loss. Validation loss is tracked and logged using Weights & Biases (wandb).
from transformers import AutoTokenizer
import torch
from model import DecoderOnlyTransformer # custom model class
tokenizer = AutoTokenizer.from_pretrained("NataliiaM15/decoder-shakespeare-gpt")
model = DecoderOnlyTransformer(
vocab_size=tokenizer.vocab_size,
embed_dim=128,
num_heads=4,
num_layers=2,
seq_len=64
)
model.load_state_dict(torch.load("pytorch_model.bin"))
model.eval()
# Generate text
prompt = "ROMEO:"
input_ids = tokenizer.encode(prompt, return_tensors="pt")
# generation loop would go here...