roneneldan/TinyStories
Viewer • Updated • 2.14M • 88.6k • 988
How to use AISE-TUDelft/Custom-Activations-BERT-Adaptive-GELU with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("fill-mask", model="AISE-TUDelft/Custom-Activations-BERT-Adaptive-GELU") # Load model directly
from transformers import AutoModelForMaskedLM
model = AutoModelForMaskedLM.from_pretrained("AISE-TUDelft/Custom-Activations-BERT-Adaptive-GELU", dtype="auto")Basemodel: roBERTa
Configs: Vocab size: 10,000 Hidden size: 512 Max position embeddings: 512 Number of layers: 2 Number of heads: 4 Window size: 256 Intermediate-size: 1024
Results: