norbert4-small / modeling_gptbert.py

Commit History

Using max_position_embeddings instead of max_sequence_length to standardise with HF
9894fa3
verified

lgcharpe commited on

Fix causal mode
0ac9186
verified

davda54 commited on

fixed output format
b4ba7c8
verified

davda54 commited on

fix NaNs
f694326
verified

davda54 commited on

make FlashAttention logic more robust
7df4bf5
verified

davda54 commited on

fix
d8479bb
verified

davda54 commited on

removed SDPA
2c0c592
verified

davda54 commited on

Update modeling_gptbert.py
67e8a0f
verified

davda54 commited on

fixed SDPA for older PyTorch versions
8537e95
verified

davda54 commited on

FlashAttention support
9aae5ff
verified

davda54 commited on

Update modeling_gptbert.py
a3f5ab3
verified

lgcharpe commited on

Upload folder using huggingface_hub
460fdd7
verified

davda54 commited on