Erik Scholz's picture

Erik Scholz

Green-Sky

·

Green-Sky

AI & ML interests

None yet

Recent Activity

new activity about 2 hours ago

TheDrummer/Magidonia-24B-v4.2.0-GGUF:People who vote in your polls have no idea wtf they talk about

liked a model 7 days ago

Qwen/Qwen-Image-Layered

liked a model 9 days ago

TheDrummer/Cydonia-24B-v4.1

View all activity

Organizations

upvoted a collection about 1 month ago

Z-Image

4 items • Updated 29 days ago • 96

upvoted a collection about 2 months ago

SPARK.Chroma

3 items • Updated Oct 30 • 2

upvoted 2 collections 3 months ago

MobileLLM-R1

MobileLLM-R1, a series of sub-billion parameter reasoning models • 10 items • Updated Nov 21 • 27

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 50 items • Updated 18 days ago • 136

upvoted a collection 5 months ago

GPT OSS

2 items • Updated 13 days ago • 13

upvoted a paper 5 months ago

Voxtral

Paper • 2507.13264 • Published Jul 17 • 31

upvoted 6 papers 6 months ago

Shared DIFF Transformer

Paper • 2501.17900 • Published Jan 29 • 1

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 179

Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?

Paper • 2502.11895 • Published Feb 17 • 3

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16 • 75

An Extra RMSNorm is All You Need for Fine Tuning to 1.58 Bits

Paper • 2505.08823 • Published May 12 • 2

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 627

upvoted a collection 6 months ago

blt

4 items • Updated Apr 17 • 27

upvoted a paper 6 months ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108

upvoted a collection 12 months ago

llama.vim

Recommended models for the llama.vim and llama.vscode plugins • 10 items • Updated 13 days ago • 60

upvoted a collection about 1 year ago

story writing favourites

Models I personally liked for generating stories in the past. Not a recommendation, most of these are outdated. • 26 items • Updated Nov 14 • 84

upvoted 3 papers over 1 year ago

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6, 2024 • 61

Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models

Paper • 2407.12327 • Published Jul 17, 2024 • 79

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95

upvoted a collection over 1 year ago

LongVA

Long Context Transfer From Text To Vision: https://lmms-lab.github.io/posts/longva/ • 5 items • Updated Oct 4, 2024 • 13