Commits · Xenobd/whisper.cpp

whisper : fix grammar advance stack warning (#3087)

e4a0565
unverified

danbev commited on Apr 28, 2025

examples : expose language detection probabilities to server example (#3044)

6b8d348
unverified

sachaarbonel commited on Apr 28, 2025

whisper : remove empty .gitmodules file [no ci] (#3085)

aa54166
unverified

danbev commited on Apr 28, 2025

talk-llama : sync llama.cpp (#3084)

511930c
unverified

ggerganov commited on Apr 28, 2025

ci : disable publishing of java binding [no ci] (#3086)

4b6e041
unverified

danbev commited on Apr 28, 2025

build : Add Moore Threads GPU support and update GitHub workflow for MUSA build (#3069)

8ede9a1
unverified

R0CKSTAR commited on Apr 28, 2025

examples : fix deprecated FFmpeg functions (#3073)

0aa41e8
unverified

podre-henrique commited on Apr 28, 2025

ruby : add encoder begin callback related methods (#3076)

855927b
unverified

KitaitiMakoto commited on Apr 25, 2025

ci : enable bindings java job (#3070)

469f43c
unverified

danbev commited on Apr 25, 2025

ruby : add cmake option (#0)

4a21ad6

ggerganov commited on Apr 24, 2025

cuda : fix unused variable compile warning (#0)

a1f4201

ggerganov commited on Apr 24, 2025

sync : ggml

5222212

ggerganov commited on Apr 24, 2025

opencl : remove obsolete files (skip) (ggml/1200)

adc6542

ggerganov commited on Apr 24, 2025

sync : ggml

cac9245

ggerganov commited on Apr 24, 2025

opencl: split ggml-opencl.cl into multiple files and cleanup (llama/12886)

291a5b7

lhez Shangqing Gu commited on Apr 24, 2025

ggml : fix trailing whitespaces (llama/0)

5d27bbf

ggerganov commited on Apr 24, 2025

CUDA: use switch statements in constexpr functions (llama/13095)

f5cd546

JohannesGaessler commited on Apr 24, 2025

metal : fix floating-point range of attention scores in FA kernels (llama/13090)

e093044

ggerganov commited on Apr 24, 2025

vulkan: matmul gcn tuning (llama/13016)

ac537d2

Eve

OccamRazor commited on Apr 24, 2025

CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (llama/13014)

285a334

JohannesGaessler commited on Apr 22, 2025

ggml : add SSE 4.2 and x64 base variant for CPUs without AVX (llama/12871)

f8795d3

Diego Devesa commited on Apr 21, 2025

SYCL: Add non-contiguous support in ROPE (llama/12993)

a29a2c3

Akarshan Biswas commited on Apr 21, 2025

vulkan: support noncontiguous rms_norm (llama/13031)

e4d1f59

jeffbolznv commited on Apr 20, 2025

metal: add neg operator (llama/13029)

42283e1

jmorganca commited on Apr 20, 2025

SYCL: Refactor and enable FP16 in binary broadcast OPs (llama/12975)

1377b05

Akarshan Biswas commited on Apr 18, 2025

rpc : add RPC_CMD_HELLO (llama/12955)

ff22836

rgerganov commited on Apr 18, 2025

graph : make FA compatible with MLA + add initial Metal kernels (llama/12953)

fb0d243

ggerganov commited on Apr 17, 2025

ggml: Re-enable CUDA graphs in presence of CONT and DUP nodes (llama/12970)

3944ae5

Alan Gray commited on Apr 17, 2025

CANN: Add support for async operator submission (llama/12864)

1b9d0f0

hipudding commited on Apr 17, 2025

opencl: fix incorrect local_size index in profiling log (llama/12868)

8f5d919

kimminsu commited on Apr 16, 2025

vulkan: enable coopmat2 FA gqa and split_k optimizations more often (llama/12931)

f844153

jeffbolznv commited on Apr 16, 2025

CANN: Add 310P operator support check (llama/12962)

14d0d7c

Chenguang Li commited on Apr 16, 2025

metal : add FA-vec kernels for head size 96 (llama/12952)

f1f88b8

ggerganov commited on Apr 15, 2025

CANN: Add x86 build ci (llama/12950)

f4c9b36

hipudding commited on Apr 15, 2025

CUDA/HIP: Share the same unified memory allocation logic. (llama/12934)

143cb70

David Huang commited on Apr 15, 2025

SYCL: Add ROPE vision kernel (llama/12887)

5c44879

Akarshan Biswas commited on Apr 15, 2025

ggml : Add AVX512 implementation of GEMM - Q4_Kx8 (llama/12829)

2457b99

Srihari-mcw commited on Apr 15, 2025

CANN: Opt ROPE optimization (llama/12865)

3773a09

Chenguang Li commited on Apr 15, 2025

CANN: Optimize CANN buffer pool memory management (llama/12875)

66b93b3

dou112 commited on Apr 15, 2025

SYCL: Fix im2col (llama/12910)

a33d74f

Akarshan Biswas commited on Apr 14, 2025

rpc : use ggml_context_ptr (llama/12938)

24b9742

rgerganov commited on Apr 14, 2025

ggml : Depthwise 2D convolution (ggml/1152)

0c950d5

Acly commited on Apr 17, 2025

ggml: use _mm[512/256]_dpbusd[_avx]_epi32 to directly accumulate into the result register (llama/12773)

acb674d

sxx-404 commited on Apr 14, 2025

ggml: disable CUDA graphs for unsupported DUP and CONT node types (llama/12891)

9e42c4d

Alan Gray commited on Apr 13, 2025

vulkan: use aligned loads for flash attention mask (llama/12853)

825889e

jeffbolznv commited on Apr 12, 2025

sycl: Support sycl_ext_oneapi_limited_graph (llama/12873)

5db8b21

Ewan Crawford commited on Apr 11, 2025

SYCL: Add fp16 type support to unary op kernels (llama/12788)

b5c106e

Akarshan Biswas commited on Apr 11, 2025

ggml: fix compilation error s390x (llama/12848)

2458d68

Aaron Teo Aleksei Nikiforov commited on Apr 11, 2025

cpu: fix cpu backend's supports-op for GET_ROWS_BACK. fixes a fatal when running test-backend-ops with only the CPU backend (ggml/1190)

ee7706c

cmdr2 commited on Apr 11, 2025

CANN: Support more ops (llama/12841)

6aecea5

Chenguang Li commited on Apr 10, 2025

Commit History

whisper : fix grammar advance stack warning (#3087) e4a0565 unverified

examples : expose language detection probabilities to server example (#3044) 6b8d348 unverified

whisper : remove empty .gitmodules file [no ci] (#3085) aa54166 unverified

talk-llama : sync llama.cpp (#3084) 511930c unverified

ci : disable publishing of java binding [no ci] (#3086) 4b6e041 unverified

build : Add Moore Threads GPU support and update GitHub workflow for MUSA build (#3069) 8ede9a1 unverified

examples : fix deprecated FFmpeg functions (#3073) 0aa41e8 unverified

ruby : add encoder begin callback related methods (#3076) 855927b unverified

ci : enable bindings java job (#3070) 469f43c unverified

ruby : add cmake option (#0) 4a21ad6

cuda : fix unused variable compile warning (#0) a1f4201

sync : ggml 5222212

opencl : remove obsolete files (skip) (ggml/1200) adc6542

sync : ggml cac9245

opencl: split ggml-opencl.cl into multiple files and cleanup (llama/12886) 291a5b7

ggml : fix trailing whitespaces (llama/0) 5d27bbf

CUDA: use switch statements in constexpr functions (llama/13095) f5cd546

metal : fix floating-point range of attention scores in FA kernels (llama/13090) e093044

vulkan: matmul gcn tuning (llama/13016) ac537d2

CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (llama/13014) 285a334

ggml : add SSE 4.2 and x64 base variant for CPUs without AVX (llama/12871) f8795d3

SYCL: Add non-contiguous support in ROPE (llama/12993) a29a2c3

vulkan: support noncontiguous rms_norm (llama/13031) e4d1f59

metal: add neg operator (llama/13029) 42283e1

SYCL: Refactor and enable FP16 in binary broadcast OPs (llama/12975) 1377b05

rpc : add RPC_CMD_HELLO (llama/12955) ff22836

graph : make FA compatible with MLA + add initial Metal kernels (llama/12953) fb0d243

ggml: Re-enable CUDA graphs in presence of CONT and DUP nodes (llama/12970) 3944ae5

CANN: Add support for async operator submission (llama/12864) 1b9d0f0

opencl: fix incorrect local_size index in profiling log (llama/12868) 8f5d919

vulkan: enable coopmat2 FA gqa and split_k optimizations more often (llama/12931) f844153

CANN: Add 310P operator support check (llama/12962) 14d0d7c

metal : add FA-vec kernels for head size 96 (llama/12952) f1f88b8

CANN: Add x86 build ci (llama/12950) f4c9b36

CUDA/HIP: Share the same unified memory allocation logic. (llama/12934) 143cb70

SYCL: Add ROPE vision kernel (llama/12887) 5c44879

ggml : Add AVX512 implementation of GEMM - Q4_Kx8 (llama/12829) 2457b99

CANN: Opt ROPE optimization (llama/12865) 3773a09

CANN: Optimize CANN buffer pool memory management (llama/12875) 66b93b3

SYCL: Fix im2col (llama/12910) a33d74f

rpc : use ggml_context_ptr (llama/12938) 24b9742

ggml : Depthwise 2D convolution (ggml/1152) 0c950d5

ggml: use _mm[512/256]_dpbusd[_avx]_epi32 to directly accumulate into the result register (llama/12773) acb674d

ggml: disable CUDA graphs for unsupported DUP and CONT node types (llama/12891) 9e42c4d

vulkan: use aligned loads for flash attention mask (llama/12853) 825889e

sycl: Support sycl_ext_oneapi_limited_graph (llama/12873) 5db8b21

SYCL: Add fp16 type support to unary op kernels (llama/12788) b5c106e

ggml: fix compilation error s390x (llama/12848) 2458d68

cpu: fix cpu backend's supports-op for GET_ROWS_BACK. fixes a fatal when running test-backend-ops with only the CPU backend (ggml/1190) ee7706c

CANN: Support more ops (llama/12841) 6aecea5

whisper : fix grammar advance stack warning (#3087)

e4a0565
unverified

examples : expose language detection probabilities to server example (#3044)

6b8d348
unverified

whisper : remove empty .gitmodules file [no ci] (#3085)

aa54166
unverified

talk-llama : sync llama.cpp (#3084)

511930c
unverified

ci : disable publishing of java binding [no ci] (#3086)

4b6e041
unverified

build : Add Moore Threads GPU support and update GitHub workflow for MUSA build (#3069)

8ede9a1
unverified

examples : fix deprecated FFmpeg functions (#3073)

0aa41e8
unverified

ruby : add encoder begin callback related methods (#3076)

855927b
unverified

ci : enable bindings java job (#3070)

469f43c
unverified

ruby : add cmake option (#0)

4a21ad6

cuda : fix unused variable compile warning (#0)

a1f4201

sync : ggml

5222212

opencl : remove obsolete files (skip) (ggml/1200)

adc6542

sync : ggml

cac9245

opencl: split ggml-opencl.cl into multiple files and cleanup (llama/12886)

291a5b7

ggml : fix trailing whitespaces (llama/0)

5d27bbf

CUDA: use switch statements in constexpr functions (llama/13095)

f5cd546

metal : fix floating-point range of attention scores in FA kernels (llama/13090)

e093044

vulkan: matmul gcn tuning (llama/13016)

ac537d2

CUDA: noncont MMVQ + batched bs1 MUL_MAT_ID (llama/13014)

285a334

ggml : add SSE 4.2 and x64 base variant for CPUs without AVX (llama/12871)

f8795d3

SYCL: Add non-contiguous support in ROPE (llama/12993)

a29a2c3

vulkan: support noncontiguous rms_norm (llama/13031)

e4d1f59

metal: add neg operator (llama/13029)

42283e1

SYCL: Refactor and enable FP16 in binary broadcast OPs (llama/12975)

1377b05

rpc : add RPC_CMD_HELLO (llama/12955)

ff22836

graph : make FA compatible with MLA + add initial Metal kernels (llama/12953)

fb0d243

ggml: Re-enable CUDA graphs in presence of CONT and DUP nodes (llama/12970)

3944ae5

CANN: Add support for async operator submission (llama/12864)

1b9d0f0

opencl: fix incorrect local_size index in profiling log (llama/12868)

8f5d919

vulkan: enable coopmat2 FA gqa and split_k optimizations more often (llama/12931)

f844153

CANN: Add 310P operator support check (llama/12962)

14d0d7c

metal : add FA-vec kernels for head size 96 (llama/12952)

f1f88b8

CANN: Add x86 build ci (llama/12950)

f4c9b36

CUDA/HIP: Share the same unified memory allocation logic. (llama/12934)

143cb70

SYCL: Add ROPE vision kernel (llama/12887)

5c44879

ggml : Add AVX512 implementation of GEMM - Q4_Kx8 (llama/12829)

2457b99

CANN: Opt ROPE optimization (llama/12865)

3773a09

CANN: Optimize CANN buffer pool memory management (llama/12875)

66b93b3

SYCL: Fix im2col (llama/12910)

a33d74f

rpc : use ggml_context_ptr (llama/12938)

24b9742

ggml : Depthwise 2D convolution (ggml/1152)

0c950d5

ggml: use _mm[512/256]_dpbusd[_avx]_epi32 to directly accumulate into the result register (llama/12773)

acb674d

ggml: disable CUDA graphs for unsupported DUP and CONT node types (llama/12891)

9e42c4d

vulkan: use aligned loads for flash attention mask (llama/12853)

825889e

sycl: Support sycl_ext_oneapi_limited_graph (llama/12873)

5db8b21

SYCL: Add fp16 type support to unary op kernels (llama/12788)

b5c106e

ggml: fix compilation error s390x (llama/12848)

2458d68

cpu: fix cpu backend's supports-op for GET_ROWS_BACK. fixes a fatal when running test-backend-ops with only the CPU backend (ggml/1190)

ee7706c

CANN: Support more ops (llama/12841)

6aecea5