Skip to content

Commit 31a7084

Browse files
authored
chore(model gallery): add gemma-3-4b-it-qat (#5118)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
1 parent 128612a commit 31a7084

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

gallery/index.yaml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,24 @@
9696
- filename: gemma-3-12b-it-q4_0.gguf
9797
sha256: 6f1bb5f455414f7b46482bda51cbfdbf19786e21a5498c4403fdfc03d09b045c
9898
uri: huggingface://vinimuchulski/gemma-3-12b-it-qat-q4_0-gguf/gemma-3-12b-it-q4_0.gguf
99+
- !!merge <<: *gemma3
100+
name: "gemma-3-4b-it-qat"
101+
urls:
102+
- https://huggingface.co/google/gemma-3-4b-it
103+
- https://huggingface.co/vinimuchulski/gemma-3-4b-it-qat-q4_0-gguf
104+
description: |
105+
This model corresponds to the 4B instruction-tuned version of the Gemma 3 model in GGUF format using Quantization Aware Training (QAT). The GGUF corresponds to Q4_0 quantization.
106+
107+
Thanks to QAT, the model is able to preserve similar quality as bfloat16 while significantly reducing the memory requirements to load the model.
108+
109+
You can find the half-precision version here.
110+
overrides:
111+
parameters:
112+
model: gemma-3-4b-it-q4_0.gguf
113+
files:
114+
- filename: gemma-3-4b-it-q4_0.gguf
115+
sha256: 2ca493d426ffcb43db27132f183a0230eda4a3621e58b328d55b665f1937a317
116+
uri: huggingface://vinimuchulski/gemma-3-4b-it-qat-q4_0-gguf/gemma-3-4b-it-q4_0.gguf
99117
- !!merge <<: *gemma3
100118
name: "qgallouedec_gemma-3-27b-it-codeforces-sft"
101119
urls:

0 commit comments

Comments
 (0)