xaskasdf commited on
Commit
c345b8e
·
verified ·
1 Parent(s): 2e496f2

Upload extended quantizations

Browse files

In theory these are the same as f16 model weight and shouldn't add any value to generations

Anyway, I've seen some responses aren't the same, specially since these were quantized with the importance matrix

bf16 is pretty slow if you don't have a professional GPU, GA102 cards are compatible, but slow :(

.gitattributes CHANGED
@@ -49,3 +49,5 @@ phi-3-mini-4k-instruct-iq2_xxs.gguf filter=lfs diff=lfs merge=lfs -text
49
  phi-3-mini-4k-instruct-iq4_nl.gguf filter=lfs diff=lfs merge=lfs -text
50
  phi-3-mini-4k-instruct-iq4_xs.gguf filter=lfs diff=lfs merge=lfs -text
51
  phi-3-mini-4k.imatrix filter=lfs diff=lfs merge=lfs -text
 
 
 
49
  phi-3-mini-4k-instruct-iq4_nl.gguf filter=lfs diff=lfs merge=lfs -text
50
  phi-3-mini-4k-instruct-iq4_xs.gguf filter=lfs diff=lfs merge=lfs -text
51
  phi-3-mini-4k.imatrix filter=lfs diff=lfs merge=lfs -text
52
+ phi-3-mini-4k-instruct-bf16.gguf filter=lfs diff=lfs merge=lfs -text
53
+ phi-3-mini-4k-instruct-f32.gguf filter=lfs diff=lfs merge=lfs -text
phi-3-mini-4k-instruct-bf16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:95b114be3ba31ee40f6e034771b2f6e9818cb5b7465fc6592ee4e6e6dec8e066
3
+ size 7643296992
phi-3-mini-4k-instruct-f32.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3f1874584b449c0adf023b2b7bb02bd825baa5f2b7e4fbdc08c0f56a199fa761
3
+ size 15088055520