Phil
phil111
AI & ML interests
None yet
Recent Activity
new activity
7 days ago
matteogeniaccio/phi-4:Notably better than Phi3.5 in many ways, but something is wrong.
liked
a model
8 days ago
deepseek-ai/DeepSeek-V3
Organizations
None yet
phil111's activity
Notably better than Phi3.5 in many ways, but something is wrong.
8
#5 opened 18 days ago
by
phil111
Very impressive. Good world knowledge (SimpleQA of 25) despite high math/coding performance.
2
#27 opened 8 days ago
by
phil111
SimpleQA score
2
#1 opened 19 days ago
by
frappuccino
Exceptional creative writer
5
#1 opened 15 days ago
by
SubtleOne
Very High English MMLU scores, Yet Extremely Low Broad English Knowledge
2
#8 opened 15 days ago
by
phil111
How was r7b?
6
#3 opened 20 days ago
by
MRU4913
Add Qwen 2.5 7B & Tulu 3 8B results to OLLM benchmarks
12
#1 opened 21 days ago
by
Fizzarolli
local Llama + GPU(cuda)
7
#34 opened 21 days ago
by
Luciolla
Base Model?
3
#32 opened 22 days ago
by
User8213
Add Hymba-1.5B to the leaderboard
3
#1030 opened 29 days ago
by
pmolchanov
Hallucinates more than Mistral 7b
#13 opened about 2 months ago
by
phil111
Looks like not as good as Qwen2.5 7B
9
#5 opened 3 months ago
by
MonolithFoundation
This LLM is hallucinating like crazy. Can someone verify these prompts?
28
#3 opened 3 months ago
by
phil111
This is a clear improvement over L3.1 70b Instruct, but more censored?
3
#3 opened 3 months ago
by
phil111
Thanks, although it's too verbose and prone to hallucinations.
4
#1 opened 3 months ago
by
phil111
MMLU-Pro benchmark
5
#13 opened 3 months ago
by
kth8
There's a HUGE drop in popular knowledge from v2 to v2.5.
28
#1 opened 4 months ago
by
phil111
Anybody else having the same problem with the model ending answers prematurely?
1
#19 opened 3 months ago
by
ElvisM
There's a HUGE drop in popular knowledge from v2 to v2.5.
28
#1 opened 4 months ago
by
phil111
Thanks. This is astonishingly good for its size.
1
#9 opened 3 months ago
by
phil111