Phil's picture

Phil

phil111

·

AI & ML interests

None yet

Recent Activity

new activity 7 days ago

matteogeniaccio/phi-4:Notably better than Phi3.5 in many ways, but something is wrong.

liked a model 8 days ago

deepseek-ai/DeepSeek-V3

new activity 8 days ago

deepseek-ai/DeepSeek-V3-Base:Very impressive. Good world knowledge (SimpleQA of 25) despite high math/coding performance.

View all activity

Organizations

None yet

phil111's activity

New activity in matteogeniaccio/phi-4 7 days ago

Notably better than Phi3.5 in many ways, but something is wrong.

#5 opened 18 days ago by

New activity in deepseek-ai/DeepSeek-V3-Base 8 days ago

Very impressive. Good world knowledge (SimpleQA of 25) despite high math/coding performance.

#27 opened 8 days ago by

New activity in NyxKrage/Microsoft_Phi-4 12 days ago

SimpleQA score

#1 opened 19 days ago by

New activity in ibm-granite/granite-3.1-8b-instruct 13 days ago

Exceptional creative writer

#1 opened 15 days ago by

New activity in tiiuae/Falcon3-7B-Instruct 15 days ago

Very High English MMLU scores, Yet Extremely Low Broad English Knowledge

#8 opened 15 days ago by

New activity in CohereForAI/c4ai-command-r7b-12-2024 17 days ago

How was r7b?

#3 opened 20 days ago by

Add Qwen 2.5 7B & Tulu 3 8B results to OLLM benchmarks

#1 opened 21 days ago by

New activity in meta-llama/Llama-3.3-70B-Instruct 19 days ago

local Llama + GPU(cuda)

#34 opened 21 days ago by

New activity in meta-llama/Llama-3.3-70B-Instruct 21 days ago

Base Model?

#32 opened 22 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 26 days ago

Add Hymba-1.5B to the leaderboard

#1030 opened 29 days ago by

New activity in mistralai/Ministral-8B-Instruct-2410 about 2 months ago

Hallucinates more than Mistral 7b

#13 opened about 2 months ago by

New activity in mistralai/Ministral-8B-Instruct-2410 2 months ago

Looks like not as good as Qwen2.5 7B

#5 opened 3 months ago by

MonolithFoundation

New activity in mistralai/Ministral-8B-Instruct-2410 3 months ago

This LLM is hallucinating like crazy. Can someone verify these prompts?

#3 opened 3 months ago by

New activity in nvidia/Llama-3.1-Nemotron-70B-Instruct-HF 3 months ago

This is a clear improvement over L3.1 70b Instruct, but more censored?

#3 opened 3 months ago by

New activity in Zyphra/Zamba2-7B-Instruct 3 months ago

Thanks, although it's too verbose and prone to hallucinations.

#1 opened 3 months ago by

New activity in meta-llama/Llama-3.2-3B-Instruct 3 months ago

MMLU-Pro benchmark

#13 opened 3 months ago by

New activity in Qwen/Qwen2.5-72B-Instruct 3 months ago

There's a HUGE drop in popular knowledge from v2 to v2.5.

#1 opened 4 months ago by

New activity in mistralai/Mistral-Small-Instruct-2409 3 months ago

Anybody else having the same problem with the model ending answers prematurely?

#19 opened 3 months ago by

New activity in Qwen/Qwen2.5-72B-Instruct 3 months ago

There's a HUGE drop in popular knowledge from v2 to v2.5.

#1 opened 4 months ago by

New activity in meta-llama/Llama-3.2-3B-Instruct 3 months ago

Thanks. This is astonishingly good for its size.

#9 opened 3 months ago by