-
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper β’ 2401.02954 β’ Published β’ 41 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper β’ 2401.06066 β’ Published β’ 44 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper β’ 2401.14196 β’ Published β’ 48 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper β’ 2402.03300 β’ Published β’ 75
Collections
Discover the best community collections!
Collections trending this week
-
knowledgator/gliner-multitask-v1.0
Token Classification β’ Updated β’ 335 β’ 27 -
knowledgator/gliner-multitask-large-v0.5
Token Classification β’ Updated β’ 2.58k β’ 107 -
80β‘
GLiNER HandyLab
-
GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks
Paper β’ 2406.12925 β’ Published β’ 23
-
microsoft/Phi-3.5-mini-instruct
Text Generation β’ Updated β’ 486k β’ β’ 734 -
microsoft/Phi-3.5-MoE-instruct
Text Generation β’ Updated β’ 49k β’ 543 -
microsoft/Phi-3.5-vision-instruct
Image-Text-to-Text β’ Updated β’ 253k β’ 630 -
microsoft/Phi-3-mini-4k-instruct
Text Generation β’ Updated β’ 555k β’ β’ 1.1k
-
DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters
Updated β’ 61 -
DavidAU/MPT-7b-WizardLM_Uncensored-Storywriter-Merge-Q6_K-GGUF
Updated β’ 544 β’ 15 -
DavidAU/Buttocks-7B-v1.0-Q6_K-GGUF
Updated β’ 97 β’ 3 -
DavidAU/llama-2-16b-nastychat-Q6_K-GGUF
Updated β’ 152 β’ 3
-
deepseek-ai/deepseek-vl2-tiny
Image-Text-to-Text β’ Updated β’ 3.22k β’ 51 -
deepseek-ai/deepseek-vl2-small
Image-Text-to-Text β’ Updated β’ 1.3k β’ 37 -
deepseek-ai/deepseek-vl2
Image-Text-to-Text β’ Updated β’ 1.89k β’ 123 -
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Paper β’ 2412.10302 β’ Published β’ 11