Clem 🤗's picture

Clem 🤗 PRO

clem

·

http://huggingface.co

AI & ML interests

multi-modal, time-series, biology and chemistry

Recent Activity

liked a model 8 days ago

Qwen/QVQ-72B-Preview

reacted to etemiz's post with ❤️ 8 days ago

Should I create an organization tackling the AI--human alignment problem. Finding the humans that care about other humans most and basically pretraining with their stuff.. I already did some experiments and it seems to work well. Want to know about my experiments? Who would be interested to join?

reacted to wenhuach's post with 🚀 8 days ago

Are we the only providers of INT4 quantized models for Llama 3.2 VL? https://huggingface.co/OPEA/Llama-3.2-90B-Vision-Instruct-int4-sym-inc https://huggingface.co/OPEA/Llama-3.2-11B-Vision-Instruct-int4-sym-inc

View all activity

Organizations

clem's activity

upvoted a paper 13 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 14 days ago • 113

upvoted a collection 13 days ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated 13 days ago • 107

upvoted a paper 14 days ago

The Open Source Advantage in Large Language Models (LLMs)

Paper • 2412.12004 • Published 16 days ago • 9

upvoted 17 papers 16 days ago

SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding

Paper • 2412.09604 • Published 20 days ago • 35

GenEx: Generating an Explorable World

Paper • 2412.09624 • Published 20 days ago • 86

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published 19 days ago • 132

Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions

Paper • 2412.08737 • Published 21 days ago • 51

Phi-4 Technical Report

Paper • 2412.08905 • Published 20 days ago • 93

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published 20 days ago • 92

POINTS1.5: Building a Vision-Language Model towards Real World Applications

Paper • 2412.08443 • Published 21 days ago • 38

LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural Annotations

Paper • 2412.08580 • Published 21 days ago • 45

SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints

Paper • 2412.07760 • Published 22 days ago • 50

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published 22 days ago • 46

Evaluating and Aligning CodeLLMs on Human Preference

Paper • 2412.05210 • Published 26 days ago • 47

STIV: Scalable Text and Image Conditioned Video Generation

Paper • 2412.07730 • Published 22 days ago • 70

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published 23 days ago • 63

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published 23 days ago • 69

Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

Paper • 2412.06531 • Published 23 days ago • 71

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published 26 days ago • 46

EXAONE 3.5: Series of Large Language Models for Real-world Use Cases

Paper • 2412.04862 • Published 26 days ago • 48