160 68 220

Philipp Schmid

philschmid

https://www.philschmid.de

AI & ML interests

None yet

Recent Activity

updated a model 7 days ago

philschmid/modernbert-llm-router

updated a collection 9 days ago

LLM Reasoning Papers

upvoted a paper 11 days ago

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

View all activity

Articles

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

May 1, 2024

• 69

Welcome Llama 3 - Meta's new open LLM

Apr 18, 2024

• 281

Making thousands of open LLMs bloom in the Vertex AI Model Garden

Apr 10, 2024

• 18

CodeGemma - an official Google release for code LLMs

Apr 9, 2024

• 99

Bringing serverless GPU inference to Hugging Face users

Apr 2, 2024

• 11

Easily Train Models with H100 GPUs on NVIDIA DGX Cloud

Mar 18, 2024

• 6

Welcome Gemma - Google's new open LLM

Feb 21, 2024

• 21

From OpenAI to Open LLMs with Messages API

Feb 8, 2024

• 12

Hugging Face Text Generation Inference available for AWS Inferentia2

Feb 1, 2024

• 5

Hugging Face and Google partner for open AI collaboration

Jan 25, 2024

• 4

Mixture of Experts Explained

Dec 11, 2023

• 235

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

Dec 11, 2023

• 11

Deploy Embedding Models with Hugging Face Inference Endpoints

Oct 24, 2023

• 2

Llama 2 on Amazon SageMaker a Benchmark

Sep 26, 2023

Fine-tuning Llama 2 70B using PyTorch FSDP

Sep 13, 2023

• 16

Spread Your Wings: Falcon 180B is here

Sep 6, 2023

• 4

Code Llama: Llama 2 learns to code

Aug 25, 2023

• 9

Introducing SafeCoder

Aug 22, 2023

Hugging Face Platform on the AWS Marketplace: Pay with your AWS Account

Aug 10, 2023

Llama 2 is here - get it on Hugging Face

Jul 18, 2023

• 23

Deploy LLMs with Hugging Face Inference Endpoints

Jul 4, 2023

• 11

The Falcon has landed in the Hugging Face ecosystem

Jun 5, 2023

• 10

Introducing the Hugging Face LLM Inference Container for Amazon SageMaker

May 31, 2023

• 2

Hugging Face Collaborates with Microsoft to Launch Hugging Face Model Catalog on Azure

May 24, 2023

Creating a Coding Assistant with StarCoder

May 9, 2023

• 1

Accelerating Hugging Face Transformers with AWS Inferentia2

Apr 17, 2023

Hugging Face and AWS partner to make AI more accessible

Feb 21, 2023

• 2

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

Aug 22, 2022

• 5

Convert Transformers to ONNX with Hugging Face Optimum

Jun 22, 2022

• 3

Accelerated Inference with Optimum and Transformers Pipelines

May 10, 2022

• 2

Accelerate BERT inference with Hugging Face Transformers and AWS inferentia

Mar 16, 2022

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

Jan 13, 2022

• 2

Deploy GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker

Jan 11, 2022

Few-shot learning in practice: GPT-NEO and the 🤗 Accelerated Inference API

Jun 3, 2021

• 4

Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker

Apr 8, 2021

Organizations

philschmid's activity

upvoted a paper 11 days ago

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

Paper • 2407.21787 • Published Jul 31, 2024 • 12

upvoted a paper 19 days ago

Phi-4 Technical Report

Paper • 2412.08905 • Published 20 days ago • 93

upvoted a paper about 1 month ago

Hymba: A Hybrid-head Architecture for Small Language Models

Paper • 2411.13676 • Published Nov 20, 2024 • 39

upvoted a paper about 2 months ago

"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization

Paper • 2411.02355 • Published Nov 4, 2024 • 46

upvoted 2 papers 3 months ago

Pyramidal Flow Matching for Efficient Video Generative Modeling

Paper • 2410.05954 • Published Oct 8, 2024 • 38

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 104

upvoted an article 3 months ago

Article

Llama can now see and run on your device - welcome Llama 3.2

Sep 25, 2024

• 180

upvoted a collection 3 months ago

Llama 3.2

Collection

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated 26 days ago • 550

upvoted 2 papers 3 months ago

EuroLLM: Multilingual Language Models for Europe

Paper • 2409.16235 • Published Sep 24, 2024 • 25

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 135

upvoted a collection 3 months ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Nov 28, 2024 • 451

upvoted 2 collections 4 months ago

LLM-Reasoning

Collection

18 items • Updated Jul 1, 2024 • 2

🤖 Agents

Collection

21 items • Updated about 22 hours ago • 61

upvoted an article 4 months ago

Article

Meet Yi-Coder: A Small but Mighty LLM for Code

•

Sep 4, 2024

• 14

upvoted 2 papers 4 months ago

Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models

Paper • 2408.02442 • Published Aug 5, 2024 • 21

Generative Verifiers: Reward Modeling as Next-Token Prediction

Paper • 2408.15240 • Published Aug 27, 2024 • 13

upvoted a collection 5 months ago

Probably function calling datasets

Collection

Created using the https://huggingface.co/spaces/librarian-bots/dataset-column-search-api Space. • 39 items • Updated Jul 17, 2024 • 36

upvoted an article 5 months ago

Article

Google releases Gemma 2 2B, ShieldGemma and Gemma Scope

Jul 31, 2024

• 59

upvoted a collection 5 months ago

Llama 3.1 GPTQ, AWQ, and BNB Quants

Collection

Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗 • 9 items • Updated Sep 26, 2024 • 56

upvoted a paper 5 months ago

BOND: Aligning LLMs with Best-of-N Distillation

Paper • 2407.14622 • Published Jul 19, 2024 • 18

Philipp Schmid

AI & ML interests

Recent Activity

Articles

Hugging Face models in Amazon Bedrock

Introducing HUGS - Scale your AI with Open Models

Llama can now see and run on your device - welcome Llama 3.2

Deploy Meta Llama 3.1 405B on Google Cloud Vertex AI

Serverless Inference with Hugging Face and NVIDIA NIMs

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

Google Cloud TPUs made available to Hugging Face users

Welcome Gemma 2 - Google's new open LLM

Introducing the Hugging Face Embedding Container for Amazon SageMaker

Deploy models on AWS Inferentia2 from Hugging Face

From cloud to developers: Hugging Face and Microsoft Deepen Collaboration

Build AI on premise with Dell Enterprise Hub

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

Welcome Llama 3 - Meta's new open LLM

Making thousands of open LLMs bloom in the Vertex AI Model Garden

CodeGemma - an official Google release for code LLMs

Bringing serverless GPU inference to Hugging Face users

Easily Train Models with H100 GPUs on NVIDIA DGX Cloud

Welcome Gemma - Google's new open LLM

From OpenAI to Open LLMs with Messages API

Hugging Face Text Generation Inference available for AWS Inferentia2

Hugging Face and Google partner for open AI collaboration

Mixture of Experts Explained

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

Deploy Embedding Models with Hugging Face Inference Endpoints

Llama 2 on Amazon SageMaker a Benchmark

Fine-tuning Llama 2 70B using PyTorch FSDP

Spread Your Wings: Falcon 180B is here

Code Llama: Llama 2 learns to code

Introducing SafeCoder

Hugging Face Platform on the AWS Marketplace: Pay with your AWS Account

Llama 2 is here - get it on Hugging Face

Deploy LLMs with Hugging Face Inference Endpoints

The Falcon has landed in the Hugging Face ecosystem

Introducing the Hugging Face LLM Inference Container for Amazon SageMaker

Hugging Face Collaborates with Microsoft to Launch Hugging Face Model Catalog on Azure

Creating a Coding Assistant with StarCoder

Accelerating Hugging Face Transformers with AWS Inferentia2

Hugging Face and AWS partner to make AI more accessible

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

Convert Transformers to ONNX with Hugging Face Optimum

Accelerated Inference with Optimum and Transformers Pipelines

Accelerate BERT inference with Hugging Face Transformers and AWS inferentia

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

Deploy GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker

Few-shot learning in practice: GPT-NEO and the 🤗 Accelerated Inference API

Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker

Organizations

philschmid's activity

Llama can now see and run on your device - welcome Llama 3.2

Meet Yi-Coder: A Small but Mighty LLM for Code

Google releases Gemma 2 2B, ShieldGemma and Gemma Scope