16 37 72

Alireza Mohammadshahi

alirezamsh

AI & ML interests

AI/NLP (NMT,LLMs)

Recent Activity

upvoted a paper 15 days ago

Qwen2.5 Technical Report

updated a Space about 2 months ago

alirezamsh/small100

View all activity

Articles

Mergoo: Efficiently Build Your Own MoE LLM

Jun 3, 2024

• 42

Orchestration of Experts: The First-Principle Multi-Model System

May 30, 2024

• 15

Organizations

alirezamsh's activity

upvoted a paper 15 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 15 days ago • 334

upvoted 2 papers 4 months ago

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14, 2024 • 75

PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers

Paper • 2406.12430 • Published Jun 18, 2024 • 7

upvoted a collection 4 months ago

Probably function calling datasets

Collection

Created using the https://huggingface.co/spaces/librarian-bots/dataset-column-search-api Space. • 39 items • Updated Jul 17, 2024 • 37

upvoted a paper 6 months ago

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

Paper • 2406.12753 • Published Jun 18, 2024 • 14

upvoted 6 papers 8 months ago

A Careful Examination of Large Language Model Performance on Grade School Arithmetic

Paper • 2405.00332 • Published May 1, 2024 • 30

Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 116

Better & Faster Large Language Models via Multi-token Prediction

Paper • 2404.19737 • Published Apr 30, 2024 • 73

Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 68

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs

Paper • 2404.16873 • Published Apr 21, 2024 • 28

FlowMind: Automatic Workflow Generation with LLMs

Paper • 2404.13050 • Published Mar 17, 2024 • 33

upvoted a collection 8 months ago

OpenMath

Collection

A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" • 15 items • Updated Oct 1, 2024 • 41

upvoted a paper 8 months ago

Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 87

upvoted 2 articles 8 months ago

Article

Synthetic data: save money, time and carbon with open source

Feb 16, 2024

• 54

Article

Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

Mar 20, 2024

• 71

upvoted 3 papers 8 months ago

Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

Paper • 2404.14723 • Published Apr 23, 2024 • 10

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25, 2024 • 75

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published Apr 22, 2024 • 126

upvoted a collection 9 months ago

Top 10% instruction tuning datasets

Collection

Collects datasets with 'instruction' in the name and more than 1 download and in the top 10% for the number of likes • 13 items • Updated Jul 3, 2024 • 7

upvoted a paper 9 months ago

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Paper • 2306.05685 • Published Jun 9, 2023 • 32