Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2407.01370

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

Paper • 2406.19389 • Published Jun 27, 2024 • 53
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 90
RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs

Paper • 2407.02485 • Published Jul 2, 2024 • 5
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Paper • 2407.01370 • Published Jul 1, 2024 • 86

Bootstrapping Language Models with DPO Implicit Rewards

Paper • 2406.09760 • Published Jun 14, 2024 • 39
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17, 2024 • 59
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs

Paper • 2406.14544 • Published Jun 20, 2024 • 35
Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20, 2024 • 87

From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries

Paper • 2406.12824 • Published Jun 18, 2024 • 21
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Paper • 2407.01370 • Published Jul 1, 2024 • 86

How Do Large Language Models Acquire Factual Knowledge During Pretraining?

Paper • 2406.11813 • Published Jun 17, 2024 • 31
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries

Paper • 2406.12824 • Published Jun 18, 2024 • 21
Tokenization Falling Short: The Curse of Tokenization

Paper • 2406.11687 • Published Jun 17, 2024 • 16
Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level

Paper • 2406.11817 • Published Jun 17, 2024 • 13

LLoCO: Learning Long Contexts Offline

Paper • 2404.07979 • Published Apr 11, 2024 • 21
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21, 2024 • 115
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Paper • 2402.11550 • Published Feb 18, 2024 • 17
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 21

Relevant-Papers-Midterm

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Paper • 2402.14848 • Published Feb 19, 2024 • 18
The Prompt Report: A Systematic Survey of Prompting Techniques

Paper • 2406.06608 • Published Jun 6, 2024 • 58
CRAG -- Comprehensive RAG Benchmark

Paper • 2406.04744 • Published Jun 7, 2024 • 45
Transformers meet Neural Algorithmic Reasoners

Paper • 2406.09308 • Published Jun 13, 2024 • 44

Papers I want to read

Papers in my to-read list

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 67
Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16, 2024 • 129
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24, 2024 • 53
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 87

test-collection

試しに作ったコレクションです

openai-community/gpt2

Text Generation • Updated Feb 19, 2024 • 11.7M • • 2.52k
meta-llama/Meta-Llama-3-8B

Text Generation • Updated Sep 27, 2024 • 564k • 5.98k
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Paper • 2407.01370 • Published Jul 1, 2024 • 86

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 50
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 16
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 14
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 12

The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26, 2024 • 79
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 104
ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4, 2024 • 92
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4, 2024 • 61

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs