jiakai

real-jiakai

https://blog.gujiakai.top

AI & ML interests

LLM && Smart QA

Recent Activity

liked a model 4 days ago

jinaai/ReaderLM-v2

upvoted a paper 4 days ago

Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability

upvoted a paper 4 days ago

OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation

View all activity

Organizations

real-jiakai's activity

upvoted 2 papers 4 days ago

Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability

Paper • 2411.19943 • Published Nov 29, 2024 • 57

OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation

Paper • 2412.02592 • Published Dec 3, 2024 • 21

upvoted a paper 8 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 11 days ago • 232

upvoted 2 papers 9 days ago

LLM4SR: A Survey on Large Language Models for Scientific Research

Paper • 2501.04306 • Published 11 days ago • 33

Agent Laboratory: Using LLM Agents as Research Assistants

Paper • 2501.04227 • Published 12 days ago • 77

upvoted a paper 10 days ago

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published 12 days ago • 48

upvoted a paper 11 days ago

Personalized Graph-Based Retrieval for Large Language Models

Paper • 2501.02157 • Published 16 days ago • 28

upvoted a paper 12 days ago

OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

Paper • 2412.20005 • Published 22 days ago • 17

upvoted a collection 13 days ago

🪐 SmolLM

Collection

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated 28 days ago • 209

upvoted an article 15 days ago

Article

✴️ ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use

•

16 days ago

• 12

upvoted a paper 15 days ago

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published 17 days ago • 47

upvoted 2 articles 17 days ago

Article

Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK

•

Nov 21, 2024

• 34

Article

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

•

17 days ago

• 38

upvoted a paper 17 days ago

Executable Code Actions Elicit Better LLM Agents

Paper • 2402.01030 • Published Feb 1, 2024 • 44

upvoted a collection 18 days ago

Open LLM Leaderboard best models ❤️‍🔥

Collection

A daily uploaded list of models with best evaluations on the LLM leaderboard: • 65 items • Updated 35 minutes ago • 508

upvoted a collection 24 days ago

GTE models

Collection

General Text Embedding Models Released by Tongyi Lab of Alibaba Group • 19 items • Updated 29 days ago • 19

upvoted 4 papers about 1 month ago

OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain

Paper • 2412.13018 • Published Dec 17, 2024 • 41