Rhapsody

non-profit

Activity Feed

AI & ML interests

ANTISCOOPING ANYTHING

Recent Activity

BreakLee authored a paper about 1 month ago

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

bokesyo authored a paper 3 months ago

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

tcy6 authored a paper 3 months ago

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

View all activity

RhapsodyAI's activity

BreakLee

authored a paper about 1 month ago

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

Paper • 2412.02611 • Published Dec 3, 2024 • 23

bokesyo

authored a paper 3 months ago

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

Paper • 2410.10594 • Published Oct 14, 2024 • 24

tcy6

authored a paper 3 months ago

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

Paper • 2410.10594 • Published Oct 14, 2024 • 24

bokesyo

authored 2 papers 3 months ago

Enhancing Chat Language Models by Scaling High-quality Instructional Conversations

Paper • 2305.14233 • Published May 23, 2023 • 6

Tool Learning with Foundation Models

Paper • 2304.08354 • Published Apr 17, 2023 • 3

Cuiunbo

authored a paper 3 months ago

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

Paper • 2410.10594 • Published Oct 14, 2024 • 24

bokesyo

updated a model 5 months ago

RhapsodyAI/MiniCPM-V-Embedding-preview

Feature Extraction • Updated Aug 20, 2024 • 40 • 44

yiye2023

updated a model 5 months ago

RhapsodyAI/qwen_vl_guidance

Visual Question Answering • Updated Aug 13, 2024 • 164 • 1

bokesyo

updated a dataset 5 months ago

RhapsodyAI/UltraVL

Viewer • Updated Aug 9, 2024 • 215k • 30 • 3

trainfanlhy

authored a paper 5 months ago

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

Paper • 2408.01800 • Published Aug 3, 2024 • 79

Cuiunbo

authored a paper 5 months ago

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

Paper • 2408.01800 • Published Aug 3, 2024 • 79

Luobots

authored 3 papers 5 months ago

Cuiunbo

updated a model 6 months ago

RhapsodyAI/minicpm-guidance

Visual Question Answering • Updated Jul 15, 2024 • 32 • 6

Cuiunbo

updated a Space 6 months ago

Running

⚡

README

Cuiunbo

authored a paper 6 months ago

GUICourse: From General Vision Language Models to Versatile GUI Agents

Paper • 2406.11317 • Published Jun 17, 2024 • 1

Cuiunbo

posted an update 7 months ago

Post

2487

Introducing GUICourse! 🎉
By leveraging extensive OCR pretraining with grounding ability, we unlock the potential of parsing-free methods for GUIAgent.
📄 Paper: ( GUICourse: From General Vision Language Models to Versatile GUI Agents (2406.11317))
🌐 Github Repo: (https://github.com/yiye3/GUICourse)
📖 Dataset: ( yiye2023/GUIAct) / ( yiye2023/GUIChat) / ( yiye2023/GUIEnv)
🎯 Model: ( RhapsodyAI/minicpm-guidance) / ( RhapsodyAI/qwen_vl_guidance)

16 replies

BreakLee

authored a paper 8 months ago

SEED-Bench-2: Benchmarking Multimodal Large Language Models

Paper • 2311.17092 • Published Nov 28, 2023

BreakLee

authored a paper 9 months ago

SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension

Paper • 2404.16790 • Published Apr 25, 2024 • 7

AI & ML interests

Recent Activity

Team members 9

RhapsodyAI's activity

README