Luci

Akirami

LuciAkirami

AI & ML interests

None yet

Recent Activity

reacted to singhsidhukuldeep's post with 🤗 about 1 month ago

Exciting breakthrough in Document AI! Researchers from UNC Chapel Hill and Bloomberg have developed M3DocRAG, a revolutionary framework for multi-modal document understanding. The innovation lies in its ability to handle complex document scenarios that traditional systems struggle with: - Process 40,000+ pages across 3,000+ documents - Answer questions requiring information from multiple pages - Understand visual elements like charts, tables, and figures - Support both closed-domain (single document) and open-domain (multiple documents) queries Under the hood, M3DocRAG operates through three sophisticated stages: >> Document Embedding: - Converts PDF pages to RGB images - Uses ColPali to project both text queries and page images into a shared embedding space - Creates dense visual embeddings for each page while maintaining visual information integrity >> Page Retrieval: - Employs MaxSim scoring to compute relevance between queries and pages - Implements inverted file indexing (IVFFlat) for efficient search - Reduces retrieval latency from 20s to under 2s when searching 40K+ pages - Supports approximate nearest neighbor search via Faiss >> Question Answering: - Leverages Qwen2-VL 7B as the multi-modal language model - Processes retrieved pages through a visual encoder - Generates answers considering both textual and visual context The results are impressive: - State-of-the-art performance on MP-DocVQA benchmark - Superior handling of non-text evidence compared to text-only systems - Significantly better performance on multi-hop reasoning tasks This is a game-changer for industries dealing with large document volumes—finance, healthcare, and legal sectors can now process documents more efficiently while preserving crucial visual context.

updated a collection about 2 months ago

JailBreak

updated a collection about 2 months ago

JailBreak

View all activity

Organizations

Akirami's activity

liked a model about 2 months ago

katanemo/Arch-Guard-cpu

Text Classification • Updated Oct 9, 2024 • 1.3k • 2

liked a model 6 months ago

mlabonne/NeuralPipe-7B-slerp

Text Generation • Updated Jul 2, 2024 • 81 • 7

liked a dataset 6 months ago

openai/gsm8k

Viewer • Updated Jan 4, 2024 • 17.6k • 172k • 479

liked 2 models 7 months ago

CohereForAI/aya-23-8B

Text Generation • Updated Oct 30, 2024 • 14.1k • 398

HuggingFaceFW/fineweb-edu-classifier

Text Classification • Updated Nov 17, 2024 • 6.05k • 153

liked a model 8 months ago

Akirami/truthy-llama3-8b

Text Generation • Updated Apr 29, 2024 • 18 • 1