Integrated Vision and Language Lab.

university

https://www.ivylab.kaist.ac.kr/research_1/lab-overview

AI & ML interests

Deep learning and machine learning on computer vision and multimedia, Multimodal deep learning, Integrating vision, speech, and language for AI, Multimodal object and motion detection/recognition, Inclusive human machine teaming, Analysis for competency, interpretability, memorability, and robustness of deep learning model, Multimodal prompt with large scale model

Recent Activity

dwightro authored a paper about 2 months ago

VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models

dwightro authored a paper about 2 months ago

Look Every Frame All at Once: Video-Ma$^2$mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing

dwightro authored a paper 7 months ago

TroL: Traversal of Layers for Large Language and Vision Models

View all activity

ivyivl's activity

dwightro

authored 2 papers about 2 months ago

VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models

Paper • 2412.01822 • Published Dec 2, 2024 • 14

Look Every Frame All at Once: Video-Ma$^2$mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing

Paper • 2411.19460 • Published Nov 29, 2024 • 10

dwightro

authored a paper 7 months ago

TroL: Traversal of Layers for Large Language and Vision Models

Paper • 2406.12246 • Published Jun 18, 2024 • 35

dwightro

authored a paper 8 months ago

Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24, 2024 • 53

dwightro

authored 10 papers 10 months ago

Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing

Paper • 2402.15151 • Published Feb 23, 2024 • 7

Masking Adversarial Damage: Finding Adversarial Saliency for Robust and Sparse Network

Paper • 2204.02738 • Published Apr 6, 2022 • 3

Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck

Paper • 2204.02735 • Published Apr 6, 2022 • 4

CoLLaVO: Crayon Large Language and Vision mOdel

Paper • 2402.11248 • Published Feb 17, 2024 • 21

Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression

Paper • 2303.01052 • Published Mar 2, 2023 • 3

Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge

Paper • 2308.09311 • Published Aug 18, 2023 • 1

DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding

Paper • 2308.07787 • Published Aug 15, 2023 • 1

Causal Unsupervised Semantic Segmentation

Paper • 2310.07379 • Published Oct 11, 2023 • 3

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Paper • 2403.07508 • Published Mar 12, 2024 • 75

Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning

Paper • 2307.07250 • Published Jul 14, 2023 • 2

AI & ML interests

Recent Activity

Team members 1

ivyivl's activity