Vo Minh Duc Homepage

🖼️ Image/Video Synthesis 🤖 Foundation Models 🌐 Vision & Language 🎨 Style Transfer 🧠 Deep Learning 🔀 Domain Adaptation ⚖️ Debiasing 📽️ Diffusion Models

I am Vo Minh Duc, a Senior Research Scientist at SB Intuitions, Japan, working on foundation models and multimodal generation. Feel free to reach out if you are interested in collaborating!

2025– Senior Research Scientist · SB Intuitions, Japan

2022–2025 Project Assistant Professor · Nakayama Lab, University of Tokyo

Ph.D. Computer Science · SOKENDAI / NII · Advisor: Prof. Akihiro Sugimoto

B.Sc. & M.S. Computer Science · University of Science, Vietnam National University HCMC

🌐 Community Activities

🏆 Grand Challenge · ACM MM 2026

LAVA Challenge

Large Vision–Language Model Learning & Applications

Annual grand challenge on document understanding with Vision-Language Models. 2026 edition extends to multilingual PDFs and evidence-grounded answering.

📄 Multilingual PDFs 🌏 Int'l teams 📚 ACM MM proceedings

Visit lava-workshop.github.io →

📖 Bi-weekly · Vietnam & Japan

VJAI Paper Reading Hub

From Paper to Prototype

A high-signal AI paper reading community for engineers and researchers in Vietnam & Japan — deep-diving into papers that matter every two weeks.

📝 9+ papers digested 👥 24+ members 🔁 Monthly cadence

Visit vominhduc.github.io/vjai-paper-hub →

📰