Publications

Highlights

NEMO: Can Multimodal LLMs Identify Attribute-Modified Objects?
NEMO: Can Multimodal LLMs Identify Attribute-Modified Objects?

JiaXuan Li, Junwen Mo, Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama

R.I.P.: A Simple Black-box Attack on Continual Test-time Adaptation
R.I.P.: A Simple Black-box Attack on Continual Test-time Adaptation

Trung-Hieu Hoang, Duc Minh Vo, Minh N.Do

Persistent Test-time Adaptation in Recurring Testing Scenarios
Persistent Test-time Adaptation in Recurring Testing Scenarios

Trung-Hieu Hoang, Duc Minh Vo, Minh N.Do

A Compact Dynamic 3D Gaussian Representation for Real-Time Dynamic View Synthesis
A Compact Dynamic 3D Gaussian Representation for Real-Time Dynamic View Synthesis

Kai Katsumata, Duc Minh Vo, Hideki Nakayama

EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension
EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension

JiaXuan Li, Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama

Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts
Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts

Jiaxuan Li, Duc Minh Vo, Hideki Nakayama

A-CAP: Anticipation Captioning with Commonsense Knowledge
A-CAP: Anticipation Captioning with Commonsense Knowledge

Duc Minh Vo, Quoc-An Luong, Akihiro Sugimoto, Hideki Nakayama

StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning
StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning

Hong Chen, Duc Minh Vo, Hiroya Takamura, Yusuke Miyao, Hideki Nakayama

NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge
NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge

Duc Minh Vo, Hong Chen, Akihiro Sugimoto, Hideki Nakayama

OSSGAN: Open-Set Semi-Supervised Image Generation
OSSGAN: Open-Set Semi-Supervised Image Generation

Kai Katsumata, Duc Minh Vo, Hideki Nakayama

PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression
PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression

Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama

Visual-Relation Conscious Image Generation from Structured-Text
Visual-Relation Conscious Image Generation from Structured-Text

Duc Minh Vo, Akihiro Sugimoto

Paired-D GAN for Semantic Image Synthesis
Paired-D GAN for Semantic Image Synthesis

Duc Minh Vo, Akihiro Sugimoto

Balancing Content and Style with Two-Stream FCNs for Style Transfer
Balancing Content and Style with Two-Stream FCNs for Style Transfer

Duc Minh Vo, Trung-Nghia Le, Akihiro Sugimoto

Paired-D++ GAN for Image Manipulation with Text
Paired-D++ GAN for Image Manipulation with Text

Duc Minh Vo, Akihiro Sugimoto

Two-stream FCNs to Balance Content and Style for Style Transfer
Two-stream FCNs to Balance Content and Style for Style Transfer

Duc Minh Vo, Akihiro Sugimoto


Full List

International Conference & Workshop

1
NEMO: Can Multimodal LLMs Identify Attribute-Modified Objects?
JiaXuan Li, Junwen Mo, Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama
Arxiv 2024  ·  See Project page
2
R.I.P.: A Simple Black-box Attack on Continual Test-time Adaptation
Trung-Hieu Hoang, Duc Minh Vo, Minh N.Do
Arxiv 2024  ·  See Project page
3
Uncovering the Risk of Model Collapsing in Self-Supervised Continual Test-time Adaptation
Trung-Hieu Hoang, Duc Minh Vo, Minh N.Do
Workshop on Self-Supervised Learning - Theory and Practice, NeurIPS 2024
4
Questioning, Answering, and Captioning for Zero-Shot Detailed Image Caption
Duc-Tuan Luu, Viet-Tuan Le, Duc Minh Vo
Workshop on Large Vision - Language Model Learning and Applications, ACCV 2024
5
Persistent Test-time Adaptation in Recurring Testing Scenarios
Trung-Hieu Hoang, Duc Minh Vo, Minh N.Do
The Thirty-eighth Annual Conference on Neural Information Processing Systems, NeurIPS 2024  ·  See Project page
6
A Compact Dynamic 3D Gaussian Representation for Real-Time Dynamic View Synthesis
Kai Katsumata, Duc Minh Vo, Hideki Nakayama
2024 European Conference on Computer Vision  ·  See Project page
7
EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension
JiaXuan Li, Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama
IEEE Computer Vision and Pattern Recognition (CVPR), 2024  ·  Invited talk at MIRU, 2024
8
Persistent Test-time Adaptation in Episodic Testing Scenarios
Trung-Hieu Hoang, Duc Minh Vo, Minh N.Do
First Workshop on Test-Time Adaptation Model, Adapt Thyself (MAT), Community Track, CVPR 2024
9
Improving the Robustness of 3D Human Pose Estimation: A Benchmark Dataset and Learning from Noisy Input
Trung-Hieu Hoang, Mona Zehn, Huy Phan, Duc Minh Vo, Minh N. Do
IEEE CVPR workshop on fair, data-efficient, and trusted computer vision, 2024
10
Balancing Reconstruction and Editing Quality of GAN Inversion for Real Image Editing with StyleGAN Prior Latent Space
Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama
IEEE Winter Conference on Applications of Computer Vision (WACV), 2024  ·  See Code
11
Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data
Kai Katsumata, Duc Minh Vo, Tatsuya Harada, Hideki Nakayama
IEEE Winter Conference on Applications of Computer Vision (WACV), 2024  ·  See Code
12
Label Augmentation as Inter-class Data Augmentation for Conditional Image Synthesis with Imbalanced Data
Kai Katsumata, Duc Minh Vo, Hideki Nakayama
IEEE Winter Conference on Applications of Computer Vision (WACV), 2024  ·  See Code
13
Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts
Jiaxuan Li, Duc Minh Vo, Hideki Nakayama
International Conference on Computer Vision (ICCV), 2023  ·  See Code
14
Revisiting Latent Space of GAN Inversion for Robust Real Image Editing
Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama
AI for Content Creation Workshop @CVPR, 2023
15
A-CAP: Anticipation Captioning with Commonsense Knowledge
Duc Minh Vo, Quoc-An Luong, Akihiro Sugimoto, Hideki Nakayama
IEEE Computer Vision and Pattern Recognition (CVPR), 2023  ·  See Video
16
Indirect Adversarial Losses via an Intermediate Distribution for Training GANs
Rui Yang, Duc Minh Vo, Hideki Nakayama
IEEE Winter Conference on Applications of Computer Vision (WACV), 2023
17
StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning
Hong Chen, Duc Minh Vo, Hiroya Takamura, Yusuke Miyao, Hideki Nakayama
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022  ·  Oral
18
NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge
Duc Minh Vo, Hong Chen, Akihiro Sugimoto, Hideki Nakayama
IEEE Computer Vision and Pattern Recognition (CVPR), 2022
19
OSSGAN: Open-Set Semi-Supervised Image Generation
Kai Katsumata, Duc Minh Vo, Hideki Nakayama
IEEE Computer Vision and Pattern Recognition (CVPR), 2022  ·  Invited talk at MIRU, 2022
20
PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression
Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama
IEEE Winter Conference on Applications of Computer Vision (WACV), 2022  ·  See Poster
21
Saliency based Subject Selection for Diverse Image Captioning
Quoc-An Luong, Duc Minh Vo, Akihiro Sugimoto
17th International Conference on Machine Vision Applications, 2021
22
Visual-Relation Conscious Image Generation from Structured-Text
Duc Minh Vo, Akihiro Sugimoto
European Conference on Computer Vision (ECCV), 2020
23
Stylized-Colorization for Line Arts
Tzu-Ting Fang, Duc Minh Vo, Akihiro Sugimoto, Shang-Hong Lai
25th International Conference on Pattern Recognition (ICPR), 2020
24
Paired-D GAN for Semantic Image Synthesis
Duc Minh Vo, Akihiro Sugimoto
Asian Conference on Computer Vision (ACCV), 2018  ·  See Poster and Code
25
Balancing Content and Style with Two-Stream FCNs for Style Transfer
Duc Minh Vo, Trung-Nghia Le, Akihiro Sugimoto
IEEE Winter Conference on Applications of Computer Vision (WACV), 2018  ·  See Poster, Code, Demo
26
Facial Expression Recognition by Re-ranking with Global and Local Generic Features
Duc Minh Vo, Akihiro Sugimoto, Thai Hoang Le
23rd International Conference on Pattern Recognition (ICPR), 2016  ·  See Poster

Domestic Conference & Workshop

1
Multimodal Large Language Model Meets New Knowledge: A Preliminary Study
Junwen Mo, Jiaxuan Li, Duc Minh Vo, Hideki Nakayama
言語処理学会第30回年次大会(NLP2024)
2
暗黙的な変形場を用いた変形可能な3D敵対的生成ネットワーク
勝又海 (東大, 理研), Duc Minh Vo (東大), 原田達也 (東大, 理研), 中山英樹 (東大)
MIRU, 2023
3
Robust Novel Object Captioning by Retrieving Objects from External Knowledge
Duc Minh Vo, Hong Chen, Akihiro Sugimoto and Hideki Nakayama
First International Workshop on Embodied Semiotics (EmSemi2023) (Idea presentation)
4
Biases mitigation in medical images via knowledge guidance
Jiaxuan Li, Duc Minh Vo, Kohei Murao, Hiroyuki Abe, Tetsuo Ushiku, Shin'Ichi Satoh and Hideki Nakayama
First International Workshop on Embodied Semiotics (EmSemi2023) (Idea presentation)
5
Enhancing Sign Language Translation with Quantized Video Encoding
Junwen Mo, Duc Minh Vo and Hideki Nakayama
First International Workshop on Embodied Semiotics (EmSemi2023) (Idea presentation)

Journal

1
Anticipation Captioning with Commonsense Knowledge
Duc Minh Vo, Quoc-An Luong, Akihiro Sugimoto, Hideki Nakayama
Journal of the Imaging Society of Japan, 2023 (Invited commentorary paper)
2
Stochastically Flipping Labels of Discriminator’s Outputs for Training Generative Adversarial Networks
Yang Rui, Duc Minh Vo, Hideki Nakayama
IEEE Access, 2022
3
Paired-D++ GAN for Image Manipulation with Text
Duc Minh Vo, Akihiro Sugimoto
Machine Vision and Applications (MVAP), 2022
4
Two-stream FCNs to Balance Content and Style for Style Transfer
Duc Minh Vo, Akihiro Sugimoto
Machine Vision and Applications (MVAP), 2020