Publications

Highlights

**At the end of this page, you can find the full list of publications. **

EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension

EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension

JiaXuan Li, Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama

IEEE Computer Vision and Pattern Recognition (CVPR), 2024

See Project page

Persistent Test-time Adaptation in Episodic Testing Scenarios

Persistent Test-time Adaptation in Episodic Testing Scenarios

Trung-Hieu Hoang, Duc Minh Vo, Minh N.Do

ArXiv

An Efficient 3D Gaussian Representation for Monocular/Multi-view Dynamic Scenes

An Efficient 3D Gaussian Representation for Monocular/Multi-view Dynamic Scenes

Kai Katsumata, Duc Minh Vo, Hideki Nakayama

ArXiv

See Code

Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts

Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts

Jiaxuan Li, Duc Minh Vo, Hideki Nakayama

International Conference on Computer Vision (ICCV), 2023

See Code

A-CAP: Anticipation Captioning with Commonsense Knowledge

A-CAP: Anticipation Captioning with Commonsense Knowledge

Duc Minh Vo, Quoc-An Luong, Akihiro Sugimoto, Hideki Nakayama

IEEE Computer Vision and Pattern Recognition (CVPR), 2023

See Video

StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning

StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning

Hong Chen, Duc Minh Vo, Hiroya Takamura, Yusuke Miyao, Hideki Nakayama

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Oral

NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge

NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge

Duc Minh Vo, Hong Chen, Akihiro Sugimoto, Hideki Nakayama

IEEE Computer Vision and Pattern Recognition (CVPR), 2022

OSSGAN: Open-Set Semi-Supervised Image Generation

OSSGAN: Open-Set Semi-Supervised Image Generation

Kai Katsumata, Duc Minh Vo, Hideki Nakayama

EEE Computer Vision and Pattern Recognition (CVPR), 2022

Invited talk at MIRU, 2022

See code

PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression

PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression

Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama

IEEE Winter Conference on Applications of Computer Vision (WACV), 2022

See Poster

Visual-Relation Conscious Image Generation from Structured-Text

Visual-Relation Conscious Image Generation from Structured-Text

Duc Minh Vo, Akihiro Sugimoto

European Conference on Computer Vision (ECCV), 2020

Paired-D GAN for Semantic Image Synthesis

Paired-D GAN for Semantic Image Synthesis

Duc Minh Vo, Akihiro Sugimoto

Asian Conference on Computer Vision (ACCV), 2018

See Poster and Code

Journal Paired-D++ GAN for Image Manipulation with Text

Duc Minh Vo, Akihiro Sugimoto

Balancing Content and Style with Two-Stream FCNs for Style Transfer

Balancing Content and Style with Two-Stream FCNs for Style Transfer

Duc Minh Vo, Trung-Nghia Le, Akihiro Sugimoto

IEEE Winter Conference on Applications of Computer Vision (WACV), 2018

See Poster, Code, Demo

 

Paired-D++ GAN for Image Manipulation with Text

Paired-D++ GAN for Image Manipulation with Text

Duc Minh Vo, Akihiro Sugimoto

Machine Vision and Applications (MVAP), 2022

Two-stream FCNs to Balance Content and Style for Style Transfer

Two-stream FCNs to Balance Content and Style for Style Transfer

Duc Minh Vo, Akihiro Sugimoto

Machine Vision and Applications (MVAP), 2020

 

Full list of publications

International conference/workshop

  1. EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension
    JiaXuan Li, Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama
    IEEE Computer Vision and Pattern Recognition (CVPR), 2024
  2. Persistent Test-time Adaptation in Episodic Testing Scenarios
    Trung-Hieu Hoang, Duc Minh Vo, Minh N.Do
    ArXiv
  3. An Efficient 3D Gaussian Representation for Monocular/Multi-view Dynamic Scenes
    Kai Katsumata, Duc Minh Vo, Hideki Nakayama
    ArXiv
  4. Balancing Reconstruction and Editing Quality of GAN Inversion for Real Image Editing with StyleGAN Prior Latent Space
    Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama
    IEEE Winter Conference on Applications of Computer Vision (WACV), 2024
  5. Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data
    Kai Katsumata, Duc Minh Vo, Tatsuya Harada, Hideki Nakayama
    IEEE Winter Conference on Applications of Computer Vision (WACV), 2024
  6. Label Augmentation as Inter-class Data Augmentation for Conditional Image Synthesis with Imbalanced Data
    Kai Katsumata, Duc Minh Vo, Hideki Nakayama
    IEEE Winter Conference on Applications of Computer Vision (WACV), 2024
  7. Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts
    Jiaxuan Li, Duc Minh Vo, Hideki Nakayama
    International Conference on Computer Vision (ICCV), 2023
  8. Revisiting Latent Space of GAN Inversion for Robust Real Image Editing
    Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama
    AI for Content Creation Workshop @CVPR, 2023
  9. A-CAP: Anticipation Captioning with Commonsense Knowledge
    Duc Minh Vo, Quoc-An Luong, Akihiro Sugimoto, Hideki Nakayama
    IEEE Computer Vision and Pattern Recognition (CVPR), 2023
  10. Indirect Adversarial Losses via an Intermediate Distribution for Training GANs
    Rui Yang, Duc Minh Vo, Hideki Nakayama
    IEEE Winter Conference on Applications of Computer Vision (WACV), 2023
  11. StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning
    Hong Chen, Duc Minh Vo, Hiroya Takamura, Yusuke Miyao, Hideki Nakayama
    Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
  12. NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge
    Duc Minh Vo, Hong Chen, Akihiro Sugimoto, Hideki Nakayama
    IEEE Computer Vision and Pattern Recognition (CVPR), 2022
  13. OSSGAN: Open-Set Semi-Supervised Image Generation
    Kai Katsumata, Duc Minh Vo, Hideki Nakayama
    EEE Computer Vision and Pattern Recognition (CVPR), 2022
  14. PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression
    Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama
    IEEE Winter Conference on Applications of Computer Vision (WACV), 2022
  15. Saliency based Subject Selection for Diverse Image Captioning
    Quoc-An Luong, Duc Minh Vo, Akihiro Sugimoto
    17th International Conference on Machine Vision Applications , 2021
  16. Visual-Relation Conscious Image Generation from Structured-Text
    Duc Minh Vo, Akihiro Sugimoto
    European Conference on Computer Vision (ECCV), 2020
  17. Stylized-Colorization for Line Arts
    Tzu-Ting Fang, Duc Minh Vo, Akihiro Sugimoto, Shang-Hong Lai
    25th International Conference on Pattern Recognition (ICPR), 2020
  18. Paired-D GAN for Semantic Image Synthesis
    Duc Minh Vo, Akihiro Sugimoto
    Asian Conference on Computer Vision (ACCV), 2018
  19. Balancing Content and Style with Two-Stream FCNs for Style Transfer
    Duc Minh Vo, Trung-Nghia Le, Akihiro Sugimoto
    IEEE Winter Conference on Applications of Computer Vision (WACV), 2018
  20. Facial Expression Recognition by Re-ranking with Global and Local Generic Features
    Duc Minh Vo, Akihiro Sugimoto, Thai Hoang Le
    23rd International Conference on Pattern Recognition (ICPR), 2016

Domestic conference/workshop

  1. Improving the Robustness of 3D Human Pose Estimation: A Benchmark Dataset and Learning from Noisy Input
    Trung-Hieu Hoang, Mona Zehn, Huy Phan, Duc Minh Vo, Minh N. Do
    IEEE CVPR workshop on fair, data-efficient, and trusted computer vision, 2024
  2. Multimodal Large Language Model Meets New Knowledge: A Preliminary Study
    Junwen Mo, Jiaxuan Li, Duc Minh Vo, Hideki Nakayama
    言語処理学会第30回年次大会(NLP2024)
  3. 暗黙的な変形場を用いた変形可能な3D敵対的生成ネットワーク
    勝又海 (東大, 理研), Duc Minh Vo (東大), 原田達也 (東大, 理研), 中山英樹 (東大)
    MIRU, 2023
  4. Robust Novel Object Captioning by Retrieving Objects from External Knowledge
    Duc Minh Vo, Hong Chen, Akihiro Sugimoto and Hideki Nakayama
    First International Workshop on Embodied Semiotics (EmSemi2023) (Idea presentation)
  5. Biases mitigation in medical images via knowledge guidance
    Jiaxuan Li, Duc Minh Vo, Kohei Murao, Hiroyuki Abe, Tetsuo Ushiku, Shin'Ichi Satoh and Hideki Nakayama
    First International Workshop on Embodied Semiotics (EmSemi2023) (Idea presentation)
  6. Enhancing Sign Language Translation with Quantized Video Encoding
    Junwen Mo, Duc Minh Vo and Hideki Nakayama
    First International Workshop on Embodied Semiotics (EmSemi2023) (Idea presentation)

Journal

  1. Anticipation Captioning with Commonsense Knowledge
    Duc Minh Vo, Quoc-An Luong, Akihiro Sugimoto, Hideki Nakayama
    Journal of the Imaging Society of Japan, 2023 (Invited commentorary paper)
  2. Stochastically Flipping Labels of Discriminator’s Outputs for Training Generative Adversarial Networks
    Yang Rui, Duc Minh Vo, Hideki Nakayama
    IEEE Access, 2022
  3. Paired-D++ GAN for Image Manipulation with Text
    Duc Minh Vo, Akihiro Sugimoto
    Machine Vision and Applications (MVAP), 2022
  4. Two-stream FCNs to Balance Content and Style for Style Transfer
    Duc Minh Vo, Akihiro Sugimoto
    Machine Vision and Applications (MVAP), 2020