Publications


2024

  • SlowFast Network for Continuous Sign Language Recognition
    J. Ahn, Y. Jang, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF
  • Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification
    H. Heo, K. Nam, B. Lee, Y. Kwon, M. Lee, Y. J. Kim, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF
  • Speech Guided Masked Image Modeling for Visually Grounded Speech
    J. Woo, H. Ryu, A. Senocak, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF
  • VoxMM: Rich Transcription of Conversations in the Wild
    D. Kwak, J. Jung, K. Nam, Y. Jang, J. Jung, S. Watanabe, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF
  • From Coarse To Fine: Efficient Training for Audio Spectrogram Transformers
    J. Feng, M. H. Erol, J. S. Chung, A. Senocak
    International Conference on Acoustics, Speech, and Signal Processing
    PDF
  • VoiceLDM: Text-to-Audio Generation with Linguistic Content
    Y. Lee, I. Yeon, J. Nam, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF Project page
  • TalkNCE: Improving Active Speaker Detection with Talking-Aware Contrastive Learning
    C. Jung, S. Lee, K. Nam, K. Rho, Y. J. Kim, Y. Jang, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF
  • Seeing Through the Conversation: Audio-Visual Speech Separation based on Diffusion Model
    S. Lee, C. Jung, Y. Jang, J. Kim, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF
  • Let There Be Sound: Reconstructing High Quality Speech from Silent Videos
    J. Kim, J. Kim, J. S. Chung
    AAAI Conference on Artificial Intelligence
    PDF Project page

2023

  • That's What I Said: Fully-Controllable Talking Face Generation
    Y. Jang, K. Rho, J. Woo, H. Lee, J. Park, Y. Lim, B. Kim, J. S. Chung
    ACM International Conference on Multimedia
    PDF Project page
  • Sound Source Localization is All about Cross-Modal Alignment
    A. Senocak, H. Ryu, J. Kim, T. Oh, H. Pfister, J. S. Chung
    International Conference on Computer Vision
    PDF
  • Curriculum learning for self-supervised speaker verification
    H. Heo, J. Jung, J. Kang, Y. Kwon, B. Lee, Y. J. Kim, J. S. Chung
    Interspeech
    PDF
  • Self-sufficient framework for continuous sign language recognition
    Y. Jang, Y. Oh, J. W. Cho, M. Kim, D. Kim, I. S. Kweon, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF Project page
  • Metric learning for user-defined keyword spotting
    J. Jung, Y. Kim, J. Park, Y. Lim, B. Kim, Y. Jang, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF Project page
  • Hindi as a second language: improving visually grounded speech with semantically similar samples
    H. Ryu, A. Senocak, I. S. Kweon, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF
  • MarginNCE: Robust Sound Localization with a Negative Margin
    S. Park, A. Senocak, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF
  • Advancing the dimensionality reduction of speaker embeddings for speaker diarisation: disentangling noise and informing speech activity
    Y. J. Kim, H. Heo, J. Jung, Y. Kwon, B. Lee, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF
  • In search of strong embedding extractors for speaker diarisation
    J. Jung, B. Lee, J. Huh, A. Brown, Y. Kwon, S. Watanabe, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF
  • Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech
    J. Lee, J. S. Chung, S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF

2022

  • Signing Outside the Studio: Benchmarking Background Robustness for Continuous Sign Language Recognition
    Y. Jang, Y. Oh, J. W. Cho, D. Kim, J. S. Chung, I. S. Kweon
    British Machine Vision Conference
    PDF Project page
  • Augmentation adversarial training for self-supervised speaker representation learning
    J. Kang, J. Huh, H. Heo, J. S. Chung
    Journal of Selected Topics in Signal Processing
    PDF
  • Pushing the limits of raw waveform speaker recognition
    J. Jung, Y. J. Kim, H. Heo, B. Lee, Y. Kwon, J. S. Chung
    Interspeech
    PDF
  • Spell my name: Keyword boosted speech recognition
    N. Jung, G. Kim, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF
  • Multi-scale speaker embedding-based graph attention networks for speaker diarisation
    Y. Kwon, H. Heo, J. Jung, Y. J. Kim, B. Lee, J. S. Chung
    International Conference on Acoustics, Speech, and Signal Processing
    PDF
  • AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks
    J. Jung, H. Heo, H. Tak, H. Shim, J. S. Chung, B. Lee, H. Yu, N. Evans
    International Conference on Acoustics, Speech, and Signal Processing
    PDF

KAIST logo