Depth fuels expertise, breadth sparks innovation.
Yuan Tseng,
Layne Berry*,
Yi-Ting Chen*,
I-Hsiang Chiu*,
Hsuan-Hao Lin*,
Max Liu*,
Puyuan Peng*,
Yi-Jen Shih*,
Hung-Yu Wang*,
Haibin Wu*,
Po-Yao Huang,
Shang-Wen Li,
David Harwath,
Yu Tsao,
Shinji Watanabe,
Abdelrahman Mohamed,
Chi-Luen Feng,
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024
We propose the AV-SUPERB benchmark that enables general-purpose evaluation of unimodal audio/visual and bimodal fusion representations on 7 datasets covering 5 audio-visual tasks in speech and audio processing.
Frontiers4LCD Workshop at International Conference on Machine Learning (ICML), 2023
We propose a novel imitation learning method combining with diffusion model. We show that our method can achieve better performance than previous imitation learning methods.
23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), 2022
We propose a data augmentation method for DST, which improve the state-of-the-art performance on MultiWOZ 2.1.