INTERSPEECH2021
An Introduction to Automatic Differentiation with Weighted Finite-State Automata
2:59:31
INTERSPEECH2021
Neural target speech extraction
3:00:29
INTERSPEECH2021
Concept to Code: Semi-Supervised End-To-End Approaches For Speech Recognition
2:23:05
INTERSPEECH2021
Intonation Transcription and Modelling in Research and Speech Technology Applications
2:50:58
INTERSPEECH2021
SpeechBrain: Unifying Speech Technologies and Deep Learning With an Open Source Toolkit
1:18:38
INTERSPEECH2021
SpeechBrain: Unifying Speech Technologies and Deep Learning With an Open Source Toolkit
45:23
INTERSPEECH2021
SpeechBrain: Unifying Speech Technologies and Deep Learning With an Open Source Toolkit
33:47
INTERSPEECH2021
Speech Recognition with Next-Generation Kaldi (K2, Lhotse, Icefall)
3:04:22
INTERSPEECH2021
Language Modeling and Artificial Intelligence
1:02:31
INTERSPEECH2021
Learning speech models from multi-modal data
1:00:15
INTERSPEECH2021
Towards automatic speech recognition for people with atypical speech
1:04:17
INTERSPEECH2021
Opening ceremony
1:15:02
INTERSPEECH2021
Child Language Acquisition studied with Wearables
58:22
INTERSPEECH2021
Uncovering the acoustic cues of COVID-19 infection
1:03:19
INTERSPEECH2021
Ethical and Technological Challenges of Conversational AI
51:15
INTERSPEECH2021
Adaptive listening to everyday soundscapes
1:02:36
INTERSPEECH2021
ISCA Medalist: Forty years of speech and language processing: from Bayes decision rule to deep l...
58:26
INTERSPEECH2021
Web Interface for estimating articulatory movements in speech production from acoustics and text...
3:19
INTERSPEECH2021
WittyKiddy: Multilingual Spoken Language Learning for Kids - (3 minutes introduction)
3:45
INTERSPEECH2021
NeMo (Inverse) Text Normalization: From Development To Production - (longer introduction)
12:58
INTERSPEECH2021
Automatic Radiology Report Editing through Voice - (3 minutes introduction)
2:37
INTERSPEECH2021
NeMo (Inverse) Text Normalization: From Development To Production - (3 minutes introduction)
3:17
INTERSPEECH2021
Save your Voice: Voice Banking and TTS for Anyone - (3 minutes introduction)
3:19
INTERSPEECH2021
Interactive and real-time acoustic measurement tools for speech data acquisition and presentatio...
2:28
INTERSPEECH2021
Analysis and Tuning of a Voice Assistant System for Dysfluent Speech - (Oral presentation)
1:23
INTERSPEECH2021
F-T-LSTM based Complex Network for Joint Acoustic Echo Cancellation and Speech Enhancement - (Or...
16:09
INTERSPEECH2021
Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility O...
1:28
INTERSPEECH2021
Disordered Speech Data Collection: Lessons Learned at 1 Million Utterances from Project Euphonia...
1:29
INTERSPEECH2021
Conformer Parrotron: a Faster and Stronger End-to-end SpeechConversion and Recognition Model for...
1:14
INTERSPEECH2021
A Voice-Activated Switch for Persons with Motor and Speech Impairments: Isolated-Vowel Spotting ...
1:20
INTERSPEECH2021
Bayesian Parametric and Architectural Domain Adaptation of LF-MMI Trained TDNNs for Elderly and ...
1:22
INTERSPEECH2021
Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition - (Oral pr...
1:21
INTERSPEECH2021
Adversarial Data Augmentation for Disordered Speech Recognition - (Oral presentation)
1:21
INTERSPEECH2021
INTERSPEECH 2021 Acoustic Echo Cancellation Challenge - (Oral presentation)
18:53
INTERSPEECH2021
Handling acoustic variation in dysarthric speech recognition systems through model combination -...
1:08
INTERSPEECH2021
Acoustic Echo Cancellation using Deep Complex Neural Network with Nonlinear Magnitude Compressio...
16:31
INTERSPEECH2021
Investigating the Utility of Multimodal Conversational Technology and Audiovisual Analytic Measu...
1:21
INTERSPEECH2021
Automatic Speech Recognition of Disordered Speech: Personalized models outperforming human liste...
1:32
INTERSPEECH2021
Factorization-Aware Training of Transformers for Natural Language Understanding On the Edge - (3...
3:22
INTERSPEECH2021
END-to-END Cross-Lingual Spoken Language Understanding Model with Multilingual Pretraining - (3 ...
2:52
INTERSPEECH2021
Augmenting Slot Values and Contexts for Spoken Language Understanding with Pretrained Models - (...
3:19
INTERSPEECH2021
Synthesis of expressive speaking styles with limited training data in a multi-speaker, prosody-c...
2:08
INTERSPEECH2021
Cross-speaker Style Transfer with Prosody Bottleneck in Neural Speech Synthesis - (3 minutes int...
3:17
INTERSPEECH2021
SponSpeech: Adaptive Text to Speech for Spontaneous Style - (3 minutes introduction)
3:24
INTERSPEECH2021
Presentation matters: Evaluating speaker identification tasks - (longer introduction)
13:34
INTERSPEECH2021
Towards Multi-Scale Style Control for Expressive Speech Synthesis - (3 minutes introduction)
3:20
INTERSPEECH2021
Expressive Text-to-Speech using Style Tag - (3 minutes introduction)
3:19
INTERSPEECH2021
Controllable Context-Aware Conversational Speech Synthesis - (3 minutes introduction)
3:23
INTERSPEECH2021
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressi...
3:06
INTERSPEECH2021
An Integrated Framework for Two-pass Personalized Voice Trigger - (3 minutes introduction)
3:19
INTERSPEECH2021
Automatic Error Correction for Speaker Embedding Learning with Noisy Labels - (3 minutes introdu...
3:13
INTERSPEECH2021
Presentation matters: Evaluating speaker identification tasks - (3 minutes introduction)
3:21
INTERSPEECH2021
Chronological Self-Training for Real-Time Speaker Diarization - (3 minutes introduction)
3:01
INTERSPEECH2021
Multi-Channel Speaker Verification for Single and Multi-talker Speech - (3 minutes introduction)...
3:18
INTERSPEECH2021
Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition - (3 minutes...
3:21
INTERSPEECH2021
Fusion of Embeddings Networks for Robust Combination of Text Dependent and Independent Speaker R...
3:29
INTERSPEECH2021
Collaborative Training of Acoustic Encoders for Speech Recognition - (3 minutes introduction)
3:21
INTERSPEECH2021
Graph-based Label Propagation for Semi-Supervised Speaker Identification - (3 minutes introducti...
3:08
INTERSPEECH2021
Weakly Supervised Construction of ASR Systems from Massive Video Data - (longer introduction)
10:46
INTERSPEECH2021
PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation - (3 minutes introd...
3:09
INTERSPEECH2021
Tied & Reduced RNN-T Decoder - (3 minutes introduction)
3:09
INTERSPEECH2021
Extremely Low Footprint End-to-End ASR System for Smart Device - (3 minutes introduction)
3:14
INTERSPEECH2021
Compressing 1D Time-Channel Separable Convolutions using Sparse Random Ternary Matrices - (longe...
15:13
INTERSPEECH2021
Weakly Supervised Construction of ASR Systems from Massive Video Data - (3 minutes introduction)...
2:35
INTERSPEECH2021
Compressing 1D Time-Channel Separable Convolutions using Sparse Random Ternary Matrices - (3 min...
3:17
INTERSPEECH2021
Generalized Dilated CNN Models for Depression Detection Using Inverted Vocal Tract Variables - (...
3:20
INTERSPEECH2021
Speech Emotion Recognition with Multi-task Learning - (3 minutes introduction)
3:18
INTERSPEECH2021
Metric Learning Based Feature Representation With Gated Fusion Model For Speech Emotion Recognit...
2:58
INTERSPEECH2021
Audio-Visual Speech Emotion Recognition by Disentangling Emotion and Identity Attributes - (3 mi...
3:20
INTERSPEECH2021
Improvement of Automatic English Pronunciation Assessment with Small Number of Utterances Using ...
3:11
INTERSPEECH2021
NeMo Inverse Text Normalization: From Development To Production - (3 minutes introduction)
2:49
INTERSPEECH2021
"You don't understand me!": Comparing ASR results for L1 and L2 speakers of Swedish - (3 minutes...
3:21
INTERSPEECH2021
Deep feature transfer learning for automatic pronunciation assessment - (3 minutes introduction)...
3:01
INTERSPEECH2021
Lexical Density Analysis of Word Productions in Japanese English Using Acoustic Word Embeddings ...
3:29
INTERSPEECH2021
Explore Wav2vec 2.0 for Mispronunciation Detection - (3 minutes introduction)
3:14
INTERSPEECH2021
Toward Genre Adapted Close Captioning - (Oral presentation)
18:39
INTERSPEECH2021
End-to-End Speaker-Attributed ASR with Transformer - (3 minutes introduction)
3:03
INTERSPEECH2021
Weakly-supervised word-level pronunciation error detection in non-native English speech - (longe...
2:52
INTERSPEECH2021
EML Online Speech Activity Detection for Fearless Steps Challenge Phase-III - (Oral presentation...
18:50
INTERSPEECH2021
Spoken Term Detection and Relevance Score Estimation using Dot-Product of Pronunciation Embeddin...
18:54
INTERSPEECH2021
Voice Activity Detection With Teacher-Student Domain Emulation - (Oral presentation)
21:01
INTERSPEECH2021
Semantic sentence similarity: size does not always matter - (Oral presentation)
18:22
INTERSPEECH2021
Speech Activity Detection Based on Multilingual Speech Recognition System - (Oral presentation)...
20:08
INTERSPEECH2021
The Application of Learnable STRF Kernels to the 2021 Fearless Steps Phase-03 SAD Challenge - (O...
19:56
INTERSPEECH2021
Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challen...
18:45
INTERSPEECH2021
Combining Hybrid and End-to-end Approaches for the OpenASR20 Challenge - (Oral presentation)
3:18
INTERSPEECH2021
The TNT Team System Descriptions of Cantonese and Mongolian for IARPA OpenASR20 - (Oral presenta...
3:13
INTERSPEECH2021
An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems - (lon...
10:54
INTERSPEECH2021
Systems for Low-Resource Speech Recognition Tasks in Open Automatic Speech Recognition and Formo...
3:35
INTERSPEECH2021
An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems - (3 m...
3:19
INTERSPEECH2021
Channel-wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks - (3 minutes in...
3:10
INTERSPEECH2021
Representation Learning to Classify and Detect Adversarial Attacks against Speaker and Speech Re...
3:22
INTERSPEECH2021
Pairing Weak with Strong: Twin Models for Defending against Adversarial Attack on Speaker Verifi...
12:29
INTERSPEECH2021
Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-...
3:25
INTERSPEECH2021
Voting for the right answer: Adversarial defense for speaker verification - (3 minutes introduct...
3:02
INTERSPEECH2021
Pairing Weak with Strong: Twin Models for Defending against Adversarial Attack on Speaker Verifi...
2:59
INTERSPEECH2021
A Comparative Study on Recent Neural Spoofing Countermeasures for Synthetic Speech Detection - (...
3:09
INTERSPEECH2021
Cross-database replay detection in terminal-dependent speaker verification - (3 minutes introduc...
2:56
INTERSPEECH2021
An Initial Investigation for Detecting Partially Spoofed Audio - (3 minutes introduction)
3:07
INTERSPEECH2021
Keyword Transformer: A Self-Attention Model for Keyword Spotting - (3 minutes introduction)
3:16