Spatial LibriSpeech: An Augmented Dataset for Spatial Audio Learning Apple Machine Learning Research
We present Spatial LibriSpeech, a spatial audio dataset with over 570 hours of 19-channel audio, first-order ambisonics, and optional distractor noise. Spatial LibriSpeech is designed for machine learning model training, and it includes labels for source position, speaking direction, room acoustics and geometry. Spatial LibriSpeech… Read More »Spatial LibriSpeech: An Augmented Dataset for Spatial Audio Learning Apple Machine Learning Research