Skip to content

Optimizing Byte-level Representation for End-to-End ASR Apple Machine Learning Research

  • by

​[[{“value”:”This paper was accepted at the IEEE Spoken Language Technology Workshop (SLT) 2024.
In this paper, we propose an algorithm to optimize a byte-level representation for end-to-end (E2E) automatic speech recognition (ASR). Byte-level representation is often used by large scale multilingual ASR systems when the character set of the supported languages is large. The compactness and universality of byte-level representation allow the ASR models to use smaller output and therefore, provides more flexibility. UTF-8 is the most commonly used byte-level representation and has been successfully applied…”}]] [[{“value”:”This paper was accepted at the IEEE Spoken Language Technology Workshop (SLT) 2024.
In this paper, we propose an algorithm to optimize a byte-level representation for end-to-end (E2E) automatic speech recognition (ASR). Byte-level representation is often used by large scale multilingual ASR systems when the character set of the supported languages is large. The compactness and universality of byte-level representation allow the ASR models to use smaller output and therefore, provides more flexibility. UTF-8 is the most commonly used byte-level representation and has been successfully applied…”}]]  Read More  

Leave a Reply

Your email address will not be published. Required fields are marked *