Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling Apple Machine Learning Research
Large language models are trained on massive scrapes of the web, which are often unstructured, noisy, and poorly phrased. Current scaling laws show that learning from such data requires an abundance of both compute and data, which grows with the size of the model being… Read More »Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling Apple Machine Learning Research