Skip to content

zetabyte

Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms Apple Machine Learning Research

​Building a generalist model for user interface (UI) understanding is challenging due to various foundational issues, such as platform diversity, resolution variation, and data limitation. In this paper, we introduce Ferret-UI 2, a multimodal large language model (MLLM) designed for universal UI understanding across a… Read More »Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms Apple Machine Learning Research

Controlling Language and Diffusion Models by Transporting Activations Apple Machine Learning Research

​Large generative models are becoming increasingly capable and more widely deployed to power production applications, but getting these models to produce exactly what’s desired can still be challenging. Fine-grained control over these models’ outputs is important to meet user expectations and to mitigate potential misuses,… Read More »Controlling Language and Diffusion Models by Transporting Activations Apple Machine Learning Research

Adaptive Batch Size for Privately Finding Second-order Stationary Points Apple Machine Learning Research

​[[{“value”:”There is a gap between finding a first-order stationary point (FOSP) and a second-order stationary point (SOSP) under differential privacy constraints, and it remains unclear whether privately finding an SOSP is more challenging than finding an FOSP. Specifically, Ganesh et al. (2023) claimed that an… Read More »Adaptive Batch Size for Privately Finding Second-order Stationary Points Apple Machine Learning Research

Do LLMs Know Internally When They Follow Instructions? Apple Machine Learning Research

​Instruction-following is crucial for building AI agents with large language models (LLMs), as these models must adhere strictly to user-provided constraints and guidelines. However, LLMs often fail to follow even simple and clear instructions. To improve instruction-following behavior and prevent undesirable outputs, a deeper understanding… Read More »Do LLMs Know Internally When They Follow Instructions? Apple Machine Learning Research

RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data Apple Machine Learning Research

​We present RelCon, a novel self-supervised Relative Contrastive learning approach for training a motion foundation model from wearable accelerometry sensors. First, a learnable distance measure is trained to capture motif similarity and domain-specific semantic information such as rotation invariance. Then, the learned distance provides a… Read More »RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data Apple Machine Learning Research

MircoNN: An On-device Disk Resident Updatable Vector Database Apple Machine Learning Research

​Nearest neighbour search over dense vector collections has important applications in information retrieval, retrieval augmented generation (RAG), and content ranking. Performing efficient search over large vector collections is a well studied problem with many existing approaches and open source implementations. However, most state-of-the-art systems are… Read More »MircoNN: An On-device Disk Resident Updatable Vector Database Apple Machine Learning Research

Simple ReFlow: Improved Techniques for Fast Flow Models Apple Machine Learning Research

​Diffusion and flow-matching models achieve remarkable generative performance but at the cost of many sampling steps, this slows inference and limits applicability to time-critical tasks. The ReFlow procedure can accelerate sampling by straightening generation trajectories. However, ReFlow is an iterative procedure, typically requiring training on… Read More »Simple ReFlow: Improved Techniques for Fast Flow Models Apple Machine Learning Research

Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling Apple Machine Learning Research

​Specialist language models (LMs) focus on a specific task or domain on which they often outperform generalist LMs of the same size. However, the specialist data needed to pretrain these models is only available in limited amount for most tasks. In this work, we build… Read More »Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling Apple Machine Learning Research

The AdEMAMix Optimizer: Better, Faster, Older Apple Machine Learning Research

​Momentum based optimizers are central to a wide range of machine learning applications. These typically rely on an Exponential Moving Average (EMA) of gradients, which decays exponentially the present contribution of older gradients. This accounts for gradients being local linear approximations which lose their relevance… Read More »The AdEMAMix Optimizer: Better, Faster, Older Apple Machine Learning Research

Implement human-in-the-loop confirmation with Amazon Bedrock Agents Clement Perrot AWS Machine Learning Blog

​[[{“value”:” Agents are revolutionizing how businesses automate complex workflows and decision-making processes. Amazon Bedrock Agents helps you accelerate generative AI application development by orchestrating multi-step tasks. Agents use the reasoning capability of foundation models (FMs) to break down user-requested tasks into multiple steps. In addition,… Read More »Implement human-in-the-loop confirmation with Amazon Bedrock Agents Clement Perrot AWS Machine Learning Blog