Skip to content

zetabyte

SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding Apple Machine Learning Research

​We introduce SlowFast-LLaVA-1.5 (abbreviated as SF-LLaVA-1.5), a family of video large language models (LLMs) offering a token-efficient solution for long-form video understanding. We incorporate the two-stream SlowFast mechanism into a streamlined training pipeline, and perform joint video-image training on a carefully curated data mixture of… Read More »SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding Apple Machine Learning Research

Checklists Are Better Than Reward Models For Aligning Language Models Apple Machine Learning Research

​Language models must be adapted to understand and follow user instructions. Reinforcement learning is widely used to facilitate this — typically using fixed criteria such as “helpfulness” and “harmfulness”. In our work, we instead propose using flexible, instruction-specific criteria as a means of broadening the… Read More »Checklists Are Better Than Reward Models For Aligning Language Models Apple Machine Learning Research

Fine-tune OpenAI GPT-OSS models using Amazon SageMaker HyperPod recipes Durga Sury Artificial Intelligence

​[[{“value”:” This post is the second part of the GPT-OSS series focusing on model customization with Amazon SageMaker AI. In Part 1, we demonstrated fine-tuning GPT-OSS models using open source Hugging Face libraries with SageMaker training jobs, which supports distributed multi-GPU and multi-node configurations, so… Read More »Fine-tune OpenAI GPT-OSS models using Amazon SageMaker HyperPod recipes Durga Sury Artificial Intelligence

Inline code nodes now supported in Amazon Bedrock Flows in public preview Shubhankar Sumar Artificial Intelligence

​[[{“value”:” Today, we are excited to announce the public preview of support for inline code nodes in Amazon Bedrock Flows. With this powerful new capability, you can write Python scripts directly within your workflow, alleviating the need for separate AWS Lambda functions for simple logic.… Read More »Inline code nodes now supported in Amazon Bedrock Flows in public preview Shubhankar Sumar Artificial Intelligence

Accelerate enterprise AI implementations with Amazon Q Business Oliver Steffmann Artificial Intelligence

​[[{“value”:” As an Amazon Web Services (AWS) enterprise customer, you’re probably exploring ways to use generative AI to enhance your business processes, improve customer experiences, and drive innovation. With a variety of options available—from Amazon Q Business to other AWS services or third-party offerings—choosing the… Read More »Accelerate enterprise AI implementations with Amazon Q Business Oliver Steffmann Artificial Intelligence

Speed up delivery of ML workloads using Code Editor in Amazon SageMaker Unified Studio Paul Hargis Artificial Intelligence

​[[{“value”:” Amazon SageMaker Unified Studio is a single integrated development environment (IDE) that brings together your data tools for analytics and AI. As part of the next generation of Amazon SageMaker, it contains integrated tooling for building data pipelines, sharing datasets, monitoring data governance, running… Read More »Speed up delivery of ML workloads using Code Editor in Amazon SageMaker Unified Studio Paul Hargis Artificial Intelligence

What Is Speaker Diarization? A 2025 Technical Guide: Top 9 Speaker Diarization Libraries and APIs in 2025 Michal Sutter Artificial Intelligence Category – MarkTechPost

​[[{“value”:” Table of contents How Speaker Diarization Works Accuracy, Metrics, and Current Challenges Technical Insights and 2025 Trends Top 9 Speaker Diarization Libraries and APIs in 2025 FAQs Speaker diarization is the process of answering “who spoke when” by separating an audio stream into segments… Read More »What Is Speaker Diarization? A 2025 Technical Guide: Top 9 Speaker Diarization Libraries and APIs in 2025 Michal Sutter Artificial Intelligence Category – MarkTechPost

NVIDIA AI Just Released Streaming Sortformer: A Real-Time Speaker Diarization that Figures Out Who’s Talking in Meetings and Calls Instantly Asif Razzaq Artificial Intelligence Category – MarkTechPost

​[[{“value”:” NVIDIA has released its Streaming Sortformer, a breakthrough in real-time speaker diarization that instantly identifies and labels participants in meetings, calls, and voice-enabled applications—even in noisy, multi-speaker environments. Designed for low-latency, GPU-powered inference, the model is optimized for English and Mandarin, and can track… Read More »NVIDIA AI Just Released Streaming Sortformer: A Real-Time Speaker Diarization that Figures Out Who’s Talking in Meetings and Calls Instantly Asif Razzaq Artificial Intelligence Category – MarkTechPost

How Infosys Topaz leverages Amazon Bedrock to transform technical help desk operations Meenakshi Venkatesan, Karthikeyan Senthilkumar, Aninda Chakraborty Artificial Intelligence

​[[{“value”:” AI-powered apps and AI-powered service delivery are key differentiators in the enterprise space today. A generative AI-based resource can greatly reduce the onboarding time for new employees, enhance enterprise search, assist in drafting content, check for compliance, understand the legal language of data, and… Read More »How Infosys Topaz leverages Amazon Bedrock to transform technical help desk operations Meenakshi Venkatesan, Karthikeyan Senthilkumar, Aninda Chakraborty Artificial Intelligence