News Feed - Page 167 of 961 - PhD Studio January 19, 2025

SaRA: A Memory-Efficient Fine-Tuning Method for Enhancing Pre-Trained Diffusion Models Sana Hassan Artificial Intelligence Category – MarkTechPost

by

[[{“value”:” Recent advancements in diffusion models have significantly improved tasks like image, video, and 3D generation, with pre-trained models like Stable Diffusion being pivotal. However, adapting these models to new tasks efficiently remains a challenge. Existing fine-tuning approaches—Additive, Reparameterized, and Selective-based—have limitations, such as added… Read More »SaRA: A Memory-Efficient Fine-Tuning Method for Enhancing Pre-Trained Diffusion Models Sana Hassan Artificial Intelligence Category – MarkTechPost

Windows Agent Arena (WAA): A Scalable Open-Sourced Windows AI Agent Platform for Testing and Benchmarking Multi-modal, Desktop AI Agent Asif Razzaq Artificial Intelligence Category – MarkTechPost

by

[[{“value”:” Artificial intelligence (AI) has been advancing in developing agents capable of executing complex tasks across digital platforms. These agents, often powered by large language models (LLMs), have the potential to dramatically enhance human productivity by automating tasks within operating systems. AI agents that can… Read More »Windows Agent Arena (WAA): A Scalable Open-Sourced Windows AI Agent Platform for Testing and Benchmarking Multi-modal, Desktop AI Agent Asif Razzaq Artificial Intelligence Category – MarkTechPost

Agent Workflow Memory (AWM): An AI Method for Improving the Adaptability and Efficiency of Web Navigation Agents Nikhil Artificial Intelligence Category – MarkTechPost

by

[[{“value”:” Web navigation agents revolve around creating autonomous systems capable of performing tasks like searching, shopping, and retrieving information from the internet. These agents utilize advanced language models to interpret instructions and navigate through digital environments, making decisions to execute tasks that typically require human… Read More »Agent Workflow Memory (AWM): An AI Method for Improving the Adaptability and Efficiency of Web Navigation Agents Nikhil Artificial Intelligence Category – MarkTechPost

InfraLib: A Comprehensive AI framework for Enabling Reinforcement Learning and Decision Making for Large Scale Infrastructure Management Tanya Malhotra Artificial Intelligence Category – MarkTechPost

by

[[{“value”:” Infrastructure systems must be managed effectively to preserve sustainability, protect public safety, and uphold economic stability. Transportation, communication, energy distribution, and other functions are made possible by these networks, which are the cornerstone of any functioning society. However, there is a great deal of… Read More »InfraLib: A Comprehensive AI framework for Enabling Reinforcement Learning and Decision Making for Large Scale Infrastructure Management Tanya Malhotra Artificial Intelligence Category – MarkTechPost

Small but Mighty: The Enduring Relevance of Small Language Models in the Age of LLMs Mohammad Asjad Artificial Intelligence Category – MarkTechPost

by

[[{“value”:” Large Language Models (LLMs) have revolutionized natural language processing in recent years. The pre-train and fine-tune paradigm, exemplified by models like ELMo and BERT, has evolved into prompt-based reasoning used by the GPT family. These approaches have shown exceptional performance across various tasks, including… Read More »Small but Mighty: The Enduring Relevance of Small Language Models in the Age of LLMs Mohammad Asjad Artificial Intelligence Category – MarkTechPost

XVERSE-MoE-A36B Released by XVERSE Technology: A Revolutionary Multilingual AI Model Setting New Standards in Mixture-of-Experts Architecture and Large-Scale Language Processing Asif Razzaq Artificial Intelligence Category – MarkTechPost

by

[[{“value”:” XVERSE Technology made a significant leap forward by releasing the XVERSE-MoE-A36B, a large multilingual language model based on the Mixture-of-Experts (MoE) architecture. This model stands out due to its remarkable scale, innovative structure, advanced training data approach, and diverse language support. The release represents… Read More »XVERSE-MoE-A36B Released by XVERSE Technology: A Revolutionary Multilingual AI Model Setting New Standards in Mixture-of-Experts Architecture and Large-Scale Language Processing Asif Razzaq Artificial Intelligence Category – MarkTechPost

GenMS: An Hierarchical Approach to Generating Crystal Structures from Natural Language Descriptions Sana Hassan Artificial Intelligence Category – MarkTechPost

by

[[{“value”:” Generative models have advanced significantly, enabling the creation of diverse data types, including crystal structures. In materials science, these models can combine existing knowledge to propose new crystals, leveraging their ability to generalize from large datasets. However, current models often require detailed input or… Read More »GenMS: An Hierarchical Approach to Generating Crystal Structures from Natural Language Descriptions Sana Hassan Artificial Intelligence Category – MarkTechPost

How to Prompt on OpenAI’s o1 Models and What’s Different From GPT-4 Sana Hassan Artificial Intelligence Category – MarkTechPost

by

[[{“value”:” OpenAI’s o1 models represent a newer generation of AI, designed to be highly specialized, efficient, and capable of handling tasks more dynamically than their predecessors. While these models share similarities with GPT-4, they introduce notable distinctions in architecture, prompting capabilities, and performance. Let’s explore… Read More »How to Prompt on OpenAI’s o1 Models and What’s Different From GPT-4 Sana Hassan Artificial Intelligence Category – MarkTechPost

OneGen: An AI Framework that Enables a Single LLM to Handle both Retrieval and Generation Simultaneously Aswin Ak Artificial Intelligence Category – MarkTechPost

by

[[{“value”:” A major challenge in the current deployment of Large Language Models (LLMs) is their inability to efficiently manage tasks that require both generation and retrieval of information. While LLMs excel at generating coherent and contextually relevant text, they struggle to handle retrieval tasks, which… Read More »OneGen: An AI Framework that Enables a Single LLM to Handle both Retrieval and Generation Simultaneously Aswin Ak Artificial Intelligence Category – MarkTechPost

Nvidia Open Sources Nemotron-Mini-4B-Instruct: A 4,096 Token Capacity Small Language Model Designed for Roleplaying, Function Calling, and Efficient On-Device Deployment with 32 Attention Heads and 9,216 MLP Asif Razzaq Artificial Intelligence Category – MarkTechPost

by

[[{“value”:” Nvidia has unveiled its latest small language model, Nemotron-Mini-4B-Instruct, which marks a new chapter in the company’s long-standing tradition of innovation in artificial intelligence. This model, designed specifically for tasks like roleplaying, retrieval-augmented generation (RAG), and function calls, is a more compact and efficient… Read More »Nvidia Open Sources Nemotron-Mini-4B-Instruct: A 4,096 Token Capacity Small Language Model Designed for Roleplaying, Function Calling, and Efficient On-Device Deployment with 32 Attention Heads and 9,216 MLP Asif Razzaq Artificial Intelligence Category – MarkTechPost

« Previous
1
…
165
166
167
168
169
…
961
Next »