Skip to content

CMU Researchers Unveil RoboTool: An AI System that Accepts Natural Language Instructions and Outputs Executable Code for Controlling Robots in both Simulated and Real-World Environments Pragati Jhunjhunwala Artificial Intelligence Category – MarkTechPost

  • by

Researchers from Carnegie Mellon University and Google DeepMind have collaborated to develop RoboTool, a system leveraging Large Language Models (LLMs) to imbue robots with the ability to creatively use tools in tasks involving implicit physical constraints and long-term planning. The system comprises four key components: 

Analyzer for interpreting natural language

Planner for generating strategies

Calculator for computing parameters, 

Coder for translating plans into executable Python code.

Using GPT-4, RoboTool aims to provide a more flexible, efficient, and user-friendly solution for complex robotics tasks compared to traditional Task and Motion Planning methods.

The study addresses the challenge of creative tool use in robots, analogous to the way animals exhibit intelligence in tool use. It emphasizes the importance of robots not only using tools for their intended purpose but also employing them in creative and unconventional ways to provide flexible solutions. Traditional Task and Motion Planning (TAMP) methods need to be revised in handling tasks with implicit constraints and are often computationally expensive. Large Language Models (LLMs) have shown promise in encoding knowledge beneficial for robotics tasks.

The research introduces a benchmark for evaluating creative tool-use capabilities, including tool selection, sequential tool use, and manufacturing. The proposed RoboTool is evaluated in both simulated and real-world environments, demonstrating proficiency in handling tasks that would be challenging without creative tool use. The system’s success rates surpass those of baseline methods, showcasing its effectiveness in solving complex, long-horizon planning tasks with implicit constraints.

The evaluation was done by calculating 3 types of errors- 

Tool-use error indicating whether the correct tool is used,

Logical error focuses on planning errors such as using tools in the wrong order or ignoring the provided constraints,

Numerical error including calculating the wrong target positions or adding incorrect offsets.

The RoboTool without the analyzer shows the use of the analyzer has a large tool-use error and the RoboTool without the calculator has a large numerical error in comparison with the RoboTool showcasing their role in the model.

The study showcases RoboTool’s achievements in various tasks, such as traversing gaps between sofas, reaching objects placed out of a robot’s workspace, and creatively using tools beyond their conventional functions. The system leverages LLMs’ knowledge about object properties and human common sense to identify key concepts and reasons about the 3D physical world. In experiments with a robotic arm and a quadrupedal robot, RoboTool demonstrates creative tool-use behaviors, including improvisation, sequential tool use, and tool manufacturing. While achieving success rates comparable to or exceeding baseline methods in simulation, its real-world performance is slightly affected by perception errors and execution errors.

In conclusion, RoboTool, powered by LLMs, is a creative robot tool user capable of solving long-horizon planning problems with implicit physical constraints. The system’s ability to identify key concepts, generate creative plans, compute parameters, and produce executable code contributes to its success in handling complex robotics tasks that require creative tool use.

Check out the PaperProject, and BlogAll credit for this research goes to the researchers of this project. Also, don’t forget to join our 34k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

The post CMU Researchers Unveil RoboTool: An AI System that Accepts Natural Language Instructions and Outputs Executable Code for Controlling Robots in both Simulated and Real-World Environments appeared first on MarkTechPost.

 Researchers from Carnegie Mellon University and Google DeepMind have collaborated to develop RoboTool, a system leveraging Large Language Models (LLMs) to imbue robots with the ability to creatively use tools in tasks involving implicit physical constraints and long-term planning. The system comprises four key components:  Using GPT-4, RoboTool aims to provide a more flexible, efficient,
The post CMU Researchers Unveil RoboTool: An AI System that Accepts Natural Language Instructions and Outputs Executable Code for Controlling Robots in both Simulated and Real-World Environments appeared first on MarkTechPost.  Read More AI Shorts, Applications, Artificial Intelligence, Computer Vision, Editors Pick, Language Model, Large Language Model, Machine Learning, Staff, Tech News, Technology, Uncategorized 

Leave a Reply

Your email address will not be published. Required fields are marked *