Chat GPT for Robots

Microsoft has come up with an API that allows ChatGPT to produce code that will control the actions of robots. The object is for people who do not have special knowledge about physics or robotics to use ChatGPT’s stored knowledge about the world to create programs for robots.

The API can, for example, be used to write code to control a drone. In the video example, the drone is used to visually inspect (with a camera) the photovoltaic array of a solar energy installation. The human user can, with the assistance of the API, tell ChatGPT the desired actions in what the video describes as “casual conversation.”

Use cases

It’s easy to imagine use cases for this kind of capability. The demos provided by Microsoft involve placing colored blocks, but picking and placing are important skills in manufacturing and in warehouses.

Microsoft included the example of cooking an omelet. While we have long looked forward to being able to ask a robot to make us a cup of coffee, we figure a robot capable of the tasks involved in making an omelet could assemble things and check their labels, pulling out any items that were incorrectly labeled.

Changes on an assembly line, for new packaging, for example, could be much easier if we could just tell ChatGPT to write code for the machinery.

Human actions

Obviously, you have to choose actions the robot in question is physically able to perform. Telling your industrial robotic arm to fly around and check a solar energy installation will not be successful.

Once the human being has determined the correct prompts and activated ChatGPT’s relevant knowledge of the world, the resulting code can be programmed into the robot’s control system for testing. While the prompts need to be formed in a special way and must use the API, the process can be done in English.

A simulator can also be used to test the code — clearly a better choice for drone control, for example.

Having seen what the robot does with the instructions as they were written in the first pass, the human can correct and improve them. Repeated testing and additional iterations allow the operator to perfect the code before fully implementing it.

We’re still waiting to be able to say, “Alexa, make me a cup of coffee,” but that might be on the horizon.

