No headings found on page

Voice and Gesture Control on a Cobot: IIT Indore × Addverb Syncro 5

SYNCRO

RESEARCH

PHYSICAL AI

5 mins

28 Apr 2026

Voice and Gesture Control on a Cobot: IIT Indore × Addverb Syncro 5

Eight students at IIT Indore built a dual-control framework on the Addverb Syncro 5 combining autonomous LLM-driven voice commands with real-time glove teleoperation on a single collaborative robot.

In the first, a student says aloud "Pick up the bottle." No buttons. No code. No setup. The Syncro 5 cobot hears the words, interprets them, sweeps the table, finds the bottle, aligns to it, descends, grasps, retracts. The student never touches the robot.

In the second, that same student slips on a glove. Each finger bend, each squeeze of the thumb, becomes a command. Move the hand forward, the arm moves forward. Tilt up the arm moves up. There's no joystick, no keyboard. Just a gesture, and the robot follows.

Same robot. Same table. Two completely different ways of being in command.

This is what a team of eight students at IIT Indore built in collaboration with Addverb: a complete dual-control framework on the Addverb Syncro 5 that lets an operator switch between full autonomy and intuitive human-in-the-loop teleoperation on the same hardware, through the same backend, inside the same session.

Voice Control and Gesture Control: Two Ways to Command the Cobot

The IIT Indore team didn't pick the easy path. They built every layer themselves and offered the operator a choice of how to engage with it.

In Voice Control mode, you simply speak. A large language model interprets the intent, computer vision finds the object in the frame, and a depth camera confirms it's within reach. The cobot closes the loop on its own and completes the grasp. From the operator's side, the only act of "control" is a sentence in plain English. It's the closest a collaborative robot has come to behaving like an assistant rather than a tool.

In Gesture Control mode, you wear a custom wearable the team designed and assembled fitted with sensors that read finger position and thumb pressure in real time. A machine-learning classifier maps your gestures to motion, and the arm moves with you. There's no calibration step before you start; the model learns as you use it, and adapts to whoever's wearing the glove.

The two modes share everything underneath the same SDK, the same control server, the same safety configuration. The operator chooses the mode that fits the moment.

Learn more on our Cobot – Syncro 5

Why Dual-Control Cobots Matter for Industry

The most interesting result here isn't either mode taken in isolation. Plenty of labs have built voice-controlled robotic arms. Plenty have built teleoperated ones. What's rare is the dual-control paradigm itself the seamless switch between full autonomy and human-in-the-loop control, on a single shared infrastructure.

That switch matters in the real world. Warehouses, manufacturing lines, and research labs don't run on one mode. A pick-and-place in a structured location should be autonomous; a delicate handover, an unfamiliar object, an exception that wasn't anticipated — those need a human in the loop. Most current automation forces a hard architectural choice between the two. This project demonstrates it doesn't have to.

There's a deeper signal here too. When a cobot can interpret intent, find an object, confirm it's reachable, and read your hand in real time the operator's job stops looking like programming a machine. It starts looking like collaborating with one. That shift is what the next generation of cobot deployments in intralogistics, in advanced manufacturing, and in research will be built on.

It's also the kind of work that's only possible when students get hands on a real industrial-grade collaborative robot. The dual-control paradigm only holds together because the underlying SDK, control loop, gripper drivers, and safety configuration all behave the way Addverb engineered them to. Take any of those layers away and the abstraction collapses.

Syncro 5 in Academic Robotics Research

This is the kind of work the Addverb Syncro 5 cobot was designed to enable. Across the Addverb Academic Series so far, we've seen the platform support vision-based motion retargeting at IISc Bangalore, bimanual handover research at IIT Gandhinagar, and now LLM-powered voice and gesture control at IIT Indore. The pattern is consistent: students and faculty take a real industrial cobot, treat it as a research surface, and push it into territory the spec sheet doesn't cover.

Each project is different. The platform underneath is the same.

Build on Syncro 5

If you're a faculty member, researcher, or student building on the Addverb Syncro 5 cobot or thinking about it, we want to hear from you. The team's repositories are open and live (links below), and we're actively supporting the next set of academic robotics projects.

Write to us at automate@addverb.com, and explore open-source libraries, project templates, and academic resources at community.addverb.ai.

Learn more on our Cobot – Syncro 5

Technical Appendix: Architecture & Implementation Details

This section is intended for readers interested in the technical depth of the project. All four open-source repositories are linked at the end.

Autonomous Voice Control Pipeline

Natural-language command interpreted by Llama 3 via the Groq inference API

YOLO object detection running on aligned RGB frames from an Intel RealSense D435i depth camera

Depth-guided closed-loop visual servoing — pixel-space error driven to zero before approach

Depth verification at the object centroid before the grasp is triggered

Grasp, lift, and retract executed via the Cobot Python SDK

Glove Teleoperation Pipeline

Custom wearable built on an ESP32 microcontroller with flex sensors (finger bend) and a thumb pressure sensor

Sensor windows streamed to a Python service over MQTT

Statistical features (80th percentile, 20th percentile, mean) extracted per channel

Online Hoeffding Tree classifier maps features to one of seven gesture classes — Neutral, Forward, Backward, Left, Right, Up, Down

Online learning means no offline dataset collection and continuous adaptation per operator

System Architecture (Five Layers)

Human Interface — natural-language input and sensor glove

Processing & Decision — LLM intent parsing, vision pipeline, gesture classifier

Communication — custom ASCII-over-TCP protocol on port 5000, MQTT, optional Bluetooth (rfcomm)

Backend / Real-Time Control — C++17 dual-threaded TCP server inside Docker on the cobot controller, motors driven over EtherCAT (SOEM)

Hardware & Simulation — physical Syncro 5 plus a complete ROS 2 + Gazebo digital twin

Software Stack

LLM / NLU: Llama 3 via Groq API

Computer Vision: Ultralytics YOLO, OpenCV, pyrealsense2

Streaming ML: river.tree.HoeffdingTreeClassifier

Motion Planning & Control: MoveIt 2 (Servo), JointTrajectoryController, GripperActionController

Simulation: ROS 2 Humble, Gazebo Ignition, gz_ros2_control, ros_gz_bridge

Backend Libraries: Addverb system_manager, Orocos KDL, Eigen3, SOEM (EtherCAT)

Languages: Python 3.10+ (client, vision, ML), C++17 (backend server)

Hardware Platform

Addverb Syncro 5 — 6-DOF industrial collaborative robot

Intel RealSense D435i — aligned RGB-D streams at up to 90 fps, mounted at the end-effector

Interchangeable grippers — Feetech, Dynamixel, DH (driven through Addverb's gripper framework)

Custom ESP32 sensor glove — flex + thumb pressure sensors, MQTT over Wi-Fi

Validation

Full pipeline validated on the physical Addverb Syncro 5

Parallel validation in the ROS 2 + Gazebo digital twin, with detachable-joint grasping and an end-effector camera plugin

Both modes operate through a single shared SDK and control server, allowing seamless switching mid-session

Roadmap

Voice-to-text front-end so the operator can speak directly to the cobot rather than type

Two-handed glove operation for richer six-axis control

Tighter integration of voice and gesture modes within a single task e.g. autonomous approach with human-corrected final placement

Extension of the LLM intent layer to multi-step tasks beyond single-object retrieval

Open-Source Repositories

Cobot Python SDK — github.com/HrishabhMittal/cobot

Autonomous Processing Pipeline — github.com/CoderSATTY/Cobot-processing

ROS 2 + Gazebo Simulation — github.com/keshavln/CobotLLMControl

Glove ML Calibration — github.com/YashBhamare123/cobot_glove_calibration

By Varad Pendse, Yash Bhamare, Satyam Ashtikar, Hrishab Mittal, Keshav N., Dhananjay Dhumal, Sinam J., Atharva Chavan— IIT Indore, in collaboration with Addverb Technologies

Part of the Addverb Academic Series. Earlier articles featured RBCCPS / IISc Bangalore (vision-based motion retargeting) and IIT Gandhinagar (bimanual handover via the USAC-DS framework). Read the full series at community.addverb.ai.

Author:

Varad Pendse

Featured Blogs

READ ARTICLE

Two Arms, One Handover: Teaching Robots to Work in Sync

5 min read

Author:

Debojit Das

RESEARCH

SYNCRO

PHYSICAL AI

READ ARTICLE

Real Research on a Real Robot: Addverb x IISc, Bangalore

5 min read

Author:

Shreya Shah and Akshay Arjun

RESEARCH

SYNCRO

PHYSICAL AI

training-india-automation-engineers-inside-addverb-namtech-robotics-lab

READ ARTICLE

Training India’s Next-Gen Automation Engineers: Inside the Addverb x NAMTECH Robotics Lab

5 min read

Author:

Addverb Insights

EDUCATION

INSIGHTS

PHYSICAL AI

Featured Blogs

READ ARTICLE

Two Arms, One Handover: Teaching Robots to Work in Sync

5 min read

Author:

Debojit Das

RESEARCH

SYNCRO

PHYSICAL AI

READ ARTICLE

Real Research on a Real Robot: Addverb x IISc, Bangalore

5 min read

Author:

Shreya Shah and Akshay Arjun

RESEARCH

SYNCRO

PHYSICAL AI

READ ARTICLE

Training India’s Next-Gen Automation Engineers: Inside the Addverb x NAMTECH Robotics Lab

SYNCRO

TRAKR

ELIXIS

Voice and Gesture Control on a Cobot: IIT Indore × Addverb Syncro 5

SYNCRO

RESEARCH

PHYSICAL AI

Voice and Gesture Control on a Cobot: IIT Indore × Addverb Syncro 5

Voice Control and Gesture Control: Two Ways to Command the Cobot

Learn more on our Cobot – Syncro 5

Why Dual-Control Cobots Matter for Industry

Syncro 5 in Academic Robotics Research

Build on Syncro 5

Learn more on our Cobot – Syncro 5

Technical Appendix: Architecture & Implementation Details

Roadmap

Featured Blogs

READ ARTICLE

5 min read

RESEARCH

SYNCRO

PHYSICAL AI

READ ARTICLE

5 min read

RESEARCH

SYNCRO

PHYSICAL AI

READ ARTICLE

5 min read

EDUCATION

INSIGHTS

PHYSICAL AI

Featured Blogs

READ ARTICLE

5 min read

RESEARCH

SYNCRO

PHYSICAL AI

READ ARTICLE

5 min read

RESEARCH

SYNCRO

PHYSICAL AI

READ ARTICLE

5 min read

EDUCATION

INSIGHTS

PHYSICAL AI

SYNCRO

TRAKR

ELIXIS

TERMS & CONDITIONS

PRIVACY POLICY

Addverb © All rights Reserved 2025