"A robot may not injure a human being or, through inaction, allow a human being to come to harm."
Isaac Asimov's First Law of Robotics

A Quick Glance

overview

Figure 1. Physical-world demonstration of our proposed TrojanRobot (vanilla scheme). Based on myCobot 280-Pi manipulator, we showcase the backdoor attacks on VLM-based robotic manipulation.

Paper Overview

Robotic manipulation policies are increasingly empowered by large language models (LLMs) and vision-language models (VLMs), leveraging their understanding and perception capabilities. Recently, the security of robotic manipulation tasks has been extensively studied, with backdoor attacks drawing considerable attention due to their stealth and potential harm. However, existing backdoor efforts are limited to simulators and struggle to poisoning third-party commercial VLM-based implementations in real-world robotic manipulation. To address this, we propose TrojanRobot, embedding a backdoor module into the modular robotic policy via backdoor relationships to manipulate the LLM-to-VLM pathway and compromise the system, with our vanilla design employing a backdoor-finetuned VLM to serve as the module. To enhance attack performance, we further propose a prime scheme by introducing the concept of LVLM-as-a-backdoor, which leverages in-context instruction learning (ICIL) to control large vision-language model (LVLM) behavior via backdoor system prompts. Moreover, we develop three types of prime attacks—permutation, stagnation, and intentiona-—achieving flexible backdoor attack effects. Extensive physical-world and simulator experiments on 18 real-world manipulation tasks and 4 VLMs verify the superiority of proposed TrojanRobot.
overview

Figure 2. The working pipelines of our proposed vanilla TrojanRobot attack scheme and prime TrojanRobot schemes.

Physical Wolrd Results

Put the rubbish in the bin

w/o attack
Prime RBA (I)
Prime RBA (P)
Prime RBA (S)

Move the triangle board to the human

w/o attack
Prime RBA (I)
Prime RBA (P)
Prime RBA (S)

Move the lid to the table

w/o attack
Prime RBA (I)
Prime RBA (P)
Prime RBA (S)