Vous êtes sur la page 1sur 5

Disco Robot

Enabling dancing and playing melody using voice commands

Olesya Derkach, Doga Demirel, Byungkyu Kang, Valerii Dychok

Department of Computer Science University of Central Arkansas Conway, AR United States of America

AbstractRobot Navigation with audio or speech sensors is an emerging area of research in various fields of robotic exploration and artificial intelligence. Robots have great importance and future in our daily life. Albeit to a lesser extent, speech controlled are capable of serving multiple purposes across a wide variety of application domains. The work presented in our paper describes a robot which can recognize voice commands, attempt to understand the command by inferring the intent of the command and responding as programmed. In order to further the research and show robot’s capabilities, more complicated commands such as dance and square were implemented. Also, several melodies were coded, and one of them was incorporated with robot’s movements as the experiment progresses.



During the Fall 2013 semester Multimedia class students started to work on implementing voice recognition function on Arduino robot. The Arduino platform consists of a single-board microcontroller that aims to make using electronics in multidisciplinary projects more accessible. As a speech input device that would recognize voice commands, VRbot Speech Recognition Module was used. The project was focused solely on enabling speech recognition in a robot by using wireless communication. This semester goals included implementing more complicated movements like dancing and enabling robot to play melodies. One of the main concerns was to merge movements with the sound. All equipment, methods, and results will be further discussed in the following sections.

results will be further discussed in the following sections. Figure 1: Wireless Connection between the robot

Figure 1: Wireless Connection between the robot and the voice recognition

Byungkyu Kang, Sachiko Oshio, Erica Sheff, Clifford Tawiah, Donnie Turner

Department of Computer Science University of Central Arkansas Conway, AR United States of America

A. Hardware



For this project DFRobotShop Rover Kit is used. Following hardware is used to enable robot dance when receiving the command:

1. DFRobotShop Rover Kit. This kit includes the robot which will have an XBee module installed on it to receive the commands from XBee installed on easyVR

2. VRbot Speech Recognition Module. This module processes speech and identifies voice commands.

3. Two XBee Radio Frequency Communication Modules. These modules create a wireless link between the speech recognition engine and the robot. Figure 1 shows how the Arduino robot connects with the EasyVR through wireless communication, by using two XBee modules. One of them attached to the EasyVR and the other one attached to the DFRobot.

4. Arduino Uno. This board controls the speech recognition module.

5. IO Expansion Shield. This shield connects an XBee module to the Arduino Uno

6. Male Headers. These headers are required by the XBee shield.

7. Barrel Jack to 9v Battery Adaptor. This adaptor allows the Arduino Uno utilize a 9v battery as a power source.

8. Speaker with hook-up wires. Is used to play the melody. For this project it was soldered on IO Expansion Shield, so no need to connect it manually.

B. Procedure Flow

Like every running application/program our robot has a procedure flow. This flow starts with receiving the voice command and ends with executing the move. As seen in Figure 2, voice communication between the DFRobot and the voice recognition module, have 6 procedures in the flow.

on the voice

recognition module, and the last steps are executed on the DFRobot.

Below is the execution sequence provided that applies to any command:

1. VRbot Speech Recognition Module receives voice commands

2. Stores commands and recognizes received command by a particular index

3. Arduino Uno stores set of actions for each command

4. XBee module attached to voice recognition module sends a character representing a movement to XBee module attached to DFRobot

5. XBee module attached to DFRobot receives character representing a movement

6. DFRobot executes a movement based on a character received.

The first four of the steps are executed

character received. The first four of the steps are executed Figure 2: Arduino procedure flow C.

Figure 2: Arduino procedure flow

C. Software Voice controlled communication requires two IDEs to be installed:

EasyVR Commander - is required to work with EasyVR Voice Recognition module

Arduino IDE - to code robot’s moves and melodies

Following Table 1 demonstrates the full cycle of execution of command “Right”:






Say “Right” to the microphone


Recognize the command under index 1


Find corresponding set of movements


Send characters “d” and “f” to DFRobot


Receive characters


Move right, then stop

Table 1: the full cycle of execution of command “Right”

Below follows detailed explanation for the above table with code snippets and program flow.

1) The Right command is first in the command list in EasyVR Commander thus holding the index of 1. When the user says “Right” into the microphone the code that is uploaded onto the Arduino Uno board begins to execute.

2) Here is the pseudocode for the Right command:

Switch statement start

Case one: // Right

Make ‘d’ available on port

Block any other characters from being sent

Make ‘f’ available on port

Exit the switch

As we can see from the snippet above the logic for the Right command is not complicated. First of all it is a switch statement with several different case statements; each case number corresponds to command index in the command list in EasyVR Commander.

3) When appropriate case is found by the program the statements inside it begin to execute. The logic behind them is the following:

Serial.print() will make specified character (‘d’ in our case) available on a certain port.

Delay() ensures that no other characters will be send during the time specified in the parenthesis and that this time will be spent on a command execution.

4) When first Serial.print() is encountered the specified in parenthesis character is sent to the VRbot for interpretation and processing.

5) When VRbot receives the character via wireless communication the code on the rover performs Serial.read() command which simply reads that character and begins its own switch case execution.

Switch statement start

Case one: // Right

Read ‘d’ from the port

Call ‘right’ function

Exit the switch

The code above demonstrates the processing for a specified character that has been read from a port. The right() function has two parameters which correspond to the speed of left and right track on the rove with possible values 1 255 (from lowest to highest).

6) At this point the rover executes the command, physically turns right and then stops (after character ‘d’ is sent, ‘f’ character that corresponds to STOP is sent as well.

After the command is fully processed and executed and the robot has finished moving, Break statement ensures that the program will exit when the statement is encountered and will not continue to loop through the cases in switch.

D. Assembly

Rover was assembled according to the directions included with the kit. The IO expansion shield and an XBee module was mounted together. Then, headers were soldered onto XBee shield, and the shield was placed on the Arduino. VRbot speech recognition module’s wires were connected, along with the microphone and the speaker was hooked up with Arduino board. There are two modes in EasyVR on Arduino. These modes are; bridge mode and adapter mode. With the adapter mode, we can use the Arduino board as a Serial adapter by pressing the reset button on the controller. We need to

change the connections, when we want to control the module from the microcontroller. The mode used in this project is the bridge mode. Automatic bridge mode is supported with Arduino boards. Bridge mode allows user to control the module with a software serial library and with the same pin configuration the robot can connect to the module using EasyVR Commander from the PC. Figure 3 shows how EasyVR and Arduino can be connected using four wires through ETX, ERX, VCC and GND. Bridge mode is the preferred connection mode, since it makes the communication between the Arduino microcontroller and the PC simpler.

between the Arduino microcontroller and the PC simpler. Figure 3: Automatic bridge mode schema E. Programming

Figure 3: Automatic bridge mode schema

E. Programming

The voice recognition module is coded using the EasyVR Commander software. First, the commands are recorded. Second, recorded speech is linked to a VR module to command the rover. Then the robot is programmed in the Arduino IDE environment. Code is defined for each of the commands trained previously.

Coding for the melodies is done in Arduino IDE, and requires some knowledge about how to play notes and timing issues. Before coding melodies, need a header file with list of notes definitions, for example

#define NOTE_B0 31

#define NOTE_C1 33

a header file with list of notes definitions, for example #define NOTE_B0 31 #define NOTE_C1 33

Figure 4: Voice/Sound Wave

Next, iterate over the notes of the melody, calculating the note duration. Function note() will produce a sound by receiving corresponding pin number, note and duration. To quit playing melody, function noTone() is used. Finally, code uploaded and the melody is reproduced. As an example, below we will describe Dance command along with additional details on how it works. Whenever user says “Dance” into the microphone the code that is uploaded on Arduino Uno goes through a series of case statements and looks for a corresponding number for Dance command:

case 13: // DANCE






When it finds case 13 it recognizes it as Dance and starts going through every statement inside and executes it. (Number 13 corresponds to the number that is assigned to Dance command during recording session). Inside the case we have defined series of Serial.print() and delay() commands. Whenever first Serial statement is executed the character is transmitted to the VRbot for execution of the specific and defined command: either right, left, forward, backward or stop.

Serial.print() allows for a character to become available on a specific port and Serial.read() that is called on the VRbot allows that available character to be read and processed.

Delay command ensures appropriate time that will be spent by VRbot executing the command and also that no other command will be send during this time.

VRbot has its own case statements to interpret the received character and execute the appropriate command. There are two additional variables defined in the code, leftspeed and rightspeed they hold a value of 255 which means the highest speed for each track.

The code snippet above shows a small part of the Dance command that is a combination of right, left, forward, backward moves that together form and closely mimic dance movements in our opinion.

F. Social Implications

Voice control over devices is useful in various situations.

it can be used by disabled people to control tools or machines which they have to use

voice control can make people’s lives easier since it does not require any extra work like pushing buttons to make machines move

therefore, voice controlled interaction is a promising field to do research in

Many elderly have difficulties with mobility that limit their ability to get around. They are forced to rely on canes or walkers that tend to be clumsy and difficult to store. The results from our research will assist the elderly or physically challenged to easily navigate to anywhere they wish using voice commands. This resolves the mobility problems of people who are immobile.


The group produced the following results:

Website was created in order to keep track of all the progress.

4 main melodies and 1 test melody was coded, uploaded and tested.

Initial Command Set:



RIGHT made the rover turn right.



LEFT made the rover turn left.



FORWARD made the rover move




BACKWARD made the rover move



Additional Commands

1.STOP made the rover cease all movement.












3.SQUARE made the rover perform a set of movements that mimicked the shape of a square.

4.DANCE made the rover play melody first, and then perform a predefined set of movements that mimicked a dance performance.

Voice recognition was improved by using another source of energy.

Finally, melody code was merged with the main code in order to perform two actions after receiving command “dance”: play music and then dance.

A. Comments While working with EasyVR voice recognition module the following observations were made:

All the parts have to be purchased at the same time period, otherwise they may not work with each other due to constant modifications.

Any old hardware may not work with the new software or updated drivers.

Multithreading, a very important feature for the future work, may be achieved using operating system, while this is not an option with Arduino.

The robot will not detect the commands, unless the environment is completely quiet.

the commands, unless the environment is completely quiet. Figure 5: Wireless Communication III. C ONCLUSIONS Our

Figure 5: Wireless Communication



Our group had three primary goals. These goals included implementing music feature, dancing feature, and improving voice recognition. At this point we have implemented music feature, dancing feature and improved voice recognition. During our work we faced some challenges working with voice recognition module while trying to implement dance command for the dancing feature. All issues were resolved by connecting Arduino correctly via Bridge Mode and using alternative energy source for the voice recognition module. One of the interesting things we have noticed during our experiments with voice recognition is that whenever we have been recording voice commands the speaker’s voice had to be exact and quite static when pronouncing command name, otherwise it may not be recorded at all. Also, when we were actually testing recorded commands we have noticed that sometimes the robot will not execute the command because of the voice tone and intonation so we had to try several times before we can speak with the right tone of voice. Overall, this was very good practical experience with both hardware and software sides where we had the chance to physically connect, attach and solder the components to make it work and

program everything to make the robot move, dance and play melodies.



One of the future extensions of this project will be a merge with another autonomous robot control projects which seeks to focus on using robot navigation exploration to aid the elderly or physically challenged.

The next step is by implementing the sound, enable robot to respond to the commands, thus, enabling the people controlling the robot to have communication with it.

Also, due to the limited amount of voice commands that can be uploaded to the module (only 15) future work involves seeking alternative module with a higher capacity for commands

Among improvements, is enhancements to speech recognition system, in order to enable to recognize not only one person voice

Enabling voice detection even when other noises around can be a good improvement for the future

Finally, we would include wireless communication systems to transmit voice commands wirelessly through the internet to the rover.



The authors would like to acknowledge previous semester group work on this project as well as Dr. Sun, for providing all the guidance and advising us during our work and research. Also NASA for the funding this project



RobotShop, “Building a speech controlled

Arduino” March 2011.




controlledarduino- robot-3684 Arduino,