Académique Documents
Professionnel Documents
Culture Documents
ISSN 2278-6856
Abstract
This paper is a culmination of two dynamic fields which is of
major importance in the practical world Voice Recognition
and Music Therapy. Voice Recognition, which is a very
important field and is under development by various
commercial and research organizations all over the world. It
is implemented in critical applications such as healthcare,
military, to regular applications such as smart phones,
microwaves, biometric security. However, a portion of our
speech which is less quantitative in nature is the emotion
involved in speech. It is that portion of our speech which adds
dynamism to our speech. This paper is an attempt to extract
the emotional part of a speech and detect the emotional
content of the speech. The detection of emotion is independent
of any adjective-oriented word spoken but rather the emphasis
on the pitch, tonality and stress on particular words.
1. INTRODUCTION
1.1 History of Music Therapy
Music Therapy [6] is the systematic application of music in
the treatment of the physiological and psychosocial
aspects of an illness or disability. It focuses on the
acquisition of non-musical skills and behaviors, as
determined by a board certified music therapist through
systematic assessment and treatment planning. Therefore,
it is an allied health profession and one of the expressive
therapies, consisting of an interpersonal process in which
a certified music therapist uses music and all of its
facetsphysical, emotional, mental, social, aesthetic, and
spiritualto help clients to improve or maintain their
health. Music therapy in the United States of America
began in the late 18th century. However, using music as a
healing medium dates back to ancient times. This is
evident in biblical scriptures and historical writings of
ancient civilizations such as Egypt, China, India, Greece
and Rome. Today, the power of music remains the same
but music is used much differently than it was in ancient
times. The profession of music therapy in the United
States began to develop during W.W.I and W.W. II, when
music was used in Veterans Administration Hospitals as
an intervention to address traumatic war injuries.
Veterans actively and passively engaged in music
activities that focused on relieving pain perception.
12
ISSN 2278-6856
2. PROBLEM STATEMENT
This project aims to detect the negative emotion from a
given sound and determine that which music is to be
implemented as a therapy on the given subject. The
steps are as discussed below:
Voice Recording
At application level, a voice is to be recorded and fed as
input.
Detection of emotion
The program will provide us with the output of an emotion
detected.
Final Result
On the basis of the emotion detected, a mapping function
will generate which music will be used as a therapy on the
subject.
10 OBJECTIVES
The objective of this project is to do stress management
using music therapy but the innovation is to apply the
music therapy using an automated method of detecting the
stress in the voice sample. The primary objective and
motivation of this project is to reduce stress problems by
utilizing music therapy. The ragas involved in the music
therapy happen to produce a positive effect on the subject
and reduce the temporary negative emotions present.
11 BENEFITS
It is an effective automation system which will determine
the negative emotion and rather help the subject from
going in negative psychological state. It will remove the
necessity of any human intervention in the following
process Reduce human mortality by reducing the chances
of deaths related to psychiatric issues. It can be
implemented on any device such as smart phones which
will be available to each and every people.
Figure 2 Flow Diagram
13 PROCESS DESCRIPTION
The Stress Management [1] using Music Therapy is the
complete culmination of two different well known fields.
It is a combination of Music Therapy and Voice
Recognition but involving the ability to detect emotional
content from a given voice/speech sample. The entire
system is divided in two sections. One section deals with
the detection of emotion from the given voice/speech
sample and the other section works with the mapping
Page 274
ISSN 2278-6856
N 1
Xn =
k 0
Page 275
ISSN 2278-6856
14 TEST RESULTS
On the basis of the value generated, we play the necessary
raga file. A mapping table is provided under the section of
MUSIC THERAPY [8] where we have provided a list of
diseases which can be used to cure using a particular
music file.
Table 1: Classification of various moods according to the
Ragas
Mood
Sad
Depression
Hypertension
Anger
Fear
Figure 3 Cepstral Representation
The values of the cepstrum are then converted from
frequency domain to time domain using Discrete Cosine
Transformation. Thus we can calculate the MFCC's as:
Ragas
Kafi
Kapi
Bageshri
Sahana
Mishra Mand
Fear
(4)
13.6 Mel Frequency Cepstrum Coefficient (MFCC)
Mel Frequency Cepstrum Coefficient are coefficients
which represent audio on the basis of perception. It was
developed by Paul Mermelstein along with Bridle and
Brown who proposed the idea. It generates a 20
dimensional matrix from the signal and we utilize the
value and generate and algorithm to deduce what emotion
we have as the sample. Algorithmically, the concept of
cepstrum[2] is presented here in the form of a block
diagram. Figure below shows the flow chart that describes
as to how to obtain cepstrum from a signal.
Figure 5
Anger
Figure 6
Depression
Figure 7
Figure 4: MFCC Flow Chart
Page 276
ISSN 2278-6856
References
Figure 8
15 CONCLUSION
The project is an example of two distinct fields of
Computer Science and Para medicine merged into a
single field and though at a nascent stage with very
narrow production, it will lead to a very promising
field. The project provides a solution to Stress
Management which is very prevalent today in the
modern world and if the project is taken up at a
higher level it can be applied at a commercial level
also. Though the technique used to emulate human
hearing and perception, the Mel Frequency Cepstrum
Coefficient is very much prone to noise and even after
the noise is reduced, the results produced are
sometimes very limited as we are unable to properly
detect the emotion, however by utilising proper scales
and on further research, it will be very much possible
to detect the correct emotion with much accuracy.
Even though with all the limitations, we have tried our
level best to produce satisfactory result and generate a
solution by which we can map the negative emotions
with the ragas which can be used to cure them.
Page 277