Vous êtes sur la page 1sur 7

ELEC2201: Signals & Linear Systems

SLS-07: Extracting Frequencies of Musical Tones

Pre-Lab and Warm-Up: You should read at least the Pre-Lab and Warm-up sections of this lab assignment
and go over all exercises in the Pre-Lab section before going to your assigned lab session.
Verification: The Warm-up section of each lab must be completed during your assigned Lab time and
the steps marked Instructor Veri cation must also be signed off during the lab time. One of the laboratory
instructors must verify the appropriate steps by signing on the Instructor Verification line. When you have
completed a step that requires veri cation, simply demonstrate the step to the TA or instructor. Turn in the
completed veri cation sheet to your TA when you leave the lab.
Lab Report: It is only necessary to turn in a report on Sections 4 and 5 with graphs and explanations. You
are asked to label the axes of your plots and include a title for every plot. In order to keep track of plots,
include your plot inlined within your report. If you are unsure about what is expected, ask the TA who will
grade your report.

1 Introduction
This lab is built around a single project that involves the implementation of a system for automatically
writing a musical score by analyzing the frequency content of a recording (a sampled signal). A primary
component of such a system is the spectrogram which produces a time-frequency representation of the
recorded waveform. However, to make a working system, several other processing components are needed
after the spectrogram to extract the important information related to the notes. The design of these additional
blocks will lead naturally to a deeper understanding of what the spectrogram actually represents.

2 Overview
In Chapter 13 we introduced the spectrogram as an important tool for time-frequency analysis. For mu-
sic signals the spectrogram tends to produce an image with only a few peaks. Finding these peaks and t
h
k
n
identifying their frequencies and durations is the main issue in this lab. A M ATLAB GUI for showing the
spectrogram along with musical notation is available for experimentation. CD-ROM

In order to make the project manageable, we will progress through several different signal types while Music
GUI

testing. These include:

1. Sine waves at a speci c frequency.

2. Sine waves that make up a C-major scale.

3. Sinusoids that create the tune for Twinkle, Twinkle Little Star.

4. A piano rendition of Twinkle, Twinkle Little Star.

5. Other recorded songs are available for processing: Jesu, Joy of Man’s Desiring, Minuet in G, Beethoven’s
Fifth, and Für Elise.

McClellan, Schafer, and Yoder, Signal Processing First, ISBN 0-13-065562-7. 1


c
Prentice Hall, Upper Saddle River, NJ 07458. 2003 Pearson Education, Inc.
3 Warm-up: System Components
In the warmup we will investigate several M- les needed to build the complete processing chain. The
instructor veri cation sheet is included at the end of this lab.

3.1 Spectrogram Computation


In M ATLAB there is already a function for calculating the spectrogram, but it is not always available since it t
h
k
n
is part of the Signal Processing Toolbox. A similar function (called spectgr) is included as part of the SP
First Toolbox, and we have preserved the same list of arguments. The calling format for either specgram CD-ROM

is the following:
spectgr.m

[B,T,F] = specgram( xx, Nfft, fs, window, Noverlap )

where xx is the input signal, Nfft is the FFT length, fs is the sampling frequency, window is a column
vector containing the coef cients of the window, and Noverlap is the number of points in the overlapped
part of consecutive sections.
The outputs are the spectrogram matrix B, a vector T containing the time locations of the windowed
segments,1 and a vector F which contains the list of scaled frequencies corresponding to the spectrogram
analysis. The vectors T and F are useful for labeling plots. Both are scaled by the sampling frequency so the
units are seconds for T, and hertz for F. The spectrogram B contains complex values and its size is such that
it has a column length (number of rows) equal to length(F) and a row length equal to length(T). In
the SP First implementation, the calling program must provide all of the arguments. The M ATLAB function,
on the other hand, allows the caller to omit arguments, but that just adds complexity in the programming.
In this section, we will present the steps in a spectrogram calculation, so that you could write your own
function. The preferred viewpoint for calculation is that of a sliding-window FFT. In this implementation,
we take a segment of the signal of length L, multiply the segment by a window, and then compute a zero-
padded N -point FFT to form one column of the spectrogram matrix. Then the starting point of the data
segment is moved over by an amount L − Noverlap and the process is repeated. Eventually, we run out of
data and the spectrogram is complete.
For a M ATLAB program, we need to write a while loop that tests whether any signal remains. The
following example shows all the code needed for the inner loop:
B = zeros( Nfft/2 + 1, num_segs ); %- Pre-allocate the matrix
L = length(window) %-- assuming a user generated window
iseg = 0;
while( ) %<==== FILL IN THE TEST CONDITION
nstart = 1 + iseg*(L-Noverlap);
xsegw = window .* xx( nstart:nstart+L-1 ); %-- xx is a column
XX = fft( xsegw, Nfft );
iseg = iseg + 1;
B(:,iseg) = XX(1:Nfft/2+1);
end

Explain how each of the steps in the spectrogram are being calculated. Explain how to calculate the number
of segments num segs ahead of time. Also, explain the purpose of the last line in the while loop. And
1
There are several conventions for de ning the time: (1) start of the segment, (2) middle of the segment, or (3) end of the
segment. The middle choice probably makes the most sense, but it really doesn’t matter in this project because only relative times
will be signi cant.

McClellan, Schafer, and Yoder, Signal Processing First, ISBN 0-13-065562-7. 2


c
Prentice Hall, Upper Saddle River, NJ 07458. 2003 Pearson Education, Inc.
nally , determine a test that can be used to terminate the while loop.
Instructor Verification (separate page)

3.2 Generating the Window


The call to specgram requires a window of length L. One possibility is the rectangular window consisting
of all ones, but a better window is the Hann window. The rectangular window corresponds to a running-sum
lter . The de nition of the Hann window is
2πn
 
w[n] = 1
2
1
− cos
2 n = 1, 2, . . . L (1)
L+1
There are some variations on this de nition, but the one given here omits end points that would be zero.
Write a function that will generate the Hann window, making sure that it returns a column vector. Then you
can use this when calling your specgram function. Make a plot of the Hann window for L = 64.
Instructor Verification (separate page)

3.3 Display the Spectrogram


t
h
k
n
The display of the specgram output can be done with the M ATLAB function imagesc or with the SP
First function show img. On a computer monitor the spectrogram display can use color so that low-level CD-ROM

details can be seen, but the conventional printout is a gray scale image with black indicating large values.
show img.m

In addition, if the gray level is proportional to the magnitude of B, small details may be lost, so it might be
advantageous to convert to a logarithmic scale covering 30 or 40 dB (called “log mag”). Finally, the default
orientation in M ATLAB is a matrix orientation with the origin in the upper left-hand corner. To change this
orientation so that the origin is in the lower left-hand corner, use axis xy. The following code fragment
summarizes the display:

if (LOG) %-- assume LOG is a true/false variable


B = 20*log10( abs(B) ); %-- ignore log(0) warnings
dBmax = 30;
B = B - max(abs(B(:))) + dBmax;
B = B.*(B>0); %-- dB range is now 0 <= B <= dBmax.
else
B = abs(B);
end
imagesc( T, F, B ); colormap(1-gray(256)); axis xy;

3.4 Finding Peaks


Although it might be easy to spot peaks in the spectrogram visually, it is much harder to write a computer
program to extract the same peaks reliably. This rst step in the process, however, is to just extract all the
peaks. Then we can follow this up with an editing program that removes extraneous peaks. The peak-picking
function only needs to do one-dimensional picking along the frequency axis because the music spectrogram
has a de nite horizontal bias—the tones last for a long duration along the time (horizontal) axis. If we scan
each column of the B(k, `) matrix for peaks, we can merge peaks from neighboring columns to see if a note
is present and also determine how long it lasts.

McClellan, Schafer, and Yoder, Signal Processing First, ISBN 0-13-065562-7. 3


c
Prentice Hall, Upper Saddle River, NJ 07458. 2003 Pearson Education, Inc.
t
h
k
n

A one-dimensional peak-picker is available in the function pkpick.m whose help comments are CD-ROM

given below:
pkpick.m

function [peaks, locs] = pkpick( xx, thresh, number )


%PKPICKER pick out the peaks in a vector
% Usage: [peaks,locs] = pkpick( xx, thresh, number )
% peaks : peak values
% locs : location of peaks (index within a column)
% xx : input data (if complex, operate on mag)
% thresh : reject peaks below this level
% number : max number of peaks to return
%

Test that pkpick.m works as you expect by generating a cosine wave and nding its peaks.
Instructor Verification (separate page)

An unexpected problem with peak picking is the quantization of the frequency axis. The peak-picking
function will give an output that is on the grid of possible frequencies. If we need to estimate the peak
location between these grid points, interpolation is needed. The function pkinterp.m is available for that
purpose.

4 Lab Exercises: Design of the Music Writing System


The complete system for writing the music is quite complicated, so we follow the engineering practice of
breaking the system down into smaller, more manageable, components.

4.1 Block Diagram for the System

SHEET
x[n] IMAGE Peak LIST Editing & KEY # Write MUSIC
Spectrogram Picking Merging Notes

Figure 1: Block diagram of major components in music writing system.

Figure 1 shows the major sub-systems needed to extract enough information from a musical recording to
write the sheet music for that input. Each of these should be implemented as a separate M ATLAB function.

4.2 Write a Spectrogram Function


Use the code fragment above as the basis for writing your own specgram function. Test your function by
having it compute the spectrogram of a sine wave. The display should be one horizontal line at the frequency
of the sinusoid.

McClellan, Schafer, and Yoder, Signal Processing First, ISBN 0-13-065562-7. 4


c
Prentice Hall, Upper Saddle River, NJ 07458. 2003 Pearson Education, Inc.
4.3 Parameters of the Spectrogram
Window Length: Derive a resolution requirement for separating notes, so that you can specify a window
length. The resolution must be converted from continuous-time frequency (in Hz) to discrete-time
frequency:
f
ω̂ = 2π
fs
FFT Length: Use a power of two FFT for ef cienc y. A long FFT will give more frequencies and reduce
the gridding problem for peak interpolation.

Overlap: Determine the time spacing needed to nd the duration of notes. Be careful when making the
time spacing very small because the amount of computation will increase dramatically.

4.4 Peak Picking & Editing


The peak picking operation is relatively straightforward to implement—only the number of peaks and a
threshold need to be speci ed. If the threshold is too low, the editing phase will have to deal will many
extraneous peaks. The peak picker should generate a list consisting of triplets (frequency, time, amplitude).
The function pkpick.m provided in Section 3.4 will only nd peaks with one vector, so it must be modi ed
to nd peaks as a function of both time and frequency.
Editing is the crucial step and also the hardest to specify. Unlike the spectrogram which is a well-de ned
calculation with only a couple of parameters to adjust, the editing process can take many different forms.
The editing system must take a list of frequency-time-amplitude triplets generated by the peak-picker and
eliminate many of them based on rules that are derived from common sense. The following issues should
be considered:

1. Frequency:

(a) How close is the frequency to one of the allowable frequencies of the piano keys.
(b) The harmonics must be eliminated, but there are cases where an octave is played, so the second
harmonic might be allowed. In addition, when the song has both bass and treble sections, note
frequencies can be 4 or 8 times each other.

2. Time:

(a) Check the duration; is it a half note, quarter note, etc? This requires that peaks be merged and
tracked along the time axis.
(b) Timing of the notes. We expect the notes to start at regular times because the music has a rate,
such as 2/4 time, or 4/4 time.
(c) In fact, an interesting sidelight project would be to extract the “beat” of the music. This might
help to establish a time base for the song, and help set the parameters of the expected durations.
(d) There is a minimum duration unless we have a piece with lots of special effects, trills, grace
notes, etc.

3. Amplitude:

(a) Keep the strongest ones, but how many?


(b) If we also look in the time domain, the “attack” could be found. This is the sharp rise in ampli-
tude at the beginning of a note.

McClellan, Schafer, and Yoder, Signal Processing First, ISBN 0-13-065562-7. 5


c
Prentice Hall, Upper Saddle River, NJ 07458. 2003 Pearson Education, Inc.
4.5 Writing the Musical Score
The output of the editing process should be a list of key numbers and durations that de ne the music. We t
h
k
n
have provided a function wrinotes.m that will create a M ATLAB image that has the notes in the musical
score. Consult the help for wrinotes.m to learn the data structure that is needed for its input. CD-ROM

wrinotes.m

5 Lab Exercises: Testing the Music Extraction Program


This project is relatively complicated and testing will not be easy. However, several test les are available,
progressing very easy cases to dif cult ones. You should run your program on the following four test cases:

1. Sine waves at a speci c frequency.

2. Sine waves that make up a C-major scale.

3. Sinusoids that create the tune for Twinkle, Twinkle Little Star.

4. A piano rendition of Twinkle, Twinkle Little Star.

In each case, you know what the true answer should be, so you can assess the capabilities of your music
writer.
All the piano songs are sampled at 11.025 kHz. Alternate songs are: Jesu, Joy of Man’s Desiring, Minuet
in G, Beethoven’s Fifth, and Für Elise. Each of these will be quite dif cult and challenging, unless your
editing logic is very sophisticated.
Remember that the objective of the lab is to make a working system containing the major components
listed in Fig. 1. Even with a few simple tests, you should learn quite a bit about the spectrogram, its strengths
and its shortcomings.

McClellan, Schafer, and Yoder, Signal Processing First, ISBN 0-13-065562-7. 6


c
Prentice Hall, Upper Saddle River, NJ 07458. 2003 Pearson Education, Inc.
SLS-07
INSTRUCTOR VERIFICATION SHEET
For each verification, be prepared to explain your answer and respond to other related questions
that the lab TA’s or professors might ask. Turn this page in at the end of your lab period.

Name: Date:

Part 3.1 Complete and explain spectrogram code:

Veri ed: Date/Time:

Part 3.2 Write a function to return a Hanning window:

Veri ed: Date/Time:

Part 3.4 Test the peak picking function:

Veri ed: Date/Time:

McClellan, Schafer, and Yoder, Signal Processing First, ISBN 0-13-065562-7. 7


c
Prentice Hall, Upper Saddle River, NJ 07458. 2003 Pearson Education, Inc.

Vous aimerez peut-être aussi