Académique Documents
Professionnel Documents
Culture Documents
ALGORITHMS &
COMPUTING I
Asst. Prof. Axel POSCHMANN
AY 2012/13 Semester 1
23/10/12
L10-2
on Wednesday 24.10.2012
Next week will be Lab 8 (no marks, but part of GL2)
The week after is Graded Lab Session 2 (GL2)
The week after is presentation of the final project
Lab 8
GL2
Project
LA1
Monday
29.10.2012
05.11.2012
12.11.2012
LA2
Tuesday
30.10.2012
06.11.2012
20.11.2012
LA3
Wednesday
31.10.2012
07.11.2012
14.11.2012
LA4
Friday
02.11.2012
09.11.2012
16.11.2012
LA5
Friday
02.11.2012
09.11.2012
16.11.2012
LA6
Friday
02.11.2012
09.11.2012
16.11.2012
23/10/12
L10-3
Final Project
Information are available in edventure
Groups of 5 have been (randomly) assembled
Meet your team and split the work, discuss your approach, schedule
meetings etc
Remember: everybody is responsible for one part
More explanations during next weeks lecture
Deadline (sharp): 11.11.2012 23:59h SGT
23/10/12
Quiz
Use a pen
Closed Book
Move your bags, materials etc far away
5 minutes
1% possible
L10-4
23/10/12
L10-5
23/10/12
L10-6
vector called x
23/10/12
L10-7
Motivating Example
Statistics can be used to characterize properties of a data
set
Consider a set of exam grades
x = {33, 75, 77, 82, 83, 85, 85, 91, 100}
What is a normal, expected or average exam grade?
There are several ways to interpret this:
Mean: summing the grades, then divide by n (79)
Mode: Most often found grade (85)
Median: The value in the middle of the list (83)
23/10/12
L10-8
23/10/12
L10-9
23/10/12
L10-10
23/10/12
L10-11
same dimension
Example
>> x=[3 5 8
>> y=[2 6 4
>> min(x,y)
Ans =
2 5 4 2 10
2 11];
5 10];
Second argument is
for second vector/matrix
23/10/12
L10-12
23/10/12
L10-13
x
x=
i=1
>> x=[33, 75, 77, 82, 83, 85, 85, 91, 100];
>> mean(x)
ans =
79
23/10/12
L10-14
>> mat = [8 9 3; 10 2 3; 6 10 9]
>> mean(mat)
8 9 3
ans =
columnwise
10 2 3
6 10 9
8 7 5
To find the mean of each row, the second argument is 2
>> mean(mat,2)
ans =
6.6667
5.0000
8.3333
8 9 3
10 2 3
6 10 9
rowwise
23/10/12
L10-15
Outliers
Sometimes a value that is much larger or smaller than the
rest of the data -called an outlier- can throw off the mean
Example
>>ybig=[9 10 10 9 8 100 7 3 10 9 8 5 10];
>>mean(ybig)
ans =
15.2308
Typically, an outlier represent an error of some kind (data
collection etc)
In this example, the maximum and minimum could be
removed using logical indexing (how?)
23/10/12
L10-16
mean)
i
var =
i=1
n 1
uses n-1)
23/10/12
L10-17
elements is 6
>> x = [8 7 5 4 6];
>> var(x)
ans =
2.5000
23/10/12
L10-18
sd = var
MATLAB has a built in function std
>> x = [8 7 5 4 6];
>> std(x)
ans =
1.5811
23/10/12
The Mode
The mode of a data set is the value that appears most
frequently
MATLAB has a built in function mode
>> x=[9 10 10 9 8 7 3 10 9 8 5 10];
>> mode(x)
ans =
10
If there is more than one value with the same (highest)
frequency, the smaller value is the mode
>> x=[3 8 5 3 4 1 8];
>> mode(x)
ans =
3
L10-19
23/10/12
L10-20
The Median
The median is defined only for a data set that has been
23/10/12
L10-21
23/10/12
L10-22
operations
Examples are:
union,
intersect
setdiff
setxor
unique
(ascending order)
There are two is functions: issorted and ismember
23/10/12
L10-23
Union
The union function returns a vector that contains all of
>>union(v1,v2)
ans =
1 2 3 4 5 6 7
v2
23/10/12
L10-24
Intersect
The intersect function returns a vector that contains all
>>v1=[6 5 4 3 2];
>>v2=[1 3 5 7];
v1
>>intersect(v1,v2)
ans =
3 5
v2
23/10/12
L10-25
Intersect (ctd)
The intersect function also returns an index vector into
index1 =
4 2
index2 =
2 3
23/10/12
L10-26
Setdiff
The setdiff function returns a vector consisting of all of the
values that are contained in the first input argument vector but
not the second
The order of the input arguments is important!
Example
>>v1=[6 5 4 3 2];
>>v2=[1 3 5 7];
v1
v2
>>setdiff(v1,v2)
ans =
2 4 6
>>setdiff(v2,v1)
v1
v2
ans =
1 7
23/10/12
L10-27
Setxor
The setxor function returns a vector consisting of all of the
values from the two input vectors that are not in the intersection
of these two vectors
It is the union of the two vectors obtained using setdiff
Example
>>v1=[6 5 4 3 2];
>>v2=[1 3 5 7];
v1
v2
>>setxor(v1,v2)
ans =
1 2 4 6 7
>>union(setdiff(v1,v2),setdiff(v2,v1))
ans =
1 2 4 6 7
23/10/12
L10-28
Unique
The unique function returns all of the unique values from
a set argument
Example
>>v3=[1 2 3 4 5 3 4 5 6];
>>unique(v3)
ans =
1 2 3 4 5 6
23/10/12
L10-29
23/10/12
L10-30
Sorting
Sorting is the process of putting a list in order
Either descending highest to lowest
Or ascending lowest to highest
Example
By default
sorted acending
23/10/12
L10-31
Sorting (ctd)
For matrices the sort function will sort each column
To sort by rows dimension 2 is specified
Example
>>sort(mat)
ans =
6 2 3
8 9 3
10 10 9
>>sort(mat,2)
ans =
3 8 9
2 3 10
6 9 10
8 9 3
10 2 3
6 10 9
8 9 3
10 2 3
6 10 9
rowwise
columnwise
23/10/12
L10-32
Sorting rows
The sortrows function sorts each row as a block, or
group
Example
>>sortrows(mat)
ans =
6 10 9
8 9 3
10 2 3
8 9 3
10 2 3
6 10 9
columnwise
23/10/12
L10-33
>>words=char(Hello,Hi,Goodbye,Ciao)
Words =
Hello
Hi
Goodbye
Ciao
>>sortrows(words)
ans =
Ciao
Goodbye
Hello
Hi
23/10/12
L10-34
23/10/12
L10-35
Index Vectors
Using index vectors is an alternative to sorting a vector
Indexing leaves vector in its original order, just point to
index
23/10/12
L10-36
23/10/12
L10-37
23/10/12
L10-38
23/10/12
L10-39
Lessons learned
Common Pitfalls:
Forgetting that max and min return the index of only the
first occurrence of the maximum or minimum value
Not realizing that a data set has outliers that can
drastically alter the results obtained from the statistical
functions
23/10/12
L10-40
Lessons learned
Programming Style Guidelines:
Remove the largest and the smallest numbers from a
large data set before performing statistical analysis to
handle the problem of outliers
Use sortrows to sort strings stored in a matrix
alphabetically
23/10/12
L10-41
G = n x1 * x2 * x3 *...xn
>> x = [33, 75, 77, 82, 83, 85, 91, 100];
>> mean(x)
ans =
78.2500