Académique Documents
Professionnel Documents
Culture Documents
M.Vnageswararao
18BEV7021
L49+L50
1. Simple Operations
a) Enter the data {2,5,3,7,1,9,6}directly and store it in a variable x.
Command> x<-c(2,5,3,7,1,9,6)
Output> x
[1] 2 5 3 7 1 9 6
d) Print the data as {20, 19, …, 2, 1} without again entering the data
Command> rev(x)
Output> [1] 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
d) First create a list (2, 1, 3, 4). Then append this list at the end with another list (5, 7, 12, 6,
-8). Check whether the number of elements in the augmented list is 9.
b<-c(2,1,3,4)
> c<-c(5,7,12,6,8)
Command> r <- append(b ,c ,after=c(length(b)))
>r
Output> [1] 2 1 3 4 5 7 12 6 8
Command> length(r)
Output> [1] 9
4.(a) Print all numbers starting with 3 and ending with 7 with an increment of 0:0.5. Store
these numbers in x.
Command> x<-seq(3,7,0.5)
>x
Output> [1] 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0
(b) Print all even numbers between 2 and 14 (both inclusive)
Command> seq(2,14,by=2)
Output> [1] 2 4 6 8 10 12 14
(c) Type 2*x and see what you get. Each element of x is multiplied by 2.
> x<-seq(3,7,0.5)
Command> 2*x
Output> [1] 6 7 8 9 10 11 12 13 14
Functions in R
M.V.NageswaraRao
18BEV7021
L49+L50
1)
Solution:
a)command> x=c(1:10)
> x
Output> [1] 1 2 3 4 5 6 7 8 9 10
b) command>sum(x)
Output> [1] 55
c) command> mean(x)
Output> [1] 5.5
command> median(x)
Output> [1] 5.5
d) command> p=sum(x^2)
> p
Output> [1] 385
e) command> u=sum((abs(x-mean(x))/length(x)))
> u
Output> [1] 2.5
b) How to find the number of rows and number of columns by a single command?
command> nrow(a)
Output> [1] 20
command> ncol(a)
Output> [1] 5
command> b=c(nrow(a),ncol(a))
> b
Output> [1] 20 5
command> names(a)
[5] "CentralHeating"
d) If the file is very large, naturally we cannot simply type `a', because it will cover the entire
screen and we won't be able to understand anything. So how to see the top or bottom few lines in
this file?
command>head(tableinfo)
command> tail(k)
e) If the number of columns is too large, again we may face the same problem. So how to see the
first 5 rows and first 3 columns?
command>k[c(1:5),c(1:3)]
Output>
1. 52.00 1225 3
2. 54.75 1230 3
3. 57.50 1200 3
4. 57.50 1000 2
5. 59.75 1420 4
f) How to get 1st, 3rd, 6th, and 10th row and 2nd, 4th, and 5th column?
command> k[c(1,3,6,10),c(2,4,5)]
Output>
3 1200 4.2 No
10 1550 5.7 No
3) Calculate simple statistical measures using the values in the data file.
a) Find means, medians, standard deviations of Price, Floor Area, Rooms, and Age.
b) #for to find mean
c) mean(price)
d) [1] 71.5875
e) mean(floorarea)
f) [1] 1607.75
g) mean(Rooms)
h) [1] 5
i) mean(Age)
j) [1] 4.875
k) #for to find median
l) median(price)
m) [1] 69.875
n) median(floorarea)
o) [1] 1575
p) median(Rooms)
q) [1] 5.5
r) median(Age)
s) [1] 5.4
t) #for to find square deviation
u) sd(price)
v) [1] 12.21094
w) sd(floorarea)
x) [1] 331.7675
y) sd(Rooms)
z) [1] 1.65434
aa) sd(Age)
bb) [1] 2.366182
b) How many houses have central heating and how many don't have?
#a$CentralHeating=factor(a$CentralHeating,labels=c("Yes"))
#y=subset(a,a$CentralHeating=='Yes')
c) Plot Price vs. Floor, Price vs. Age, and Price vs. rooms, in separate graphs.
Plot Price vs. Floorarea
d)
sum((abs(price-mean(price))/length(price)))
[1] 10.42125
sum((abs(floorarea-mean(floorarea))/length(floorarea)))
[1] 272.025
sum((abs(Rooms-mean(Rooms))/length(Rooms)))
[1] 1.4
sum((abs(Age-mean(Age))/length(Age)))
[1] 1.8675
e) Draw box plots of Price, FloorArea, and Age.
boxplot(price,floorarea,Age)
f) Draw all the graphs in (c), (d), and (e) in the same graph paper.
c(plot(price,floorarea),plot(price,Age),plot(price,Rooms))
M.V.NageswaraRao
18BEV7021
L49+L50
Q.) Collect at least 20 students and analyse the data by using descriptive statistics
and Interpret the results.
name=c("A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T")
age=c(11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30)
maths=c(19,23,42,31,42,50,50,36,23,41,47,25,33,25,35,45,50,32,28,38)
phy=c(10,20,21,22,11,34,33,23,10,29,38,34,55,45,33,32,22,11,19,21)
chy=c(18,19,19,19,19,20,20,20,20,20,21,21,21,21,22,23,24,27,30,36)
Command> data=data.frame(name,age,maths,phy,chy)
Output> data
name age maths phy chy
1 A 11 19 10 18
2 B 12 23 20 19
3 C 13 42 21 19
4 D 14 31 22 19
5 E 15 42 11 19
6 F 16 50 34 20
7 G 17 50 33 20
8 H 18 36 23 20
9 I 19 23 10 20
10 J 20 41 29 20
11 K 21 47 38 21
12 L 22 25 34 21
13 M 23 33 55 21
14 N 24 25 45 21
15 O 25 35 33 22
16 P 26 45 32 23
17 Q 27 50 22 24
18 R 28 32 11 27
19 S 29 28 19 30
20 T 30 38 21 36
> mean3
Output> [1] 26.15
Command> > mean4=mean(chy)
> mean4
Output> [1] 22
>
> #for all the medians
Command> > median1=median(age)
> median1
Output> [1] 20.5
>
> #for all the modes of the data
Command> > xr1=table(age)
> mode1=which(xr1==max(xr1))
> mode1
Output> 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Command> > xr2=table(maths)
> mode2=which(xr2==max(xr2))
> mode2
Output> 50 15
Command> > xr3=table(phy)
> mode3=which(xr3==max(xr3))
> mode3
Output> 10 11 21 22 33 34 1 2 5 6 10 11
> variance1
Output> [1] 35
Command> > variance2=var(maths)
> variance2
Output> [1] 98.61842
> c.f1
Output> [1] 11 23 36 50 65 81 98 116 135 155 176 198 221 245 270 296 323 351 380
[20] 410
Command> > c.f2=cumsum(maths)
> c.f2
Output> [1] 19 42 84 115 157 207 257 293 316 357 404 429 462 487 522 567 617 649 677
[20] 715
Command> > c.f3=cumsum(phy)
> c.f3
Output> [1] 10 30 51 73 84 118 151 174 184 213 251 285 340 385 418 450 472 483 502
[20] 523
Command> > c.f4=cumsum(chy)
> c.f4
Output> [1] 18 37 56 75 94 114 134 154 174 194 215 236 257 278 300 323 347 374 404
[20] 440
>
> # fro all the summaries
Command> > s1=summary(age)
> s1
Output>
Min. 1st Qu. Median Mean 3rd Qu. Max.
11.00 15.75 20.50 20.50 25.25 30.00
Command> > s2=summary(maths)
> s2
Output>
Min. 1st Qu. Median Mean 3rd Qu. Max.
19.00 27.25 35.50 35.75 42.75 50.00
Command> > s3=summary(phy)
> s3
Output>
Min. 1st Qu. Median Mean 3rd Qu. Max.
10.00 19.75 22.50 26.15 33.25 55.00
Command> > s4=summary(chy)
> s4
Output>
Min. 1st Qu. Median Mean 3rd Qu. Max.
18.00 19.75 20.50 22.00 22.25 36.00
M.V.NageswaraRao
18BEV7021
L49+L50
1. Matrices and arrays
COMMAND> k<-array(1:12,dim=c(3,4))
OUTPUT> > k<-array(1:12,dim=c(3,4))
>k
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
> A<-list(c(1,2,3,4),c(5,6,7,8),c(9,10,11,12))
COMMAND> A
OUTPUT> [[1]]
[1] 1 2 3 4
[[2]]
[1] 5 6 7 8
[[3]]
[1] 9 10 11 12
command> y<-matrix(c(1,2,3,4,5,6,7,8,9,10,11,12),nrow=3,ncol=4,byrow=TRUE)
output> y<-matrix(c(1,2,3,4,5,6,7,8,9,10,11,12),nrow=3,ncol=4,byrow=TRUE)
>y
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
command> z=matrix(c(1,2,3,4,5,6,7,8,9,10,11,12),nrow=3,ncol=4,byrow=TRUE,dimnames=li
st(c("A","B","C")))
output> z=matrix(c(1,2,3,4,5,6,7,8,9,10,11,12),nrow=3,ncol=4,byrow=TRUE,dimnames=list(
c("A","B","C")))
>z
[,1] [,2] [,3] [,4]
A 1 2 3 4
B 5 6 7 8
C 9 10 11 12
>z=matrix(c(1,2,3,4,5,6,7,8,9,10,11,12),nrow=3,ncol=4,byrow=TRUE,dimnames=list(c("A","B
","C")))
command> t(z)
output> z=matrix(c(1,2,3,4,5,6,7,8,9,10,11,12),nrow=3,ncol=4,byrow=TRUE,dimnames=list(
c("A","B","C")))
> t(z)
AB C
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
> a<-matrix(c(1,2,3,4,5,6,7,8,9,10,11,12),nrow=3,ncol=4,byrow=TRUE)
>a
> b<-matrix(c(1,3,7,8,3,5,2,12,13,15,16,17),nrow=3,ncol=4,byrow=TRUE)
>b
command> cbind(a,b)
output> > cbind(a,b)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 2 3 4 1 3 7 8
[2,] 5 6 7 8 3 5 2 12
[3,] 9 10 11 12 13 15 16 17
> a<-matrix(c(1,2,3,4,5,6,7,8,9,10,11,12),nrow=3,ncol=4,byrow=TRUE)
>a
> b<-matrix(c(1,3,7,8,3,5,2,12,13,15,16,17),nrow=3,ncol=4,byrow=TRUE)
>b
command> rbind(a,b)
output> rbind(a,b)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
[4,] 1 3 7 8
[5,] 3 5 2 12
[6,] 13 15 16 17
f) Use arbitrary numbers to create matrix.
b<-matrix(c(1,3,7,8,3,5,2,12,13,15,16,17),nrow=3,ncol=4,byrow=TRUE)
Command> b
output> [,1] [,2] [,3] [,4]
[1,] 1 3 7 8
[2,] 3 5 2 12
[3,] 13 15 16 17
2. Random sampling
a)In R ,you can simulate these situations with the sample function. pick five numbers
random from the set 1:40
> a<-1:40
> a
Command> sample(a,5)
Output>[1] 1 32 4 10 31
b) Notice that the default behaviour of sample is sampling without replacement. That is, the
samples will not contain the same number twice, and size obviously cannot be bigger than t
he length of the vector to be sampled. If you want sampling with replacement, then you nee
d to add the argument replace=TRUE.Sampling with replacement is suitable for modelling co
in tosses or throws of a die. So, for instance,simulate 10 coin tosses.
(i)Tosses
> p<-c("H","T")
Command> sample(p,10,replace=TRUE)
Output> [1] "T" "H" "H" "H" "H" "H" "T" "H"
[9] "H" "T"
(ii)Die throws
> D<-c(1,2,3,4,5,6)
Command> sample(D,10,replace=TRUE)
Output> [1] 4 6 1 5 2 1 5 3 6 3
c) In fair coin-tossing, the probability of heads should equal the probability of tails, but the i
dea of a random event is not restricted to symmetric cases. It could be equally well applied t
o other cases, such as the successful outcome of a surgical procedure. Hopefully, there woul
d be a better than 50% chance of this. Simulate data with nonequal probabilities for the out
comes (say, a 90% chance of success) by using the prob argument to sample.
d)
a<-40
b<-5
Command> choose(40,5)
Output> [1] 658008
e)5!
Command> factorial(5)
Output> [1] 120
R-LAB
M.Vnageswararao
18BEV7021
L49+L50
Command> x=3
> q=dbinom(x,size=5,prob=0.95)
Output> q
[1] 0.02143438
2. It is known that 20% of integrated circuit chips on a production line are defective.
To maintain and monitor the quality of the chips, a sample of twenty chips is
selected at regular intervals for inspection.
Let X denote the number of defectives found in the sample.
Find the probability of different number of defective found in the sample?
Command> z<-seq(0,100,by=1)
d=dbinom(z,size=100,prob=0.01)
d
Output> > d
4. Plot all of the above problems in a single window for random variable and
respective Probability distribution.
Command> x<-seq(0,3,by=1)
b=dbinom(x,size=5,prob=0.95)
y<-seq(0,20,by=1)
c=dbinom(y,size=20,prob=0.2)
par(mfrow=c(2,2))
z<-seq(0,100,by=1)
d=dbinom(z,size=100,prob=0.01)
plot(x,b,main="1st problem",xlab="no of trails",ylab=" probabilitie")
plot(y,c,main="2nd problem",xlab="no of trails",ylab="probabilitie")
plot(z,d,,main="3rd problem",xlab="no of trails",ylab="probabilitie")
Output>
5. For Q.No. 1 Find P(X ≤ 3) and P(X > 3). For Q. No. 2 Find P(X ≤ 4) and P(X > 4).
Find all the cumulative probabilities and round to 4 decimal places.
P(x≤ 3)
Command> a=round(pbinom(3,size=5,prob=0.95),4)
>a
Output> [1] 0.0226
d) Print the data as {20, 19, …, 2, 1} without again entering the data
Command> rev(x)
Output> [1] 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
6. The probability that a patient recover from a rare blood disease is 0.4. If 15
people are known to have contracted this disease, what is the probability that (a) at
least10 survive, (b) from 3 to 8 survive, and (c) exactly 5 survive?
Command> k=1-pbinom(9,size=15,prob=0.4)
k
Output> [1] 0.0338333
Command> L=pbinom(8,size=15,prob=0.4)-pbinom(3,size=15,prob=0.4)
L
Output> [1] 0.8144507
Command> m=pbinom(5,size=15,prob=0.4)
m
Output> [1] 0.4032156
R-LAB
M.Vnageswararao
18BEV7021
L49+L50
(b) What is the probability that there are at most three days with an accident?
> n=400
> p=0.005
> lambda=n*p
Command> z<-ppois(3,lambda)
z
Output> [1] 0.8571235
5. The potential buyer of a particular engine requires (among other things) that the
engine start successfully 10 consecutive times. Suppose the probability of a
successful start is 0.990. Let us assume that the outcomes of attempted starts are
independent.
(a) What is the probability that the engine is accepted after only 10 starts?
Command> w=10*0.99
> ppois(10,w,lower.tail=FALSE)
Output> [1] 0.4044513
(b) What is the probability that 12 attempted starts are made during the acceptance
process?
Command> w=10*0.99
> dpois(12,w)
Output> [1] 0.09284745
6. The acceptance scheme for purchasing lots containing a large number of batteries
is to test no more than 75 randomly selected batteries and to reject a lot if a single
battery fails. Suppose the probability of a failure is 0.001.
(a) What is the probability that a lot is accepted?
Command> ppois(75,0.01,lower.tail=FALSE)
Output> [1] 0.9277513
(b) What is the probability that a lot is rejected on the 20th test?
Command> m=75*(1-0.001)
dpois(20,m)
Output> [1] 9.8107528e-4
(c) What is the probability that it is rejected in 10 or fewer trials?
Command> m=75*(1-0.001)
ppois(10,m,lower.tail=TRUE)
Output> [1] 0.0142532
7. Plot the graph for Q. No. 2, 4, 5 and 6 for Random Variable against Probability
Distribution function.
.
Command>
x=seq(1,400,1)
x
b<-dpois(x,2)
b
#3
y=seq(0,3,1)
y
p=3
c<-dpois(y,3)
c
z<-seq(1,20,1)
z
d<-dpois(z,10)
d
q<-seq(1,100,1)
q
m=75*(1-0.001)
e<-dpois(q,m)
e
plot(x,b)
plot(y,c)
plot(z,d)
plot(q,e)
par(mfrow=c(2,2))
plot(x,b,main="1st problem",xlab="random varible",ylab="probabilitie
distribution")
plot(y,c,main="2nd problem",xlab="random varible",ylab="probabilitie
distribution")
plot(z,d,main="3rd problem",xlab="random varible",ylab="probabilitie
distribution")
plot(q,e,main="4rd problem",xlab="random varible",ylab="probabilitie
distribution")
Output