0 évaluation0% ont trouvé ce document utile (0 vote)
14 vues10 pages
SAS (Statistical analysis System) is a software system / package for data analysis. #he SAS system is a powerf$l programming lang$age pl$s a collection of ready%$to%$se programs called proced$res or P&'C(s) #he SAS lang$age has its own oca,$lary and synta% words and the r$les for p$tting together.
SAS (Statistical analysis System) is a software system / package for data analysis. #he SAS system is a powerf$l programming lang$age pl$s a collection of ready%$to%$se programs called proced$res or P&'C(s) #he SAS lang$age has its own oca,$lary and synta% words and the r$les for p$tting together.
SAS (Statistical analysis System) is a software system / package for data analysis. #he SAS system is a powerf$l programming lang$age pl$s a collection of ready%$to%$se programs called proced$res or P&'C(s) #he SAS lang$age has its own oca,$lary and synta% words and the r$les for p$tting together.
Statistics 511 Professor Naomi Altman revised from previos editions !" #cS$ane and Altman% and Ns$in"a!a&o!e'e and Altman TA()* O+ CONT*NTS A, O-*.-I*W O+ TH* SAS S/ST*# 0 (, TWO ST*PS N**D*D IN TH* SAS P.OG.A##ING )ANGUAG* 0 C, S/NTA1 2 D, CHA.ACT*.ISTICS O+ A SAS DATA S*T 3 D, C.*ATING A SAS DATA S*T 5 *, C.*ATING A SAS P.OG.A# +O. D*SC.IPTI-* STATISTICS 4 F. PRODUCING A REPORT FROM A SAS OUTPUT 10 G. H*)P+U) HINTS 15 1 Using SAS PC with Windows A, O-*.-I*W O+ TH* SAS S/ST*# SAS (Statistical Analysis System) is a software system/package for data analysis. SAS proides tools for! information storage and file handling" data modification and management" statistical analysis" and report writing. #he SAS system is a powerf$l programming lang$age pl$s a collection of ready%to%$se programs called proced$res or P&'C(s) which can perform a large ariety of applications. We will $se primarily the *asic and Statistical tools + a small fraction of the capa,ilities of SAS. 'n%line doc$mentation is aaila,le at www.sas.ps$.ed$. #here is also on%line help when yo$ r$n SAS PC) ,$t this is diffic$lt to $se. (, TWO ST*PS N**D*D IN TH* SAS P.OG.A##ING )ANGUAG* #he SAS lang$age has its own oca,$lary and synta- % words and the r$les for p$tting them together. A SAS statement is a string of SAS keywords) SAS names) and special characters and operators ending in a semicolon that instr$cts SAS to perform an operation or gies SAS information. A se.$ence of SAS statements is called a SAS program. A SAS program consists of two kinds of steps! /A#A steps and P&'C steps. /A#A and P&'C steps can appear in any order) and any n$m,er of /A#A and P&'C steps can ,e $sed in a SAS program. Us$ally) /A#A steps create SAS data sets) and P&'C steps do analysis of SAS data sets. A P&'C may also create aria,les) s$ch as resid$als and fitted al$es) which can ,e placed in a new data set or appended to an e-isting data set. A /A#A step is a gro$p of SAS statements that ,egins with a /A#A statement. 0-ample! /A#A '10" Creates a data set named "ONE". 213240 5A!6204/.#7#5" Reads the data from the file A:YIELD.TXT 21PU# #&0A# &0P 6204/" The file has !aria"les named TREAT# RE$ and YIELD. 4'8694'8 (6204/)" A ne% !aria"le LO&Y is created and added to "ONE". #he /A#A step ,egins with a /A#A statement and can incl$de any n$m,er of program statements. 6o$ can $se the /A#A step for these p$rposes! : retrieal! getting inp$t data from a file : editing! checking for errors in the data and correcting them" comp$ting new aria,les" ; Using SAS PC with Windows : o$tp$tting! write data sets to disk" : creating! prod$cing new SAS data sets from e-isting ones ,y s$,setting) merging) and $pdating. 0ery SAS data set has a name. *y defa$lt) SAS $ses the c$rrently actie data set) which is the one most recently called as in'(t to a /A#A or P&'C statement. 2f yo$r program $ses seeral data sets) it is ,est to call the re.$ired data set $sing /A#A9datasetname when yo$ need to $se it. #hat will aoid pro,lems as yo$ change yo$r program. #he /A#A step can incl$de statements telling SAS to create one or more new SAS data sets and programming statements that perform the manip$lations necessary to ,$ild the data sets. Creating a new data set does not change the c$rrently actie data set. A P&'C is a gro$p of SAS statements that ,egins with a P&'C statement. 0-ample! P&'C &08 /A#A9<'US218" Calls the regression 'roced(re %ith data set )*O+,IN&- ='/04 P&2C09S>3# 1'*0/&=" Tells ,A, %hich are the de'endent and inde'endent !aria"les. 'U#PU# 'U#910W/A#A &9& P9P" ,tores the created !aria"les# R and $ in a ne% data set called )NE.DATA- #he P&'C step (or P&'C0/U&0 step) instr$cts SAS to call a proced$re from its li,rary and to e-ec$te that proced$re) $s$ally with a SAS data set as inp$t. #he P&'C step ,egins with a P&'C statement. 'ther statements in the P&'C step gie the program more information a,o$t the res$lts that yo$ want. C, S/NTA1 #here are ? main synta- r$les! 1. 0ery SAS statement ends with a semi%colon @"A. 3ail$re to incl$de the semi%colon is the most common error) and $nfort$nately leads to error messages that are diffic$lt to decipher. ;. Baria,les or data set names sho$ld contain C or fewer characters or digits. D. SAS is case insensitie. /'8) /og and dog all mean the same thing to SAS. ?. SAS ignores @end of lineA and m$ltiple spaces. D, CHA.ACT*.ISTICS O+ A SAS DATA S*T D Using SAS PC with Windows #he SAS system reads data (letters or n$m,ers) in ario$s forms and organiEes them into a SAS data set which is similar to a spreadsheet. 'nce the data hae ,een organiEed into a SAS data set) yo$ can access) analyEe) reise) and display the data. 6o$ can also store datasets + howeer) for small datasets) it is most conenient to store them as te-t files. #he data consist of the following components! data al$e) aria,le) and o,seration. /ata al$e is a single $nit of information (a single cell) Baria,le is a set of data al$es descri,ing a characteristic (a single col$mn) ',seration is a set of data al$es for the same item (a single row) #he data set following contains F aria,les) 1C o,serations) and GH data al$es) one of which is missing. aria,les 1A=0 S07 A80 <028<# W028<# 1 A$,rey = ?1 I? 1IH ; &on = ?; JC 1JJ D Carl = D; IH 1FF ? Antonio = DG I; 1JI F /e,orah 3 DH JJ 1;? J Kac.$eline 3 DD JJ 11F I <elen 3 ;J J? 1;1 C /aid = DH I1 1FC G Kames = FD I; 1IF 1H =ichael = D; JG 1?D 11 &$th 3 ?I JG 1DG 1; Koel = D? I; 1JD 1D /onna 3 ;D J; GC 1? &oger = DJ IF 1JH 1F 6ao = , IH 1?F 1J 0liEa,eth 3 D1 JI 1DF 1I #im = ;G I1 1IJ 1C S$san 3 ;C JF 1D1 data al$es 1ow yo$ are ready to create a data set file and a SAS program file. *, C.*ATING A SAS DATA S*T #he initial step in most SAS programs will inole reading data from a te-t file. Any te-t processor or spreadsheet can ,e $sed to create the file. We demonstrate $sing 1otepad. Us$ally 2 $se my faorite te-t editor and sae in t-t format. ? o,seration missing al$e } Using SAS PC with Windows (2n the SAS man$als yo$ will occasionally see data im,edded in a /A#A step. <oweer) this is awkward and means that a new program file needs to ,e written eery time the data are modified.) #he data set to ,e created consists of two aria,les meas$red on a random sample of G steers. #he first aria,le is the lie weight (in h$ndreds of po$nds) and the second aria,le is the dressed weight (in h$ndreds of po$nds). #his sample data set will ,e $sed to o,tain simple s$mmary statistics and a 1ormal Pro,a,ility Plot. 4ie weight ?.; D.C ?.C D.? ?.F ?.J ?.D D.I D.G /ressed weight ;.C ;.F D.1 ;.1 ;.G ;.C ;.J ;.? ;.F /ata from 4yman 'tt) An 2ntrod$ction to Statistical =ethods and /ata Analysis) p. 1?D. Start notepad as follows! Start 6 Pro7rams 6 Accessories 6 Notepad, 1ow) type the following data set. ?.; ;.C D.C ;.F ?.C D.1 D.? ;.1 ?.F ;.G ?.J ;.C ?.D ;.J D.I ;.? D.G ;.F Sae the file on a diskette in drie A! as follows. +ile 6 Save 6 A89ST**.S,T1T 6 Save, +ile 6 *:it, 1ow yo$ are going to create the SAS Program. *, C.*ATING A SAS P.OG.A# +O. D*SC.IPTI-* STATISTICS Altho$gh SAS has some interactie feat$res) it is ,asically a ,atch program. #his means that it is conenient to create and sae programs as te-t files. Us$ally) 2 create my program in my faorite te-t editor) and sae it as a t-t file) with e-tension @.sasA instead of @.t-tA. #hen clicking on the file opens the program and places the te-t in the SAS te-t editor) from where it can ,e r$n. 6o$ can also create and sae yo$r program in the SAS te-t editor. 2nstr$ctions are ,elow. Start SAS as follows! Start 6 Pro7rams 6 T$e SAS S"stem 6 T$e SAS S"stem for Windo;s v< F Using SAS PC with Windows After opening SAS) one is prompted to the following screen with two windows (see ,elow). #he $pper window is a 4og window showing the SAS statements which hae already ,een processed) along with comments. #he ,ottom window is the Program editor window. 6o$ will enter and edit yo$r SAS program in the Program editor window 1ow) create the following SAS program in the Program 0ditor window. Use $pper or lower case letters as yo$ choose. /* J Using SAS PC with Windows THIS PROGRAM IS USED TO CREATE A SMALL SAS PROGRAM WRITTEN BY: LAST NAME, FIRST NAME OF STUDENT. DATE: MONTH/DAY/YEAR */ #he te-t a,oe which is delimited ,y /: :/ is a comment and is ignored ,y SAS. 2t is helpf$l to $se comments as a way to doc$ment yo$r data. OPTIONS LS=79 NOCENTER; OPTIONS picks options for the output. LS= selects the number of characters per line. TITLE 'SUMMARY STATISTICS'; TITLE provides a title that appears on each page of the output. DATA MARY; e no! create the data set named "#$%&". INFILE 'A:STEERS.TT'; e read the data from $'STEE%S.T(T INPUT LI!EWT DRESSWT; There are t!o variables named LI)ET and *%ESST. The variable names are separated b+ blanks. &ou need to name all variables in the data set, even if +ou do not !ant to use them all. TITLE" 'PRINTING LI!EWT'; TITLE- provides a subtitle. It can be used e.g. if several anal+ses are performed in the same program. PROC PRINT DATA=MARY; e no! run our .rst P%O/. It prints some or all of the data in data set #$%&. !AR LI!EWT; Tells S$S to print LI)ET onl+. Other!ise it prints all of the data. RUN; The %0N command can be used to terminate a P%O/ or *$T$ step. /ommands +ou submit !ill not run until a %0N command is added. #ry to r$n the SAS program now ,y clicking the SU*=2# icon (the r$nning fig$re)) or ,y pressing the key f$nction 3D. 4ook at yo$r SAS o$tp$t. 6o$ sho$ld see a list of the data. 2f yo$ do not) yo$ hae made an error. <oweer) yo$ can contin$e reading this t$torial) as error correction is the ne-t topic. Whether or not the o$tp$t appears) open the 4og window as follows! Windo; 6 )o7, 6o$ sho$ld see the SAS commands yo$ entered) with comments a,o$t how they e-ec$ted) incl$ding error messages (in red) if any. Warnings are printed in green. We now want to see what happens if yo$ make an error. #o do this) we will start oer. 6o$ can clear any window ,y clicking on the window to make it actie. #hen clear the window as follows! *dit 6 Clear Te:t, Clear the 'U#PU# and 4'8 windows. (Windo;6OUTPUT *dit6Clear Te:t Windo;6)OG *dit6Clear Te:t= &ecall the c$rrent SAS file in the Program 0ditor window as follows! Windo; 6 Pro7ram *ditor to open the window and )ocals 6 .ecall Te:t to ,ring ,ack the most recently s$,mitted te-t, (&epeating &ecall #e-t ,rings ,ack the second most recently s$,mitted te-t) etc.) #o see how SAS handles errors in a program file) change the statement PROC PRINT DATA=MARY; #$ PROC PRINT DATA=MARIAM; in the program a,oe) then r$n SAS. 6o$ will get the following error message I Using SAS PC with Windows (written in red) in the 4'8 window. ERROR: File WORK.MARIAM.DATA does o! e"is!. 'nce a SAS program has ,een s$,mitted for processing) error messages are written in the 4og window. #hey can ,e accessed as follows! Windo; 6 )o7, 1ow open the 4og window and scroll down to read the error message. As mentioned preio$sly) it is ass$med that yo$ hae cleared the 4og window of its preio$s contents. 2f not) clear this window and r$n SAS program again. &eopen yo$r SAS program file as follows! Windo; 6 Pro7ram *ditor 6 )ocals 6 .ecall Te:t, 8o to the statement PROC PRINT DATA=MARIAM; and change MARIAM ,ack to MARY. Add t$e remainin7 statements !elo; to "or SAS pro7ram, PROC UNI#ARIATE DATA=MARY; $ROC +NI/ARIATE 'rints s(mmar0 statistics. It is 'art of ,A, 1A,IC# rather than ,A, ,TATI,TIC,. #AR $I#EWT DRESSWT; .e %ill o"tain s(mmar0 statistics for "oth !aria"les. %%P$OT; .e re2(est a Normal $ro"a"ilit0 $lot for "oth !aria"les. RUN; Sae the SAS program file on A89 drie as follows! +ile 6 Save 6 A89steers,sas 6 Save #o r$n the SAS program) click on the SU*=2# icon or simply press the 3D key. 'ther key f$nctions are defined $nder! Help 6 >e", *y selecting the Window men$) yo$ can open the '$tp$t) Program 0ditor) and 4og windows wheneer necessary. #he '$tp$t window can ,e selected and opened the same way the other two windows are opened. 6o$ can sae the '$tp$t window(s contents as follows! +ile 6 Save 6 A89steers,lst 6 Save 2 $s$ally c$t and paste the entire window into a te-t editor as descri,ed ,elow. C Using SAS PC with Windows +, P.ODUCING A .*PO.T +.O# A SAS OUTPUT #here are many different ways to prod$ce a report $sing SAS o$tp$t. We will go thro$gh one way) which ass$mes that yo$ hae a te-t editor s$ch as Word on yo$r comp$ter and that yo$r comp$ter is powerf$l eno$gh to r$n the editor and SAS at the same time. Start t$e Te:t *ditor, 2n the editor) write yo$r report. 3or e-ample) type the heading and introd$ctory material descri,ing the pro,lem yo$ analyEed. /isc$ss yo$r analysis of the data. S$ppose yo$ wo$ld like to incl$de a SAS analysis o$tp$t in yo$r report. Cop" anal"sis otpt to t$e Clip!oard8 'pen the SAS o$tp$t! Windo; 6 otpt, *dit6 Select all? *dit6 Cop"? 1ote that copying only a portion of the o$tp$t ,y highlighting does not always work. 2 copy the entire o$tp$t to a te-t editor) and edit there. 'pen the te-t editor and paste yo$r SAS o$tp$t! *dit 6 Paste, 6o$ will realiEe that the SAS o$tp$t has the SAS monospace font siEe 1H ,y defa$lt, 6o$ need to modify the font siEe for a ,etter o$tp$t. Proceed as follows to modify the font siEe of yo$r pasted o$tp$t. e.g. 2n =icrosoft Word) choose *dit 6 Select All and change the SAS monospace font to size to 8. 1ow yo$ can edit yo$r word doc$ment ,y adding te-t and/or remoing parts of the o$tp$t yo$ L$dge $nimportant. 6o$ can also copy and paste graph sheets directly into yo$r doc$ment. Sae yo$r report as follows! +ile 6 Save 6 A89report,doc6 Save 1ow yo$ can print yo$r report in =icrosoft Word as follows! +ile 6 Print 6 Clic& O> G Using SAS PC with Windows G, H*)P+U) HINTS 0ery SAS statement ends with a semicolon ". 6o$ may contin$e statements on two or more lines. 3orgetting the semicolon (") leads to error messages which are hard to decipher. #his is always the first thing to check if yo$r program does not r$n. #he ne-t most common error is misspelling a SAS command) or aria,le name. SAS aria,le names can ,e no more than C characters long. #he third most common error is trying to $se a aria,le that is not aaila,le. 3or e-ample) this error can occ$r if yo$ try to draw resid$al plot) ,$t forgot to store the resid$als from the regression or if yo$ are in the wrong data set. SAS ignores e-tra ,lanks) incl$ding ,lank lines) so yo$ can space yo$r program so that it is neat and easy to read. 6o$ can p$t more than one statement on a line) ,$t this $s$ally makes yo$r program hard to read. SAS program files can get .$ite long) so it is $sef$l to keep them reada,le. Statements added in the SAS editor are not saed in yo$r SAS program file. 2f yo$ want to hae them aaila,le for f$t$re $se) yo$ m$st e-plicitly sae them $sing file6save =ost P&'Cs can $se only one data set. 2f aria,les from m$ltiple data sets are needed) the data sets can ,e merged in a /A#A step. #he online SAS help can ,e diffic$lt to $se d$e to statements with the same name in different P&'Cs. When seeking help for a P&'C) try searching on P&'C. #his gies a list of all the P&'Cs. 6o$ can then click on the P&'C yo$ want) which gies a list of the statements alid for that P&'C. 6o$ will likely find the online man$al more $sef$l. 1H