Vous êtes sur la page 1sur 42

Review of SAS

SAS Display Manager The SAS Display Manager (DM) is the standard interactive environment for the SAS system. When you start the SAS program, it displays various windows PROGRAM !indo! is the "ottom !indo!. #ou can enter a SAS program or "ring in a SAS program from a file. This !indo! has an editor that allo!s you to !rite and edit your program. When completed, you can su"mit your program. LOG !indo! is the middle !indo!. This !indo! displays the program lines as the program e$ecutes, prints out messages a"out your data sets as they are created, and gives error messages as errors are created. %ote that the error messages are sho!n in red color. OUTPUT !indo! is the top !indo!. This !indo! displays results from your SAS program. #ou can s!itch among these !indo!s. RESULTS !indo! is li&e a ta"le of contents for the 'utput !indo!. The results tree lists each part of the results in an outline form. EXPLORER !indo! gives you (uic& access to SAS files.

Various Modules of SAS


Base SAS SAS/STAT SAS/GRAP is the main module !hich provides data manipulation and programming capa"ility and some elementary descriptive statistics. is the module that includes all the statistical programs e$cept the elementary ones supplied !ith the "ase pac&age. is a module that provides high (uality graphs. The "ase SAS and SAS)STAT do provide graphics "ut SAS)*+A,- provides high (uality graphs, maps, and charts. is a module that allo!s you to search, modify, or delete records directly from SAS data file. .t provides for data entry !ith sophisticated data chec&ing capa"ilities. /S, stands for Full Screen Product.

SAS/!SP

The other modules are SAS/A! (SAS Applications Facility), SAS/ETS, SAS/OR, SAS/"#, SAS/$ML (Interactive Matri$ Language), SAS/A##ESS, SAS/#O%%E#T, and SAS/ASS$ST.

Review of SAS Basi& S'ru&'ure of '(e SAS Sys'e)


SAS is organi1ed into steps. There are two types of steps2 DATA step is the heart of the SAS3s programming. .t puts data in a form that the SAS program can use. DATA step reads SAS datasets or ra! data, performs transformations, creates ne! varia"les, and recodes e$isting varia"les. PRO# step !hich uses procedures to do many things to the data, such as sorting, analy1ing, or printing them 4oth DATA and PRO# steps consist of SAS statements. 5ach SAS statement "egins !ith a &ey!ord that implies its function For example2 INPUT statement reads inputs FREQ procedure produces fre(uency distri"ution RUN statement tells SAS to run the DATA or ,+'6 7ust completed A semicolon (;) is re(uired to denote the end of each SAS statement. The spacing of the statements or lo!er (or upper) case is not important in SAS. SAS has many different PRO#s that can "e organi1ed into functional categories2 ,+'6s that manage datasets ,+'6s that analy1e the contents of a data set ,+'6s that graphically display the contents of a data set. 8..etc. SAS not only provides ho! to create a data set !ithin the SAS environment "ut also provides many different !ays to create a SAS data set "y letting you import data from many different formats such as, to name a fe!, 5$cel, AS6.., Access, S,SS formats. Data sets are organi1ed in the form of ro*s and &olu)ns . A row refers to an o"servation, and a col mn refers to a varia"le name.

Review of SAS #rea'ing a SAS da'a se' as par' of '(e DATA s'ep progra))ing
Proc format; Value $sexf Value racef 'F' = 'Female' 'M' = 'Male'; 1 = 'Cauasian' 2 = 'African-American' 3 = 'Other'; 0 = 'No' 1 = 'Yes'; 0 = '"o# $ra!e' 1 = '%i h $ra!e';

Value yesnof value Run; ra!ef

data tem&; in&ut i! h' a e sex $ race !m ht stro(e cancer )37 !ate*+trt mm!!yy,)49 !ate*lastcontact mm!!yy,- ra!e; format !ate*+trt !ate*lastcontact mm!!yy+.-; la'el i! = /0atients 12/ h' = /Anemic status3 4es&onse Varia'le/ a e = /A e of the &atient/ sex = /0atients en!er/ race = /0atients race/ !m = /2M5/ ht = /%65/ stro(e = /7tro(e5/ cancer = /Cancer5/ !ate*+trt = /2ate of +st treatment/ !ate*lastcontact = /2ate of last contact/; !atalines; + +8-9 8. F + . . . . ++:++:.; < 9-9 8= F + . + . . +<:<+:.; ; ,-= 8= F ; . + . . +<:9:9. 8 +?-= ;= F < . . . . ;:=:.; > +<-> 8> F + . . . . +<:,:.; ? ,-8 8. M + . . . . 8:+:99 = =-8 <8 M < . . . . 8:++:.; , 9-; ?. F + . . . + ?:+,:,? 9 ++-+ ;, M + . + . . <:+=:.; +. +.-; <8 M + . . . . ;:+?:.+ ++ ++-8 8? F + + + . . ++:,:.+ +< 9-+ ;; F + . . . . +.:=:99 +; 9-+ 8< M ; . . . . +:?:.. +8 ++ ;9 M < . . . . ?:>:.. +> 9-9 >9 M < + + . . 9:8:9, +? +.-8 >9 M ; + + . . +.:8:9+ += +<-= ;, F + . . . . +<:;:9. +, +.-> 8> M + . . . . ++:+9:9+ +9 +;-? 8< M < . + + . ++:+;:9. <. +.-; >= M < + + + . ?:;.:,= ;

+<:<:.; ++:++:.8 8:+>:9+ =:+<:.; >:+;:.> ++:9:.. ++:+:.8 +.:<+:,, +<:;+:.> 9:+,:.8 ++:+=:.> +<:+:.; ?:<:.8 +.:;:.> ++:=:.8 ?:9:.8 8:<;:.8 ,:<8:.> ;:<<:.> ?:<+:.8

+ + + + . + . . + + . . . . . . . . + .

Review of SAS
proc contents !ata=tem&; run; @ 04OC CON6AN67 &roce!ure is illustrate!;

proc print !ata=tem&; @ 04OC 041N6 &roce!ure is illustrate!; format sex $sexf- race racef- !m ht stro(e cancer yesnof- ra!e run; proc means !ata=tem&; var h' a e; run; @ 04OC MAAN7 &roce!ure is illustrate!;

ra!ef-;

proc freq !ata=tem&; @ 04OC F4AB &roce!ure is illustrate!; ta'les sex race !m ht stro(e cancer; format sex $sexf- race racef- !m ht stro(e cancer yesnof- ra!e run;

ra!ef-;

@7u&&ose #e #ant to create a 0A4MANAN6 7A7 !ata file 'y name 6AM0+ in the !irectory C3C2DM*<..,C; li'name ANYNAME 'c3CmathC'; @ "1DNAMA is intro!uce!; data ANYNAME-tem&+; set tem&; @7ets 6AM0 eEual to 6AM0+; run;

Re)ar+s2 .n the a"ove program, !e have used many SAS &ey!ords2 data li"name mmddyy<. mmddyy0=. means var la"el format ; set fre( proc format input datalines contents print ta"les run ;se$f

>

Review of SAS
data temp statement creates a temporary data set called TEMP libname points to the su"directory in !hich the data file needs to "e stored permanently $ indicates that the type of the varia"le is c!aracter. input is the &ey !ord that defines the names of the varia"les in the data set. datalines statement signals the "eginning of the lines of data. mmddyy8. is called an inp t "ormat (informat). This indicates to SAS to e$pect the dates to "e entered as 09)9?)=9 mmddyy1 . is called a "ormat . This indicates to SAS to print date as 09)9?)9==9 set tells SAS to use this data file T5M, to create the ne! data file. data ANYNAME.temp1 statement creates a permanent data set called TEMP, and stores it in the su"directory MA!" !hich is a su"directory in drive # P#o$ stands for ,+'65D@+5 P#o$ $ontents displays the contents of the data file P#o$ p#int is a SAS procedure that prints the dataset for all the varia"les in the order in !hich they !ere read. P#o$ means computes the mean, standard deviation, minmum, ma$imum for the varia"les listed follo!ing the &ey !ord #AR. P#o$ f#e% computes fre(uency ta"ulation for the varia"les listed follo!ing the &ey !ord TA$%ES. A semicolon (;) signifies the end of a particular SAS statement. A single SAS data set can have up to :9,ABA varia"les. %ames in SAS can contain letters, num"ers, and underline characters "ut cannot "egin !ith a num"er. %ames can "e in upper case, lo!er case, or a mi$ture. %ames must "e :9 characters or fe!er in length. CA45C statement2 each la"el can have up to 9DB characters. %'DAT5 and %'%@M45+ in ',T.'%S statement are to avoid printing date and page num"er on the output.

Review of SAS
!&#ee modes of input' LIS!( )*L+M,( and F*RMA!!-. LIS! input2 Data in this format are entered !ithout regard to column location. T!o restrictions are placed2 First, varia"le values must "e separated "y one or more "lan&s. Second, there must "e as many values in each line of data as the num"er of varia"les listed in the .%,@T statement.
in&ut i! h' a e sex $ race !m ht stro(e cancer

)*L+M, input

& sign identifies the varia"le as a c!aracter varia"le. Missing value should "e represented "y a period (.) Default length for character data value is < 6haracter data values cannot contain spaces

Most e$ternal files have a regular pattern for data location. 6olumn input is one of the t!o methods that can "e used to read such data.
.nput + < ; id 0E : h" DE? age 00E09 se$ ; 0> race 0A dm 0? ht 90 stro&e 9:F +8-9 9-9 ,-= 8. F 8= F 8= F + . . . + . + . ; . + .

F*RMA!!-. input2 Ci&e the column style, /'+MATT5D input re(uires that you &no! the location of the "eginning of the varia"le in the line of data and ho! many columns it uses. .t uses !hat is called inp t "ormats or in"ormats for short. These in"ormats are instructions that tell the SAS ho! to read ra! data.
in&ut )+ i! )> h' )++ a e )+8 sex $+- )+= race )+9 !m )<+ ht )<; stro(e )25 !ate*+trt mm!!yy,- )34 !ate*lastcontact mm!!yy,-; + < ; +8-9 9-9 ,-= 8. F 8= F 8= F + . . . ++:++:.; +<:<:.; + . + . +<:<+:.; ++:++:.8 ; . + . +<:9:9. 8:+>:9+

Review of SAS

Da'a S'ep Progra))ing


@ 6he follo#in statements rea! only o'servations <. throu h 8.; data file+; set exam&le+ Ffirsto's=20 o's=40G; Run;

GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGF
data file<; informat name $+8-; in&ut name $ i! $ en!er $ run;

@ Hse of 1NFO4MA6 statement to accommo!ate the !ata;

&a hei ht #ei ht;

@ Hse of M177OVA4 or 64HNCOVA4 o&tion;

GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGF

data file;; @ Run t e pro!ram "it out t e #N$#%E statement first& infile !atalines missover; @@ 64HNCOVA4 is similar to M177OVA4; in&ut name $ i! $ en!er $ &a hei ht #ei ht; !atalines; victor +<; male ;-> - +>> - <<+ female ;-= >< +.+ Amy +.>

F runF G Misso'er option says that if you reach the end of a data line and have not yet read values for all varia"les, set all the remaining values to missing.F GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGF
data file8; infile !atalines !s!; in&ut name $ i! $ en!er $ &a hei ht #ei ht; !atalines; /victor/I/+<;/I/male/I;->II+>> I /<<+/I/female/I;-=I><I+.+ /ro er/I;<;IIIII ;

G dsd option allo!s you to read commaEdelimited data, to treat 9 consecutive commas as a missing value, and to remove the dou"le (uotes from (uoted strings. GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGF
data file>; infile !atalines &a!; in&ut name $ 1-14 i! $ 1'-1( en!er $ 23-2( hei ht 35-3' #ei ht 39-41; !atalines; victor +<; male ;-> +>> <<+ female ;-= >< +.+ ro er ;<; amy +.> female ;-. >> ; run; &a 31-33

G ,AD option ma&es sure that SAS !ill not try to read data from the ne$t line.

Review of SAS

<

Review of SAS
@se of -- in the .%,@T statement to read the data.
data file?; in&ut !ay tem& )); !atalines; > <+ ? <; = <9 , ;; 9 +9 +. <, ++ ;; +< ;9 +; 8; +8 88 +> <, +? <+ += <8 +, <= +9 <9 <. ;= <+ ;< << ;+ <; ;; <8 <9 ; proc print !ata=exam&le9; title ' 6a'ulation of !ata from AJAM0"A9'; run;

%o'e2 @se of -- in the .%,@T statement to read the data.

Review of SAS
/o#0in1 wit& .A!-s in SAS When once SAS &no!s it is reading a date, it converts the date into the n m(er o" days from Hanuary 0, 0?B=. Dates after Hanuary 0, 0?B= are stored as positive num"ers, and the dates "efore Hanuary 0, 0?B= are stored as negative num"ers. 5$ample2 12 121343 is stored as 5674 and 12 128 dates, 1 is stored as 193:7

SAS uses $%!ORMATS for reading dates, !U%#T$O%S for manipulating and !ORMATS for printing dates. We can use a date as a constant in a SAS e$pression "y adding "UOTES and a letter D. 5$ample2
st dystartday ) *++a g+,,-.D/

/ollo!ing are various infor)a's that can "e used to read the date values2 SAS in"ormat mmddyyB. ddmmyyB. yymmddB. mmddyy<. mmddyy<. ddmmyy<. yymmdd<. mmddyy0=. dateA. date?. monyyD. monyyA. Example o" inp t data 0090?D 9000?D ?D0090 00900??D 00)90)?D 90000??D 0??D0090 00)90)0??D 90%'I?D 90%'I0??D %'I?D %'I0??D

.EAR#UTO!! op'ion in SAS


Suppose !e are using a t!oEdigit year. Then !e can select the range of 0== years. /or e$ample, if the t!oEdigit year falls !ithin the range of 0?=D to 9==D, then the follo!ing SAS statement is needed in the program2 options yearc to"" ) 01,-F The default value for #5A+6@T'// is 0?9=. That is, any t!oEdigit year is ta&en as a fourEdigit year "et!een 0?9= and 9=0?.

0=

Review of SAS
Suppose the date value is =:)0=)DA. SAS ta&es this year as 0?DA since the default value for #5A+6@T'// is 0?9= 2onsider t!e "ollowing SAS program2 Purpose2 The purpose of this SAS program is to read in the DATES using different .%/'+MATS
data file=; in&ut )1 name $+.- )15 a e 2)0 )19 s&orts $++- )31 a!mit!ay mm!!yy,- )40 !x!ate mm!!yy?- )4( rel!ate !ate=-; format a!mit!ay mm!!yy+.- !x!ate !atalines; Victor 8= Dase'all ;+.+9>> Kulie ;8 6ennis +<.+9?, 7mith 8< 6ennis <<<+9,9 ; mm!!yy,- rel!ate !ate9-; .;+.>> .+<.?, .<<<,9 +.mar>> <.Lan?, <<fe',9

Suppose !e add a second line o" data !ith t!o varia"les GE%DER and /E$G T, t!ird line o" data !ith SMO0E and DR$%0, and the "o rt! line o" data !ith the varia"le, RATE, We modify the a"ove program as sho!n "elo!
data file,; in&ut name $ a e s&orts $ a!mit!ay mm!!yy,- !x!ate mm!!yy?- rel!ate !ate=- : en!er $ #ei ht : smo(e $ !rin( $ : hrate ; format a!mit!ay mm!!yy+.- !x!ate mm!!yy,- rel!ate !ate9-; !atalines; Victor 8= Dase'all ;+.+9>> .;+.>> +.mar>> Male +,. yes no =. Kulie ;8 6ennis +<.+9?, .+<.?, <.Lan?, female +<. no no ?> ; REMARKS3

/ tells SAS to go to the ne$t line of the ra! data "efore reading the ne$t varia"le. .f there are : lines of data per o"servation, then there should "e two / 3s in the .%,@T statement, and three / 3s if > lines of data, and so on.
data file9; in&ut )1 !ate+ mm!!yy?- )( !ay 2) )11 month 2) )14 year 4); format !ate+ mm!!yy+.- ne#!ate mm!!yy+.-; ne#!ate = m!yFmonthI!ayIyearG; a e = Fne#!ate - !ate+G:3'5)25; a e+ = intFa eG; a e< = roun!Fa eI 0)1G; a e; = roun!Fa eI 1G; a e8 = F*01+an9(*d - !ate+G:3'5)25; ne#!ay = !ayF!ate+G; ne#month = monthF!ate+G; ne#year = yearF!ate+G; !atalines; +.<+8? += .9 +99. +<<>9? ;. ++ <..+ +<<>?, <> +< <..+

00

Review of SAS
; proc print !ata=file9; run;

E1a)ples of reading an e1'ernal da'a file 23DAT or 3TXT 'ypes4 using L$B%AME5 $%!$LE5 !$LE%AME and $%!$LE op'ions L$B%AME2 The %I$NAME statement points to the su"directory from !hich the file can "e accessed or the su"directory in !hich the files are to "e stored. 6onsider the follo!ing e$ample2
li'name MA6% 'c3CMA6%C'; data math-tem&+; set tem&; run;

!$LE%AME2 The /.C5%AM5 statement is used to lin& an outside file to the SAS program "y giving it a SAS name. This name is li&e an alias, or "ilere" .
data MA6%-sas!atafile<; filename tem& 'c3CMA6%Ctextfile-txt'; infile tem&; in&ut i! !o' mm!!yy+.- en!er $ hrate sy*'& !y*'& !ov mm!!yy+.-; format !o' mm!!yy+.- !ov mm!!yy+.-; run;

$%!$LE2

The INFI%E statement is used in the DATA step "efore the INPUT Statement to let SAS &no! that the data !ill "e read from an e$ternal file instead of "eing put !ith DATA%INES statements
data MA6%-sas!atafile+; infile 'c3CMA6%Ctextfile-!at'; in&ut i! !o' mm!!yy+.- en!er $ hrate sy*'& !y*'& !ov mm!!yy+.-; format !o' mm!!yy+.- !ov mm!!yy+.-; run;

09

Review of SAS Adding 6Value La7els8 2!ORMATS4


'ne of the things !e do "efore analy1ing the data is to &ode the varia"les. Example 02 Suppose !e have a varia"le PAIN that has "een categori1ed into three groups2 N3 PAIN, M3DERATE PAIN, and SE#ERE PAIN. .n order to analy1e the varia"le PAIN, !e need to &ode these three categories. 'ne !ay of doing that is to code N3 PAIN as , M3DERATE PAIN as 1, and SE#ERE PAIN as 8. Example +2 Suppose !e have a varia"le 4ENDER !ith t!o categories MA%E and FEMA%E. .n order to analy1e 4ENDER varia"le, !e code them as 1 J MA%E and 8 J FEMA%E. Similarly, !e can categori1e, and code a variety of varia"les such as IN23ME %E#E%5 EDU2ATI3N5 A%2363% 23NSUMPTI3N. The tas& of adding VALUE LABELS can "e achieved "y adding proc "ormat and 'al e statements to the SAS program. This improves the reada(ility o" t!e o tp t. o /or the t!o varia"les, PAIN and 4ENDER mentioned a"ove, the follo!ing SAS statements !ill suffice to add #al e %a(els .
,+'6 /'+MATF IAC@5 ,A.%/ = J K%' ,A.%3 0 J KM'D5+AT5 ,A.%3 9 J KS5I5+5 ,A.%3F 0 J KMAC53 9 J K/5MAC53F

IAC@5 +@%F

*5%D5+/

These statements go "efore !e !rite the DATA step. o .n order for these #al e %a(les to "e implemented, !e need the follo!ing SAS statement (e"ore the DATAL$%ES or a"ter LABEL statement2
!ORMAT PA$% PA$%!3 GE%DER GE%DER!39

0:

Review of SAS Using %u)eri&al !un&'ions2

Some of the numerical functions that help to do mathematical calculations are R3UND5 %345 %340,5 INT5 MIN5 MA75 MEAN5 SUM, N5 2EI%5 F%33R5 RANN3R5
RANUNI5 and SQRT8

Brief Des&rip'ion of '(e %u)eri&al !un&'ions


ROU%D rounds the varia"le value to the nearest rounding unit. For example, R3UND(09.:>D, =.=0) J 09.:D (rounds the value of 09.:>D to the nearest hundredth). LOG('ar) gives the natural logarithm of :ar. LOG,;('ar) gives the common logarithm of :ar. $%T('ar) gives the .%T5*5+ part of :ar. For example, INT(E:.>) J E: MOD('ar05 'ar+) gives the +5MA.%D5+ !hen the first of t!o arguments is divided "y the second. For example, M3D(9<, D) J : #E$L('ar) rounds :ar to the ne$t largest integer. For example, 2EI%(<.>) J ?, 2EI%(E:.9) J E: !LOOR('ar) rounds :ar to the ne$t smallest integer. For example, F%33R(<.>) J <, F%33R(E:.9) J E> RA%%OR(seed) generates a normally distri"uted random num"er. SEED is optional. .f entered, it must "e = or a fiveE or sevenEdigit odd num"er. RA%U%$(seed) generates a random num"er uniformly distri"uted "et!een = and 0. SEED is optional. .f entered, it must "e = or a fiveE or sevenEdigit odd num"er. %o'e2 The value for SEED should "e = in order to get a different series of random num"ers each time the program is run. .f !e need a repeata"le series of random num"ers, then !e need to give a DE or BE or AEdigit odd num"er as the value of the SEED.

0>

Review of SAS
Rando) Assign)en' of Su7<e&'s 'o ei'(er a TREATME%T or #O%TROL group .n any e$periment, one of the most important steps is to randomly assign the su"7ects to either a treatment or control group in order to avoid the "ias in the allocation of su"7ects to either of these t!o groups. /ollo!ing SAS program !ill accomplish the tas&
Proc format; Value rou&f run; 0 = /CON64O"/ 1 = /64AA6MAN6/;

data file+.; in&ut i! name $<>-; rou& = ranuniF0G;

@ 7ee! is .- 6hereforeI every time the &ro ram is runI #e et a !ifferent ran!omiMation; @ Hse <;=?>8;? as a see! an! run the 0ro ram;

!atalines; + 7mithI Kohn < AmyI Durns ; Clar(I Nim'erly 8 Au eneI Kohn > AnaI 6ran ? 0aulaI Oolf an = $hoshI Arvin! , Ne#tonI 1ssac 9 KonesI 0atricia +. $eor eI Dush ; proc ran, !ata=file+. rou&s = 2 out = ne#; var rou&; run; - PR#N. t e datafi/e0 .EMP0 to s o"

o" t e output /oo,s&

The +A%L procedure computes ran&s for one or more numeric varia"les across the o"servations of a SAS data set and outputs the ran&s to a ne! SAS data set. ,+'6 +A%L "y itself produces no printed output.
proc sort !ata=ne#; 'y name; run; proc print !ata=ne#; title '4an!om allocation of su'Lects to 64AA6MAN6 or CON64O" rou&'; var i! name rou&; format rou& rou&f-; run;

0D

Review of SAS IF5!"-,2-LS- and .*5-,. statements


Suppose !e have a varia"le A4E !hose values range from := to BD years. Also, suppose that !e have data on A4E for BD su"7ects and no data on A4E for A su"7ects (Total sample si9e is :+) Suppose !e li&e to create a ne! varia"le AGEGROUP that has : levels2 Cevel 0 has su"7ects !hose ages are from := to >D years Cevel 9 has su"7ects !hose ages fall "et!een >D and DD Cevel : has su"7ects !hose ages are greater than or e(ual to DD %ote, ho!ever, that A4E4R3UP has seven missing values. The follo!ing SAS statement !ill create the ne! varia"le A4E4R3UP2
if a e = ) then a e rou& = ); else if F30 a e 45G then a e rou& = 1; else if F45 P a e P 55G then a e rou& = 2; else a e rou& = 3; 6henI #e nee! the follo#in /value/ an! /format/ statements that shoul! o at a&&ro&riate &lacesvalue a e r&f 1 = /$rou& +3 ;. A$A 8>/ 2 = /$rou& <3 8> P A$A P >>/ 3 = /$rou& ;3 A$A >> years/;

format a e rou& a e r&f-;

The conditions in the $!=T E%/ELSE statement can "e one of the follo!ing comparisons2 3perator Meaning 5M (J) %5 (NJ) CT (O) *T (P) *5 () C5 () 5(ual to %ot e(ual to Cess than *reater than *reater than or e(ual to Cess than or e(ual to

.n addition, the comparisons can "e com"ined into a more comple$ condition using 4oolean operators AND, 3R, and N3T.

0B

Review of SAS
SAS program in w!ic! IF;T6E<E%SE5 =6ERE5 SU$SETTIN4 IF5 and D3;END statements are sed
proc format; value yesnof value ex&ty&ef 1 = /Yes/ 0 = /No/;

1 = /Not Ax&ose!/ 2 = /0reviously ex&ose!/ 3 = /Currently ex&ose!/; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ An exam&le of the use of 1F-6%AN:A"7A an! 2O-AN2 statements to create t#o 2HMMY varia'les for a varia'le that has ; levels @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@; data file++; in&ut i! ex&*ty&e )); la'el ex&*ty&e = /6y&e of Ax&osure/ &rev*ex& = /0reviously ex&ose!5/ cur*ex& = /Curently ex&ose!5/; format ex&*ty&e ex&ty&ef- &rev*ex& cur*ex& yesnof-; if ex&*ty&e =) then !o; cur*ex&=); &rev*ex&=); en!; else if ex&*ty&e = 2 then !o; cur*ex&=0; &rev*ex&=1; en!; else if ex&*ty&e = 3 then !o; cur*ex&=1; &rev*ex&=0; en!; else !o; cur*ex&=0; &rev*ex&=0; en!; !atalines; + + < + ; + 8 + > + ? - = + , + 9 + +. + ++ < +< < +; < +8 - +> < +? < += - +, < +9 < <. < <+ ; << - <; - <8 ; <> ; <? ; <= ; <, ; <9 ; ;. ; ; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@; proc print !ata=file++; title '6a'ulation of the !ata from AJAM0+? !ata set'; run; proc freq !ata=file++; title 'FreEuency 6a'ulation from AJAM0+? !ata set'; ta'les ex&*ty&e &rev*ex& cur*ex&; run; @ Note the use of )) sym'ol;

0A

Review of SAS
proc freq !ata=file++; title 'FreEuency 6a'ulation from AJAM0+? 'ase! on O%A4A statement'; ta'les ex&*ty&e; #here Fex&*ty&e=2 or ex&*ty&e=3G; run; ---------------------------------------------------------------& data !ataset+; set file++; if Fex&*ty&e = 2 or ex&*ty&e= 3G; proc freq !ata= !ataset+; title 'FreEuency 6a'ulation from 2A6A7A6+ !ata set'; title< 'create! 'ase! on 7HD7A661N$ 1F statement'; ta'les ex&*ty&e; run;

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@; / ERE S'a'e)en' .n the a"ove program, note the use of =6ERE statement. The =6ERE statement selects the o"servations that satisfy the logical condition placed after the &ey!ord =6ERE. .n the a"ove e$ample, the last Proc Fre( is performed only on those o"servations for !hich E7P>T?PE is either 8 or 6. Su7se''ing $! S'a'e)en' .n the last part of the a"ove program, !e have used $! statement to create a ne! SAS data set. The IF statement that is used here is called a s (setting IF statement. The SET statement is used to create a ne! SAS data set called DATASET0 from the original SAS data set E7AMP0@. This ne! data set DATASET0 contains only those o"servations for !hich E7P>T?PE is either 8 or 6. T*o DUMM. :aria7les &rea'ed in '(e a7o:e progra) We started out !ith EXP>T.PE varia"le ta&ing values 1 (not exposed), 8 (pre'io sly exposed), and 6 (c rrently exposed). o o o o The dummy varia"le PREV>EXP ta&es on a value of 1 (?es5 pre'io sly exposed) !hen EXP>T.PE J ?, other!ise, PREV>EXP J (No5 not pre'io sly exposed). The dummy varia"le #UR>EXP ta&es on a value of 1 (?es5 c rrently exposed) !hen EXP>T.PE J @, other!ise, #UR>EXP J (No5 not c rrently exposed). 4oth dummy varia"les PREV>EXP and #UR>EXP ta&e a value of J ,. !hen EXP>T.PE

4oth dummy varia"les PREV>EXP and #UR>EXP ta&e a value of . (period) !hen EXP>T.PE J 3 (that is, a missing 'al e).

0<

Review of SAS Me'(ods of Ge''ing Da'a in'o '(e SAS sys'e)


Suppose !e have a file (datafile.t;t, a spa$e delimited file) !hich loo&s as follo!s in the directory $'<MA!"<.
1! !'irth en!er a e s'& !'& a!mit!at + .;:+.:+9>> male =. +<> ,. .<:.;:+9,9 < .+:<.:+9?, female ?> +<. ,. +<:<;:+99. ; .;:.=:<..+ male =+ - +;. +<:.>:<..< PR12 #MP1R. OH6= s!file 2A6AF1"A= /c3CMA6%C!atafile-txt/ 2DM7=2"M 4A0"ACA; $A6NAMA7=YA7; 2A6A4OO=2; R3N; Remarks3 1f the first ro# has no varia'le namesI then $A6NAMA7 = NOI an! henceI 2A6A4OO = + 1f the first ro# has varia'le namesI then $A6NAMA7 = YA7I an! henceI 2A6A4OO = < 4%M represent 5delimiters other than commas or tabs6

Suppose !e have a file (datafile.$sv, a $omma sepa#ated file) !hich loo&s as follo!s in the directory c2QMAT-Q.
1!I!'irthI en!erIa eIs'&I!'&Ia!mit!at +I.;:+.:+9>>ImaleI=.I+<>I,.I.<:.;:+9,9 <I.+:<.:+9?,IfemaleI?>I+<.I,.I+<:<;:+99. ;I.;:.=:<..+ImaleI=+I-I+;.I+<:.>:<..<

PR12 #MP1R. OH6 = csv*file 2A6AF1"A= /c3CmathC!atafile-csv/ 2DM7= C7V 4A0"ACA; $A6NAMA7=YA7; 2A6A4OO=2; R3N;

Suppose !e have a file (datafile.t;t, a !A= delimited file file) !hich loo&s as follo!s in the directory c2QMAT-Q.
1! + < ; !'irth .;:+.:+9>> .+:<.:+9?, .;:.=:<..+ en!er male female male a e =. ?> =+ s'& +<> +<. !'& ,. ,. +;. a!mit!at .<:.;:+9,9 +<:<;:+99. +<:.>:<..<

PR12 #MP1R. OH6 = ta'*file 2A6AF1"A= /c3CMA6%C!atafile-txt/ 2DM7= 6AD 4A0"ACA; $A6NAMA7=YA7; 2A6A4OO=2; R3N;

0?

Review of SAS
Suppose !e have an 5$cel file (e;$eldatafile.;ls) !hich loo&s as follo!s in the directory c2QMAT-Q
Id 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 Area 3 3 3 2 1 2 3 1 2 2 1 2 2 1 2 1 2 1 1 1 1 1 2 1 2 Age 1101 905 1101 611 1103 606 611 1500 702 703 1308 1004 1207 1201 1500 1007 1511 908 800 1101 709 1104 608 601 905 Sex 1 1 1 1 1 1 1 2 2 1 2 2 1 1 1 1 1 2 1 1 1 2 1 1 1 Iqv_inf 3 7 4 4 5 5 7 3 13 7 6 11 11 6 9 4 13 4 5 8 10 6 2 9 8 Iqv_comp 4 9 9 6 4 12 9 1 10 9 10 14 12 4 11 6 17 6 8 7 8 3 4 13 7

PR12 #MP1R. OH6 = xls*sas 2A6AF1"A= /c3CMA.7Cexcel!atafile-xls/ 2DM7= AJCA"<... 4A0"ACA; $A6NAMA7=YA7; R3N;

9=

Review of SAS
/or+ing *i'( Arrays Suppose !e have 8 varia"les X,, X?, >6, . . ., >8 for !hich missing (EB), re" sed to respond (EA), not applica(le (E<), and do not Anow (E?) have "een coded. Suppose !e !ant to set all of these missing value codes to 3 (missing 'al e) for all these 9= varia"les. @sing an ARRA., the follo!ing SAS program !ill accomplish this tas&2
data file+<; in&ut rou& $ x+-x<.; array file+<Q20R x+-x<.; !o i = 1 to 20; if Ffile+<QiR = -( or file+<QiR = -9 or file+<QiR = -7 or file+<QiR = -'G then file+< = ); @ if file+< in F-9I -,I -=I -?G then file+< = ); en!; !ro& i; !atalines; A + < ; 8 > ? = , -9 + < ; 8 > ? = , 9 -? ++ D -, -, -, -, -, -, -, -, -, -, -, -, . -, -, -, -, -, -, -, C ; ; ; + + < < + -= ; 8 > ? = -= 9 . ++ -, +? ;

Explanation2 o ARRA? statement defines an array "y specifying the name of the array. o The num"er of varia"les is to "e included in "races. o All the varia"les in the array must "e of the same type (all numeric or all character) o The D3 loop repeats the statements "et!een the D3 and the END a fi$ed num"er of times, !ith the value of the inde$ i changing at each repetition o The D3 loop starts !ith the inde$ varia"le i J 0 and ends !hen i J 9=.
data file+<; in&ut rou& $ x+-x<.; array file+<Q@R *numeric*; !o i = 1 to !imFfile+<G; if file+<QiR in F-9I -(I -7I -'G then file+<QiR = ); en!; !ro& i; !atalines; A + < ; 8 > ? = , -9 + < ; 8 > ? = , 9 -? ++ D -, -, -, -, -, -, -, -, -, -, -, -, . -, -, -, -, -, -, -, C ; ; ; + + < < + -= ; 8 > ? = -= 9 . ++ -, +? ; &roc &rint data8fi/e12&run&

90

Review of SAS
- 9uppose "e a:e 3 dates for eac person and "e "ant to e;tract t e ear/iest date for eac person and put it in a ne" :aria</e& data file+;; in&ut )1 !ate+ mm!!yy+.)12 !ate< mm!!yy+.)23 !ate; mm!!yy+.-; format !ate+ !ate< !ate; ne#!ate mm!!yy+.-; array oneS3T !ate+-!ate;; ne#!ate = minFof !ate+ !ate< !ate;G; !o i = 1 to 3; if oneSiT = ne#!ate then !atety&e = i; en!; !ro& i; !atalines; .,:.+:<..? .,:<>:<..? .9:.?:<..? .,:<?:<..? .,:;.:<..? .9:.=:<..? .9:.>:<..? .9:.+:<..? .9:+;:<..? ;

99

Review of SAS
Three levels of factor D'S5, t!o levels of factor D+@*, and t!o response varia"les Drug R Drug #

#0 #9 #0 #9 SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS D'S50 1 81 3 19 18 88 8 14 3 13 11 17 1 81 3 1: 19 86 3 1: D'S59 11 86 11 14 19 8: 18 18 16 89 1 17 14 87 3 1: 19 89 3 18 D'S5: 8 1: 3 88 : 14 8 18 1 18 1 1: 8 1: 3 13 : 13 8 13 .n order to perform the factorial MA%'IA analysis, !e need to rearrange the data so that they loo& li&e2 D'S5 D+@* +5S,'%S50 +5S,'%S59 Dose 0 Drug R 0= 90 Dose 0 Drug # ? 0> . . etc. (a total of := o"servations) 'nly the "olded num"ers go as the data in the SAS program

9:

Review of SAS
The follo!ing SAS program accomplishes the tas&.
data !ataforFactorialMANOVA; in&ut x+-x8; if *n* in F1I2I3I4I5G then !ose = '2ose +'; else if *n* in F'I7I(I9I10G then !ose = '2ose <'; else !ose = '2ose ;'; array ex8S4T x+-x8; !o i = 1 to 4; if i in F1I 2G then !ru else !ru

= '2ru = '2ru

J'; Y';

en!; !ro& x+-x8 i; !atalines; +. <+ 9 +8 +< << , +> 9 +9 ++ +? +. <+ 9 += +8 <; 9 += ++ <; ++ +> +8 <= +< +, +; <8 +. +? +> <? 9 += +8 <8 9 +, , += 9 << = +> , +, +. +, +. += , += 9 +9 = +9 , +9 ; proc print;

if i in F1I 3G then !o; 4es&onse+ = ex8SiT; 4es&onse< = ex8SiU1T; out&ut; en!;

runF

9>

Review of SAS $n'rodu&'ion 'o RETA$%5 !$RST3$D and LAST3$D s'a'e)en's


SAST default "ehavior is to set all varia"les to missing each time a ne! o"servation is read.

Sometimes it is necessary to remem"er the value of a varia"le from the previous o"servation. after T!e RETAIN statement speci"ies 'aria(les w!ic! will retain t!eir 'al es "rom pre'io s o(ser'ations instead o" (eing set to missing. #ou can specify an initial value for retained varia"les "y putting that value the varia"le name on the retain statement. Fo#mal .efinition' The +5TA.% statement in SAS causes a varia"le created "y an .%,@T statement or an assignment statement to retain its value from one iteration of the DATA step to the ne$t.
retain :isit1 = :isit'&

This statement retains the values of the varia"les I.S.T0 through I.S.TB from one iteration of the data step to the ne$t.
retain :isit1 = :isit' 0 ; > ? @>esA&

The values of I.S.T0 through I.S.TB are set initially to =. The varia"les R, #, and U are each set to the character value #5S.
retain :isit1 = :isit' F+G;

This +5TA.% statement assigns the initial value 0 to the varia"le :isit1 and varia"les :isit2 through :isit' are set to missing initially.
retain Ba//B;

This +5TA.% statement retains the values of all varia"les that are defined earlier in the DATA step "ut not those defined after!ards2
retain ;1C;4 F+ < ; 8G; retain ;1C;4 F+I<I;I8G; retain ;1C;4 F+38G;

All of these statements assign initial values of 0 through > to ;1 through ;92

9D

Review of SAS
Some uses of R-!AI, statementB +5TA.% statement can "e used to, among others, count items, to compare or copy data from an earlier record to other records, etc. Example o" sing FIRST8ID5 %AST8ID5 and RETAIN statements. TAS02
data

6omputation of the average value for each person using /.+ST..D CAST..D, and +5TA.% statements

ra!es; in&ut i! score )); !atalines; + 9. + ,> + +.. < ,. < 9. < ,= ; ?. ; +.. ; 9< ; proc sort !ata= ra!es; 'y i!; run; data ra!es; set ra!es;

retain sumscore count; 'y i!; if first-i! then !o; sumscore=0; count=0; en!; sumscore=sumscoreUscore; count=countU1; meanscore=sumscore:count; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@; @ 4un this &ro ram #ithout the follo#in t#o statements an! !iscuss the out&ut- 6henI run a ain inclu!in these statements an! !iscuss the out&utOhich out&ut #oul! you #ant to (ee&555555; @if last-i! then out&ut; @ (ee& i! sumscore count meanscore; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@; run; proc print !ata= ra!es; run;

GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGF

@ 6he !ata set has t#o recor!s for each &erson- 6his is li(e t#o &eo&le enterin the !ata into the same !ata set @ 6his is a useful &ro ram to !o the !ou'le-entry com&arison in clinical trials; @ 6as( here is to com&are the t#o recor!s for each &erson an! to verify if there is any !ifference 'et#een the t#o entries; @ Hse of 'oth A44AY an! 4A6A1N statements are illustrate!;

9B

Review of SAS
proc format; value cf 0 = /NO MA6C%/; data tem&; in&ut i! h' a e race !m ht stro(e cancer )24 !ate*+trt mm!!yy,- )33 !ate*lastcontact mm!!yy,- ra!e; format !ate*+trt !ate*lastcontact mm!!yy+.-; !atalines; + +8-9 8. + +;-9 8. < 9-9 >= < 9-9 8= ; ,-= 8= ; =-= 8= ; run; + + + + ; < . . . . . . . + + + + + . . + . . . . . . . . . ++:++:.; ++:++:.< +<:<+:.; +<:<+:.; +<:9:9. +<:9:9; +<:<:.; +<:<:.; ++:++:.? ++:++:.8 8:+>:9+ 8:+>:9+ + . + + + +

data tem&+; set tem&; retain f+-f+.; retain s+ - s+.; retain c+-c+.; array cQ10R c+-c+.; array fQ10R f+-f+.; array sQ10R s+-s+.; array vQ10R h' -'y i!; ra!e;

!o i = 1 to 10; if first-i! then fQiR=vQiR; else sQiR = vQiR; if fQiR=sQiR then cQiR=); else cQiR=0;

en!; !ro& f+-f+. s+-s+.; if last-i! then out&ut;

data tem&< F(ee&= i! c+ c< c; c8 c> c? c= c, c9 c+.G; set tem&+; run; data tem&;; set tem&<; rename c+ = h' c< = a e c; = race c8 = !m c> = ht c? = stro(e c= = cancer c, = !ate*+trt c9 = !ate*lastcontact c+. = ra!e;

9A

Review of SAS
proc print !ata=tem&;; format h'-- ra!e cf-; run;

9<

Review of SAS

#o)7ining Da'ase's
$n'rodu&'ion 'o Da'a Rela'ions(ips
+elationships among multiple sources of input data e$ist !hen the sources each contain common data. 'nce the data relationship e$ists, it falls into one of four groups2 A3 One='o=one

.n a oneEtoEone relationship, a single o"servation in the data set A is related to a single o"servation from another data set 4 "ased on the values of one or more selected varia"les (such as .DV in the follo!ing e$ample). B3 One='o=)any and Many='o=one

A one;to;many relationship o# many;to;one relationship "et!een t!o input data sets implies that one data set has at most one o"servation !ith a specific value of the selected varia"le "ut the other input data set may have more than one occurrence of each value. .n the follo!ing ta"ular display, oneEtoEmany and manyEtoEone o"servations in data sets, Data Set A, Data set $, and Data set 2 are related "y common values for the varia"le, $D. 'neEtoEmany relationship e$ists "et!een data set A and data set 4, !here as manyEtoEone relationship e$ists "et!een data set 4 and data set 6.
Data set A
ID 1 2 3 4 SS 123-34-4567 234-45-5654 592-14-6524 813-90-5089

Data set B
ID 1 2 2 3 3 3 4 5 Visit 1 1 2 1 2 3 1 1 Chol 215 243 222 220 212 180 250 280

Data set C
ID 1 2 3 4 5 Dx_age 45 52 56 61 40

#3

Many='o=)any

.n manyEtoEmany relationship, multiple o"servations from each input data set may "e related "ased on values of one or more common varia"les.

9?

Review of SAS An *ve#view of Met&ods fo# )ombinin1 SAS .ata Sets


.n any data management situation, one needs to com"ine multiple SAS datasets, add ne! o"servations to the e$isting data set, and many other tas&s that involve com"ining different SAS datasets. The follo!ing methods can "e used to com"ine SAS data sets2 6oncatenating .nterleaving 'neEtoEone reading 'neEtoEone merging Match merging @pdating

)on$atenation2
6oncatenating com"ines t!o or more SAS data sets, one after the other, into a single data set. The num"er of o"servations in the ne! data set is the sum of the num"er of o"servations in the original data sets. #ou can concatenate SAS data sets "y using the SET statement in a DATA step or the APPEND proced re. .f the data sets that you concatenate contain the same varia"les, and each varia"le has the same attri"utes in all data sets, then the results of the S5T statement and ,+'6 A,,5%D are the same. The follo!ing SAS statements 6'%6AT5%AT5 the t!o datasets using S5T statement2 data new; set dataset1 dataset8; #un; ,+'6 A,,5%D is a SAS procedure that also does the concatenation of t!o data sets. The synta$ is PR*) APP-,. ?=AS-@ SAS5data5setA ?.A!A@SAS5data5setA !here $ASE ) data set identifies the original data set. The DATA) data set identifies the data set to "e concatenated to the original.

Inte#leavin12
.nterleaving sorts the )on$atenated .ata Set "y the common varia"le(s). data new; set dataset1 dataset8; by id; #un;
data !ataset+; in&ut 12 %D A$A $AN2A4 4ACA AM0"OYA2;

:=

Review of SAS
!atalines; + +8-9 < 9-9 ++ ,-= +> +?-= > +<-> <> ,-8 , =-8 = 9-; 9 ++-+ <; +.-; ; run; 8. 8= 8= ;= 8> 8. <8 ?. ;, <8 . . . . . + + . + + < < < + + < < + < < + . + . + + + + . .

data !ataset<; in&ut 12 %D A$A !atalines; 8 ++-8 8? . << 9-+ ;; . +; 9-+ 8< + +, ++ ;9 + ; 9-9 >9 + +8 +.-8 >9 + += +<-= ;, . +? +.-> 8> + +9 +;-? 8< + +. +.-; >= + <+ ++-= >+ . <= 9-, >? . <, +,-< ;; + <9 ++-? 8= . ;. +<-+ 8= + ; run;

$AN2A4 4ACA AM0"OYA2; < + < ; < < ; + < < ; < + < ; + + + . + + + + + + + . + . +

data ne#+; @ CONCA6ANA61ON; set !ataset+ !ataset<; run; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@; proc sort !ata=!ataset+; 'y i!; run; proc sort !ata=!ataset<; 'y i!; run; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@; data ne#<; @ 1N6A4"AAV1N$; set !ataset+ !ataset<; 'y i!; run; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@;

:0

Review of SAS
6oncatenate the t!o data sets, DATAS5T0 and DATAS5T9 using S5T statement ,+'6 A,,5%D wit!o t /'+65 option ,+'6 A,,5%D wit! /'+65 option
o&tions no!ate nonum'er; data !ataset+; in&ut i! !atalines; + female 8> < male ?> ; male <; ? female ,> = male ?= ; run; en!er $ score;

data !ataset<; in&ut i! a e score; !atalines; 8 8> =< > >> ,< , <; >? 9 <9 9< +. =+ ,, ; run; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@R3N t is piece $#R9.; @ Create a !ataset calle! NAO; data ne#; set !ataset+ !ataset<; run; proc print !ata=ne#; run; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@R3N t is 9E21N4; @ Concatenation usin 04OC A00AN2 #ithout FO4CA o&tion; proc append 'ase=!ataset+ !ata=!ataset<; run; @ 6hese statements !ata set; ive error 'ecause A$A is not in the DA7A

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@R3N t is /ast; @ Concatenation usin 04OC A00AN2 #ith FO4CA o&tion; proc append 'ase=!ataset+ !ata=!ataset< force; run; proc print !ata=!ataset+; run; @ 0rintin the DA7A !ata set;

@ 6hese statements !o concatenate 'ecause of the FO4CA o&tion 'ut A$A is not oin to 'e a!!e! to the DA7A !ataset;

:9

Review of SAS
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@;

::

Review of SAS

One='o=one READ$%G2
6om"ines o"servations from t!o or more data sets "y creating o"servations that contain all of the varia"les from each contri"uting data set. '"servations are com"ined "ased on their relati'e position in each data set, that is, the first o"servation in one data set !ith the first o"servation in the other, and so on. The DATA step stops a"ter it !as read t!e last o(ser'ation "rom t!e smallest data set . %ote this uses multiple SET statements.

The SAS statements that !ill accomplish this tas& are2 !ata ne#; set !ataset+; set !ataset<; run;

One='o=one MERG$%G
This is same as one;to;one READIN4 !ith t!o e$ceptions2 We use MER4E statement instead of multiple SET statements DATA step reads all o"servations from all data sets.

The SAS program to do one;to;one MER4IN4 is !ata ne#; mer e !ataset+ !ataset<; run;

Ma'&( Merging
Match merging com"ines o"servations from t!o or more data sets into a single o"servation in a ne! data set "ased on the values of one or more common varia"les. !ata ne#; mer e !ataset+ !ataset<; 'y i!; run;

:>

Review of SAS

Upda'ing
@pdating uses information from o"servations in a transaction data set to delete, add, or alter information in o"servations in a master data set. +efer to the e$ample "elo!. %ote that MASTER and TRA%SA#T$O% are "oth sorted "y $D. @pdating a data set re(uires that the data "e sorted on the common 'aria(le. We can update a master ta"le "y using the UPDATE statement %ote that UPDATE does not replace nonEmissing values in a MASTER data set !ith missing values in a TRANSA2TI3N data set.
TRANSACTION
d " = = = = = = = = = = = hy p = 0 = = = = = chd = = = = str & = = = = = = = = can = = = = = = = 0 = = num -" 0>. ? ?.? <.A age >= >A >A :A 09. D <.> A.> ?.: >D >= 9> B= :< 0=. : 00. > ?.0 9> >B :: se$ = = = = = 0 0 = 0 0 = = race 9 9 9 0 0 9 9 0 9 9 9 0 = = = = = = 0 = d " = = = hy p = 0 0 = = = = = 0 = 0 = chd str & = = = = = = = = = = = = can

MASTER DATABASE
num -" 0>. ? <.A 0B. A 09. D <.> ?.: 00. 0 0=. : age >= >A >A :A >D >= 9> B= :< 9> se$ = = = = race

1 2 3 4 5 6 7 8 9 10

9 9 0 0

1 2 3 4 5 6 7 8 9 10 11 12

= = = = = = = 0 = = =

= = = = = = 0 = = = =

0 0 = 0 0

9 0 9 9

= = = 0 =

data master; in&ut num %' a e sex race !' hy& ch! str( !atalines; + +8-9 8. . . . . < 8= . < . . ; ,-= 8= . < . + . 8 +?-= ;= . + . . . > +<-> 8> + . . ? ,-8 8. + . . . = <8 + < . . . , 9-; ?. . + . . . 9 ++-+ ;, + < . + +. +.-; <8 + < . . . ; run;

can; . . . . . . . . . . . . . . . + . .

:D

Review of SAS
data transaction; in&ut num %' a e sex race !' hy& ch! str( !atalines; + +8-9 8. . < . . < 9-9 8= . < . + . ; ,-= 8= . < . + . 8 ;= . + . . > +<-> 8> . + . . . ? ,-8 8. + < . . . = =-8 <8 + < . . . , 9-; ?. . + . . . 9 ;, + < . + + +. +.-; <8 + < . . . ++ ++-8 8? . < + + . +< 9-+ ;; . + . . . ; run; data u&!atin ; u&!ate master transaction; 'y num; run; proc print !ata=H02A61N$; title '6a'ulation of !ata from H02A61N$ #hich is o'taine! 'y'; title< 'H02A61N$ the MA76A4 !ata set usin 64AN7AC61ON !ata set'; run; can; . . . . . . . . . . . . . . . . . . + . . . .

6onsider the follo!ing t!o datasets2 DATASET, id h"


+ < ; 8 > ?

+8-9 9-9 ,-= +?-= +<-> ,-8

age
8. 8= 8= ;= 8> 8.

gender race
F F F F F M + + ; < + +

dm
. . . . . .

DATASET? .d
+ < ? , ; 8 +.

ht
. + . . + . .

stro&e cancer grade


. . . . . . . . . . + . . . + + + . + + +

:B

Review of SAS
A. +un the follo!ing SAS statements
4ata !+!<*one; mer e !ataset+ !ataset<; run;

B.

+un the follo!ing SAS statements


4ata !+!<*one; mer e !ataset+ !ataset<; 'y i!; run;

#.

+un the follo!ing SAS statements


4ata !+!<*one; mer e !ataset+ Fin=aG !ataset<; 'y i!; if a; run;

%o'e2 merge dataset0 Cin)aD dataset+/ (y id/ i" a/ Merge the data from the t!o data sets, DATASET0 and DATASET+5 only for those values of $D that are in DATASET0. D. +un the follo!ing SAS statements
4ata !+!<*one; mer e !ataset+ !ataset< Fin=aG; 'y i!; if a; run;

%o'e2 merge dataset0 dataset+ Cin) aD/ (y id/ i" a/ Merge the data from the t!o data sets, DATASET0 and DATASET+5 only for those values of $D that are in DATASET+. E. +un the follo!ing SAS statements
4ata !+!<*one; mer e !ataset+ Fin=aG !ataset< Fin='G; 'y i!; if a an! '; run;

%o'e2 merge dataset0 Cin)aD dataset+ Cin) (D/ (y id/ i" a and (/ Merge the data from the t!o data sets, DATASET0 and DATASET+5 only for those values of $D that are in (ot! DATASET0 and DATASET+. :A

Review of SAS

Int#odu$tion to Ma$#os
Ma$#o Ba#iable
The simplest !ay to define a macro 'aria(le is to use ALET statement to assign the macro varia"le a name and a value. E1a)ple2
Vlet Vlet Vlet Vlet data a = ;; ' = 8; c = -<; ! = a Eua!ratic eEuation; tem&; in&ut x )); y = Wa@x@x U W'@x U Wc; !atalines; ; < + > ; ? < ; 8 ; ; proc print !ata=tem&; title /Y is the value of W!/; run;

%o'eB

@se of do (le E otes in the title statement. @se of C (ampersand) to let the system &no! that !e are referring to a macro varia"le.

This is a simple e$ample of replacing te$t strings, and computation using macro varia"les.

=asi$ )omponents of a Ma$#o


5ach macro starts !ith AMA#RO )a&ro=na)e9 !here macro;name is any name su"7ect to the standard SAS naming convention. 5ach macro must end !ith AME%D )a&ro=na)e9 Note that macro;name specified in the AME%D statement must match !ith the macro;name specified in the AMA#RO statement. The SAS statements that go "et!een AMA#RO and AME%D are referred to as a )a&ro defini'ion. To call a macro, !e need the follo!ing statement D)a&ro=na)eF

:<

Review of SAS
A study !as conducted to e$amine gender and racial differences among individuals BD years of age and older !ho suffered a hip fracture. The data are given "elo!2 Age *roup BDEA> ADE<> <DE?> ?DW White Men :B,>A: B9,D0: >=,?AD >=<< 4lac& Men 99?D 9?=9 0BD? 9=< White Women 0=:,0=D 9::,=>A 0<?,>D? 0<,9>A 4lac& Women :>9D B<0? D?B< ?:>

Dra! a plot of num"er of hip fractures versus age group for each com"ination of +A65 and *5%D5+. /ollo!ing is the SAS program2
o&tions no!ate nonum'er; proc format; value a e r&f 1='?> - =8' 2='=> - ,8' 3=',> - 98' 4='9>U'; value $race r& '#m'='O%16A ''m'='D"ACN '##'='O%16A ''#'='D"ACN MAN' MAN' OOMAN' OOMAN';

data macro<;

la'el a e rou&=/A e rou& of &atients/ rou&=/Com'ination of 4ace an! $en!er/ num'er=/No- of hi& fractures/; in&ut a e rou& rou& $ num'er ));

!atalines; + #m ;?8=; + 'm <<9> + ## +.;+.> + '# ;8<> < #m ?<>+; < 'm <9.< < ## <;;.8= < '# ?,+9 ; #m 8.9=> ; 'm +?>9 ; ## +,98>9 ; '# >9?, 8 #m 8.,, 8 'm <., 8 ## +,<8= 8 '# 9;8 ; proc sort; 'y rou&; run; Dmacro &lot FxI yI MG; &roc &lot !ata=macro<; title /0lot of Num'er of %i& Fractures Vs- A$A$4OH0/; title< /for the/; &lot Wx@Wy=Wx; 'y WM; format a e rou& a e r&f- rou& $race r&-; run; Dmend &lot; VplotFnum'erI a e rou&I rou&G;

:?

Review of SAS
4e&lace what is in the box above 'y the follo#in - Oe ans#eret the same

proc !p/ot !ata=macro<; title '0lot of Num'er of %i& Fractures Vs- A$A$4OH0'; title< 'for the'; &lot num'er@a e rou& = num'er; 'y rou&; format a e rou& a e r&f- rou& $race r&-; run;

Re1#ession -;ample usin1 ma$#os2


proc format; value toxemiaf data tem&; 0='A'sence of 6oxemia !urin &re nancy' 1='0resence of 6oxemia !urin &re nancy'; esta e 'irth#t moma e toxemia;

in&ut su'L*i! hea!circ len th

la'el hea!circ=/%ea! circumference/ len th=/"en th of the infant/ esta e=/$estational a e/ 'irth#t=/Dirth Oei ht of the infant/ moma e=/Mothers a e/ toxemia=/0resenceF+G or A'senceF.G of 6oxemia/; !atalines; + <= 8+ <9 +;?. ;= . < <9 8. ;+ +89. ;8 . ; ;. ;, ;; +89. ;< . 8 <, ;, ;+ ++,. ;= . > <9 ;, ;. +<.. <9 + ? <; ;< <> ?,. +9 . = << ;; <= ?<. <. + , <? ;, <9 +.?. <> . 9 <= ;. <, +;<. <= . +. <> ;8 <9 ,;. ;< + ++ <; ;< <? ,,. <? . +< <? ;9 ;. ++;. <9 . +; <= ;, <9 ++8. <8 . +8 <= ;9 <9 +;>. <? . +> <? ;= <9 9>. <> . +? <= ;9 <9 +<<. <> . += <? ;, <9 9,. <, . +, <9 8< ;; +8,. ;. . +9 <, ;9 ;; +<>. ;+ + <. <= ;, <9 +<>. ;= . <+ <> ;? <, 9<. ;+ . << <> ;, ;. +.<. <+ . <; <8 ;8 <= =>. <+ . <8 ;+ 8< ;; +8,. ;. . <> <, ;= ;< ++8. <; . <? <; ;< <, ?=. ;; + <= <? ;? <9 ++>. +, .

>=

Review of SAS
<, <9 ;. ;+ ;< ;; ;8 ;> ;? ;= ;, ;9 8. 8+ 8< 8; 88 8> 8? 8= 8, 89 >. >+ >< >; >8 >> >? >= >, >9 ?. ?+ ?< ?; ?8 ?> ?? ?= ?, ?9 =. =+ =< =; =8 => =? == =, =9 ,. ,+ ,< ,; ,8 <> <; <, ;> <8 ;; <, <? <? <? <, <9 <, <= <= <, <, << << <, <8 <, <; << <; <, <> <= <8 <; <+ <> << <= <, <? <, <= ;+ ;. <? <= <8 <> <> <9 <> <9 ;+ <9 <? <; <; <> <> <, ;. ;8 ;; ;= ;? ;+ ;9 ;9 ;= ;? ;= 8. 8; ;= ;? ;, ;9 8+ ;< ;< 8. ;< 8. ;; ;+ ;< ;9 ;, ;> <9 <. ;< ;> ;; ;, 8. ;> ;, ;9 8; ;, ;, ;, ;? ;9 ;? 8+ ;> 8+ 8. ;9 ;= ;+ ;8 ;= ;? 8. 8< <, <9 ;. ;+ ;. ;+ <9 <= <= <= ;< ;+ <, ;. <9 <, ;+ <= <> ;. <, <, <> <; <= <, <= <= <? <> <; <? <8 <9 <9 <= ;. ;. ;< ;; <= ;+ <? <= <= ;> <, ;. ;+ ;. <= <> <> <? <9 <9 ;8 +.;. >?. +<?. 9.. ?<. +88. +;>. ++=. ++=. ++=. +8<. +8=> +<.. 9=. ++9. +;>. +8?. >=. ?<. +<.. ??. +<,. ,;. ?,. ?8. +88. ++>. ,>. =?. ?<. ??. 9,> ?9. +<.. +;=. ++=. +<.. ++.. +8,. +;>. ++?. +;;. 9?. +.,. +... +89. ,,. +;=. +;<> +8.. +<8. ??. =,. 9>. ,>. +<.. +88. <. <9 <? <; ;; <? ;, <; <; <? <= <= +9 ;. <= ;+ <9 <; ;> <= ;> <, <+ <? <+ <9 ;< ;< <8 +9 8+ ;; += <? <. <> ;. ;< ;; ;8 <; ;. << ;> <? ;8 +, <; +, <. <9 +, <> +8 ;9 ;< ;> . . . . . . . . . . . . . + . . . + . . . . . . . . . . . . . . . + . . . . . + . . . . . + + . + . . . . . . . +

>0

Review of SAS
,> <? ;, ;. ++.. <; + ,? <? ;, <9 +<>. +, . ,= <9 8< ;; +8<. <, + ,, <, ;, ;. +8.. +9 . ,9 <= ;? <9 +8<. ;? . 9. <8 ;8 <8 9.. <9 . 9+ <9 ;, ;; ++?. ;; + 9< <; ;8 <> ,<. ;9 . 9; <, 8+ ;< +8+. <9 + 98 <= ;9 ;+ +;.. ;; + 9> <? ;, ;+ +++. ;. . 9? <? ;= ;+ ,<. ;. + 9= <= 8. <9 ++>. <, . 9, <, ;> ;< ,,. ;> + 99 <, 8+ ;; +;<. ;? + +.. <? ;, <, +.,. ;? . ; run; data tem&+; set tem&; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@; @ Macro to &erform a sim&le linear re ression F#ith one ex&lanatory varia'leG; Dmacro sl*re FaI 'G; &roc re !ata=tem&+; title /7im&le "inear re ression of Wa on W'/; mo!el Wa = W'; format toxemia toxemiaf-; run; Dmend sl*re ; @@@@@@@@@@@@@@@@@@@@@@@@@@@; Vsl_regFlen thI esta eG; Vsl_regFlen thI moma eG; Vsl_regF'irth#tI esta eG; Vsl_regF'irth#tI moma eG; Vsl_regFhea!circI esta eG; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@; @ Macro to &erform a multi&le re ression F#ith t#o ex&lanatory varia'lesG; Dmacro re *<var FaI 'I cG; &roc re !ata=tem&+; title /Multi&le 4e ression of Wa on W' an! Wc/; mo!el Wa = W' Wc; format toxemia toxemiaf-; run; Dmend re *<var; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@; Vreg_2varFlen thI esta eI moma eG; Vreg_2varF'irth#tI esta eI moma eG; Vreg_2varFhea!circI esta eI moma eG; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@;

>9

Vous aimerez peut-être aussi