Vous êtes sur la page 1sur 42

R e t u r nt oU n i v e r s i t yN o t e si n d e x

CS135
D e s i g n i n gF u n c t i o n a lP r o g r a m s

Instructor:
N a m e :S a n d r a( S a n d y )G r a h a m E m a i l :s a n d y . g r a h a m @ u w a t e r l o o . c a O f f i c e :M C6 4 2 3 O f f i c eh o u r s :T u e s d a y si nM C2 0 6 2 / 2 0 6 3 ,T h u r s d a y si nM C4 0 6 5 ,1 : 1 5 2 : 1 5P M

ISA = instructional support assistant Drop by the Tutorial Center (MC 4065) during the scheduled hours for assistance, no appointments needed.

i-clicker
Do this before every class: 1. Hold On/Off until power button blinks. 2. There is an i-clicker sticker on the wall, says DA. Press D and then A. 3. The Vote Status light should flash green.

Programming Language Design


Two important branches of language design: Imperative: frequent changes to data - Java, C++ Functional: computation of new values rather than changing old ones - LISP, ML, Haskell, Erlang, F# - closely connected to math, easier to reason about/design programs

Scheme
Member of the LISP family of languages. Usually no side effects - operations do not affect other ones Functional language Basic Scheme forms:
; ;b l o c kc o m m e n t 5;i n l i n ec o m m e n t 6;a t o mn u m b e r " a b c ";a t o ms t r i n g

Stylistically, single line comments should use two semicolons; however, this is not required by the syntax. Primary aspects of course: Design Abstraction Refinement of old ideas Syntax, expressiveness, semantics Communication with human and computer

Functions

In math, functions generalize similar expressions:


f ( x )=x ^ 2 + 4 * x + 2 g ( x , y )=x + y

Function consist of: Function name Parameters Algebraic expression of parameters Application of function:
f ( 3 ) g ( 5 , 6 )

Application supplies arguments (the values) that correspond to the parameters. In math, application is evaluated by substitution:
f ( g ( 5 , 6 ) )=f ( 5 + 6 )=f ( 1 1 )=1 1 ^ 2 + 4 * 1 1 + 2=1 6 7

Evaluation can be done in any order:


g ( g ( 1 , 3 ) , f ( 2 ) )=g ( 1 + 3 , f ( 2 ) )o rg ( 1 , 3 )+f ( 2 )

The Scheme interpreter (program that evaluates Scheme code) uses a left to right, depth-first evaluation order - inside-out, left to right. Math is written in infix notation - the operator is placed between its operands. There are also notations known as prefix and postfix notation - operator before operands, and operator after operands, respectively. Scheme uses prefix notation. Prefix notation needs no order of operations because there is no ambiguity. Convert infix to prefix:
( 6 4 ) / ( 5 + 7 ) W h a ti st h el a s to p e r a t o rt ob ea p p l i e d ? /( 6 4 )( 5 + 7 ) R e p e a tp r o c e s s . /-64+57 T h i si sv a l i dp r e f i xn o t a t i o n ,b u tn o tv a l i dS c h e m e . S i n c ei nS c h e m ea r b i t r a r yn u m b e r so fo p e r a n d sa r es u p p o r t e d ,w en e e dt oa d db r a c k e t st om a k ei te x p l i c i t . ( /( -64 )( +57 ) )

Conversion is done by moving the last operator to be applied to the beginning of the subexpression until no infix operators remain. Operand order remains the same.

Prefix Notation
If we treat infix operators as functions, we don't need to use parentheses to specify order of operations:
3-2; i n f i xn o t a t i o n ( 3 ,2 ); p r e f i xn o t a t i o n

Convert to prefix notation:


( ( ( 3 + 8 ) ( 7 + 9 ) ) / 1 2 ); i n f i xn o t a t i o n /( ( 3 + 8 ) ( 7 + 9 ) )1 2 /-( 3 + 8 )( 7 + 9 )1 2 /-+38+791 2; p r e f i xe x p r e s s i o n ( /( -( +38 )( +79 )1 2 ) ); s c h e m ec o d e

Scheme code needs the brackets in order to support arbitrary numbers of parameters.

DrRacket
Racket (Scheme) development environment. DrRacket has interactions and definitions panes. Definitions are persistent and are saved on permanent storage. Interactions are

realtime and users interact with programs here, but are not saved. The interactions pane is a REPL (read-eval-print-loop), a way to write some code, execute it, and get results immediately. Integers in Scheme are unbounded - they can be arbitrarily large without fear of overflows. Rational numbers in Scheme are represented and computed exactly, without any loss in precision. Scheme tries to use exact numbers whenever possible. When an exact value is not possible, such as with irrational numbers, they are marked as inexact. Inexact values taint all computations it is used in with inexactness.
( s q r t2 )e v a l u a t e st o# i 1 . 4 1 4 2 1 3 5 6 2 3 7 0 9 5 1;# i Xr e p r e s e n t sal i t e r a li n e x a c tv a l u e ( e x p t21 0 0 )e v a l u a t e st o1 2 6 7 6 5 0 6 0 0 2 2 8 2 2 9 4 0 1 4 9 6 7 0 3 2 0 5 3 7 6; e x a c t ( /51 2 )e v a l u a t e st o$ \ f r a c { 5 } { 1 2 } $; e x a c t # i 1 . 2 3; i n e x a c t 1 . 2 e 1 2; e x a c t 1 . 2 3 4 5 6 7; e x a c t 1 2 3 4 5; e x a c t

Common errors: Mismatched brackets: ( +12 Infix operators: ( 1+2 ) Runtime errors: ( /3( -22 ) ) (division by 0) The stepper tool is useful for tracing execution one step at a time. Scheme is a dynamically typed language - types do not need to be declared. Contracts are not enforced by the language since they are just comments. However, we can explicitly check for types to catch errors. This contrasts with statically typed languages such as Java, where the type is associated with identifiers and only certain values are allowed to be stored in them. Types are associated with values, but not with identifiers such as parameters or constants.

Definitions
Defining functions in math:
f (x) = x
2

This follows the general pattern of n a m e ( f o r m a l _ p a r a m e t e r s )=b o d y In Scheme, this is written ( d e f i n e( n a m ef o r m a l _ p a r a m e t e r s )b o d y ). For example:
( d e f i n e( s u mxy )( +xy ) )i se q u i v a l e n tt os u m ( x , y )=x+y

This is called with something like the following:


( s u m56 );5a n d6a r et h ea r g u m e n t s d e f i n e is a special form. It looks like a Scheme function, but its arguments are not necessarily evaluated, and this form may do something special normal functions cannot. d e f i n e binds a name to an expression.

A definition can only be defined once - d e f i n e cannot be used twice on the same identifier. However, redefinition is possible in the full Scheme language. All operators in scheme are actually just functions: +, -, s q r t are predefined in the environment when the program starts. This means that they can be redefined, too. Evaluate ( *( -64 )( +32 ) ):
( *( -64 )( +32 ) ) ( *2( +32 ) ) ( *25 ) 1 0

On paper:
(* (- 6 4) (+ 3 2)) (* 2 (+ 3 2)) (* 2 5) 10

Functions are applied via substitution, as in math. There is only one solution to every possible expression - there is no ambiguity. Functions can only return one value.

Constants
Constants do not accept parameters, and simply have a constant value:
( d e f i n ep i3 . 1 4 1 5 9 2 6 5 3 5 ) ( d e f i n ed e n s i t y( /m a s sv o l u m e ) )

Orders of definitions are not important at this point. Definitions can be done in any order. Constants are a special case of definitions. Constants are only evaluated once, and are not evaluated again upon substitution.

Scope
Inner scopes override outer scopes:
( d e f i n ex3 ) ( d e f i n e( fx )( *xx ) ) ( f4 );i nt h eb o d yo ff ,xi s4 ,s i n c et h ep a r a m e t e ri si nt h ei n n e rs c o p ea n do v e r r i d e sx = 3i nt h eo u t e rs c o p e

Every function has its own scope. Scopes are environments where bindings exist.

17/9/13
Constants have various advantages: Gives meaningful names to magic numbers. Reduces typing and errors if values need to be changed. Makes programs easier to understand. Constants are sometimes called variables, but are generally not changed. Unevaluated code is highlighted in black. Tests try to evaluate all possible code paths and all the highlighting should disappear. Scheme programs are sequences of definitions and expressions. Expressions are evaluated using substitution to produce values. Expressions may use special forms such as d e f i n e, which may not necessarily behave in the same way as normal expressions.

The Design Recipe


Programs are acts of communication: between person and computer, between person and same person in the future, and between person and others.
;c o m m e n t ss t a r tw i t has e m i c o o l o na n dg oo nu n t i lt h ee n do ft h el i n e

Block comments are comments that generally go on for multiple lines. These are, by convention, written with two semicolons:
; ;b l o c kc o m m e n t s ; ;g e n e r a l l ya p e p a ra tt h eb e g i n n i n go ff i l e s ; ;a n db e f o r ef u n c t i o n s

Every function must follow the design recipe - a development process that leaves behind a written explanation of development. Design recipes result in robust and reliable functions that are easy to understand. The five parts of the design recipe are, in order of submission: Contract: information for the user - function signature - argument types and descriptions, return types and descriptions. Purpose: description of what the function is designed to compute - what it produces or returns. Examples: clarification of the general use of the function and what usage of it looks like. Should represent each part of the data

definition. Definition: The Scheme header and body of the function. Tests: a representative set of inputs and expected outputs showing that the function works - expected outputs must be calculated by hand or some other source. Examples are similar to tests, but tests generally only show the function works while examples show people how to use it. There are usually more tests than examples. Recommended order of execution: Write contract. Write purpose. Write examples. Write definition body. Write tests. Write a function that sums the squres of two numbers:
C o n t r a c t : ; ;s u m o f s q u a r e s :N u mN u m>N u m P u r p o s e : ; ;P u r p o s e :p r o d u c e st h es u mo fs q u a r e so fa r g 1a n da r g 2 4 E x a m p l e s : ; ;E x a m p l e s : ( c h e c k e x p e c t( s u m o f s q u a r e s34 )2 5 ) ( c h e c k e x p e c t( s u m o f s q u a r e s02 . 5 )6 . 2 5 ) B o d y : ( d e f i n e( s u m o f s q u a r e sa r g 1a r g 2 ) ( +( s q ra r g 1 )( s q ra r g 2 ) ) ) T e s t s : ( c h e c k e x p e c t( s u m o f s q u a r e s12 )5 ) ( c h e c k e x p e c t( s u m o f s q u a r e s0 . 0 11 0 0 0 )1 0 0 0 0 0 0 . 0 0 0 1 ) ( c h e c k e x p e c t( s u m o f s q u a r e s5 02 8 )3 2 8 4 ) ( c h e c k e x p e c t( s u m o f s q u a r e s1 / 2 56 5 )4 2 2 5 . 0 0 1 6 )

Types used in contract (case sensitive): Num: any Scheme numeric value Int: any integers Nat: natural numbers Boolean: Boolean value Symbol: symbolic value String: string value Char: character value Any: any type of value Tests should be written after the code body. They should be small and focused with a clear purpose.
( c h e c k e x p e c t( +12 )3 );c h e c k st h a tav a l u ei se x a c t l ye q u a lt oa n o t h e r ( c h e c k w i t h i n( s q r t2 )1 . 4 1 40 . 0 0 1 );c h e c k st h a tav a l u ei se q u a lt oa n o t h e rw i t h i nat o l e r a n c e ( c h e c k e r r o r( /10 )" / :d i v i s i o nb yz e r o " ); c h e c k st h a tac e r t a i ne r r o ro c c u r s

These are special forms and are evaluated at the end. A summary of the test results are shown in the interactions window. Write a function that rounds to a given number of decimal places:
; ;r o u n d t o :N u mI n t>N u m ; ;P u r p o s e :p r o d u c e st h ev a l u eg i v e nr o u n d e dt oag i v e nn u m b e ro fd e c i m a lp l a c e s ; ;E x a m p l e s : ( c h e c k e x p e c t( r o u n d t o1 . 2 51 )1 . 2 ) ( c h e c k e x p e c t( r o u n d t o2 3 . 4 51 )2 0 ) ( d e f i n e( r o u n d t ov a l u ed e c i m a l p l a c e s ) ( /( r o u n d( *v a l u e ( e x p t1 0d e c i m a l p l a c e s ) ) ) ( e x p t1 0d e c i m a l p l a c e s ) ) ) ; ;T e s t s ( c h e c k e x p e c t( r o u n d t o1 . 2 51 )1 . 2 );r o u n dd o w nt o w a r d se v e nn u m b e r ( c h e c k e x p e c t( r o u n d t o1 . 3 51 )1 . 4 );r o u n du pt o w a r d se v e nn u m b e r ( c h e c k e x p e c t( r o u n d t o1 2 . 35 )1 2 . 3 );f e w e rd e c i m a lp l a c e st h a nr e q u e s t e d ( c h e c k e x p e c t( r o u n d t o1 20 )1 2 );b o u n d a r yc o n d i t i o n

We can put . . . as a placeholder for the function body before actually writing the body. If the contract is violated, the result may be undefined. For example, ( r o u n d t o30 . 5 ).

Starting with the Intermediate Student teaching language, helper functions that supplement a wrapper function need only a contract and purpose if the wrapper function obeys all of the following: One line of code in the body. Includes a function application of the helper function with modified or additional parameters. Mutually recursive functions, should be directly adjacent. They only need one set of examples and tests for all of them, but each one still neesd a contract and purpose. The tests for the wrapper function, however, must fully test the helper function as well. Templates are useful, but are not required to unless specifically requested to, or for custom data types. See "Generative Recursion and the Design Recipe" for more concerns when using the design recipe with generative recursion.

19/9/13
Boolean Values
Scheme represents Boolean values with the literals # t and # f( t r u e and f a l s e are also usable in the Scheme teaching languages), representing true and false respectively. The equality function ( =xy )( ( =N u mN u m )>B o o l e a n ) tests whether two numbers are equal and results in a boolean value. ( <xy ) and ( > =xy ) behave similarly. Predicates are expressions that result in Boolean values. They are, by convention, given names that end with a question mark. For example, ( e v e n ?x ) is clearly a predicate. The most common Boolean operators are ( a n dxy. . . ), ( o rxy. . . ) , and ( n o tx ). They represent x y , x y , and x , respectively. Scheme has no inequality (not-equals) operator. However, it can be implemented as follows: ( n o t( =xy ) ). Scheme uses short circuit evaluation. For a n d and o r , if the result of the expression is known before the evaluation is complete, the rest is not evaluated: If o r has a true argument, it knows that the result must be true regardless of other arguments - ( o r# t( /1 0 ) ) will not give an error, since the division is never evaluated. If a n d has a false argument, it knows that the result must be false regardless of other arguments - ( a n d# f( /10 ) ) will not give an error, since the division is never evaluated. This is made possible by a n d and o r being special forms. Many types have an equality predicate, like s y m b o l = ? and s t r i n g = ? , which should be used whenever possible. However, if the types of the operands are not known befrehand, ( e q u a l ?xy. . . ) can be used to check that they are compatible types and that they have the same value. This does not work with inexact numbers.

Strings
Strings are denoted by double quotes: " C S 1 3 5 ", " a b c " ," ". The length of a string is determined with ( s t r i n g l e n g t hx ). We determine if a value is a string with the predicate function ( s t r i n g ?x ). We concatenate strings using ( s t r i n g a p p e n dxy. . . ) String comparisons are done based on ASCII values.

Symbols
Symbols are denoted by a single quote: ' s y m b o l. A symbol represents a particular idea. They are used to define a finite set of values, each one with a name. Symbols can only be compared, not manipulated like with strings. Write a predicate function that checks if the input is a valid multiple choice answer:

; ;v a l i d c h o i c e :A n y>B o o l e a n ; ;P u r p o s e :p r o d u c e st r u ew h e nt h ea n s w e ri so n eo f" A " ," B " ," C " ," D " ,f a l s eo t h e r w i s e . ; ;E x a m p l e s : ( c h e c k e x p e c t( v a l i d c h o i c e ?1 2 3 )# f ) ( c h e c k e x p e c t( v a l i d c h o i c e ?" C " )# t ) ( d e f i n e( v a l i d c h o i c e ?v a l u e ) ( a n d( s t r i n g ?v a l u e ) ( o r( s t r i n g = ?v a l u e" A " ) ( s t r i n g = ?v a l u e" B " ) ( s t r i n g = ?v a l u e" C " ) ( s t r i n g = ?v a l u e" D " ) ) ) ) ; ;T e s t s ( c h e c k e x p e c t( v a l i d c h o i c e ?" A " )t r u e ) ( c h e c k e x p e c t( v a l i d c h o i c e ?" B " )t r u e ) ( c h e c k e x p e c t( v a l i d c h o i c e ?" C " )t r u e ) ( c h e c k e x p e c t( v a l i d c h o i c e ?" D " )t r u e ) ( c h e c k e x p e c t( v a l i d c h o i c e ?" p o t a t o " )f a l s e ) ( c h e c k e x p e c t( v a l i d c h o i c e ?1 2 3 )f a l s e )

Conditional Expressions
The special form c o n d is used to write conditionaal expressions in Scheme. Each argument is a question/answer pair, where the question is a boolean expression:
( c o n d [ ( <x0 )( -x ) ] [ ( > =x0 )x ] )

The above results in the absolute value of x. Square brackets are used by convention. Square brackets are equivalent to parentheses in the teaching languages.
c o n d evaluates the question in each pair from top to bottom. As soon as one is true, its associated answer is evaluated and returned. If no pair matches, a runtime error is generated.

The last pair can use the question e l s e to always match:


( c o n d [ ( =12 )3 ] [ ( =45 )6 ] [ e l s e7 ] )

Write a program that converts a numeric grade to a letter grade:


( d e f i n e( c o n v e r t g r a d ep e r c e n t a g ea d v a n c e d ? ) ( s t r i n g a p p e n d ( c o n d [ ( > =p e r c e n t a g e8 0 )" A " ] [ ( > =p e r c e n t a g e7 0 )" B " ] [ ( > =p e r c e n t a g e6 0 )" C " ] [ ( > =p e r c e n t a g e5 0 )" D " ] [ e l s e" F " ] ) ( c o n d [ a d v a n c e d ?" + " ] [ e l s e" " ] ) ) )

When testing c o n d statements, test values on boundaries, and test values for each case. A statement with 4 cases might need 7 tests.

24/9/13
Simplifying Conditionals
If a question is asked, we know that all the questions before it are false. For example, we can simplify the following:

( c o n d [ ( <g r a d e5 0 )' f a i l ] [ ( a n d( <g r a d e6 0 )( > =5 0 ) )' p o o r ] [ ( > =g r a d e6 0 )' a c c e p t a b l e ] )

Into the following:


( c o n d [ ( <g r a d e5 0 )' f a i l ] [ ( <g r a d e6 0 )' p o o r ] [ e l s e' a c c e p t a b l e ] )

For conditional expressions, each question and answer should have one corresponding tests. The tests should be simple and directly test a particular answer. More tests are appropriate at boundary points as well. In the above case, good test values would be 40, 50, 55, 60, and 70. Every way each argument could be false needs to be false, and each one needs a test. Some tests are based on the problem description - these are black-box tests. They are not based on anything in the code, such as implementation details. Some tests are based on the code itself - these are white-box tests. They may check things like specific conditionals or boolean expressions. Both types of testing are important. Helper functions generalize similar expressions, and help avoid overly complex expressions. Helper functions should use meaningful names and must follow the design recipe.

Syntax/Semantics
Syntax is the way we're allowed to say things. Semantics is the meaning of what we say. Ambiguity is the property of sentence having multiple meanings. Scheme programs must have correct syntax, meaningful semantics, and be unambiguous.

Syntax
Grammars enforce syntax and avoid ambiguity. For example, an English sentence might be described as follows:
< s e n t e n c e >=< s u b j e c t >< v e r b >< o b j e c t >

The grammar is the syntactic model of the Scheme language.

Semantics
A semantic model provides a way to predict the result of running any program. Ellipses ( . . .) can represent omissions, indicate patterns, and more. Pattern ellipses often represent multiple arguments or parameters. A semantic model for Scheme is based on substitution, where we step through the program one substitution at a time: 1. Find the leftmost (from beginning) expression that can have a rule applied to it. A rule can only be applied if the expression depends only on simple values. Otherwise, the non-simple values need to be simplified first. 2. Rewrite it according to the substitution rules: Built-in function applications become their values. ( f. . . ) => ( r e s u l to fe v a l u a t i n gf ( . . . ) ) User defined function applications become their bodies, with arguments inserted. when ( d e f i n e( f. . . )e ) occurs to the left, ( f. . . ) => ( ew i t hs u b s t i t u t i o no fp a r a m e t e r sf o ra r g u m e n t s ) Constants become their values. when ( d e f i n ex. . . ) occurs to the left, x => . . . Conditional expressions become an answer if a question is true, or lose a question/answer pair otherwise. ( c o n d[ t r u ee ] ) => e ( c o n d[ f a l s ee ]. . . ) => ( c o n d. . . ) ( c o n d[ e l s ee ] ) => e And and or become short circuiting arguments, and lose non-short-circuiting arguments.

( a n df a l s e. . . ) => f a l s e ( a n dt r u e. . . ) => ( a n d. . . ) ( a n d ) => t r u e ( o rt r u e. . . ) => t r u e ( o rf a l s e. . . ) => ( o r. . . ) ( o r ) => f a l s e Structure constructors stay as-is, though arguments are simplified. ( m a k e p o s n. . . ) => ( m a k e p o s n. . . ) ( m a k e p o s n81 ) => ( m a k e p o s n81 ) Structure selectors become the value of its corresponding field. ( p o s n x( m a k e p o s n42 ) ) => 4 Structure predicates become a boolean representing whether the argument is an instance of the structure. ( p o s n ?( m a k e p o s n12 ) ) => t r u e ( p o s n ?5 ) => f a l s e Lists stay as-is, though arguments are simplified. ( c o n s1( c o n s2e m p t y ) ) => ( c o n s1( c o n s2e m p t y ) ) ( l i s t12345 ) => ( l i s t12345 ) in "Beginner Student with List Abbreviations" and above. Local definitions are renamed, rebound, and hoisted. See "Local Definitions and Lexical Scope" for more details. ( l o c a l[ ( d e f i n ex. . . ). . . ]. . . ) => ( d e f i n e( n e wn a m ef o rx ). . . )( b o d yo fl o c a lw i t hxs u b s t i t u t e dw i t ht h en e wn a m ef o rx ) ( l o c a l[ ( d e f i n e( f. . . ). . . ). . . ]. . . ) => ( d e f i n e( ( n e wn a m ef o rf ). . . ). . . )( b o d yo fl o c a lw i t hfs u b s t i t u t e dw i t ht h en e wn a m ef o rf )

Anonymous functions become their bodies, with arguments inserted. ( ( l a m b d a( x )( *x2 ) )5 ) => ( *52 ) 3. This is one evaluation step. Return to step 1 until the entire expression is in the simplest possible form, or results in an error. Note that constant and function definitions are already in their simplest form. These rules may differ from those in DrRacket's stepper feature. Evaluating a program by stepping through is called tracing. In more complex programs, condensed traces are used - traces that can skip multiple steps to show only important parts. Trace ( t e r m( -31 )( +12 ) ) given ( d e f i n e( t e r mxy )( *x( s q ry ) ) ) :
( t e r m( -31 )( +12 ) ) = >( t e r m2( +12 ) ) = >( t e r m23 ) = >( *2( s q r3 ) ) = >( *29 ) = >1 8 = >( s i m p l e s tf o r m )

Trace ( c o n d[ (>34 )x ] ):
( c o n d[ (>34 )x ] ) = >( c o n d[ f a l s ex ] ) = >( c o n d ) = >( e r r o r :n oq u e s t i o n sa n s w e r e d )

Templates
The form of a program should mirror the form of the data. A template is a general outline of code that consumes some type of data, that we can fill in to create a program. Templates must appear after data definitions and before function definitions. We start by making the template of a function, and then flesh out the template to create the finished function. For every form of data, we create a template and use it to write functions that work with that type of data. Templates should be commented out in Scheme code due to issues with MarkUs. For example, a template for a list of a datatype called X might appear as follows:

; ;m y l i s t o f x f n :( l i s t o fX )>A n y ; ;( d e f i n e( m y l i s t o f x f nl o x ) ; ; ( c o n d ; ; [ ( e m p t y ?l o x ). . . ] ; ; [ e l s e( . . .( f i r s tl o x ). . . ; ; ( m y l i s t o f x f n( r e s tl o x ) ). . . ) ] ) )

The template must always produce A n y since we don't know what type of data it will give. Templates only require the contract, but functions written using a template still require the full design recipe.

Structures
Structures are a bundling of several values into one. They are complex values. They work only with finite sets of values, and have a fixed size and field count. For example, a structure might represent a product in an online store. It would store, for example, the name (String), product ID (Nat), price (Num), and availability (Boolean). The two parts of a structure definition is the code and the data definition:
; ;t h i si st h ec o d ep a r t ( d e f i n e s t r u c tp r o d u c t ( n a m ep r o d u c t i dp r i c ea v a i l a b i l i t y ) ) ; ;t h i si st h ed a t ad e f i n i t i o np a r t ; ;AP r o d u c t=( m a k e p r o d u c tS t r i n gN a tN u mB o o l e a n );u s eC a m e l C a s ei nd a t ad e f i n i t i o n s d e f i n e s t r u c t is

a special form that defines a structure and a set of corresponding helper functions.

Here, Racket has made a number of functions automatically: an instance of the struture, and is named m a k e { x } , where { x } is the structure name. obtain a particular field in the structure, and are named { x } { y }, where { x } is the structure name and { y } is a field name. p r o d u c t ? - the type predicate checks if a particular value is an instance of the structure, and are named { x } ? , where { x } is the structure name.
p r o d u c t n a m e, p r o d u c t p r o d u c t i d ,p r o d u c t p r i c e, p r o d u c t a v a i l a b i l i t y - the selector functions m a k e p r o d u c t - the constructor creates

We can now work with this structure:


( d e f i n ei t e m( m a k e p r o d u c t" T e l e v i s i o n "4 1 28 9 9 . 9 9f a l s e ) ) ( p r o d u c t ?i t e m )= >t r u e ( p r o d u c t n a m ei t e m )= >" T e l e v i s i o n "

Structures are immutable - they cannot be changed. Once created, they remain the same. Structures can contain structures. In contracts, product structures can now be referenced as P r o d u c t. For example: f a k e p r o d u c t :S t r i n gB o o l e a n>P r o d u c t . In the Scheme teaching languages, the P o s n structure is defined, and is designed to represent a 2D coordinate.
; ;d i s t a n c e :P o s nP o s n>N u m ; ;P u r p o s e :p r o d u c t e st h eE u c l i d e a nd i s t a n c eb e t w e e n` p 1 `a n d` p 2 ` ; ;E x a m p l e s : ( c h e c k e x p e c t( d i s t a n c e( m a k e p o s n11 )( m a k e p o s n45 ) )5 ) ( d e f i n e( d i s t a n c ep 1p 2 ) ( s q r t( +( s q r( -( p o s n xp 2 )( p o s n xp 1 ) ) ) ( s q r( -( p o s n yp 2 )( p o s n yp 1 ) ) ) ) ) )

In code, the structure name is lowercase. In contracts, data definitions, and a few other places, the name is written in CamelCase - each word is capitalized, and dashes are removed.

Templates
The template is written right after the data definition. A template for a function that consumes a structure selects every field in the structure, even if the function itself doesn't use all of them. When we want to write a function, we write it based on the template:

; ;p r o d u c t f n :P r o d u c t>A n y ( d e f i n e( p r o d u c t f np r o d ) ( . . .( p r o d u c t n a m ep r o d ). . . . . .( p r o d u c t i dp r o d ). . . . . .( p r o d u c t p r i c ep r o d ). . . . . .( p r o d u c t a v a i l a b i l i t yp r o d ) ) )

We use A n y since we don't know what it returns yet. This needs to be reviewed later when actually writing the function. We then fill in the placeholders, . . ., to create the finished function:
( d e f i n e( c h a n g e p r i c ep r o dp r i c e ) ( m a k e p r o d u c t( p r o d u c t n a m ep r o d ) ( p r o d u c t i dp r o d ) p r i c e ( p r o d u c t a v a i l a b i l i t yp r o d ) ) )

1/10/13
For each new structure type, we need: data analysis: looking at the problem, we need to determine if there is a need for compound data type. data definition: describe the compound data type - what each field is, what they are used for. template: describe the basic structure of functions that consume this type, after the data definition. In contracts, we can use atomic data types as well as data definition names (capitalized). It is best to define constants for tests and examples to represent structures, in order to shorten the code.

Data definitions
Unions
( d e f i n e s t r u c tm o v i e i n f o( n a m ed i r e c t o r ) ) ; ;AM o v i e I n f o=( m a k e m o v i e i n f oS t r i n gS t r i n g ) ( d e f i n e s t r u c tm p 3 i n f o( t i t l el e n g t h ) ) ; ;A nM p 3 I n f o=( m a k e m p 3 i n f oS t r i n gN u m ) ; ;T H I SI SAU N I O NT Y P E ; ;AM u l t i m e d i a I n f oi so n eo f : ; ;*aM o v i e I n f o ; ;*a nM p 3 I n f o ; ;T H I SI ST H ET E M P L A T EF O RAF U N C T I O NT H A TC O N S U M E ST H EU N I O NT Y P E ; ;m y m u l t i m e d i a i n f o f n :M u l t i m e d i a I n f o>A n y ( d e f i n e( m y m u l t i m e d i a i n f o f ni n f o ) ( c o n d[ ( m o v i e i n f o ?i n f o ) ( . . .( m o v i e i n f o n a m ei n f o ). . . . . .( m o v i e i n f o d i r e c t o ri n f o ). . . ) ] [ ( m p 3 i n f o ?i n f o ) ( . . .( m p 3 i n f o t i t l ei n f o ). . . . . .( m p 3 i n f o l e n g t hi n f o ). . . ) ] ) )

Now when we write a function, we use the template as a basis:


; ;m u l t i m e d i a i n f o i d e n t i f i e r :M u l t i m e d i a I n f o>S t r i n g ; ;W EC A NA L S OW R I T ET H EC O N T R A C TA S; ;m u l t i m e d i a i n f o i d e n t i f i e r :( u n i o nM o v i e I n f oM p 3 I n f o )>S t r i n g ( d e f i n e( m u l t i m e d i a i n f o i d e n t i f i e ri n f o ) ( c o n d[ ( m o v i e i n f o ?i n f o ) ( m o v i e i n f o n a m ei n f o ) ] [ ( m p 3 i n f o ?i n f o ) ( m p 3 i n f o t i t l ei n f o ) ] ) )

In the above code, the union data type M u l t i m e d i a I n f o (also known as ( u n i o nM o v i e I n f oM p 3 I n f o ) ) represents either a M o v i e I n f o or an M p 3 I n f o. Data definitions do not necessarily need to correspond to any structures in the code:

; ;AN a ti sa ni n t e g e rg r e a t e rt h a no re q u a lt oz e r o

Above we defined the natural number, but there is no data type in Scheme that corresponds to this. It is intended for the human readers.

Error Checking
( d e f i n e( s a f e m a k e p o s nxy ) ( c o n d[ ( a n d( n u m b e r ?x )( n u m b e r ?y ) )( m a k e p o s nxy ) ] [ e l s e( e r r o r" n u m e r i c a la r g u m e n t sr e q u i r e d " ) ] ) ) ; ;T e s t s ( c h e c k e x p e c t( s a f e m a k e p o s n23 )( m a k e p o s n23 ) ) ( c h e c k e r r o r( s a f e m a k e p o s n2' a b c )" n u m e r i c a la r g u m e n t sr e q u i r e d " )

We generally assume inputs are valid unless explicitly required to do error checking.

Lists
A recursive definition defines something in terms of itself. A list is a compound data type. It is a recursively defined. They are known as "cons" types. A list of 5 numbers is a number followed by a list of 4 numbers. A list of 4 numbers is a number followed by a list of 3 numbers. A list of 3 numbers is a number followed by a list of 2 numbers. A list of 2 numbers is a number followed by a list of 1 numbers. A list of 1 numbers is a number followed by a list of 0 numbers. A list of 0 numbers is the base case and handled specially. Lists in Scheme are similar to singly linked lists. We have access only to the first element and the rest of the list.

Basic list constructs


e m p t y - list of

0 elements.

( c o n se l e m e n tr e s t ) (construct) ( f i r s tl i s t ) - obtains

- creates a list with v a l u e followed by r e s t. the first element of non-empty list l i s t . ( r e s tl i s t ) - obtains the (possibly empty) list of all the elements of non-empty list l i s t , excluding the first. ( e m p t y ?l i s t ) - determines whether list l i s t is empty. ( c o n s ?v a l u e ) - determines whether value v a l u e is a cons type (except for e m p t y). ( m e m b e r ?e l e m e n tl i s t ) - determines whether e l e m e n t is contained in l i s t. ( l e n g t hl i s t ) - obtains the number of elements in l i s t .

List operations
( c o n s' a( c o n s' b( c o n s' ce m p t y ) ) )

This is a list of ' a, ' b , and ' c, in that order. To append lists, we cannot use ( c o n sl i s t 1l i s t 2 ). This would simply create a list with the first element being l i s t 1 , and the rest being l i s t 2. For list appending, we can use the built-in function a p p e n d .

3/10/13
A list is one of:
e m p t y - the empty

list.

( c o n se l e m e n tl i s t ) - a recursive list definition.

Data Definitions and Templates

For each new list type, we need: data analysis: looking at the problem, we need to determine if there is a need for a recursive data type. data definition: describe the recursive data type - what each element is, what the base cases are. template: describe the basic structure of functions that consume this type, after the data definition. The template is written right after the data definition. It is based on the data definition and so appears generally as a c o n d expression with one qeustion/answer pair for each possibility. Self-referential data definition clauses lead to recursion in the template, while base cases do not. Example of a list of strings:
; ;AL i s t O f S t r i n g si se i t h e r ; ;*e m p t yo r ; ;*( c o n sS t r i n gL i s t O f S t r i n g s ) ; ;T e m p l a t ef o rL i s t O f S t r i n g s ; ;m y l o s f n :L i s t O f S t r i n g s>A n y ( d e f i n e( m y l o s f nl o s ) ( c o n d[ ( e m p t y ?l o s ). . . ];b a s ec a s e [ e l s e( . . .( f i r s tl o s ). . . . . .( m y l o s f n( r e s tl o s ). . . ) ) ] ) )

We can write L i s t O f S t r i n g s (or alternatively, ( l i s t o fS t r i n g ) ) in data definitions. The ( l i s t o fX ) notation is shorter and does not require any other definitions. Here, X represents any type, even a list or structure. The implicit template when using ( l i s t o fX ) is as follows:
; ;m y l i s t o f X f n :( l i s t o fX )>A n y ( d e f i n e( m y l i s t o f X f nl s t ) ( c o n d[ ( e m p t y ?l s t ). . . ] [ e l s e( . . .( f i r s tl s t ) . . .( m y l i s t o f X f n( r e s tl s t ) ). . . ) ] ) )

Sometimes we need non-empty lists. A data definition could be written as ( n e l i s t o fX ), or using a definition like the following:
; ;AN e L i s t O f S t r i n g si se i t h e r ; ;*( c o n sS t r i n ge m p t y )o r ; ;*( c o n sS t r i n gN e L i s t O f S t r i n g s ) ; ;T e m p l a t ef o rN e L i s t O f S t r i n g s ; ;m y n e l o s f n :N e L i s t O f S t r i n g s>A n y ( d e f i n e( m y n e l o s f nn e l o s ) ( c o n d[ ( e m p t y ?( r e s tn e l o s ) );b a s ec a s e ( . . .( f i r s tn e l o s ). . . ) ] [ e l s e( . . .( f i r s tn e l o s ). . . . . .( m y l o s f n( r e s tn e l o s ). . . ) ) ] ) )

Function that makes an acronym from a list of strings:


; ;m a k e a c r o n y m :L i s t O f S t r i n g s>S t r i n g ; ;P u r p o s e :p r o d u c e sa na c r o n y mf o r m e db yt h ef i r s tl e t t e ro fe a c ho ft h ee l e m e n t so f` s t r i n g s ` . ; ;E x a m p l e s : ( c h e c k e x p e c t( m a k e a c r o n y m( c o n s" K e n t u c k y "( c o n s" F r i e d "( c o n s" C h i c k e n "e m p t y ) ) ) )" K F C " ) ( d e f i n e( m a k e a c r o n y ms t r i n g s ) ( c o n d[ ( e m p t y ?s t r i n g s )" " ] [ e l s e( s t r i n g a p p e n d( s u b s t r i n g( f i r s ts t r i n g s )01 ) ( m a k e a c r o n y m( r e s ts t r i n g s ) ) ) ] ) )

Recursion
Recursive definitions should have a base case. This allows the recursion to eventually terminate. It should also always be possible to get closer to the base case upon each step. It may not have to happen for every call, but it must eventually reach the base case. If either of these are not true, it may result in infinite recursion, when the function calls itself indefinitely. Structural recursion, as opposed to generative recursion, is recursion guided by the data definition - the form of the code matches the form of the data definition. In other words, our functions should follow the template closely and work with the first element of the list and recurse only with the rest of the list.

Pure structural recursion requires that at every call of the recursive function, all parameters are either unchanged or one step closer to the base case. The parameters should be driving the recursion, while everything else stays unchanged. Mutual recursion is recursion involving two or more functions that call each other recursively. It occurs when we have data definitions that refer to each other. Care must be taken to ensure that the base case is eventually reached. Data definitions can be mutually recursive:
AN e s t e d T h i n gi so n eo f : *e m p t y *( l i s t o fO t h e r T h i n g ) AO t h e r T h i n gi so n eo f : *S y m b o l *( l i s tS y m b o lN e s t e d T h i n g )

Condensed Traces
A condensed trace is a way of writing traces that skips the excessive detail that would result from a full trace. Here, we skip steps to show only the most important information. It is always important to specify whether a trace is condensed or full. For example, we might do a condensed trace of a function as follows:
( m a k e a c r o n y m( c o n s" K e n t u c k y "( c o n s" F r i e d "( c o n s" C h i c k e n "e m p t y ) ) ) ) = >( s t r i n g a p p e n d" K "( m a k e a c r o n y m( c o n s" F r i e d "( c o n s" C h i c k e n "e m p t y ) ) ) ) = >( s t r i n g a p p e n d" K "( s t r i n g a p p e n d" F "( m a k e a c r o n y m( c o n s" C h i c k e n "e m p t y ) ) ) ) = >( s t r i n g a p p e n d" K "( s t r i n g a p p e n d" F "( s t r i n g a p p e n d" C "( m a k e a c r o n y me m p t y ) ) ) ) = >( s t r i n g a p p e n d" K "( s t r i n g a p p e n d" F "( s t r i n g a p p e n d" C "" " ) ) ) = >" K F C "

This better shows the way the application of the recursive function leads to the application of that function to a smaller list, until the base case is reached. There aren't strict rules for condensed traces, since everyone might have a different idea of what is an important step. It is possible to condense more or less depending on whether it makes the trace more clear.

8/10/13
Strings are used to represent text. In Scheme, strings are actually sequences of characters.
( s t r i n g > l i s t" a b c" )>( c o n s# \ a( c o n s# \ b( c o n s# \ c( c o n s# \ s p a c ee m p t y ) ) ) ) ( l i s t > s t r i n g( c o n s# \ a( c o n s# \ b( c o n s# \ c( c o n s# \ s p a c ee m p t y ) ) ) ) )>" a b c"

Characters are denoted by # \ a, where a represents the character value - in this case, a lowercase A.

; ;r e p l a c e s p a c e :S t r i n g>S t r i n g ; ;P u r p o s e :p r o d u c e sac o p yo f` s t r `w h e r ea l ls p a c e sa r er e p l a c e db yu n d e r s c o r e s . ; ;E x a m p l e s : ( c h e c k e x p e c t( r e p l a c e s p a c e" " )" " ) ( c h e c k e x p e c t( r e p l a c e s p a c e" C S1 3 5 " )" C S _ 1 3 5 " ) ; ;T H I SI SAW R A P P E RF U N C T I O N ;I TM A I N L YC A L L SA N O T H E RF U N C T I O NT OD OT H EA C T U A LW O R K ( d e f i n e( r e p l a c e s p a c es t r ) ( l i s t > s t r i n g( r e p l a c e s p a c e l i s t( s t r i n g > l i s ts t r ) ) ) ) ; ;T e s t s : ; ;N O TI N C L U D E DF O RB R E V I T Y ; ;r e p l a c e s p a c e l i s t :( l i s t o fC h a r )>( l i s t o fC h a r ) ; ;P u r p o s e :p r o d u c e sac o p yo f` l o c `w h e r ea l l# \ s p a c ei sr e p l a c e db y# \ _ ; ;E x a m p l e s : ( c h e c k e x p e c t( r e p l a c e s p a c e l i s te m p t y )" " ) ( c h e c k e x p e c t( r e p l a c e s p a c e( c o n s# \ C( c o n s# \ S( c o n s# \ s p a c e( c o n s# \ 1( c o n s# \ 3( c o n s# \ 5e m p t y ) ) ) ) ) ) ) ( c o n s# \ C( c o n s# \ S( c o n s# \ _( c o n s# \ 1( c o n s# \ 3( c o n s# \ 5e m p t y ) ) ) ) ) ) ) ( d e f i n e( r e p l a c e s p a c e l i s tl o c ) ( c o n d[ ( e m p t y ?l o c )e m p t y ] [ e l s e( c o n s( c o n d[ ( c h a r = ?( f i r s tl o c )# \ s p a c e )# \ _ ] [ e l s e( f i r s tl o c ) ] ) ( r e p l a c e s p a c e l i s t( r e s tl o c ) ) ) ] ) ) ; ;T e s t s : ; ;N O TI N C L U D E DF O RB R E V I T Y

Nested Templates
Template for a Polygon:
; ;AP o l y g o ni so n eo f : ; ;*e m p t y ; ;*( c o n sP o s nP o l y g o n ) ( d e f i n e( m y p o l y g o n f np o l y ) ( c o n d[ ( e m p t y ?p o l y ). . . ] [ e l s e( . . .( f i r s tp o l y ). . . . . .( m y p o l y g o n f n( r e s tp o l y ) ). . . ) ] ) )

However, we know that ( f i r s tp o l y ) is a Posn. So we should refer to its template:


( d e f i n e( m y p o l y g o n f np o l y ) ( c o n d[ ( e m p t y ?p o l y ). . . ] [ e l s e( . . .( m y p o s n f n( f i r s tp o l y ) ). . . . . .( m y p o l y g o n f n( r e s tp o l y ) ). . . ) ] ) ) ( d e f i n e( m y p o s n f np ) ( . . .( p o s n xp ). . . . . .( p o s n yp ). . . ) )

Alternatively, it is possible to combine the two templates:


( d e f i n e( m y p o l y g o n f np o l y ) ( c o n d[ ( e m p t y ?p o l y ). . . ] [ e l s e( . . .( . . .( p o s n xp ). . . . . .( p o s n yp ). . . ). . . . . .( m y p o l y g o n f n( r e s tp o l y ) ). . . ) ] ) )

A data definition for Nat:


; ;AN a ti so n eo f : ; ;*0 ; ;*( a d d 1N a t ) ; ;N A T U R A LN U M B E R SS T A R TA T0I NC O M P U T E RS C I E N C EA N DL O G I C

; ;T E M P L A T EF O RN A T U R A LN U M B E R S ( d e f i n e( m y n a t f nn ) ( c o n d[ ( < =n0 ). . . ];W EU S E< =H E R EI N S T E A DO Fz e r o ?I NO R D E RT OC A T C HN E G A T I V EO RF R A C T I O N A LI N P U T S .T H I SI SD E F E N S I V EP R O G [ e l s e( . . .( m y n a t f n( s u b 1n ) ). . . ) ] ) ) ; ;W EU S ET H EI N V E R S EO FT H E` a d d 1 `F U N C T I O NT OG E TT H EN U M B E R` x `S U C HT H A T` ( a d d 1x ) `I S` n `

A natural number like 5 would therefore be representable as something like ( a d d 1( a d d 1( a d d 1( a d d 1( a d d 10 ) ) ) ) ). This is similar to the recursive formulation of a list.

Since in each call we need to get closer to the base case, we need to invert the function, so we use s u b 1 to get closer to the base case. This isn't the usual way we'd think of numbers, but writing it in the form of a data definition allows us to make good templates that consume this type of data. Countdown example:
; ;c o u n t d o w n t o :I n tI n t>( l i s t o fI n t ) ; ;P u r p o s e :p r o d u c e sal i s to fi n t e g e r sf r o m` s t a r t `t o` e n d ` ; ;E x a m p l e s : ; ;N O TI N C L U D E DF O RB R E V I T Y ( d e f i n e( c o u n t d o w n t os t a r te n d ) ( c o n d[ ( < =s t a r te n d )( c o n se n de m p t y ) ] [ e l s e( c o n ss t a r t( c o u n t d o w n t o( s u b 1s t a r t )e n d ) ) ] ) )

10/10/13
We can denote subsets of certain sets using subscript notation: Natural numbers: Z0 Negative integers: Z<0 Real numbers greater than 100: R>100 In data definitions, we can represent this as follows:
; ;a s c i i > l i s t o f c h a r :N a t [ < 2 5 6 ]N a t [ < 2 5 6 ]>( l i s t o fC h a r )

Here, N a t [ < 2 5 6 ] is equivalent to N<256 , or natural numbers less than 256. Other possible uses of the square braket notation are I n t [ > = 2 0 ], S t r i n g [ < " a b c " ] . Primality test:
; ;p r i m e ? :N a t [ > 0 ]>B o o l e a n ; ;P u r p o s e :p r o d u c e st r u ei f` n `i sp r i m ea n df a l s eo t h e r w i s e ; ;E x a m p l e s : ( c h e c k e x p e c t( p r i m e ?1 )f a l s e ) ( c h e c k e x p e c t( p r i m e ?2 )t r u e ) ( c h e c k e x p e c t( p r i m e ?4 )f a l s e ) ( d e f i n e( p r i m e ?n ) ( n o t( o r( =n1 ) ( h a s f a c t o r s ?2n ) ) ) ) ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y ; ;h a s f a c t o r s ? :N a t [ > 1 ]N a t [ > 0 ]>B o o l e a n ; ;P u r p o s e :p r o d u c e st r u ei fa n yn u m b e r sb e t w e e n` f a c t o r `a n do n el e s st h a n` n `d i v i d e` n ` ,a n df a l s eo t h e r w i s e ; ;E x a m p l e s : ; ;O M I T T E DF O RB R E V I T Y ( d e f i n e( h a s f a c t o r s ?f a c t o rn ) ( c o n d[ ( > =f a c t o rn )# f ] [ ( z e r o ?( r e m a i n d e rnf a c t o r ) )# t ] [ e l s e( h a s f a c t o r s ?( a d d 1f a c t o r )n ) ] ) ) ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y

Consider a basic list sorting function template:


( d e f i n e( s o r tl i s t ) ( c o n d[ ( e m p t y ?l i s t ). . . ] [ e l s e( . . .( f i r s tl i s t ). . .( s o r t( r e s tl i s t ) ). . . ) ] ) )

Now we need to do something with the first element of the list and the (assumed sorted) rest of the list. What we can do is use i n s e r t, a helper function that inserts an element into a list in sorted order:

( d e f i n e( s o r tl i s t ) ( c o n d[ ( e m p t y ?l i s t )e m p t y ] [ e l s e( i n s e r t( f i r s tl i s t )( s o r t( r e s tl i s t ) ) ) ] ) )

We simply need to assume that when we call s o r t, we will get a sorted list, and that i n s e r t will correctly insert the element in sorted order. We start with a template for the insertion function:
; ;i n s e r t :A n y( l i s t o fA n y )>( l i s t o fA n y ) ( d e f i n e( i n s e r te l e m e n tl i s t ) ( c o n d[ ( e m p t y ?l i s t ). . . ] [ e l s e( . . .( f i r s tl i s t ). . .( i n s e r te l e m e n t( r e s tl i s t ) ). . . ) ] ) )

We can assume the list is in sorted order since it will only ever be called on the result of s o r t. So we just need to put it at the beginning if it's already in the proper place, or recurse to put it in the correct place in the rest of the list:
; ;i n s e r t :N u m( l i s t o fN u m )>( l i s t o fN u m ) ; ;P u r p o s e :p r o d u c e sal i s te q u a lt o` l i s t `e x c e p tw i t h` e l e m e n t `i n s e r t e di ns o r t e do r d e r ; ;E x a m p l e s : ( c h e c k e x p e c t( i n s e r t1( l i s t23 ) )( l i s t123 ) ) ( c h e c k e x p e c t( i n s e r t2( l i s t13 ) )( l i s t123 ) ) ( d e f i n e( i n s e r te l e m e n tl i s t ) ( c o n d[ ( e m p t y ?l i s t )( c o n se l e m e n te m p t y ) ] [ ( < =e l e m e n t( f i r s tl i s t ) )( c o n se l e m e n tl i s t ) ] [ e l s e( c o n s( f i r s tl i s t )( i n s e r te l e m e n t( r e s tl i s t ) ) ) ] ) ) ; ;T e s t s : ( c h e c k e x p e c t( i n s e r t1( l i s t23 ) )( l i s t123 ) ) ( c h e c k e x p e c t( i n s e r t2( l i s t13 ) )( l i s t123 ) ) ( c h e c k e x p e c t( i n s e r t2e m p t y )( l i s t2 ) )

Together, this forms a sorting function based on the insertion sort algorithm.

List Abbreviations
Lists can be written in a few ways. All of the following are equivalent:
( c o n s3( c o n s( c o n s' a( c o n s' be m p t y ) )( c o n s" t e s t "( c o n s# \ ae m p t y ) ) ) ) ( l i s t3( l i s t' a' b )" t e s t "# \ a ) ' ( 3( ab )" t e s t "# \ a ) - only available starting in "Beginning Student with List Abbreviations". In the quoted part, only symbols, numbers, strings, and lists are allowed. No quotes are used inside the outer brackets. ' ( ) is the same as e m p t y .

We use l i s t for lists of fixed size, where the length of the list is known beforehand. We still need c o n s for constructing a list of variable length. In data definitions, we can use notation like ( l i s tS t r i n gN u m ) to represent a list with the first element being a string, and the second a number. We can simulate structures with lists - each element could hold a field, and the list itself would be a collection of fields, just like a structure. This could be useful for things like type unions, where instead of writing very similar code for two different types of structures, we simply use lists for both and assume the needed fields are at the same place in both types of lists. Beginning Student with List Abbreviations also has extra functions for working with lists:
( s e c o n dl i s t ) is ( t h i r dl i s t ) is

equivalent to ( f i r s t( r e s tl i s t ) ) . It obtains the second element of a list. equivalent to ( f i r s t( r e s t( r e s tl i s t ) ) ) . It obtains the third element of a lsit.

...

( e i g h t hl i s t ) is equivalent to ( f i r s t( r e s t( r e s t( r e s t( r e s t( r e s t( r e s t( r e s tl i s t ) ) ) ) ) ) ) ) . It obtains the eighth element of a list.

15/10/13
Dictionaries

Dictionaries are abstract data types (not a primitive type, but a commonly used pattern). They are associations of keys with values. A telephone directory is a dictionary - the names are the key, which we use to look up phone numbers, which are the values. Keys must be unique in a dictionary - there can be no duplicates. However, values do not need to be unique. The most important operations on dictionaries are: Lookup - given a key, produce the value associated with it. Add - add a new key and its associated value. Remove - given a key remove it and the value associated with it. The actual implementation of the dictionary is dependent on what we want from it. For example, some implementations might have faster lookup but slower add and remove.

Association Lists
This is simply a list of key/value pairs:
; ;A nA Li so n eo f : *e m p t y *( c o n s( l i s tN u mS t r i n g )A L ) ; ;T e m p l a t ef o rA L : ; ;m y a l f n :A L>A n y ( d e f i n e( m y a l f na l ) ( c o n d[ ( e m p t y ?a l ). . . ] [ e l s e( . . .( f i r s t( f i r s ta l ) ). . . . . .( s e c o n d( f i r s ta l ) ). . . . . .( m y a l f n( r e s ta l ) ). . . ) ] ) ) ; ;O RU S E( l i s t o f( l i s tN u mS t r i n g ) )

Now we can implement a few operations on this data type:


; ;a l l o o k u p :A LN u m>( u n i o nS t r i n gf a l s e ) ; ;P u r p o s e :p r o d u c e st h ev a l u ea s s o c i a t e dw i t h` k e y `i n` a l `i fi te x i s t s ,a n df a l s eo t h e r w i s e ; ;E x a m p l e s : ; ;O M I T T E DF O RB R E V I T Y ( d e f i n e( a l l o o k u pa lk e y ) ( c o n d[ ( e m p t y ?a l )f a l s e ];f a l s er e p r e s e n t st h ee l e m e n tn o tb e i n gf o u n d [ ( =k e y( f i r s t( f i r s ta l ) ) )( s e c o n d( f i r s ta l ) ) ] [ e l s e( a l l o o k u p( r e s ta l )k e y ) ) ] ) ) ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y ; ;a l r e m o v e :A LN u m>A L ; ;P u r p o s e :p r o d u c e sa nA Le q u a lt o` a l `w i t h o u tt h ek e y` k e y ` ; ;E x a m p l e s : ; ;O M I T T E DF O RB R E V I T Y ( d e f i n e( a l r e m o v ea lk e y ) ( c o n d[ ( e m p t y ?a l )e m p t y ] [ ( =k e y( f i r s t( f i r s ta l ) ) ) ( a l r e m o v e( r e s ta l )k e y ) ] [ e l s e( c o n s( f i r s ta l )( a l r e m o v e( r e s ta l )k e y ) ) ] ) ) ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y

Processing Multiple Lists


Processing one list
Consider an appending function:
; ;m y a p p e n d :( l i s t o fA n y )( l i s t o fA n y )>( l i s t o fA n y ) ( d e f i n e( m y a p p e n dl i s t 1l i s t 2 ) ( c o n d[ ( e m p t y ?l i s t 1 )l i s t 2 ] [ e l s e( c o n s( f i r s tl i s t 1 )( m y a p p e n d( r e s tl i s t 1 )l i s t 2 ) ) ] ) )

This uses only structural recursion - l i s t 2 does not change between calls. Note that the run time of this function depends on the length of l i s t 1. If l i s t 1 is very large, the function may need a significant amount of time to run.

Processing in lockstep
If both lists are of the same length, we can assume that the first list will be empty if and only if the second is.

Consider a dot product function:


; ;d o t p r o d u c t :( l i s t o fN u m )( l i s t o fN u m )>N u m ( d e f i n e( d o t p r o d u c tv e c 1v e c 2 ) ( c o n d[ ( e m p t y ?v e c 1 )0 ] [ e l s e( +( *( f i r s tl i s t 1 )( f i r s tl i s t 2 ) ) ( d o t p r o d u c t( r e s tv e c 1 )( r e s tv e c 2 ) ) ) ] ) )

Processing at different rates


There are four possible cases to consider if the two lists are of differing lengths. Both are empty. The first is empty, but the second isn't. The second is empty, but the first isn't. Both are non-empty. This is reflected in the template:
; ;m y d o u b l e l i s t f n :( l i s t o fA n y )( l i s t o fA n y )>A n y ( d e f i n e( m y d o u b l e l i s t f nl i s t 1l i s t 2 ) ( c o n d[ ( a n d( e m p t y ?l i s t 1 )( e m p t y ?l i s t 2 ) ). . . ] [ ( a n d( e m p t y ?l i s t 1 )( c o n s ?l i s t 2 ) ). . . ] [ ( a n d( c o n s ?l i s t 1 )( e m p t y ?l i s t 2 ) ). . . ] [ e l s e. . . ] ) )

Consider an element count test function:


; ;m i n i m u m o c c u r r e n c e s ? :( l i s t o fA n y )A n yN a t>B o o l e a n ; ;P u r p o s e :p r o d u c e st r u ei f` v a l u e `a p p e a r si n` l i s t `a tl e a s t` c o u n t `t i m e s ,a n df a l s eo t h e r w i s e ; ;E x a m p l e s : ; ;O M I T T E DF O RB R E V I T Y ( d e f i n e( m i n i m u m o c c u r r e n c e s ?l i s tv a l u ec o u n t ) ( c o n d[ ( < =c o u n t0 )t r u e ] [ ( e m p t y ?l i s t )f a l s e ] [ ( e q u a l ?v a l u e( f i r s tl i s t ) ) ( m i n i m u m o c c u r r e n c e s ?( r e s tl i s t )( s u b 1c o u n t ) ) ] [ e l s e( m i n i m u m o c c u r r e n c e s ?( r e s tl i s t )c o u n t ) ] ) ) ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y

Consider a list comparison function:


; ;l i s t = ? :( l i s t o fA n y )( l i s t o fA n y )>B o o l e a n ; ;P u r p o s e :p r o d u c e st r u ei f` l i s t 1 `a n d` l i s t 2 `a r ee q u a l ,a n df a l s eo t h e r w i s e ; ;E x a m p l e s : ; ;O M I T T E DF O RB R E V I T Y ( d e f i n e( l i s t = ?l i s t 1l i s t 2 ) ( c o n d[ ( a n d( e m p t y ?l i s t 1 )( e m p t y ?l i s t 2 ) )# t ] [ e l s e( a n d( c o n s ?l i s t 1 )( c o n s ?l i s t 2 ) ( e q u a l ?( f i r s tl i s t 1 )( f i r s tl i s t 2 ) ) ( l i s t = ?( r e s tl i s t 1 )( r e s tl i s t 2 ) ) ) ] ) ) ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y

17/10/13
Types of Recursion
Pure structural recursion
Structural recursion is based on a recursive data definition - it is driven by and follows the form of the data definition. On each call, all parameters must be either unchanged, or one step closer to a base case according to the data definition. However this can have disadvantages. Consider a function finding the maximum element of a list, written in pure structural recursion style:

; ;l i s t m a x :( n e l i s t o fN u m )>N u m ( d e f i n e( l i s t m a xl i s t ) ( c o n d[ ( e m p t y ?( r e s tl i s t ) )( f i r s tl i s t ) ] [ ( > =( f i r s tl i s t )( l i s t m a x( r e s tl i s t ) ) )( f i r s tl i s t ) ] [ e l s e( l i s t m a x( r e s tl i s t ) ) ] ) )

In the worst case - a strictly increasing list - the function will call itself twice for each step, which means it takes exponential time based on the length of the list.

Accumulative recursion/Structural recursion with an accumulator


This is similar to pure structural recursion, but it can also have parameters with partial answers. Consider a function finding the maximum element of a list, written in accumulative recursion style:
; ;l i s t m a x h e l p e r :( l i s t o fN u m )N u m>N u m ( d e f i n e( l i s t m a x h e l p e rl i s tp a r t i a l m a x ) ( c o n d[ ( e m p t y ?l i s t )p a r t i a l m a x ] [ ( > =( f i r s tl i s t )p a r t i a l m a x ) ( l i s t m a x h e l p e r( r e s tl i s t )( f i r s tl i s t ) ) ] [ e l s e ( l i s t m a x h e l p e r( r e s tl i s t )p a r t i a l m a x ) ] ) ) ; ;l i s t m a x :( n e l i s t o fN u m )>N u m ( d e f i n e( l i s t m a xl i s t ) ( l i s t m a x h e l p e r( r e s tl i s t )( f i r s tl i s t ) ) )

Here, we recurse at most once per call. The extra parameter allows us to move extra data downwards through the calls so we don't need to restructure it to move data upwards only. We can use as many extra parameters as needed. The key is that we make extra data available to the callee. Generally, the accumulatively recursive function needs a wrapper function to start it off with initial values for the extra parameters. Consider a function that reverses a list:
; ;l i s t r e v e r s e :( l i s t o fA n y )>( l i s t o fA n y ) ( d e f i n e( l i s t r e v e r s el i s tc u r r e n t ) ( c o n d[ ( e m p t y ?l i s t )c u r r e n t ] [ e l s e( l i s t r e v e r s e( r e s tl i s t )( c o n s( f i r s tl i s t )c u r r e n t ) ) ] ) )

Note that r e v e r s e is actually a built-in function that has teh same functionality, though it doesn't require the second parameter. We use the function as follows:
( l i s t r e v e r s e' ( abc )e m p t y )>' ( cba )

Generative/general recursion
Generative/general recursion allows us to get closer to a base case in any way we want - we can calculate the parameters freely. If there is even just one generated parameter, it is generative recursion. Consider the GCD for m
gcd(n, 0) = n gcd(n, m ) = gcd(m , n mod m ) > 0

We do not have a data definition. Here, we use generative recursion to create a function to compute the GCD of two numbers:
( d e f i n e( g c dnm ) ( c o n d[ ( z e r o ?m )n ] [ e l s e( g c dm( r e m a i n d e rnm ) ) ] ) )

This is written in generatively recursive style because the arguments are generated by computation on n and m. Generative recursion is easier to get wrong, harder to debug, and harder to reason about.

22/10/13
Trees
A tree is an abtract data type, like a dictionary. It is a recursive data structure made up of nodes: internal nodes refer to one or more other nodes. leaf nodes do not refer to any other nodes. Nodes can also store their own value. This value is known as a label.
; ;AT r e ei so n eo f : ; ;*( L e a f c o n s t r u c t o rV a l u e ) ; ;*( N o d e c o n s t r u c t o rT r e eT r e e )

Every node is also a tree in itself. If we look at a node and its descendents as a tree, we call it a subtree in this context. For example, we can represent arithmetic expressions as trees. Consider (4 + 1) (7 (6/2)):
+

4 1

7
/

6 2 If node A refers to node B, and node B refers to node C, and so on, until node Z, then nodes B to Z are descendents of A, and nodes A to Y are ancestors of Z. A node is its own ancestor and descendent. If node A refers to node B, A is the parent/direct ancestor of B, and B is the child/direct descendent of A. If two nodes have the same parent, then they are siblings. Additional constraints for trees are: A node cannot have a descendent that is its ancestor. A node can have only one parent. The very top node is known as the root node. Trees have various classifying properties: Number of children each internal node has: two or less (binary tree), exactly two (variant of binary tree), or even any amount (general tree). Whether all nodes have labels, or just leaf nodes. Whether the order of children of an internal node matters. Actual structure of the tree in the implementation. So for the binary arithmetic expression above: Each internal node has exactly two children. Leaf nodes have number labels, and internal nodes have symbol labels The order of children is significant. We can use the following data definition for a binary arithmetic expression tree: (define-struct bae (operation arg1 arg2)) ;; A BinExp is one of: ;; * Num ;; * (make-bae Symbol BinExp BinExp) So the expression above would be representable as ( m a k e b a e' *( m a k e b a e' +41 )( m a k e b a e' -7( m a k e b a e' /62 ) ) ). Now we can write a template for this:

; ;b i n e x p f n :B i n E x p>A n y ( d e f i n e( b i n e x p f nt r e e ) ( c o n d[ ( n u m b e r ?t r e e ). . . ] [ ( b a e ?t r e e )( . . .( b a e o p e r a t i o nt r e e ). . . ( b a e a r g 1t r e e ). . . ( b a e a r g 2t r e e ). . . ) ] ) )

Since we know that ( b a e a r g 1t r e e ) and ( b a e a r g 2t r e e ) are both of type BinExp, we can apply the BinExp processing function on it:
; ;b i n e x p f n :B i n E x p>A n y ( d e f i n e( b i n e x p f nt r e e ) ( c o n d[ ( n u m b e r ?t r e e ). . . ] [ ( b a e ?t r e e )( . . .( b a e o p e r a t i o nt r e e ). . . ( b i n e x p f n( b a e a r g 1t r e e ) ). . . ( b i n e x p f n( b a e a r g 2t r e e ) ). . . ) ] ) )

Now we can make functions consuming BinExp values, such as an evaluator:


( d e f i n e( e v a le x ) ( c o n d[ ( n u m b e r ?e x )e x ] [ ( b a e ?e x )( c o n d[ ( s y m b o l = ?( b a e o p e r a t i o ne x )' * ) ( *( e v a l( b a e a r g 1e x ) )( e v a l( b a e a r g 2e x ) ) ) ] [ ( s y m b o l = ?( b a e o p e r a t i o ne x )' + ) ( +( e v a l( b a e a r g 1e x ) )( e v a l( b a e a r g 2e x ) ) ) ] [ ( s y m b o l = ?( b a e o p e r a t i o ne x )' / ) ( /( e v a l( b a e a r g 1e x ) )( e v a l( b a e a r g 2e x ) ) ) ] [ ( s y m b o l = ?( b a e o p e r a t i o ne x )' ) ( -( e v a l( b a e a r g 1e x ) )( e v a l( b a e a r g 2e x ) ) ) ] ) ) )

Traversal
Traversal simply means going through every node of a tree. There are two broad types of traversal: breadth-first traversal deals with one nesting level at a time - it deals with all of an interna node's children before dealing with their children. depth-first traversal deals with one path at a time - it deals with a node, its children, and so on, until the entire node is processed, before moving on to the next child. Depth-first traversal is quite natural to implement recursively. As a result, it is used quite often in this course. We can represent traversal as a flat list of the nodes in the tree, in the order that they were traversed. When we do traversal, there is also a question of the order in which we deal with children of an internal node and the node itself. For example, we can process the tree ( +12 ) in the following ways: 1. process the +, then 1 and 2 - this is called pre-order traversal. The result would be ' +12 . 2. process 1, then +, and then 2 - this is called in-order traversal. The result would be 1 ' +2 . 3. process 1, 2, and then + - this is called post-order traversal. The result would be 1 2 ' + . We can implement pre-order traversal pretty simply:
t r a v e r s e b i n e x p :B i n E x p>( l i s t o f( u n i o nS y m b o lN u m ) ) ( d e f i n e( t r a v e r s e b i n e x pt r e e ) ( c o n d[ ( n u m b e r ?t r e e )( l i s tt r e e ) ];l e a fn o d e [ ( b a e ?t r e e ) ( a p p e n d( b a e o p e r a t i o nt r e e ) ( t r a v e r s e b i n e x p( b a e a r g 1t r e e ) ) ( t r a v e r s e b i n e x p( b a e a r g 2t r e e ) ) ) ] ) )

In a similar way, in-order and post-order traversal can be done by switching the order of the arguments to a p p e n d.

24/10/13
Binary Search

Dictionaries were previously implemented using an association list of two-element lists. However, this had the problem that it could potentially require us to search through thte entire list to lookup a value. We could instead put the key-value pairs into a binary tree:
( d e f i n e s t r u c tn o d e( k e yv a ll e f tr i g h t ) ) ; ;Ab i n a r yt r e e( B T )i so n eo f : ; ;*e m p t y ; ;*( m a k e n o d eN u mS t r i n gB TB T )

Here, if a node has e m p t y as its left and right branches, it is a leaf node. Otherwise, it refers to other values and is an internal node. Template for a binary tree:
; ;m y b t f n :B T>A n y ( d e f i n e( m y b t f nt r e e ) ( c o n d[ ( e m p t y ?t r e e ). . . ] [ e l s e( . . .( n o d e k e yt r e e ). . . ( n o d e v a lt r e e ). . . ( m y b t f n( n o d e l e f tt r e e ) ). . . ( m y b t f n( n o d e r i g h tt r e e ) ). . . ) ] ) )

Consider a function that counts the number of nodes equal to a certain value in a tree:
; ;c o u n t b t e q u a l :B TA n y>A n y ; ;P u r p o s e :r e t u r n st h en u m b e ro fn o d e si n` t r e e `e q u a lt o` v a l u e ` ; ;E x a m p l e s : ; ;O M I T T E DF O RB R E V I T Y ( d e f i n e( c o u n t b t e q u a lt r e ev a l u e ) ( c o n d[ ( e m p t y ?t r e e ). . . ] [ e l s e( +( c o n d[ ( e q u a l ?( n o d e v a lt r e e )v a l u e )1 ] [ e l s e0 ] ) ( c o u n t b t e q u a l( n o d e l e f tt r e e ) ) ( c o u n t b t e q u a l( n o d e r i g h tt r e e ) ) ) ] ) ) ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y

We can search through this type of tree pretty easily - if not found or empty, search through the left and right subtrees recursively. However, this is no more efficient than an association list - we could still potentially search through the whole thing in order to lookup a value. Draw the tree ( m a k e n o d e5' a( m a k e n o d e1' be m p t ye m p t y )( m a k e n o d e6' ce m p t y( m a k e n o d e1 4' de m p t ye m p t y ) ) ):
5 /\ 1 6 \ 1 4

We do not represent the v a l u e field - only keys matter here.

Ordering property
We can add a constraint that makes this much more efficient:
( d e f i n e s t r u c tn o d e( k e yv a ll e f tr i g h t ) ) ; ;Ab i n a r ys e a r c ht r e e( B S T )i so n eo f : ; ;*e m p t y ; ;*( m a k e n o d eN u mS t r i n gB S TB S T ) ; ;A n ds a t i s f i e st h eo r d e r i n gp r o p e r t y : ; ;*e v e r yk e yi n` l e f t `i sl e s st h a n` k e y ` ; ;*e v e r yk e yi n` r i g h t `i sg r e a t e rt h a n` k e y `

The ordering property allows us to make the following assumptions: if a key is less than a given node's key, it is not in the right subtree. if a key is greater than a given node's key, it is not in the left subtree. This is very useful for operations like searching and insertion.

Searching

Searching is made more efficient because we can use these assumptions to get a faster algorithm: If the tree is empty, the search key does not exist. Otherwise, we know we have a node. If the search key is equal to a node's key, we found the node. If the search key is less than a node's key, then we only need to search in the left subtree. If the search key is greater than a node's key, then we only need to search in the right subtree. Basically, we avoid doing one recursive call each time - so we would only need to make as many recursive calls as the tree is deep. If a tree is nicely balanced (internal nodes try to have both subtrees non-empty as much as possible), we can do a search in only log 2 n calls, where n is the number of leaf nodes. Otherwise, degenerate trees such as one with all internal nodes having empty left or right subtrees are no more efficient than an association list. This can be implemented as follows:
; ;s e a r c h b s t :N u mB S T>( u n i o nA n yf a l s e ) ; ;P u r p o s e :p r o d u c e st h ev a l u ea s s o c i a t e dw i t h` k e y `i n` t r e e ` ; ;E x a m p l e s : ; ;O M I T T E DF O RB R E V I T Y ( d e f i n e( s e a r c h b s tk e yt r e e ) ( c o n d[ ( e m p t y ?t r e e )f a l s e ] [ ( =k e y( n o d e k e yt r e e ) )( n o d e v a lt r e e ) ] [ ( <k e y( n o d e k e yt r e e ) )( s e a r c h b s tk e y( n o d e l e f tt r e e ) ) ] [ ( >k e y( n o d e k e yt r e e ) )( s e a r c h b s tk e y( n o d e r i g h tt r e e ) ) ] ) ) ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y

Adding
We can add an element to a binary search tree in a similar fashion: If the tree is empty, we can simply produce the new node. If the new node's key is equal to the node's key, we have a duplicate. We often handle this by replacing the node's value with the new node's value. If the new node's key is less than the node's key, we try to insert recursively in the left subtree of the node. If the new node's key is greater than the node's key, we try to insert recursively in the right subtree of the node. This can be implemented as follows:

; ;i n s e r t b s t :N u mA n yB S T>B S T ; ;P u r p o s e :p r o d u c e s` t r e e `w i t h` k e y `a s s o c i a t e dw i t h` v a l u e ` ; ;E x a m p l e s : ; ;O M I T T E DF O RB R E V I T Y ( d e f i n e( i n s e r t b s tk e yv a l u et r e e ) ( c o n d[ ( e m p t y ?t r e e )( m a k e n o d ek e yv a l u ee m p t ye m p t y ) ] [ ( =k e y( n o d e k e yt r e e ) )( m a k e n o d ek e yv a l u e( n o d e l e f tt r e e )( n o d e r i g h tt r e e ) ) ] [ ( <k e y( n o d e k e yt r e e ) )( m a k e n o d e( n o d e k e yt r e e )( n o d e v a lt r e e )( i n s e r t b s tk e yv a l u e( n o d e l e f tt r e e ) )( n o d e r i g h t [ ( >k e y( n o d e k e yt r e e ) )( m a k e n o d e( n o d e k e yt r e e )( n o d e v a lt r e e )( n o d e l e f tt r e e )( i n s e r t b s tk e yv a l u e( n o d e r i g h tt ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y

Removing
Removing is a bit more complex. There are three cases to consider: No subtrees - leaf node. We can remove this directly. One subtree - internal node. We can remove the node and promote its child to the node's original place without violating the ordering property. Two subtrees - internal node. We can remove the node and promote the rightmost node of the left subtree of the node being removed, or the leftmost node of the right subtree of the node being removed, to the node's original place without violating the ordering property. This works because: Here, the rightmost node is the node we get if we keep taking the right subtree, never taking any left subtrees, until we get to a node without a right subtree. Likewise, the leftmost node is the node we get if we keep taking the left subtree, never taking any right subtrees, until we get to a node without a left subtree. The rightmost element of the left subtree is the largest element of the left subtree, yet is still smaller than every element of the right subtree. The leftmost element of the right subtree is the smallest element of the right subtree, yet is still larger than every element of the left subtree. This can be implemented as follows:

; ;r e m o v e m i n b s t :N o d e>B S T ; ;D E S I G NR E C I P EO M I T T E DF O RB R E V I T Y ( d e f i n e( r e m o v e m i n b s tt r e e ) ( c o n d[ ( e m p t y ?( n o d e l e f tt r e e ) )e m p t y ] [ e l s e( m a k e n o d e( n o d e k e yt r e e )( n o d e v a lt r e e ) ( r e m o v e m i n b s t( n o d e l e f tt r e e ) ) ( n o d e r i g h tt r e e ) ) ] ) ) ; ;m i n b s t :N o d e>N o d e ; ;D E S I G NR E C I P EO M I T T E DF O RB R E V I T Y ( d e f i n e( m i n b s tt r e e ) ( c o n d[ ( e m p t y ?( n o d e l e f tt r e e ) )t r e e ] [ e l s e( m i n b s t( n o d e l e f tt r e e ) ) ] ) ) ; ;r e m o v e b s t :N u mB S T>B S T ; ;P u r p o s e :p r o d u c e s` t r e e `w i t h o u tt h en o d ew i t hk e y` k e y ` ; ;E x a m p l e s : ; ;O M I T T E DF O RB R E V I T Y ( d e f i n e( r e m o v e b s tk e yt r e e ) ( c o n d[ ( e m p t y ?t r e e )e m p t y ] [ ( =k e y( n o d e k e yt r e e ) ) ( c o n d[ ( a n d( e m p t y ?( n o d e l e f tt r e e ) );l e a fn o d e ( e m p t y ?( n o d e r i g h tt r e e ) ) ) e m p t y ] ) ] [ ( e m p t y ?( n o d e l e f tt r e e ) )( n o d e r i g h tt r e e ) ];r i g h tc h i l do n l y [ ( e m p t y ?( n o d e r i g h tt r e e ) )( n o d e l e f tt r e e ) ];l e f tc h i l do n l y [ e l s e;t w oc h i l d r e n ( m a k e n o d e( n o d e k e y( m i n b s t( n o d e r i g h tt r e e ) ) ) ( n o d e v a l( m i n b s t( n o d e r i g h tt r e e ) ) ) ( n o d e l e f tt r e e ) ( r e m o v e m i n b s t( n o d e r i g h tt r e e ) ) ] [ ( <k e y( n o d e k e yt r e e ) ) ( m a k e n o d e( n o d e k e yt r e e )( n o d e v a lt r e e ) ( r e m o v e b s tk e y( n o d e l e f tt r e e ) ) ( n o d e r i g h tt r e e ) ) ] [ ( >k e y( n o d e k e yt r e e ) ) ( m a k e n o d e( n o d e k e yt r e e )( n o d e v a lt r e e ) ( n o d e l e f tt r e e ) ( r e m o v e b s tk e y( n o d e r i g h tt r e e ) ) ) ] ) ) ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y

General Trees
Binary trees are useful, but it is occasionally useful to allow a larger, fixed number of children. For example, a ternary tree has at most 3 elements. Here, we would modify our implementation to use a different definition for a node structure with additional fields. However, if there could be any number of children, we should represent a node's subtrees as a list.

Scheme expressions
Scheme expressions could be represented using one of these general trees:
( d e f i n e s t r u c ta e( o p e r a t i o na r g s ) ) ; ;A na r i t h m e t i ce x p r e s s i o n( A E )i so n eo f : ; ;*N u m ; ;*( m a k e a eS y m b o l( l i s t o fA E ) ) ; ;T e m p l a t ef o rA E : ; ;m y a e f n :A E>A n y ( d e f i n e( m y a e f na e ) ( c o n d[ ( n u m b e r ?a e ). . . ] [ e l s e( . . .( a e o p e r a t i o na e ). . . ( m y a e a r g s f n( a e a r g sa e ) ). . . ) ] ) ) ; ;m y a e a r g s f n :( l i s t o fA E )>A n y ( d e f i n e( m y a e a r g s f na r g s ) ( c o n d[ ( e m p t y ?a r g s ). . . ] [ e l s e( . . .( m y a e f n( f i r s ta r g s ) ). . . ( m y a e a r g s f n( r e s ta r g s ) ). . . ) ] ) )

Note the mutually recursive data definition results in a mutually recursive set of functions. Now we can write an evaluator for arithmetic expressions:

; ;e v a l :A E>N u m ; ;D E S I G NR E C I P EO M I T T E DF O RB R E V I T Y ( d e f i n e( e v a la e ) ( c o n d[ ( n u m b e r ?a e )a e ] [ e l s e( a p p l y( a e o p e r a t i o na e )( a e a r g sa e ) ) ] ) ) ; ;a p p l y :( l i s t o fA E )>N u m ; ;D E S I G NR E C I P EO M I T T E DF O RB R E V I T Y ( d e f i n e( a p p l yo p e r a t i o na r g s ) ( c o n d[ ( e m p t y ?a r g s )( c o n d[ ( s y m b o l = ?o p e r a t i o n' * )1 ] [ ( s y m b o l = ?o p e r a t i o n' + )0 ] ) ] [ ( s y m b o l = ?o p e r a t i o n' * ) ( *( e v a l( f i r s ta r g s ) )( a p p l yo p e r a t i o n( r e s ta r g s ) ) ) ] [ ( s y m b o l = ?o p e r a t i o n' + ) ( +( e v a l( f i r s ta r g s ) )( a p p l yo p e r a t i o n( r e s ta r g s ) ) ) ] ) )

However, we could also write the expression with just lists: ' ( +12( *456 )3 ). The data definition would look something like this:
; ;A na r i t h m e t i ce x p r e s s i o n( A E )i so n eo f : ; ;*N u m ; ;*( c o n sS y m b o l( l i s t o fA E ) ) ; ;T e m p l a t ef o rA E : ; ;m y a e f n :A E>A n y ; ;D E S I G NR E C I P EO M I T T E DF O RB R E V I T Y ( d e f i n e( m y a e f na e ) ( c o n d[ ( n u m b e r ?a e ). . . ] [ e l s e( . . .( f i r s ta e ). . . ( m y a e a r g s f n( r e s ta e ) ). . . ) ] ) ) ; ;S E ED E F I N I T I O NO Fm y a e a r g s f nA B O V E

The evaluator function for this representation would look something like this:
; ;e v a l :A E>N u m ; ;D E S I G NR E C I P EO M I T T E DF O RB R E V I T Y ( d e f i n e( e v a la e ) ( c o n d[ ( n u m b e r ?a e )a e ] [ e l s e( a p p l y( f i r s ta e )( r e s ta e ) ) ] ) ) ; ;S E ED E F I N I T I O NO Fa p p l yA B O V E

Note that a p p l y did not change when the data definition did not change. This is the beginnings of a full Scheme interpreter.

Nested lists
Nested lists can also be represented as leaf-labelled trees. Leaves correspond to list elements, and internal nodes correspond to nesting:
' ( 1( 23 )4 ) * / | \ /*\ //\\ 12 34

Note that the empty list is simply a single node:


' ( ) ( n o t h i n gh e r e )

Also, a tree containing empty has an empty tree as its value:


' ( 1e m p t y2 ) * / | \ 1 2

The data definition looks like this:

AN e s t e d L i s ti so n eo f : *e m p t y *( c o n sN u mN e s t e d L i s t ) *( c o n sN e s t e d L i s tN e s t e d L i s t ) ; ;T e m p l a t ef o rN e s t e d L i s t ; ;m y n e s t e d l i s t f n :N e s t e d L i s t>A n y ( d e f i n e( m y n e s t e d l i s t f nl i s t ) ( c o n d[ ( e m p t y ?l i s t ). . . ] [ ( n u m b e r ?( f i r s tl i s t ) ) ( . . .( f i r s tl i s t ). . . ( m y n e s t e d l i s t f n( r e s tl i s t ) ). . . ) ] [ e l s e( . . .( m y n e s t e d l i s t f n( f i r s tl i s t ) ). . . ( m y n e s t e d l i s t f n( r e s tl i s t ) ). . . ] ) )

Consider a list flattening function:


; ;f l a t t e n :N e s t e d L i s t>A n y ( d e f i n e( f l a t t e nl i s t ) ( c o n d[ ( e m p t y ?l i s t )e m p t y ] [ ( n u m b e r ?( f i r s tl i s t ) ) ( c o n s( f i r s tl i s t )( f l a t t e n( r e s tl i s t ) ) ) ] [ e l s e( a p p e n d( f l a t t e n( f i r s tl i s t ) ) ( f l a t t e n( r e s tl i s t ) ) ) ] ) )

29/10/13
Consider now a representation for algebraic expressions. These are simply the expressions we saw earlier, except now with support for variables. For example, ' ( +4# \ x( *53# \ x ) ):
A nA l g E x pi so n eo f : *N u m *C h a r; W EU S EC H A RH E R EB E C A U S EA NO P E R A T O RI SAS Y M B O LA N DI TW O U L DB EC O N F U S I N GT OH A V ES Y M B O L SM E A NT W OD I F F E R E N TT H I N G S *( c o n sS y m b o l( l i s t o fA l g E x p ) ) ; ;m y l i s t o f a l g e x p f n :( l i s t o fA l g E x p )>A n y ( d e f i n e( m y l i s t o f a l g e x p f na l g l i s t ) ( c o n d[ ( e m p t y ?a l g l i s t ). . . ] [ e l s e( . . .( m y a l g e x p f n( f i r s ta l g l i s t ) ). . . ( m y l i s t o f a l g e x p f n( r e s ta l g l i s t ) ) ) ] ) ) ; ;m y a l g e x p f n :A l g E x p>A n y ( d e f i n e( m y a l g e x p f na l g ) ( c o n d[ ( n u m b e r ?a l g ). . . ] [ ( c h a r ?a l g ). . . ] [ e l s e( . . .( f i r s ta l g ). . . ( m y l i s t o f a l g e x p f n( r e s ta l g ) ). . . ) ] ) )

Now we can write a substitution function:


; ;s u b s t i t u t e l i s t :( l i s t o fA l g E x p )C h a rN u m>( l i s t o fA l g E x p ) ; ;P u r p o s e :p r o d u c e s` a l g l i s t `w h e r e` v a r `i sr e p l a c e db y` v a l u e ` ; ;E x a m p l e s : ; ;N O TR E Q U I R E DD U ET OM U T U A LR E C U R S I O N ( d e f i n e( s u b s t i t u t e l i s ta l g l i s tv a rv a l u e ) ( c o n d[ ( e m p t y ?a l g l i s t )e m p t y ] [ e l s e( c o n s( s u b s t i t u t e( f i r s ta l g l i s t )v a rv a l u e ) ( s u b s t i t u t e l i s t( r e s ta l g l i s t )v a rv a l u e ) ) ] ) ) ; ;T e s t s : ; ;N O TR E Q U I R E DD U ET OM U T U A LR E C U R S I O N ; ;s u b s t i t u t e :A l g E x pC h a rN u m>A l g E x p ; ;P u r p o s e :p r o d u c e s` a l g `w h e r e` v a r `i sr e p l a c e db y` v a l u e ` ; ;E x a m p l e s : ( c h e c k e x p e c t( s u b s t i t u t e' ( +1# \ x2# \ y# \ x )# \ x5 )' ( +152# \ y5 ) ) ( c h e c k e x p e c t( s u b s t i t u t e# \ x# \ x5 )5 ) ( d e f i n e( s u b s t i t u t ea l gv a rv a l u e ) ( c o n d[ ( n u m b e r ?a l g )a l g ] [ ( c h a r ?a l g )( c o n d[ ( c h a r = ?a l gv a r )v a l u e ] [ e l s ea l g ] ) ] [ e l s e( c o n s( f i r s ta l g ) ( s u b s t i t u t e l i s t( r e s ta l g ) ) ) ] ) ) ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y

General trees are useful for representing any sort of nested data. For example, a book might be represented as follows:
' ( c h a p t e r( s e c t i o n( p a r a g r a p h" F i r s ts e n t e n c e . " " S e c o n ds e n t e n c e . " ) ( p a r a g r a p h" C o n t i n u e d . " ) ) ( s e c t i o n. . . ) . . . )

Local Definitions and Lexical Scope


Only available beginning with Intermediate Student. Not part of Standard Scheme, but there are similar constructs available there which are simpler, but not as general. Definitions have to this point been made at the "top level", outside of any expressions. However, there is also a special form l o c a l, which allows us to make definitions inside an expression and use them only inside that expression:
( l o c a l[ ( d e f i n eax )( d e f i n eby )( d e f i n ecz ). . . ];w eu s es q u a r eb r a c k e t sb yc o n v e n t i o nt oi m p r o v er e a d a b i l i t y . . . );d os o m e t h i n gw i t ht h o s ed e f i n i t i o n s

In local definition, definitions behave like the those in the top level. We can even define functions. Consider Heron's formula, used for calculating the area of a triangle with side lengths a, b , and c:A = $ where s
( d e f i n e( t a r e aabc ) ( s q r t( *( /( +abc )2 ) ( -( /( +abc )2 )a ) ( -( /( +abc )2 )b ) ( -( /( +abc )2 )c ) ) ) )
=
a+b +c 2

The repeated calculation of ( /( +abc )2 ) is messy. We can instead use l o c a l :


( d e f i n e( t a r e aabc ) ( l o c a l[ ( d e f i n es( /( +abc )2 ) ) ] ( s q r t( *s( -sa )( -sb )( -sc ) ) ) )

This is significantly more readable and more efficient. Note that we can also refer to earlier definitions:
( d e f i n e( t a r e aabc ) ( l o c a l[ ( d e f i n es u m( +abc ) ) ( d e f i n es( /s u m2 ) ) ] ( s q r t( *s( -sa )( -sb )( -sc ) ) ) ) )

Here, we can reference s u m from a definition right after it. Note that the order is significant - definitions must be defined before they are used.

31/10/13
Lexical scope
A binding occurrence of a name is an occurrence of the name when it is used as a definition or a formal parameter to a function. The bound occurrences associated with a binding occurrence and a name are the occurrences of the name that correspond to the binding occurrence. The scope is where the binding takes effect. This is generally the area where it can be referenced, and excludes the "holes" (nested scopes) where the binding is shadowed. Definitions are resolved from the innermost scope to the outermost scope. Definitions are said to shadow definitions in the parent scope if a name in the inner definition is the same as one in the outer one. In this case, the inner one takes precedence and the parent one is shadowed.

Lexical scoping means that binding resolution is based on where the scope is textually located in the code. So the parent scope of a given scope is the scope that is textually surrounding it. For example, the scope of variables in a l o c a l is exactly the area within the brackets surrounding l o c a l. This contrasts with dynamic scoping, where the parent scope can change depending on use. When we define something in a local scope that has the same name as something in the parent scope (this is not recommended), references to that name in the local scope reference the local definition, while references outside are unchanged. The global/top-level scope is the scope of top-level definitions. All programs initially statrt off in the global scope.

Stepping
The stepping rules for local are the most complex we have seen so far: 1. 2. 3. 4. Create new, unique names for the every local definitions. Bind the new names to the values of the definitions. Substitute the new names for the old names everywhere inside the local scope. Move all the definitions outside of the local, into the top scope, making sure to preserve the order. We can do this because the names are all unique. 5. Replace the l o c a l with its body expression. This all happens in one step. Consider the following:
( d e f i n es' b l a h ) ( l o c a l[ ( d e f i n es u m( +abc ) ) ( d e f i n es( /s u m2 ) ) ] ( s q r t( *s( -sa )( -sb )( -sc ) ) ) ) ; ;O N ES T E PB E G I N S ; ;c r e a t en a m e s ,b i n dv a l u e s ,a n ds u b s t i t u t et h en e wn a m e s ( d e f i n es' b l a h ) ( l o c a l[ ( d e f i n es u m _ 0( +abc ) ) ( d e f i n es _ 0( /s u m _ 02 ) ) ] ( s q r t( *s _ 0( -s _ 0a )( -s _ 0b )( -s _ 0c ) ) ) ) ; ;m o v ed e f i n i t i o n so u t s i d eo ft h el o c a l ( d e f i n es' b l a h ) ( d e f i n es u m _ 0( +abc ) ) ( d e f i n es _ 0( /s u m _ 02 ) ) ( l o c a l[ ]( s q r t( *s _ 0( -s _ 0a )( -s _ 0b )( -s _ 0c ) ) ) ) ; ;r e p l a c el o c a lw i t hi t sb o d y ( d e f i n es' b l a h ) ( d e f i n es u m _ 0( +abc ) ) ( d e f i n es _ 0( /s u m _ 02 ) ) ( s q r t( *s _ 0( -s _ 0a )( -s _ 0b )( -s _ 0c ) ) ) ; ;O N ES T E PE N D S

Purpose
We use l o c a l to make code more readable, by factoring out common subexpressions. This is also useful for efficiency purposes. Recall the exponential-time list maximum function:
; ;l i s t m a x :( n e l i s t o fN u m )>N u m ( d e f i n e( l i s t m a xl i s t ) ( c o n d[ ( e m p t y ?( r e s tl i s t ) )( f i r s tl i s t ) ] [ ( > =( f i r s tl i s t )( l i s t m a x( r e s tl i s t ) ) )( f i r s tl i s t ) ] [ e l s e( l i s t m a x( r e s tl i s t ) ) ] ) )

We can now use l o c a l to make it much more efficient:


; ;l i s t m a x :( n e l i s t o fN u m )>N u m ( d e f i n e( l i s t m a xl i s t ) ( c o n d[ ( e m p t y ?( r e s tl i s t ) )( f i r s tl i s t ) ] [ e l s e( l o c a l[ ( d e f i n em( l i s t m a x( r e s tl i s t ) ) ) ] ( c o n d[ ( > =( f i r s tl i s t )m )( f i r s tl i s t ) ] [ e l s em ] ) ) ] ) )

Now it calls the function ony once per call, and runs in linear time.

Encapsulation

Encapsulation is the process of grouping things together into a capsule or a black box. We choose the hide the irrelevant details to make things simpler. Behavior encapsulation is the encapsulation of functions. Since we can define functions locally, we use this to encapsulate related functions. For example, helper functions that are only used by one function can and should be moved inside that function as a local definition. This makes them invisible outside the function and avoids cluttering the top-level namespace.
; ;s u m l i s t :( l i s t o fN u m )>N u m ; ;P u r p o s e :p r o d u c e st h es u mo fe v e r ye l e m e n ti n` l o n ` ; ;E x a m p l e s : ; ;O M I T T E DF O RB R E V I T Y ( d e f i n e( s u m l i s tl o n ) ( l o c a l[ ; ;s u m a c c :( l i s t o fN u m )>N u m ; ;P u r p o s e :p r o d u c e st h es u mo fe v e r ye l e m e n ti n` l s t `p l u s` a c c ` ( d e f i n e( s u m a c cl s ta c c ) ( c o n d[ ( e m p t y ?l s t )a c c ] [ e l s e( s u m a c c( r e s tl s t ) ( +( f i r s tl s t )a c c ) ) ] ) ) ] ( s u m a c cl o n0 ) ) ) ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y

Note that the locally defined function does not require examples or tests. However, the function it is located in must fully test the locally defined function. It's useful that s u m a c c can access any of the bindings available in the scope of s u m l i s t . For example, this can remove the need for parameters that "go along for the ride":
( d e f i n e( c o u n t t ou p p e r ) ( l o c a l[ ( d e f i n e( c o u n t f r o ml o w e r ) ( c o n d[ ( >l o w e ru p p e r )e m p t y ] [ e l s e( c o n sl o w e r( c o u n t f r o m( a d d 1l o w e r ) ) ) ] ) ) ] ( c o u n t f r o m0 ) ) )

Each time we evaluate a local, we are lifting out another set of definitions - defining a different function. If we evaluate ( c o u n t t o1 ), a function gets created with a body equal to c o u n t f r o m , except with u p p e r replaced by 1 . If we evaluate ( c o u n t t o2 ), another function gets created with a body equal to c o u n t f r o m , except with u p p e r replaced by 2 . This allows us to create different functions as needed. Now we can fully encapsulate the sort function defined earlier:
; ;s o r t :( l i s t o fN u m )>( l i s t o fN u m ) ; ;P u r p o s e :p r o d u c e s` l i s t `s o r t e di na s c e n d i n go r d e r ; ;E x a m p l e s : ; ;O M I T T E DF O RB R E V I T Y ( d e f i n e( s o r tl i s t ) ( l o c a l[ ; ;i n s e r t :N u m( l i s t o fN u m )>( l i s t o fN u m ) ; ;P u r p o s e :p r o d u c e s` l i s t `w i t h` e l e m e n t `i n s e r t e di ns o r t e do r d e r ( d e f i n e( i n s e r te l e m e n tl i s t ) ( c o n d[ ( e m p t y ?l i s t )( c o n se l e m e n te m p t y ) ] [ ( < =e l e m e n t( f i r s tl i s t ) )( c o n se l e m e n tl i s t ) ] [ e l s e( c o n s( f i r s tl i s t )( i n s e r te l e m e n t( r e s tl i s t ) ) ) ] ) ) ] ( c o n d[ ( e m p t y ?l i s t )e m p t y ] [ e l s e( i n s e r t( f i r s tl i s t )( s o r t( r e s tl i s t ) ) ) ] ) ) ) ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y

5/11/13
Functions are first-class values. This means functions can be passed as arguments to other functions, returned as values, and otherwise treated just like other values like numbers or strings. Consider the sorting funtion example shown previously. What would we need to change to make it work with strings rather than numbers?

( d e f i n e( s o r tl i s t ) ( l o c a l[ ; ;i n s e r t :N u m( l i s t o fN u m )>( l i s t o fN u m ) ; ;P u r p o s e :p r o d u c e s` l i s t `w i t h` e l e m e n t `i n s e r t e di ns o r t e do r d e r ( d e f i n e( i n s e r te l e m e n tl i s t ) ( c o n d[ ( e m p t y ?l i s t )( c o n se l e m e n te m p t y ) ] [ ( s t r i n g < = ?e l e m e n t( f i r s tl i s t ) )( c o n se l e m e n tl i s t ) ] [ e l s e( c o n s( f i r s tl i s t )( i n s e r te l e m e n t( r e s tl i s t ) ) ) ] ) ) ] ( c o n d[ ( e m p t y ?l i s t )e m p t y ] [ e l s e( i n s e r t( f i r s tl i s t )( s o r t( r e s tl i s t ) ) ) ] ) ) )

This is not very elegant - for every type, we need to define a new function.

Abstract List Functions


Sort
Instead, we can pass a comparison function as an argument to the sort, which will abstract away the details of comparison:
; ; s o r t :( XX>B o o l e a n )( l i s t o fX )>( l i s t o fX ) ( d e f i n e( s o r tl i s tl e s s e q u a l ? ) ( l o c a l[ ; ;i n s e r t :N u m( l i s t o fN u m )>( l i s t o fN u m ) ; ;P u r p o s e :p r o d u c e s` l i s t `w i t h` e l e m e n t `i n s e r t e di ns o r t e do r d e r ( d e f i n e( i n s e r te l e m e n tl i s t ) ( c o n d[ ( e m p t y ?l i s t )( c o n se l e m e n te m p t y ) ] [ ( l e s s e q u a l ?e l e m e n t( f i r s tl i s t ) )( c o n se l e m e n tl i s t ) ] [ e l s e( c o n s( f i r s tl i s t )( i n s e r te l e m e n t( r e s tl i s t ) ) ) ] ) ) ] ( c o n d[ ( e m p t y ?l i s t )e m p t y ] [ e l s e( i n s e r t( f i r s tl i s t )( s o r t( r e s tl i s t )l e s s e q u a l ) ) ] ) ) )

Note the use of X to represent a particular type (that is possibly a union), in order to show that the input types and output types are the same. This is known as a type variable. We can also use ones like W, Y, or Z, as long as the meaning is clear. We use type variables whenever two or more places within a contract need to have the same type. The function works with many different types of data. THis makes it generic or polymorphic, a positive quality. We also used ( XX>B o o l e a n ) to represent a function type. The type of a function is its contract. Now we can call the function thus
( s o r t( " b "" d "" a "" c " )s t r i n g < = ? ) ( s o r t( 52913 )< = )

We can also define custom comparators:


( d e f i n e( p o s n l e s s e q u a l ?p 1p 2 ) ( < =( +( s q r( p o s n xp 1 ) )( s q r( p o s n yp 1 ) ) ) ( +( s q r( p o s n xp 2 ) )( s q r( p o s n yp 2 ) ) ) ) ) ( s o r t( l i s t( m a k e p o s n12 )( m a k e p o s n43 )( m a k e p o s n00 ) ) )>( l i s t( m a k e p o s n00 )( m a k e p o s n12 )( m a k e p o s n43 ) )

The built-in function ( q u i c k s o r tl i s tl e s s e q u a l ? ) does the same thing. These are abstract list functions - they work on a whole class of lists.

Map
Using this technique, we find that there are a lot of different abstract list operations that we often do. For example, applying a function to every element in a list:
; ;m a p :( X>Y )( l i s t o fX )>( l i s t o fY ) ( d e f i n e( m a pfl i s t ) ( c o n d[ ( e m p t y ?l i s t )e m p t y ] [ e l s e( c o n s( f( f i r s tl i s t ) )( m a pf( r e s tl i s t ) ) ) ] ) )

Note that m a p is also a built-in function that does the same thing. How do we use this?
( m a ps q r' ( 12345 ) )>' ( 1491 62 5 ) ( m a pe v e n ?' ( 12345 ) )>' ( # f# t# f# t# f )

Filter
Another example is removing elements that do not fit a certain criteria:
; ;f i l t e r :( X>B o o l e a n )( l i s t o fX )>( l i s t o fX ) ( d e f i n e( f i l t e rk e e p ?l i s t ) ( c o n d[ ( e m p t y ?l i s t )e m p t y ] [ ( k e e p ?( f i r s tl i s t ) )( c o n s( f( f i r s tl i s t ) )( m a pf( r e s tl i s t ) ) ) ] [ e l s e( f i l t e rk e e p ?( r e s tl i s t ) ) ] ) )

Note that f i l t e r is also a built-in function that does the same thing. How do we use this? (filter negative? '(1 -5 -7 3 0)) -> '(-5 -7) (filter #t '(1 2 3 4 5)) -> '(1 2 3 4 5) (list->string (filter char-alphabetic? (string->list "a89erha ae 23*%$%44 yusdh"))) -> "aerhaaeyusdh" Consider the original e l e m e n t s m o r e t h a n in assignment 4, question 2a. Now we can write it much more simply using the abstract list functions:
; ;e l e m e n t s m o r e t h a n :( l i s t o fN u m )>( l i s t o fN u m ) ; ;P u r p o s e :p r o d u c e st h ee l e m e n t so f` l o n `s t r i c t l yg r e a t e rt h a n` n ` ( d e f i n e( e l e m e n t s m o r e t h a nl o nn ) ( l o c a l[ ; ;k e e p ? :N u m>B o o l e a n ; ;P u r p o s e :p r o d u c e s` t r u e `i f` n u m b e r `i sg r e a t e rt h a n` n `; w i p :f i g u r eo u th o wt or e f e rt o` n ` ( d e f i n e( k e e p ?n u m b e r ) ( >n u m b e rn ) ) ] ( f i l t e rk e e p ?l o n ) ) )

Fold Right
How do we add up a list of numbers?
; ;t o t a l :( l i s t o fN u m )>N u m ( d e f i n e( t o t a ll o n ) ( c o n d[ ( e m p t y ?l o n )0 ] [ e l s e( +( f i r s tl o n )( t o t a l( r e s tl o n ) ) ) ] ) )

This basic form is also used in m a k e a c r o n y m, as well as many other places. How do we abstract this? An abstract list function could apply a function to the first element of a list and the result of applying it to the rest of the list:
f o l d r :( XY>Y )Y( l i s t o fX )>Y ( d e f i n e( f o l d rfb a s e c a s el i s t ) ( c o n d[ ( e m p t y ?l i s t )b a s e c a s e ] [ e l s e( f( f i r s tl i s t )( f o l d rfb a s e c a s e( r e s tl i s t ) ) ) ] ) )

Note that f o l d r is also a built-in function that does the same thing. The function f should accept an element and the "folded" rest of the list. How do we use this?
( f o l d r+0' ( 5237 ) )>1 7 ( d e f i n e( g l u e f i r s tw o r da c r o n y m ) ( s t r i n g a p p e n d( s u b s t r i n gw o r d01 )a c r o n y m ) ) ( f o l d rg l u e f i r s t" "' ( " K e n t u c k y "" F r i e d "" C h i c k e n " ) )>" K F C " f o l d r abstracts

the list template using pure structural recursion.

Intuitively, ( f o l d rfb a s e' ( abc. . . ) ) is equivalent to ( fa( fb( fc. . . ) ) )

Fold Left
This is less commonly used. It does the same thing as f o l d r, but in the opposite order. We can implement it as follows:

f o l d l :( YX>Y )Y( l i s t o fX )>Y ( d e f i n e( f o l d lfb a s e c a s el i s t ) ( l o c a l[ ; ;f o l d f r o m l e f t :( YX>Y )Y( l i s t o fX )>Y ( d e f i n e( f o l d f r o m l e f tfp r e v i o u sl i s t ) ( c o n d[ ( e m p t y ?l i s t )p r e v i o u s ] [ e l s e( f o l d f r o m l e f tf( fp r e v i o u s( f i r s tl i s t ) )( r e s tl i s t ) ) ] ) ) ] ( f o l d f r o m l e f tfb a s e c a s el i s t ) ) )

Note that f o l d r is also a built-in function that does the same thing.
f o l d l abstracts

the list template using structural recursion with one accumulator.

Intuitively, ( f o l d lfb a s e' ( . . .xyz ) ) is equivalent to ( fz( fy( fx. . . ) ) )

Build List
How do we apply a function to numbers from 1 to n?
; ;e v e n n u m b e r s :N a t>( l i s t o fN a t ) ; ;P u r p o s e :p r o d u c e sal i s to fe v e nn u m b e r si n c l u d i n g0u pt ob u tn o ti n c l u d i n g` n ` ( d e f i n e( e v e n n u m b e r sn ) ( l o c a l[ ( d e f i n e( e v e n n u m b e r s f r o ms t a r t ) ( c o n d[ ( > =s t a r tn )e m p t y ] [ e l s e( c o n s( *s t a r t2 )( e v e n n u m b e r s f r o m( a d d 1s t a r t ) ) ) ] ) ) ] ( e v e n n u m b e r s f r o m0 ) ) )

How can we abstract this? An abstract list function could apply a function to every number from 0 to the target value:
; ;b u i l d l i s t :N a t( N a t>X )>( l i s t o fX ) ( d e f i n e( b u i l d l i s tnf ) ( l o c a l[ ( d e f i n e( b u i l d l i s t f r o ms t a r t ) ( c o n d[ ( > =s t a r tn )e m p t y ] [ e l s e( c o n s( fs t a r t )( b u i l d l i s t( a d d 1s t a r t ) ) ) ] ) ) ] ( b u i l d l i s t f r o m0 ) ) )

Note that b u i l d l i s t is also a built-in function that does the same thing. The function f should accept a natural number and produce an element of the resulting list.
b u i l d l i s t abstracts

the count-up pattern. the character at a given index in a given string. The first character is at index 0.

( s t r i n g r e fS t r i n gN a t )>C h a r obtains

We can use this to implement s t r i n g > l i s t ourselves using b u i l d l i s t :


; ;s t r i n g > l i s t :S t r i n g>( l i s t o fC h a r ) ; ;P u r p o s e :p r o d u c e sal i s to fc h a r a c t e r sf o re a c hc h a r a c t e ri n` s ` ( d e f i n e( s t r i n g > l i s ts ) ( b u i l d l i s t( s t r i n g l e n g t hs )( l a m b d a( i )( s t r i n g r e fsi ) ) ) )

From now on, we should use the abstract list function whenever possible, rather than dealing with f i r s t and r e s t . The opposite of abstract list functions is explicit recursion.

7/11/13
Create a function that when given a list of numbers, produces the list of those numbers greater than the average:
( d e f i n e( a b o v e a v e r a g el o n ) ( l o c a l[ ( d e f i n ea v e r a g e( /( f o l d r+0l o n )( l e n g t hl o n ) ) ) ( d e f i n e( h i g h e r ?n ) ( >na v e r a g e ) ) ] ( f i l t e rh i g h e r ?l o n ) ) )

Create a funciton that checks if a given list of strings is a word chain - where the last letter of each word is the first letter of the next word:

( d e f i n e( w o r d c h a i n ?l o s ) ( l o c a l[ ( d e f i n e( c h e c k l e t t e rw o r d 1w o r d 2 o r b o o l ) ( l o c a l[ ( d e f i n ew o r d 1 l e n g t h( s t r i n g l e n g t hw o r d 1 ) ) ] ( c o n d[ ( b o o l e a n ?w o r d 2 o r b o o l ) ( c o n d[ w o r d 2 o r b o o lw o r d 1 ];i g n o r et h es t a r t i n gc a s e [ e l s ef a l s e ] ) ];a l r e a d yf a i l e dt e s t [ ( s t r i n g = ?( s u b s t r i n gw o r d 1( s u b 1w o r d 1 l e n g t h )w o r d 1 l e n g t h ) ( s u b s t r i n gw o r d 2 o r b o o l01 ) ) w o r d 1 ] [ e l s ef a l s e ] ) ) ) ] ( s t r i n g ?( f o l d rc h e c k l e t t e rt r u el o s ) ) ) )

We can have lists and structures that produce functions. We can also have functions that produce functions:
; ;g e n e r a t e l i n e :P o s nP o s n>( N u m>N u m ) ; ;P u r p o s e :p r o d u c e saf u n c t i o nt h a tr e p r e s e n t sal i n ep a s s i n gt h r o u g h` p 1 `a n d` p 2 ` ; ;E x a m p l e s : ; ;O M I T T E DF O RB R E V I T Y ( d e f i n e( g e n e r a t e l i n ep 1p 2 ) ( l o c a l[ ( d e f i n es l o p e( /( -( p o s n yp 2 )( -( p o s n yp 1 ) ) ) ( -( p o s n yp 2 )( -( p o s n yp 1 ) ) ) ) ) ( d e f i n ei n t e r c e p t( -( p o s n yp 1 )( *s l o p e( p o s n xp 1 ) ) ) ) ] ( l a m b d a( x )( +( *s l o p ex )b ) ) ) ) ; ;T e s t s : ; ;O M I T T E DF O RB R E V I T Y

Note that due to the halting problem, we cannot compare two functions for equality. Therefore, we can't directly test the function that g e n e r a t e l i n e produces. However, we can just test the function that it produces instead of g e n e r a t e l i n e itself. We can use it like this:
( ( g e n e r a t e l i n e( m a k e p o s n00 )( m a k e p o s n12 ) )5 )>1 0

We can test it like this:


( c h e c k e x p e c t( ( g e n e r a t e l i n e( m a k e p o s n00 )( m a k e p o s n12 ) )5 )1 0 ) ( c h e c k e x p e c t( ( g e n e r a t e l i n e( m a k e p o s n00 )( m a k e p o s n12 ) )0 )0 ) ( c h e c k e x p e c t( ( g e n e r a t e l i n e( m a k e p o s n00 )( m a k e p o s n12 ) )1 )2 )

Lambda
( l a m b d a( a r g 1a r g 2. . . )b o d y ) l a m b d a creates an anonymous/unnamed function - a function that is not bound to a name. This is roughly equivalent to the following: ( l o c a l[ ( d e f i n e( t e m p o r a r y f u n c t i o na r g 1a r g 2. . . ) b o d y ) ] t e m p o r a r y f u n c t i o n )

This is simply a function like any other, except there are no names that refer to them. A lambda is an anonymous function. This is very useful for the abstract list functions. Where we previously made small helper functions in local definitions, now we can simply use l a a m b d a. Anonymous functions do not need any parts of the design recipe.
( d e f i n e( f. . . ). . . ) is

actually a short form for ( d e f i n ef( l a m b d a( . . . ). . . ) ) .

Stepping
Lambdas by themselves are values and are in their simplest form. When applied, lambdas are substituted for their bodies, with arguments inserted in the place of parameters, just like with normal functions. In Intermediate Student, function applications and definitions with zero arguments are allowed. Note that ( + ) is 0 and ( * ) is 1. Functional abstraction is the process of creating abstract functions like f i l t e r. When we abstract the details into an abstract function, we reduce code size and make it easier to fix bugs.

Scope
Consider the following function:
( d e f i n e( m a k e a d d e rn ) ( l a m b d a( x )( +xn ) ) )

We use it as follows:
( d e f i n ea d d 5( m a k e a d d e r5 ) ) ( a d d 56 )= >1 1

The binding occurrence of n is outside of the lambda. ( m a k e a d d e r5 ) creates a new function that is equivalent to ( l a m b d a( x )( +x5 ) ). Note that a d d 5 still has access to n inside m a k e a d d e r, even though we are no longer inside of m a k e a d d e r when we are calling a d d 5. This is because the function body itself is still inside m a k e a d d e r, and so still follows the rules of lexical scoping. Functions that consume or produce functions are sometimes known as higher-order functions.

12/11/13
We are actually not as behind as we thought. So today we will go through module 10 again, but slower this time. We can actually implement m a p and f i l t e r all using f o l d r:
( d e f i n e( m y m a pfl ) ( f o l d r( l a m b d a( xy )( c o n s( fx )y ) )e m p t yl ) ) ( d e f i n e( m y f i l t e rfl ) ( f o l d r( l a m b d a( xy )( c o n d[ ( fx )( c o n sxy ) ][ e l s ey ] ) )e m p t yl ) )

14/11/13
Everything that can be done with the list template can be done via f o l d r, unless it terminates the recursion before the base case, like i n s e r t. Abstract list functions should be used in addition to the list template, when it makes for more understandable code.

Generative Recursion
Structural recursion is a way of writing code that results in the code following the form of the data definition. In contrast, generative recursion has the recursive cases and base cases generated based on the problem to be solved. Consider the GCD function using the Euclidean algorithm:
( d e f i n e( g c dnm ) ( c o n d[ ( z e r o ?m )n ] [ e l s e( g c dm( r e m a i n d e rnm ) ) ] ) )

We know this is correct because we have proven it in MATH135 - see the proof of GCD-WR. In other words, we know that it will give the correct result.

Termination
We want to know if the function terminates - if an application of the function results in a simplest form in finite time. For structurally recursive functions, this is easy because we know that each recursive case recurses on a value closer to the base case,

and so it must eventually terminate. Therefore, we can always bound the depth of recursion based on certain characteristics of the input. For generatively recursive functions, we must be able to make a similar proof of termination. This will depend on the function itself. For g c d, we know that ( r e m a i n d e rnm ) < m and that both are positive or 0. So m is decreasing on every call . Since it can never shoot past 0 into the negatives, it must eventually reach 0, the base case. Therefore, for any input, the depth of recursion is bounded by the argument m. It is not possible to analyze an arbitrary function to see if it will terminate due to the halting problem. Consider the Collatz conjecture, which states that the hailstone sequence,
xn = {
xn1 2

if xn1 is even3 xn1 + 1

if xn1 is odd

, must eventually result in the sequence 1, 4, 2, 1, 4, 2, .

This is, as of 2013, an unsolved problem in mathematics. We do not know if an arbitrary starting value will eventually result in 1. As a result, whether the following function terminates is also an unsolved problem in mathematics:
( d e f i n e( c o l l a t zn ) ( c o n d[ ( =n1 )1 ] [ ( e v e n ?n )( /n2 ) ] [ e l s e( a d d 1( *n3 ) ) ] ) )

Quicksort
Consider a more practical example of generative recursion. Quicksort is a sorting algorithm used very widely due to its performance in real-world situations. This is generative recursion because we are not following the data definition for a list. It is a divide and conquer algorithm - we divide the problem into smaller subproblems, then recursively solve each one. Afterwards, we combine the results together to obtain the final result. Quicksort works by picking a pivot, then recursively sorting all the elements lower than the pivot, and all the elements higher than the pivot. Afterwards, the two sorted sublists and the pivots are simply put back together again. Now we will implement it. We can simply select the first element of the list as a pivot. This is done with ( f i r s tl i s t ). Now we need to obtain those elements less than the pivot, excluding the pivot itself: ( f i l t e r( l a m b d a( x )( <x( f i r s tl i s t ) ) )( r e s tl i s t ) ). Now we need to obtain those elements greater than the pivot, excluding the pivot itself: ( f i l t e r( l a m b d a( x )( > =x( f i r s tl i s t ) ) )( r e s tl i s t ) ). We can combine the results as follows: ( a p p e n ds o r t e d l e s sp i v o ts o r t e d g r e a t e r ). We can now implement the function as follows:
( d e f i n e( q u i c k s o r tl ) ( c o n d[ ( e m p t y ?l )e m p t y ] [ e l s e( l o c a l[ ( d e f i n ep i v o t( f i r s tl ) ) ( d e f i n el e s s( f i l t e r( l a m b d a( x )( <xp i v o t ) )( r e s tl ) ) ) ( d e f i n eg r e a t e r( f i l t e r( l a m b d a( x )( > =xp i v o t ) )( r e s tl ) ) ) ] ( a p p e n dl e s s( l i s tl )g r e a t e r ) ) )

We know that this function terminates because each recursive call is given a smaller list, which is closer to the base case. Therefore, the function is bounded by the size of the list. Note that if the list is already sorted, our choice of pivot causes the l e s s list to be empty and the g r e a t e r list to have every elemetn except the pivot. This would cause it to have the worst-case behavior - quadratic time based on the size of the list. Likewise with lists sorted in descending order. This is caused by our choice of pivot. Choosing a better pivot in this case would probably help, but it wouldn't work in every case. Consider a list where all elements are equal. This would exhibit worst-case behavior regardless of the pivot choice. Quicksort has similarities to constructing a binary search search tree out of the elements in the list, and then flattening the tree into a list using in-order traversal. Quicksort can be seen as doing all of this, except without explicitly using trees. To continue the metaphor, we would simply choose the first element in the list as the root node of the current tree, and then add the

rest of the elements of the tree recursively according to their value. The tree sorting technique would use structural recursion, with accumulators. Quicksort uses generative recursion to do the same task but more efficiently.

Generative Recursion and the Design Recipe


When doing generative recursion, the following must also be taken into consideration when writing the design recipe: Purpose statements should also describe how the function works in addition to what it does. Examples must reflect the algorithm - more white-box testing. Templates are not used as often.

19/11/13
Mergesort
Mergesort recursively merges sorted sublists together, and eventually merges two lists into one sorted ist. We want to implement mergesort. This function is generatively recursive because we need to split the list into two halves.
; ;m e r g e s o r t :( l i s t o fN u m )>( l i s t o fN u m ) ( d e f i n e( m e r g e s o r tv a l u e s ) ( l o c a l[ ; ;m e r g e :( l i s t o fN u m )( l i s t o fN u m )>( l i s t o fN u m ) ( d e f i n e( m e r g el i s t 1l i s t 2 ) ( c o n d[ ( e m p t y ?l i s t 1 )l i s t 2 ] [ ( e m p t y ?l i s t 2 )l i s t 1 ] [ ( < =( f i r s tl i s t 1 )( f i r s tl i s t 2 ) ) ( c o n s( f i r s tl i s t 1 )( m e r g e( r e s tl i s t 1 )l i s t 2 ) ) ] [ e l s e ( c o n s( f i r s tl i s t 2 )( m e r g el i s t 1( r e s tl i s t 2 ) ) ) ] ) ) ] ( c o n d[ ( e m p t y ?v a l u e s )e m p t y ] ( m e r g e( m e r g e s o r t( l e f t h a l f o fv a l u e s ) ( r i g h t h a l f o fv a l u e s ) ) ) ) ) )

However, it is difficult and computationally expensive to split a list in two - we can't implement l e f t h a l f o f and r i g h t h a l f o f in a simple way. Instead, we take a different approach - we work from the bottom, and convert a list of lists into a smaller list of lists. Eventually, we merge two lists into one sorted list.
( d e f i n e( m e r g e s o r tv a l u e s ) ( c o n d[ ( e m p t y ?v a l u e s )e m p t y ] [ ( e m p t y ?( r e s tv a l u e s ) )( f i r s tv a l u e s ) ];o n ee l e m e n t [ e l s e( m e r g e s o r t( m e r g e p a i r s( m a pl i s tv a l u e s ) ) ) ] ) ) ( d e f i n e( m e r g e p a i r sv a l u e s ) ( c o n d[ ( e m p t y ?v a l u e s )e m p t y ];n oe l e m e n t s [ ( e m p t y ?( r e s tv a l u e s ) )( f i r s tv a l u e s ) ];o n ee l e m e n t [ e l s e( c o n s( m e r g e( f i r s tv a l u e s )( s e c o n dv a l u e s ) );m e r g et w oa d j a c e n te l e m e n t st o g e t h e r ( r e s t( r e s tv a l u e s ) ) ) ] ) ) ; ;( l i s t o fN u m )( l i s t o fN u m )>( l i s t o fN u m ) ( d e f i n e( m e r g el i s t 1l i s t 2 ) ( c o n d[ ( e m p t y ?l i s t 1 )l i s t 2 ] [ ( e m p t y ?l i s t 2 )l i s t 1 ] [ ( < =( f i r s tl i s t 1 )( f i r s tl i s t 2 ) ) ( c o n s( f i r s tl i s t 1 )( m e r g e( r e s tl i s t 1 )l i s t 2 ) ) ] [ e l s e ( c o n s( f i r s tl i s t 2 )( m e r g el i s t 1( r e s tl i s t 2 ) ) ) ] ) )

We can also simplify m e r g e s o r t by using a local definition:


( d e f i n e( m e r g e s o r tv a l u e s ) ( l o c a l[ ( d e f i n e( m e r g e s o r t n ev a l u e s ) ( c o n d[ ( e m p t y ?( r e s tv a l u e s ) )( f i r s tv a l u e s ) ];o n ee l e m e n t [ e l s e( m e r g e s o r t n e( m e r g e p a i r s( m a pl i s tv a l u e s ) ) ) ] ) ) ] ( c o n d[ ( e m p t y ?v a l u e s )e m p t y ] [ e l s e( m e r g e s o r t n ev a l u e s ) ] ) )

We know this terminates because m e r g e p a i r s always produces a list that is around half the given list. As a result, if we call it enough times, we will get a list of length 1, the base case.

21/11/13
Graphs
A graph is simply a collection of nodes where each node can refer to zero or more nodes, including themselves. A directed graph is a collection of nodes together with a collection of edges. In a directed graph, edges have direction - the edges (A, B) is different from (B, A). The first points from A to B, while the second points from B to A. There are also undirected graphs where edges have no direction**. Trees are always graphs, but graphs are not always trees. Trees are graphs that obey some additional constraints. We can draw graphs graphically. Nodes can be represented as dots with or without labels, and edges can be represented as arrows leading from one node to another. A graph is useful for solving a lot of different types of problems. For example, internet routing, solving sliding puzzles, and finding road directions.

Definitions
In a graph, a vertex is a node. An edge connects two nodes together. An edge is an ordered pair of nodes like (A, B), where A and B are nodes. is an edge connecting A and B. A is an in-neighbor of B (A points inward to B), and B is an out-neighbor of A ( A points outward towards B).
(A, B)

A sequence of nodes v1 , , vk is a path or route with length k 1 if (v1 , v2 ), (v2 , v3 ), , (vk1 , vk ) are edges in the graph. The length of a path is the number of edges that make it up. A cycle is a path where v1
= vk

- the path starts and ends at the same node.

A directed acyclic graph (DAG) is a graph with no possible cycles.

Representations
We can represent a graph as a list of nodes, each of which has a list of the nodes it points to. This is called the adjecency list representation. We will be using the adjacency list representation unless otherwise specified. This is basically an association list with nodes as keys and a list of their out-neighbors as values. In Scheme, we can represent a graph with ( l i s t o f( l i s tS y m b o l( l i s t o fS y m b o l ) ) ):
; ;AN o d ei saS y m b o l .I ti sav e r t e xi nag r a p h . ; ;AN o d e E n t r yi sa( l i s tN o d e( l i s t o fN o d e ) ) .I ts t o r e sav e r t e xa n dal i s to fv e r t i c e st h a tt h ev e r t e xp o i n t st o . ; ;AG r a p hi sa( l i s t o fN o d e E n t r y ) .I ti sac o l l e c t i o no fv e r t i c e sa n dt h e i ro u t n e i g h b o r s .

An example graph in this representation would be:


' ( ( P( Q ) ) ( Q( Z ) ) ( W( XY ) ) ( X( QZ ) ) ( Y( ) ) ( Z( ) ) )

Note that the order of the nodes in each list is completely arbitrary and does not matter.

Since in this representation, a graph is a list, we can use a list template:


; ;m y g r a p h f n :G r a p h>A n y ( d e f i n e( m y g r a p h f ng r a p h ) ( c o n d[ ( e m p t y ?g r a p h ). . . ] [ e l s e( . . .( f i r s t( f i r s tg r a p h ) ). . .;n o d e ( s e c o n d( f i r s tg r a p h ) ). . .;n o d eo u t n e i g h b o r s ( m y g r a p h f n( r e s tg r a p h ) ). . . ) ] ) )

Working with Graphs


Backtracking algorithms try to find a route from an origin to a destination. They try a possibility, and if it doesn't work out, goes back and tries another, until either there are no possibilities or a route is found. Suppose we wanted to write f i n d r o u t e, a function that finds a path from one node to another in a DAG. First we write a n e i g h b o r function:
; ;n e i g h b o r s :N o d eG r a p h>( l i s t o fN o d e ) ; ;P u r p o s e :l o o k su pt h el i s to fo u t n e i g h b o r so f` n o d e `i n` g r a p h ` ( d e f i n e( n e i g h b o r sn o d eg r a p h ) ( c o n d[ ( s y m b o l = ?n o d e( f i r s t( f i r s tg r a p h ) ) );w ed on o tu s eab a s ec a s eb e c a u s et h en o d ei sk n o w nt ob ei nt h eg r a p h ( s e c o n d( f i r s tg r a p h ) ) ] [ e l s e( n e i g h b o r sn o d e( r e s tg r a p h ) ) ] ) )

If there is a path, either the starting location is equal to the target location (base case), or the path exists in one of the node's outneighbors.
; ;f i n d r o u t e :N o d eG r a p h>( l i s t o fN o d e ) ; ;P u r p o s e :p r o d u c e sap a t hl e a d i n gf r o m` s t a r t `t o` e n d `i n` g r a p h ` ( d e f i n e( f i n d r o u t es t a r te n dg r a p h ) ( c o n d[ ( s y m b o l = ?s t a r te n d )( l i s te n d ) ] [ e l s e ( l o c a l[ ( d e f i n ef o u n d p a t h s ( f i l t e rc o n s ? ( m a p( l a m b d a( n o d e )( f i n d r o u t en o d ee n dg r a p h ) ) ( n e i g h b o r ss t a r t ) ) ) ) ] ( c o n d[ ( e m p t y ?f o u n d p a t h s )f a l s e ] [ e l s e( f i r s tf o u n d p a t h s ) ] ) ) ] ) )

This works, but it isn't very efficient since we only really care about one possible path, and it would be more efficient to stop searching when we've found a path already:
; ;f i n d r o u t e :N o d eN o d eG r a p h>( u n i o n( l i s t o fN o d e )f a l s e ) ; ;P u r p o s e :p r o d u c e sap a t hl e a d i n gf r o m` s t a r t `t o` e n d `i n` g r a p h `o rf a l s ei fn o tp o s s i b l e ( d e f i n e( f i n d r o u t es t a r te n dg r a p h ) ( c o n d[ ( s y m b o l = ?s t a r te n d )( l i s te n d ) ] [ e l s e( l o c a l[ ( d e f i n er o u t e( f i n d r o u t e l i s t( n e i g h b o r ss t a r tg r a p h )e n d ) ) ] ( c o n d[ ( f a l s e ?r o u t e )f a l s e ] [ e l s e( c o n ss t a r tr o u t e ) ] ) ) ] ) ) ; ;f i n d r o u t e l i s t :( l i s t o fN o d e )N o d eG r a p h>( u n i o n( l i s t o fN o d e )f a l s e ) ; ;P u r p o s e :p r o d u c e sap a t hl e a d i n gf r o mo n eo f` n o d e s `t o` e n d `o rf a l s ei fn o tp o s s i b l e ( d e f i n e( f i n d r o u t e l i s tn o d e se n dg r a p h ) ( c o n d[ ( e m p t y ?n o d e s )f a l s e ] [ e l s e( l o c a l[ ( d e f i n er o u t e( f i n d r o u t e( f i r s tn o d e s )e n dg r a p h ) ) ] ( c o n d[ ( c o n s ?r o u t e )r o u t e ] [ e l s e( f i n d r o u t e l i s t( r e s tn o d e s )e n dg r a p h ) ] ) ) ] ) )

We could trace this, but in this case it is more useful to do a trace tree. This is a tree drawn with the recursive call as each node.

26/11/13
f i n d r o u t e is

designed to work with acyclic graphs. If there is a cycle, it could potentially loop through the cycle infinitely.

For example, the graph ' ( ( A( B ) )( B( C ) )( C( A ) )( D( ) ) ) would result in infinite recursion if we tried to find a route from A to D. For directed acyclic graphs, any route must have no more nodes in it than the number of nodes in the graph. So f i n d r o u t e has an upper bound on the number of times it recurses - the number of routes to any destination in the graph. So the function always terminates if there are no cycles.

What if there are cycles? Some possible solution is to pass down a list of visited nodes to avoid visiting them again, since the visited nodes are still being processed:
; ;f i n d r o u t e :N o d eN o d eG r a p h( l i s t o fN o d e )>( u n i o n( l i s t o fN o d e )f a l s e ) ; ;P u r p o s e :p r o d u c e sap a t hl e a d i n gf r o m` s t a r t `t o` e n d `i n` g r a p h `h a v i n gv i s i t e d` v i s i t e d `o rf a l s ei fn o tp o s s i b l e ( d e f i n e( f i n d r o u t es t a r te n dg r a p hv i s i t e d ) ( c o n d[ ( s y m b o l = ?s t a r te n d )( l i s te n d ) ] [ e l s e( l o c a l[ ( d e f i n er o u t e( f i n d r o u t e l i s t( n e i g h b o r ss t a r tg r a p h )e n d g r a p h( c o n ss t a r tv i s i t e d ) ) ) ] ( c o n d[ ( f a l s e ?r o u t e )f a l s e ] [ e l s e( c o n ss t a r tr o u t e ) ] ) ) ] ) ) ; ;f i n d r o u t e l i s t :( l i s t o fN o d e )G r a p hN o d e>( u n i o n( l i s t o fN o d e )f a l s e ) ; ;P u r p o s e :p r o d u c e sap a t hl e a d i n gf r o mo n eo f` n o d e s `h a v i n gv i s i t e d` v i s i t e d `o rf a l s ei fn o tp o s s i b l e ( d e f i n e( f i n d r o u t e l i s tn o d e se n dg r a p hv i s i t e d ) ( c o n d[ ( e m p t y ?n o d e s )f a l s e ] [ ( m e m b e r ?( f i r s tn o d e s )v i s i t e d ) ( f i n d r o u t e l i s t( r e s tn o d e s )v i s i t e d ) ] [ e l s e( l o c a l[ ( d e f i n er o u t e( f i n d r o u t e( f i r s tn o d e s )e n dg r a p hv i s i t e d ) ) ] ( c o n d[ ( c o n s ?r o u t e )r o u t e ] [ e l s e( f i n d r o u t e l i s t( r e s tn o d e s )e n dg r a p hv i s i t e d ) ] ) ) ] ) )

Note that the value of v i s i t e d is always the reverse of the route, if there is one. In practice, we would usually write a wrapper function to avoid having to pass the extra v i s i t e d parameter. The accumulator makes sure we never recurse deeper than there are nodes in the graph, so the function always terminates, even if there are cycles. However, this is still not very efficient. Consider the following graph:
' ( ( A( B 1B 2 ) );d i a m o n d1 ( B 1( C ) ) ( B 2( C ) ) ( C( D 1D 2 ) );d i a m o n d2 ( D 1( E ) ) ( D 2( E ) ) ( E( F 1F 2 ) );d i a m o n d3 ( F 1( G ) ) ( F 2( G ) ) ( G( H 1H 2 ) );d i a m o n d4 ( H 1( I ) ) ( H 2( I ) ) ( I( ) ); e n do fd i a m o n d4 ( Z( ) ) )

If we tried to search for a path from A to Z, the backtracking search would check every possible path variation. For example, there are 2 ways to get from A to C, but C doesn't lead to Z in the first place, so searching to C is a waste of time the second time we try to get to it. In fact, with this diamond-shaped pattern of graph, the number of paths checked is an exponential function of the number of diamonds - 2 n , in our case. We can make this more efficient by having a failed route finding in f i n d r o u t e l i s t return the list of nodes that were visited so that we can avoid visiting them again in the remaining candidates in the list of neighbors:

; ;f i n d r o u t e :N o d eN o d eG r a p h>( u n i o n( l i s t o fN o d e )f a l s e ) ; ;P u r p o s e :p r o d u c e sap a t hl e a d i n gf r o m` s t a r t `t o` e n d `i n` g r a p h `o rf a l s ei fn o tp o s s i b l e ( d e f i n e( f i n d r o u t es t a r te n dg r a p h ) ( l o c a l[ ( d e f i n er o u t e( f i n d r o u t e f a s ts t a r te n dg r a p he m p t y ) ) ] ( c o n d[ ( e m p t y ?( f i r s tr o u t e ) )f a l s e ];n or o u t ef o u n d [ e l s er o u t e ] ) ) )

; ;f i n d r o u t e f a s t :N o d eN o d eG r a p h( l i s t o fN o d e )>( u n i o n( l i s t o fN o d e )( l i s t o f( u n i o ne m p t yN o d e ) ) ) ; ;P u r p o s e :p r o d u c e sap a t hl e a d i n gf r o m` s t a r t `t o` e n d `i n` g r a p h `h a v i n gv i s i t e d` v i s i t e d `o rt h el i s to fv i s i t e dn o d e sw i t he m ( d e f i n e( f i n d r o u t e f a s ts t a r te n dg r a p hv i s i t e d ) ( c o n d[ ( s y m b o l = ?s t a r te n d )( l i s te n d ) ] [ e l s e( l o c a l[ ( d e f i n er o u t e( f i n d r o u t e l i s t( n e i g h b o r ss t a r tg r a p h )e n d g r a p h( c o n ss t a r tv i s i t e d ) ) ) ] ( c o n d[ ( e m p t y ?( f i r s tr o u t e ) )r o u t e ] [ e l s e( c o n ss t a r tr o u t e ) ] ) ) ] ) )

; ;f i n d r o u t e l i s t :( l i s t o fN o d e )G r a p h( l i s t o fN o d e )>( u n i o n( l i s t o fN o d e )( l i s t o f( u n i o ne m p t yN o d e ) ) ) ; ;P u r p o s e :p r o d u c e sap a t hl e a d i n gf r o mo n eo f` n o d e s `h a v i n gv i s i t e d` v i s i t e d `o rt h el i s to fv i s i t e dn o d e sw i t he m p t yp r e p e n d e d ( d e f i n e( f i n d r o u t e l i s tn o d e se n dg r a p hv i s i t e d ) ( c o n d[ ( e m p t y ?n o d e s )( c o n se m p t yv i s i t e d ) ] [ ( m e m b e r ?( f i r s tn o d e s )v i s i t e d ) ( f i n d r o u t e l i s t( r e s tn o d e s )e n dg r a p hv i s i t e d ) ] [ e l s e( l o c a l[ ( d e f i n er o u t e( f i n d r o u t e f a s t( f i r s tn o d e s )e n dg r a p hv i s i t e d ) ) ] ( c o n d[ ( e m p t y ?( f i r s tr o u t e ) );r o u t en o tf o u n d ( f i n d r o u t e l i s t( r e s tn o d e s )e n dg r a p h( r e s tr o u t e ) ) ] [ e l s er o u t e ] ) ) ] ) )

Now each node is visited at most once, since once a node has been visited, it stays in v i s i t e d for the rest of the search. Therefore, the number of nodes we check is a linear function of the number of nodes in the graph. Since each node check calls n e i g h b o r s, which takes linear time proportional to the number of nodes in the graph, the runtime of f i n d r o u t e is a quadratic function of the number of nodes in the graph.

Implicit Graphs
Sometimes it is impractical or inefficient to build the entire graph explicitly. An implicit graph is one that isn't explicitly represented as a graph. For example, the possible legal moves in a chess game could be an implicit graph, with nodes representing possible game states and edges representing legal moves between these states. It is not practical to represent all the possible moves explicitly, so we do not. What if we wanted to use f i n d r o u t e to search for a path through an implicit graph? Note that the only part of the function that actually does anything with the graph is n e i g h b o r s. In fact, all we have to do is implement a different n e i g h b o r s function. A neighbor function that returned the list of legal moves in chess given the current game state would allow us to use a backtracking search to implement a simple chess solver, though its performance is impractically poor. In artificial intelligence applications, implementations usually add heuristics to determine which neighbors to explore first or which to skip entirely, in order to save time.

28/11/13
History
Computer Science is not a well defined field, having only existed for about 75 years. Early computations were done by humans, who were called "computers". Charles Babbage (1791-1871) was a mathemetician who invented but never built the difference engine and analytical engine mechanical computing machines where the specification of the operations to be executed were separated from actually execution. This is the first autonomous computing machine. Ada "Lovelace" Augusta Byron (1815-1852) assisted Babbage in the design of the machines and could be considered the first computer scientist or programmer. Wrote about the actual operation and use of the engines rather than just the design. David Hilbert (1862-1943) was a mathemtician who formalized the axiomatic treatment of Euclidean geometry. He is famous for posing 23 unsolved problems, some of which have since been solved. Most famously, Hilbert posed the question of whether mathematics is consistent (a statement cannot be proven true and false), or complete (all true statements are provable).

Kurt Godel (1906-1978) finally proved that any non-trivial system of axioms (a system of axioms capable of describing integer arithmetic) is not complete. If it is consistent, it cannot be proved within the system. The proof is based on "this statement cannot be proved" written in mathematical notation. It would be inconsistent if the statement was false, since it would be a proof of a false statement, so it must be true but not provable. Another one of Hilbert's questions asked if there was a procedure that, given a formula, proves it true, shows it false, or shows that it is unprovable. This requires a precise definition of a procedure - a formal model of computation. Alonzo Church (1903-1995) showed that such a procedure is not possible. He invented lambda calculus with his student Kleene. The notation is as simple as possible: x. ;wip Alan Turing (1912 1954) defined a different model of computation and resulted in a simpler and more influential proof. His Turing machine was a theoretical device with unlimited memory and a finite number of states. It could be thought of as a machine with an infinite tape and a finite state machine. One could represent a Turing machine using characters stored in the memory of another machine. Suppose there was a machine that could process one of these descriptions and result in whether it would eventually halt or not. Then there exists another machine that would halt if and only if the first machine said it wouldn't. This is a contradiction, so it is impossible for such a machine to exist. This is called the halting/undecidability problem and was a very significant result. This was later adapted into lambda calculus, and an equivalence was established between the two models. Turing is also known for his contributions to code breaking in World War 2, and to the first electronic computer, Colossus. John von Neumann (1903-1957) is known for his von Neumann computer architecture, with programs stored in the same memory as data. This is still the standard model for computation today, along with the Harvard architecture. However, it doesn't take advantage of parallel processing. ;wip: von neumann bottleneck Grace Murray Hopper (1906-1992) was the author of the first compiler defining an english-like data processing language, which later became COBOL. John Backus designed FORTRAN, which became the dominant language of numerical and scientific computation. The Backus-Naur notation is also attributable to him. He also proposed a functional programming language, and led to the development of LISP. John McCarthy (1927-2011) was an AI researcher at MIT known for designing and implementing LISP, based on ideas from lambda calculus and recursive function theory. It was described with only the functions a t o m( ( l a m b d a( x )( n o t( c o n s ?x ) ) ) ), e q( e q u a l ? ), c a r(f i r s t), c d r( r e s t), and c o n s . It also had the special forms l a m b d a, q u o t e ,c o n d, and l a b e l( d e f i n e). Many other functions could be implemented using just these functions. LISP (LISt Processing) eventually evolved into a general purpose programming language with many other useful functions. It became the dominant language in artificial intelligence due to its encouragement of modifying the language itself. It became two main standards after the 1980s - Common Lisp and Scheme. Gerald Sussman invented Scheme after an idea for a Lisp-like programming language had actors and lexical scoping added (as opposed to Common Lisp's dynamic scoping). It became popular in the study of programming languages. He also wrote the textbook "Structure and Interpretation of Computer Programs", based on Scheme. The textbook "How to Design Programs" was written to remedy some issues with that book, such as a steep learning curve and a lack of methodology. ;wip: page 696 APPLAUSE Copyright 2013 Anthony Zhang

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Vous aimerez peut-être aussi