Vous êtes sur la page 1sur 7

A beginners experience with Haskell

Sergei Winitzki
February 16, 2014

1 Introduction
I wanted to implement a simple, purely functional algorithm involving a lot of
small memory allocations, to see how different functional languages will handle
that kind of task. By intention, I wanted to keep the programming style as naive
as possible: pure functional programming without any kind of optimization by
hand.
I studied Haskell theoretically (just reading papers and tutorials), but I have
very little actual experience writing Haskell code. I wanted to get my feet wet
with an actual, although very simple, programming task. Here Id like to write
down a little bit about my experience.

2 The programming task


The program must be a command-line executable. The program will read two
integers as command-line arguments. These will be the values M and N .
The program must implement a module for a sorted bag data structure.
This is a purely functional binary tree structure containing a sortable value type,
ad the tree is always kept sorted. Elementary operations are create an empty
bag and add one value to the bag.
The program must implement a value type Junk that contains an integer, a
string, and is a sum type.

data Junk = A Int String | B Int String


In this way, it is guaranteed that any operation with a value of type Junk will
necessarily involve allocations of heap memory. Values of type Junk must be
sorted on the integer value. The program will implement a function for creating
a random value of type Junk, where at least the integer value is random. The
string value must be non-empty but does not have to be random.
The program will create N sorted bags, each bag containing M random
values of type Junk. The algorithm must do this by adding the random values
one by one to the bag.
The program will then be run, with progressively larger values of M and N ,
until the runtime finds itself out of memory.

1
3 Implementation
3.1 Sorted bag
This was the easiest part. The code is a pleasure to write and to read.

module SBag where


data SB v = Empty | Branch (SB v) v (SB v)
empty_sbag = Empty
incl :: (Ord v) => SB v -> v -> SB v
incl Empty a = Branch Empty a Empty
incl (Branch l c r) a = if a < c
then Branch (incl l a) c r
else Branch l c (incl r a)

3.2 Reading command-line arguments


Here the fun started.
Looking up in Google: how to read the args. Right away I see the answer:
do args <- getArgs. However, this does not compile because it cannot find
getArgs. Stackoverflow says I need to import Environment for this to work.
Nope: module Environment not found. Eventually I saw that I have to do
import System.Environment(getArgs) for this to work.
Observation: Stackoverflow has lots of Haskell info. Almost all of my Haskell
questions were already discussed on SO (although not necessarily with satisfac-
tory answers). This means Haskell is industry-ready.
All right, so getArgs gives me a list of strings, and I need to fetch two
integers m and n from that list.
Another quick search on SO: how do I parse integers from strings. Oh, of
course. Its called read.

import System.Environment(getArgs)
main = do
args <- getArgs
let m = read $ head args
let n = read $ head $ tail args
Lets see if this even compiles. Since I almost never write anything in Haskell,
I dont trust my syntax perhaps Ive made some mistakes already.
It doesnt compile, but not because of syntax errors:

The last statement in a 'do' block must be an expression


let n = read $ head $ tail args
Well, yes... its a do-block defining main, and so it should evaluate to an IO ().
So lets just write a return () at the end of the program, and try compiling
again...

2
No instance for (Read a0) arising from a use of `read'
The type variable `a0' is ambiguous
Possible fix: add a type signature that fixes these type variable(s)
Note: there are several potential instances:
instance Read System.Random.StdGen
-- Defined in `System.Random'
instance Read Day
-- Defined in `time-1.4.0.1:Data.Time.Format.Parse'
instance Read LocalTime
-- Defined in `time-1.4.0.1:Data.Time.Format.Parse'
...plus 32 others
In the expression: read
In the expression: read $ head args
In an equation for `m': m = read $ head args
You are kidding me, dear Haskell! You say there are thirty-five instances of
Read a0, and I dont even know what a0 means in this context!
Lets try another approach. I recall that Haskell is lazy and wont actually
read m and n unless I do something with these values. Maybe if I print them,
things will work out better? So I add these lines before return ():
print $ show m
print $ show n
Compile...? Nope: now I get four times the same error message about no
instance for read and show. What is wrong?
Suddenly I realize what could be the problem. The functions read and show
are both polymorphic; I never told Haskell that m and n must be integers. Of
course it cannot figure out how to read or to print values of unknown types.
Maybe thats what it means when saying the type variable a0 is ambiguous.
So let me tell Haskell that m and n are integers:
...
let (m :: Int) = read $ head args
let (n :: Int) = read $ head $ tail args
...
Save, compile...
Illegal type signature: `Int'
Perhaps you intended to use -XScopedTypeVariables
In a pattern type-signature
Now I really have no idea what this is saying! I cant even put a type signature
on an integer ?? Why is Int an illegal type signature???
Well, certainly m::Int is a legal type signature, but maybe Im using it at a
wrong place...
So I kept writing m::Int at various places in the program (in the head,
within main, etc.) until finally this worked:

3
print $ show (m :: Int)
print $ show (n :: Int)
The program finally compiled to an 1.3MB executable that prints its command-
line arguments.

$ ghc --make program.hs


$ ./program 2 3
"2"
"3"
I spent half an hour implementing command-line arguments... Hopefully, this
time is not lost!

3.3 Generating random Junk


Next step: I need to generate random values of type Junk. It would be enough
to do something like this,

make_random_junk () = B random_int "abc"


if I only could define random_int correctly. I look into the documentation and
on SO: how do I generate a random integer, say, between 0 and 10000. Well,
there are lots of options for random generators, but I dont want that; I just
want a simple function. Looking at Haskell documentation (oh... by the way,
where is it? it took some time to find the official docs for the standard library)...
Found it. So, I see a function that looks very promising: generate a random
integer within a given range.

randomRIO :: (Int, Int) -> IO Int


Excellent! So Im writing

import System.Random
and...
Oh no: Module System.Random not found. The fun continues!
Another SO search: there is a whole separate SO question about why Sys-
tem.Random cannot be found, and what to do on Ubuntu! Well, the module is
not found because I have to install it by hand (even though its standard).

$ sudo apt-get install cabal-install


$ cabal update
$ cabal install random
Good. Now we can go on.
I want to generate N random numbers and put them into Junk values. It
seems I will have to remain within an IO monad. So my next step is to implement
two functions:

4
-- make_random_junk will return a single, random Junk value
make_random_junk :: IO Junk
type JBag = SBag.T Junk -- a sorted bag of junk
-- make_random_sbag n will return a bag with n random Junk values
make_random_sbag :: Int -> IO JBag

The first function is easy:

make_random_junk = do
r <- randomRIO (0, 10000)
return B r "abc"

Compile... Error: oh my! Whats wrong this time?

Couldn't match type `String -> Junk' with `IO Junk'


Expected type: t0 -> [Char] -> IO Junk
Actual type: t0 -> Int -> String -> Junk
The function `return' is applied to three arguments,
but its type `(Int -> String -> Junk)
-> t0 -> Int -> String -> Junk'
has only four
In a stmt of a 'do' block: return B r "abc"
In the expression: do { r <- randomRIO (0, 10000); return B r "abc" }

I am loving this... The function is applied to three arguments, but its type has
only four ...
OK, admittedly Haskell gave a hint as to where my mistake might be. I
forgot parentheses around a constructed value...

return (B r "abc")

Continue to make_random_sbag.

3.4 Monadic fun, part 1


How would I go about making a bag of random junk?
Heres my plan: I will first create a list of junk items and then put these
items one after another into an empty sorted bag.
I already have a function that creates a single value; can I make a list of
these values? The standard library has the function replicate,

replicate :: Int -> -> []

So, for example, replicate 10 has type [] and produces a list of 10


elements.
I already have a value of type IO Junk, but I need a list, i.e. a value of type
[Junk]. Now, I cant escape the monad, so I should aim for a value of type IO

5
[Junk]. So, lets see... maybe I can use some clever library function that lifts
functions into the IO monad.
Indeed: this is fmap.

fmap (replicate 10) :: IO -> IO []

It remains to apply this function to make random junk, and we get our list.

io_list_of_junk :: Int -> IO [Junk]


io_list_of_junk m = fmap (replicate m) make_random_junk

The types are correct, so the program must work correctly, right...?
Oh woe! Wrong, I was! Mistake, I did make!
While debugging the completed code, I printed the result of computing
io_list_of_junk; all the random integers were the same!

3.5 Monadic fun, part II


Digging a little deeper, I found that there is another function calledreplicateM.
This is defined in Control.Monad and is a special version ofreplicateadapted
to work with monads.

replicateM :: (Monad m) => Int -> m a -> m [a]

And replicateM is not equal to fmap replicate! The types are the same,
but the computational effects are quite different. Replacing replicate by
replicateM cured the problem of repeating random integers.
Finally, I was on my way to completing the task. To make the code self-
documenting, I split the computation into intermediate steps and wrote explicit
type signatures.

make_random_sbag n = io_fold_sbag io_list_of_junk


where
io_list_of_junk = replicateM n make_random_junk :: IO [Junk]
fold_sbag = foldl SBag.incl empty_bag :: [Junk] -> JBag
io_fold_sbag = liftM fold_sbag :: IO [Junk] -> IO JBag

The types of the functions match; the program works correctly... this time.

4 Conclusion
I omit further steps in the implementation; creating a list of n random bags
and timing each operation was another matter of juggling the monads. After
the initial blunders documented above, I had no more surprises; the program
worked correctly without debugging.
What are my impressions from this brief experience?

6
Haskell is industry-ready, in the sense that it can be installed straight-
forwardly, is free of bugs, has lots of libraries, and all questions are already
on Stackoverflow.
Haskell forces you to organize your computation in a very specific way.
You need to know your way very well around the Prelude, monadic com-
binators, zipWithM, liftM, and such. There is just no other way to,
say, make something n times in Haskell, except if you apply a monadic
replicateM to the corresponding IO function.
The code is beautiful, and once the types match, the program runs cor-
rectly. (Except if you mess up with monads, like I did...)

Vous aimerez peut-être aussi