Académique Documents
Professionnel Documents
Culture Documents
New IO library
Simon Marlow
The 100-mile view
• the API changes:
– Unicode
• putStr “A légpárnás hajóm tele van
angolnákkal” works! (if your editor is set up
right…)
• locale-encoding by default, except for
Handles in binary mode (openBinaryFile,
hSetBinaryMode)
hSetEncoding :: Handle -> TextEncoding -> IO ()
• changing the encoding on the fly
hGetEncoding :: Handle -> IO (Maybe TextEncoding)
data TextEncoding
latin1, utf8, utf16, utf32, … :: TextEncoding
mkTextEncoding :: String -> IO TextEncoding
localeEncoding :: TextEncoding
The 100-mile view (cont.)
• Better newline support
– teletypes needed both
CR+LF to start a new
line, and we’ve been
paying for it ever since.
hSetNewlineMode :: Handle -> NewlineMode -> IO ()
… etc …
BufferedIO
class BufferedIO dev where
newBuffer :: dev -> BufferState -> IO (Buffer Word8)
data Buffer e
= Buffer {
bufRaw :: !(ForeignPtr e),
bufState :: BufferState, -- ReadBuffer | WriteBuffer
bufSize :: !Int, -- in elements, not bytes
bufL :: !Int, -- offset of first item in the buffer
bufR :: !Int -- offset of last item + 1
}
Data
bufRa b b bufSi
w ufL ufR ze
Example: memory-mapped
files
2. (a) make it an instance of
BufferedIO
instance BufferedIO MemoryMappedFile where
newBuffer m state = do
fp <- newForeignPtr_ (mmap_addr m)
return (emptyBuffer fp (mmap_length m) state) fillReadBuffer returns
the entire file!
fillReadBuffer m buf = do
p <- readIORef (mmap_ptr m)
let l = mmap_length m
if (p >= l)
then do return (0, buf{ bufL=p, bufR=p })
else do writeIORef (mmap_ptr m) l flush is a no-op: just
return (l-p, buf{ bufL=p, bufR=l }) remember where to
read from next
flushWriteBuffer m buf = do
writeIORef (mmap_ptr m) (bufR buf)
return buf{ bufL = bufR buf }
Example: memory-mapped
files
2. (b) make it an instance of IODevice
instance IODevice MemoryMappedFile where
close = IODevice.close . mmap_fd
… etc …
Example: memory-mapped
files
3. provide a way to create instances
mmapFile :: FilePath -> IOMode -> Bool -> IO Handle
mmapFile filepath iomode binary = do
let m = MemoryMappedFile {
mmap_fd = fd, Open the file and mmap() it
mmap_addr = castPtr addr,
mmap_length = fromIntegral sz,
mmap_ptr = ptr }
Call mkFileHandle to build
the Handle
let (encoding, newline)
| binary = (Nothing, noNewlineTranslation)
| otherwise = (Just localeEncoding, nativeNewlineMode)
main = do
[file,test] <- getArgs
h <- if test == "mmap" then mmapFile file ReadWriteMode True
else openBinaryFile file ReadWriteMode
hClose h
putStrLn "done"
$ ghc test.hs --make
[1 of 1] Compiling Main ( test.hs, test.o )
Linking test ...
Timings…
data Handle__
= forall dev enc_state dec_state . (IODevice dev, BufferedIO dev, Typeable dev) =>
Handle__ {
haDevice :: !dev,
haType :: HandleType, -- read/write/append etc.
haByteBuffer :: !(IORef (Buffer Word8)),
haCharBuffer :: !(IORef (Buffer CharBufElem)),
haEncoder :: Maybe (TextEncoder enc_state),
haDecoder :: Maybe (TextDecoder dec_state),
haCodec :: Maybe TextEncoding,
Two buffers: one for
haInputNL :: Newline,
bytes, one for Chars.
haOutputNL :: Newline,
.. some other things ..
}
deriving Typeable
Where to go from here
• This is a step in the right direction, but there
is still some obvious ugliness
– We haven’t changed the external API, only
added to it
– There should be a binary I/O layer
• hPutBuf working on Handles is wrong: binary Handles
should have a different type
• in a sense, BufferedIO is a binary I/O layer: it is
efficient, but inconvenient
– FilePath should be an abstract type.
• On Windows, FilePath = String, but on Unix, FilePath
= [Word8].
– Should we rethink Handles entirely?
• OO-style layers: binary IO, buffering, encoding
• Separate read Handles from write Handles?
– read/write Handles are a pain