Exercise: Write transfer function to move money between accounts
import Control.Concurrent
import Control.Monad
type Account = MVar Double
transfer :: Double -> Account -> Account -> IO ()
transfer amount from to = ???
Example:
*Main> :load "transfer.hs"
Ok, modules loaded: Main.
*Main> main
9.0
1.0
type Account = MVar Double
transfer :: Double -> Account -> Account -> IO ()
transfer amount from to =
modifyMVar_ from $ \bf -> do
when (bf < amount) $ fail "not enough money"
modifyMVar_ to $ \bt -> return $! bt + amount
return $! bf - amount
type Account = MVar Double
transfer :: Double -> Account -> Account -> IO ()
transfer amount from to =
modifyMVar_ from $ \bf -> do
when (bf < amount) $ fail "not enough money"
modifyMVar_ to $ \bt -> return $! bt + amount
return $! bf - amount
Can deadlock when simultaneously transferring money in both directions
forkIO $ transfer 1 ac1 ac2
forkIO $ transfer 1 ac2 ac1
Throwing an exception when not enough money is ugly… what if we just waited for enough money to show up before completing the transfer?
tryTakeMVar
for second MVar
safetransfer :: Double -> Account -> Account -> IO ()
safetransfer amount from to = do
let tryTransfer = modifyMVar from $ \ bf -> do
when (bf < amount) $ fail "not enough money"
mbt <- tryTakeMVar to
case mbt of
Just bt -> do putMVar to $! bt + amount
return (bf - amount, True)
Nothing -> return (bf, False)
ok <- tryTransfer
unless ok $ safetransfer (- amount) to from
from
But in Haskell, the IO
type (or lack thereof) can control side effects
Slides inspired by good write-up in [Peyton Jones]
TVar a
IORef a
(like an MVar w/o lock)Control.Concurrent.TVar
gives younewTVarIO :: a -> IO (TVar a)
readTVarIO :: TVar a -> IO a
readTVar :: TVar a -> STM a
writeTVar :: TVar a -> a -> STM ()
modifyTVar :: TVar a -> (a -> a) -> STM () -- lazy
modifyTVar' :: TVar a -> (a -> a) -> STM () -- strict
New STM
monad allows TVar
access but no irreversible side effects
atomically :: STM a -> IO a
atomically
lets you run STM
computations from IO
IO
actionstype Account = TVar Double
transfer :: Double -> Account -> Account -> STM ()
transfer amount from to = do
modifyTVar' from (subtract amount)
modifyTVar' to (+ amount)
main :: IO ()
main = do
ac1 <- newTVarIO 10
ac2 <- newTVarIO 0
atomically $ transfer 1 ac1 ac2
subtract a b = b - a
(- a)
because that’s unary negation (i.e., 0-a
)retry :: STM a
orElse :: STM a -> STM a -> STM a
retry
aborts the transaction
STM
knows what TVar
s code read to detect conflicts…TVar
code read changes w/o explicit condition variablestransfer :: Double -> Account -> Account -> STM ()
transfer amount from to = do
bf <- readTVar from
when (amount > bf) retry
modifyTVar' from (subtract amount)
modifyTVar' to (+ amount)
orElse
tries second action if first one aborts (sleeps if both abort)
transfer2 :: Double -> Account -> Account -> Account -> STM ()
transfer2 amount from1 from2 to =
atomically $ transferSTM amount from1 to
`orElse` transferSTM amount from2 to
alwaysSucceeds :: STM a -> STM ()
alwaysSucceeds
registers invariant to check after every transaction
(Either the invariant throws an exception or its return value ignored)
Example: say you are paranoid about negative account balances
newAccount :: Double -> STM Account
newAccount balance = do
tv <- newTVar balance
alwaysSucceeds $ do balance <- readTVar tv
when (balance < 0) $ fail "negative balance"
return tv
bogus :: IO ()
bogus = do
ac <- atomically $ newAccount 10
atomically $ modifyTVar ac (subtract 15)
*Main> bogus
*** Exception: negative balance
Let’s get back to pure functional code
How does the compiler represent data in memory?
A value requires a constructor, plus arguments
struct Val {
unsigned long constrno; /* constructor # */
struct Val *args[]; /* flexible array */
};
[Int]
, constrno
might be 0 for []
and 1 for (:)
, where []
has 0-sized args
and (:)
has 2-element args
Int
, constrno
can be the actual integer, with no args
Point
) constrno
not usedargs
Int
s are Val *
s, requiring chasing a pointertypedef struct Val {
const struct ValInfo *info;
struct Val *args[];
} Val;
/* Statically allocated at compile time. Only one per
* constructor (or closure-creating expression, etc.) */
struct ValInfo {
struct GCInfo gcInfo; /* for garbage collector */
enum { CONSTRNO, FUNC, THUNK, IND } tag;
union {
unsigned int constrno;
Val *(*func) (const Val *closure, const Val *arg);
Exception *(*thunk) (Val *closure);
};
};
gcInfo
says how many Val *
s are in args
and where they aretag == CONSTRNO
means constrno
valid, used as on last slidetag == IND
means args[0]
is an indirect forwarding pointer to another Val
and union is unused; useful if size of args
growsA Val
whose ValInfo
has tag == FUNC
uses the func
field
Val *(*func) (const Val *closure, const Val *arg);
To apply function f
to argument a
(where both are type Val *
):
f->info->func (f, a);
func
’s first argument (closure
) is the function Val
itself
ValInfo
/func
can be re-usedfunc
’s second argument (arg
) is the argument a
on which the function is being evaluatedTop-level bindings don’t need the closure
argument to func
addOne :: Int -> Int
addOne x = x + 1
Val
for function addOne
can have zero-length args
Local bindings may need environment values in closure
add :: Int -> (Int -> Int)
add n = \m -> addn m
where addn m = n + m
addn
onceaddn
function (with a different n
) for each invocation of add
addn
instance is a different Val
, but all share the same ValInfo
args[0]
in each Val
to specify the value of n
A Val
with tag == THUNK
uses the thunk
field in ValInfo
Exception *(*thunk) (Val *closure);
v
(turns it into non-thunk) or returns a non-NULL
Exception *
To evaluate a thunk:
v->info->thunk (v);
args
?
IND
ValInfo
tag—Allocate new Val
, place indirect forwarding pointer in old Val
A possible implementation of forcing that walks IND
pointers:
Exception *force (Val **vp)
{
for (;;) {
if ((*vp)->info->tag == IND)
*vp = (*vp)->arg[0];
else if ((*vp)->info->tag == THUNK) {
Exception *e = (*vp)->info->thunk (*vp);
if (e)
return e;
}
else
return NULL;
}
}
Set closure->args
to head of list of previously curried args
const3 :: a -> b -> c -> a
const3 a b c = a
ValInfo
s and 3 functions for const3
ValInfo
has func = const3_1
const3_1
creates Val v1
where arg[0]
is first argument (a
) and info->func = const3_2
const3_2
creates a Val v2
where arg[0]
is the second argument (b
), arg[1]
is v1
, and info->func
is const3_3
const3_3
has access to all arguments and actually implements const3
Shared arguments have common arg tails, only evaluated once
let f = const3 (superExpensive 5) -- v1, evaluated once
in (f 1 2, f 3 4)
const3 :: a -> b -> c -> a
const3 a b c = a
Val *const3_1 (Val *ignored, Val *a)
{
v = (Val *) gc_malloc (offsetof (Val, args[1]));
v->info = &const3_2_info; /* func = const3_2 */
v->args[0] = a;
return v;
}
Val *const3_2 (Val *closure, Val *b)
{
v = (Val *) gc_malloc (offsetof (Val, args[2]));
v->info = &const3_3_info; /* func = const3_3 */
v->args[0] = b;
v->args[1] = closure;
return v;
}
Val *const3_3 (Val *v, Val *c)
{
return v->args[1]->args[0];
}
Int
has even more overhead
i->info->tag
then access i->info->constr
ValInfo
structure (but ValInfo
s statically allocated—how do you know what numbers the program will need)Idea: Have special unboxed types that don’t use struct Val
union Arg {
struct Val *boxed; /* most values are boxed */
unsigned long unboxed; /* "primitive" values */
};
typedef struct Val {
const struct ValInfo *info;
union Arg args[]; /* args can be boxed or unboxed */
} Val;
ValInfo
)Val *
argGCInfo
to identify which args are and are not boxed#
character–must enable with -XMagicHash
optionInt#
) and primitive operations on them (+#
):browse GHC.Prim
” in GHCI2#
, 'a'#
, 2##
(unsigned), 2.0##
Int
really?
Prelude> :set -XMagicHash
Prelude> :m +GHC.Types GHC.Prim
Prelude GHC.Types GHC.Prim> :i Int
data Int = I# Int# -- Defined in GHC.Types
...
Prelude GHC.Types GHC.Prim> case 1 of I# u -> I# (u +# 2#)
3
Val *
always fits in general-purpose register, unboxed types might need FP reg{-# LANGUAGE MagicHash #-}
import GHC.Prim
data FastPoint = FastPoint Double# Double# -- ok
fp = FastPoint 2.0## 2.0## -- ok
-- Error: can't pass unboxed type to polymorphic function
fp' = FastPoint 2.0## (id 2.0##)
-- Error: can't use unboxed type as type parameter
noInt :: Maybe Int#
noInt = Nothing
Enforced by making unboxed types a different kind of type
Prelude GHC.Types GHC.Prim> :kind Int#
Int# :: #
Recall type variables have kinds with stars (∗, ∗ → ∗, etc.), never #
Polymorphism works because all types of kind ∗ represented by our Val *
seq
revisitedseq :: a -> b -> b
seq a b
is forced, then first a
is forced, then b
is forced and returnedConsider the following code:
infiniteLoop = infiniteLoop :: Char -- loops forever
seqTest1 = infiniteLoop `seq` "Hello" -- loops forever
seqTest2 = str `seq` length str -- returns 6
where str = infiniteLoop:"Hello"
seqTest1
hangs forever, while seqTest2
happily returns 6seq
only forces a Val
, not the arg
fields of the Val
seqTest2
’s seq
forces str
’s constructor (:)
, but not the head or tailstr
in Weak Head Normal Form (WHNF)seq
implementationconst struct ValInfo seq_info = {
some_gcinfo, THUNK, .thunk = &seq_thunk
};
Val *seq_2 (Val *closure, Val *b)
{ /* assume seq_1 put first arg of (seq a b) in closure */
c = (Val *) gc_malloc (offsetof (Val, args[2]));
c->info = &seq_info;
c->args[0].boxed = closure->args[0];
c->args[1].boxed = b;
return c;
}
Exception *seq_thunk (Void *c)
{
Exception *e = force (&c->args[0].boxed);
if (!e) {
c->info = &ind_info; /* ValInfo with tag = IND */
c->args[0].boxed = c->args[1]; /* forward to b */
}
return e;
}
Recall strictness flag on fields in data declarations
data IntWrapper = IntWrapper !Int
Int
has !
before it, meaning it must be strictInt
’s ValInfo
cannot have tag
THUNK
or IND
Int
touches only one cache line
data Int = I# Int#
has only one constructortag == CONSTRNO
, so know what’s in ValInfo
Int#
is unboxedThus, once IntWrapper
forced, immediately safe to access Int
as
myIntWrapper.arg[0].boxed->arg[0].unboxed
Int
is not just a number
Int
are {0, 1}64 ∪ {⊥}Note 2: !Int
not a first-class type, only valid for data
fields
data SMaybe a = SJust !a | SNothing -- ok, data field
strictAdd :: !Int -> !Int -> !Int -- error
type StrictMaybeInt = Maybe !Int -- error
case
statements revisitedcase
statement pattern matching can force thunks
_
is irrefutablecase
undefined :: a
is Prelude
symbol with value ⊥f ('a':'b':rest) = rest
f _ = "ok"
test1 = f (undefined:[]) -- error
test2 = f ('a':undefined) -- error
test3 = f ('x':undefined) -- "ok" (didn't force tail)
Adding ~
before a pattern makes it irrefutable
three = (\ ~(h:t) -> 3) undefined -- evaluates to 3
newtype
declarationsdata
– creates a new (boxed) type, adding overhead of a Val
wrappertype
– creates an alias for an existing type, with no overheadnewtype
– so we have have been vague about itMeters
, Seconds
, Grams
, all implemented by Double
type
would make them all synonymous, facilitating errorsShow
for each, impossible with type
data Meters = Meters Double
– but will add overheadnewtype
keyword introduces new type with no overhead
data
, but limited to one constructor and one fieldnewtype
semanticsWhat’s the semantic difference between these two declarations?
newtype NTInt = NTInt Int deriving (Show) -- def 1
data NTInt = NTInt !Int deriving (Show) -- def 2
Exercise: Suppose you have
uNTInt = NTInt undefined
Write code that behaves differently for two definitions of NTInt
newtype
semanticsWhat’s the semantic difference between these two declarations?
newtype NTInt = NTInt Int deriving (Show) -- def 1
data NTInt = NTInt !Int deriving (Show) -- def 2
Solution:
uNTInt = NTInt undefined
testNT = case uNTInt of NTInt _ -> True
The NTInt
constructor is a “fake” compile-time-only construct
newtype
compiles to nothingtestNT
evaluates to True
under definition 1 (newtype
)Conversely, forcing a value (by matching constructor) forces strict fields
data
)UNPACK
pragmanewtype
almost always better than data
when it appliesWhat about a multi-field data type?
data TwoInts = TwoInts !Int !Int
CONSTRNO
ValInfo
Int#
s directly into the args
of a TwoInts
Val
?GHC provides an UNPACK
pragma to do just this
data TwoInts = TwoInts {-# UNPACK #-} !Int {-# UNPACK #-} !Int
Works for any strict field with a single-constructor datatype (so omitted ValInfo
unambiguous)
newtype
, UNPACK
is not always a win
-funbox-strict-fields
flag unpacks all strict fields
Ptr a
represents pointers to type a
Pointers are not typesafe–allow pointer arithmetic and casting
nullPtr :: Ptr a
plusPtr :: Ptr a -> Int -> Ptr b
minusPtr :: Ptr a -> Ptr b -> Int
castPtr :: Ptr a -> Ptr b
Class Storable
provides raw access to memory using Ptr
s
class Storable a where
sizeOf :: a -> Int
alignment :: a -> Int
peek :: Ptr a -> IO a
poke :: Ptr a -> a -> IO ()
...
Bool
, Int
, Char
, Ptr a
, etc.) are Storable
alloca
Easiest way to get a valid Ptr
is alloca
:
alloca :: Storable a => (Ptr a -> IO b) -> IO b
a
Ptr
to the spacealloca
)allocaBytes :: Int -> (Ptr a -> IO b) -> IO b
Foreign
module provides handy with
utility
with :: Storable a => a -> (Ptr a -> IO b) -> IO b
with val f =
alloca $ \ptr -> do
poke ptr val
res <- f ptr
return res
Storable
typesForeign.C
contains wrappers for C types
CInt
, CUInt
, CChar
, CDouble
, CIntPtr
etc.Data.Int
and Data.Word
have all sizes of machine integer
Int8
, Int16
, Int32
, Int64
– signed integersWord8
, Word16
, Word32
, Word64
– unsigned integersExample: extract all the bytes from a Storable
object
toBytes :: (Storable a) => a -> [Word8]
toBytes a = unsafePerformIO $
with a $ \pa -> go (castPtr pa) (pa `plusPtr` sizeOf a)
where go p e | p < e = do b <- peek p
bs <- go (p `plusPtr` 1) e
return (b:bs)
| otherwise = return []
unsafePerformIO
might be okay here since toBytes
pureplusPtr
lets us change from Ptr a
to Ptr Word8
malloc
and mallocForeignPtr
Can also allocate longer-lived memory with malloc
malloc :: Storable a => IO (Ptr a)
mallocBytes :: Int -> IO (Ptr a)
free :: Ptr a -> IO ()
realloc :: Storable b => Ptr a -> IO (Ptr b)
reallocBytes :: Ptr a -> Int -> IO (Ptr a)
ForeignPtr
lets you delegate deallocation to garbage collector
mallocForeignPtr :: Storable a => IO (ForeignPtr a)
mallocForeignPtrBytes :: Int -> IO (ForeignPtr a)
ForeignPtr
sForeignPtr
, must convert it to Ptr
ForeignPtr
in scope when using Ptr
?Ptr
within function that keeps reference to ForeignPtr
withForeignPtr :: ForeignPtr a -> (Ptr a -> IO b) -> IO b
Can also convert Ptr
s to ForeignPtr
s
type FinalizerPtr a = FunPtr (Ptr a -> IO ())
newForeignPtr :: FinalizerPtr a -> Ptr a
-> IO (ForeignPtr a)
newForeignPtr_ :: Ptr a -> IO (ForeignPtr a)
addForeignPtrFinalizer :: FinalizerPtr a -> ForeignPtr a
-> IO ()
FunPtr
– this is type wrapper for C function pointer
finalizerFree
symbol conveniently provides function pointer for free
Can import foreign functions like this:
foreign import ccall unsafe "stdlib.h malloc"
c_malloc :: CSize -> IO (Ptr a)
foreign import ccall unsafe "stdlib.h free"
c_free :: Ptr a -> IO ()
ccall
says use C calling convention (also cplusplus
and few others)unsafe
promises the C function will not call back into Haskellunafe
faster than safe
, but gives undefined results if call triggers GC"
[static
] [c-header] [&
][c-name]"
static
required only if c-name is dynamic
or wrapper
.h
file with the declaration (ignored by GHC)FunPtr
s)foreign import ccall unsafe "foo.h foo"
foo :: Int -- foo must be function: int foo(void);
foreign import ccall unsafe "foo.h &bar"
bar :: Ptr Int -- here bar can be int: int bar;
Char
, Int
, Double
, Float
, Bool
, Int8
, Int16
, Int32
, Int64
, Word8
, Word16
, Word32
, Word64
, Ptr
a
, FunPtr a
, and StablePtr a
type
or newtype
wrappers for basic types (CInt
, CChar
, etc.)()
(for functions returning void
)IO a
where a
is either of the above twoIO
if function has side effects or non-determinism
Okay to omit if it is a pure C function:
foreign import ccall unsafe "arpa/inet.h ntohl"
ntohl :: Word32 -> Word32
Haskell can’t check C purity, so omitting IO
can cause problems
hsc2hs
How to access C data structures?
struct mystruct {
char *name;
int value;
};
data MyStruct -- no constructors, just a placeholder
getValue :: Ptr MyStruct -> IO CInt
getValue ptr = peek $ ptr `plusPtr` 8 -- assumes char * 8 bytes
hsc2hs
is pre-processor that lets you compute C values
#include "myheader.h"
getValue ptr = peek $ ptr `plusPtr`
#{offset struct mystruct, value}
printf
template-hsc.h
on your system to see defs of #
commands#let
(like #define
w/o parens)ByteString
sHaskell String
s obviously not very efficient
In Lab2, you saw faster ByteStrings
import qualified Data.ByteString as S
import qualified Data.ByteString.Char8 as S8
S.head
, S.tail
, S.length
, S.foldl
, S.cons
(like :
), S.empty
(like []
), S.hPut
(like hPutStr
), S.readFile
S.pack
and S.unpack
translate to/from [Word8]
S8
has same functions as S
, but uses Char
instead of Word8
—means you lose upper bits of Char
toString
from utf8-string to avoid loss, or use Bryan’s text instead of ByteString
for any serious text needsNow let’s look at implementation
data ByteString = PS {-# UNPACK #-} !(ForeignPtr Word8)
{-# UNPACK #-} !Int -- offset
{-# UNPACK #-} !Int -- length
ByteString
sSame package implements lazy ByteString
s
import qualified Data.ByteString.Lazy as L
import qualified Data.ByteString.Lazy.Char8 as L8
ByteString
modulesS.ByteString
and S8.ByteString
are the same type (re-exported), and similarly for L.ByteString
and L8.ByteString
S.ByteString
and L.ByteString
not same type, but can convert:fromChunks :: [S.ByteString] -> L.ByteString
toChunks :: L.ByteString -> [S.ByteString]
ByteString
implementationLazy ByteString
s are implemented in terms of strict ones
data ByteString = Empty
| Chunk {-# UNPACK #-} !S.ByteString ByteString
Chunk
’s first argument (S.ByteString
) never null
ByteString
sByteString
s?
S.ByteString
s, but not copy the data they contain)ByteString
s is cheap, reverse is not (so if a library can work efficiently on lazy ByteString
s, good to expose that functionality)