Exercise: Write transfer function to move money between accounts
import Control.Concurrent
import Control.Monad
type Account = MVar Double
transfer :: Double -> Account -> Account -> IO ()
transfer amount from to = ???
Example:
*Main> :load "transfer.hs"
Ok, modules loaded: Main.
*Main> main
9.0
1.0
type Account = MVar Double
transfer :: Double -> Account -> Account -> IO ()
transfer amount from to =
modifyMVar_ from $ \bf -> do
when (bf < amount) $ fail "not enough money"
modifyMVar_ to $ \bt -> return $! bt + amount
return $! bf - amount
type Account = MVar Double
transfer :: Double -> Account -> Account -> IO ()
transfer amount from to =
modifyMVar_ from $ \bf -> do
when (bf < amount) $ fail "not enough money"
modifyMVar_ to $ \bt -> return $! bt + amount
return $! bf - amount
Can deadlock when simultaneously transferring money in both directions
forkIO $ transfer 1 ac1 ac2
forkIO $ transfer 1 ac2 ac1
Throwing an exception when not enough money is ugly... what if we just waited for enough money to show up before completing the transfer?
tryTakeMVar
for second MVar
transfer :: Double -> Account -> Account -> IO ()
transfer amount from to = do
let tryTransfer = modifyMVar from $ \ bf -> do
when (bf < amount) $ fail "not enough money"
mbt <- tryTakeMVar to
case mbt of
Just bt -> do putMVar to $! bt + amount
return (bf - amount, True)
Nothing -> return (bf, False)
ok <- tryTransfer
unless ok $ safetransfer (- amount) to from
from
But in Haskell, the IO
type (or lack thereof) can control side effects
Slides inspired by good write-up in [Peyton Jones]
TVar a
(kind of like an IORef a
)
Control.Concurrent.TVar
gives younewTVarIO :: a -> IO (TVar a)
readTVarIO :: TVar a -> IO a
readTVar :: TVar a -> STM a
writeTVar :: TVar a -> a -> STM ()
modifyTVar :: TVar a -> (a -> a) -> STM () -- lazy
modifyTVar' :: TVar a -> (a -> a) -> STM () -- strict
New STM
monad allows TVar
access but no irreversible side effects
atomically :: STM a -> IO a
atomically
lets you run STM
computations from IO
IO
actionstype Account = TVar Double
transfer :: Double -> Account -> Account -> STM ()
transfer amount from to = do
modifyTVar' from (subtract amount)
modifyTVar' to (+ amount)
main :: IO ()
main = do
ac1 <- newTVarIO 10
ac2 <- newTVarIO 0
atomically $ transfer 1 ac1 ac2
subtract a b = b - a
(- a)
because that's unary negation (i.e., 0-a
)retry :: STM a
orElse :: STM a -> STM a -> STM a
retry
aborts the transaction
STM
knows what TVar
s code read to detect conflicts...TVar
code read changes w/o explicit condition variablestransfer :: Double -> Account -> Account -> STM ()
transfer amount from to = do
bf <- readTVar from
when (amount > bf) retry
modifyTVar' from (subtract amount)
modifyTVar' to (+ amount)
orElse
tries second action if first one aborts (sleeps if both abort)
transfer2 :: Double -> Account -> Account -> Account -> STM ()
transfer2 amount from1 from2 to =
atomically $ transferSTM amount from1 to
`orElse` transferSTM amount from2 to
alwaysSucceeds :: STM a -> STM ()
alwaysSucceeds
adds invariant to check after every transaction
(Either the invariant throws an exception or its return value ignored)
Example: say you are paranoid about negative account balances
newAccount :: Double -> STM Account
newAccount balance = do
tv <- newTVar balance
alwaysSucceeds $ do balance <- readTVar tv
when (balance < 0) $ fail "negative balance"
return tv
bogus :: IO ()
bogus = do
ac <- atomically $ newAccount 10
atomically $ modifyTVar ac (subtract 15)
*Main> bogus
*** Exception: negative balance
Let's get back to pure functional code
How does the compiler represent data in memory?
A value requires a constructor, plus arguments
struct Val {
unsigned long constrno; /* constructor # */
struct Val *args[]; /* flexible array */
};
[Int]
, constrno
might be 0 for []
and 1 for (:)
, where []
has 0-sized args
and (:)
has 2-element args
Int
, constrno
can be the actual integer, with no args
Point
) constrno
not usedargs
Int
s always require chasing a pointertypedef struct Val {
const struct ValInfo *info;
struct Val *args[];
} Val;
/* Statically allocated at compile time. Only one per
* constructor (or closure-creating expression, etc.) */
struct ValInfo {
struct GCInfo gcInfo; /* for garbage collector */
enum { CONSTRNO, FUNC, THUNK, IND } tag;
union {
unsigned int constrno;
Val *(*func) (const Val *closure, const Val *arg);
Exception *(*thunk) (Val *closure);
};
};
gcInfo
says how many Val *
s are in args
and where they aretag == CONSTRNO
means constrno
valid, used as on last slidetag == IND
means args[0]
is an indirect forwarding pointer to another Val
and union is unused; useful if size of args
growsA Val
whose ValInfo
has tag == FUNC
uses the func
field
Val *(*func) (const Val *closure, const Val *arg);
To apply function f
to argument a
(where both are type Val *
):
f->info->func (f, a);
func
's first argument (closure
) is the function Val
itself
ValInfo
/func
can be re-usedfunc
's second argument (arg
) is the argument a
on which the function is being evaluatedTop-level bindings don't need the closure
argument to func
addOne :: Int -> Int
addOne x = x + 1
Val
for function addOne
can have zero-length args
Local bindings may need environment values in closure
add :: Int -> (Int -> Int)
add n = \m -> addn m
where addn m = n + m
addn
onceaddn
function (with a different n
) for each invocation of add
addn
instance is a different Val
, but all share the same ValInfo
args[0]
in each Val
to specify the value of n
A Val
with tag == THUNK
uses the thunk
field in ValInfo
Exception *(*thunk) (Val *closure);
v
(turns it into non-thunk) or returns a non-NULL
Exception *
To evaluate a thunk:
v->info->thunk (v);
args
?
IND
ValInfo
tag--Allocate new Val
, place indirect forwarding pointer in old Val
A possible implementation of forcing that walks IND
pointers:
Exception *force (Val **vp)
{
for (;;) {
if ((*vp)->info->tag == IND)
*vp = (*vp)->arg[0];
else if ((*vp)->info->tag == THUNK) {
Exception *e = (*vp)->info->thunk (*vp);
if (e)
return e;
}
else
return NULL;
}
}
Set closure->args
to head of list of previously curried args
const3 :: a -> b -> c -> a
const3 a b c = a
ValInfo
s and 3 functions for const3
ValInfo
has func = const3_1
const3_1
creates Val v1
where arg[0]
is first argument (a
) and info->func = const3_2
const3_2
creates a Val v2
where arg[0]
is the second argument (b
), arg[1]
is v1
, and info->func
is const3_3
const3_3
has access to all arguments and actually implements const3
Shared arguments have common arg tails, only evaluated once
let f = const3 (superExpensive 5) -- v1, evaluated once
in (f 1 2, f 3 4)
const3 :: a -> b -> c -> a
const3 a b c = a
Val *const3_1 (Val *ignored, Val *a)
{
v = (Val *) gc_malloc (offsetof (Val, args[1]));
v->info = &const3_2_info; /* func = const3_2 */
v->args[0] = a;
return v;
}
Val *const3_2 (Val *closure, Val *b)
{
v = (Val *) gc_malloc (offsetof (Val, args[2]));
v->info = &const3_3_info; /* func = const3_3 */
v->args[0] = b;
v->args[1] = closure;
return v;
}
Val *const3_3 (Val *v, Val *c)
{
return v->args[1]->args[0];
}
Int
has even more overhead
i->info->tag
then access i->info->constr
ValInfo
structure (but ValInfo
s statically allocated--how do you know what numbers the program will need)Idea: Have special unboxed types that don't use struct Val
union Arg {
struct Val *boxed; /* most values are boxed */
unsigned long unboxed; /* "primitive" values */
};
typedef struct Val {
const struct ValInfo *info;
union Arg args[]; /* args can be boxed or unboxed */
} Val;
ValInfo
)Val *
argGCInfo
to identify which args are and are not boxed#
character--must enable with -XMagicHash
optionInt#
) and primitive operations on them (+#
):browse GHC.Prim
" in GHCI2#
, 'a'#
, 2##
(unsigned), 2.0##
Int
really?
Prelude> :set -XMagicHash
Prelude> :m +GHC.Types GHC.Prim
Prelude GHC.Types GHC.Prim> :i Int
data Int = I# Int# -- Defined in GHC.Types
...
Prelude GHC.Types GHC.Prim> case 1 of I# u -> I# (u +# 2#)
3
Int
contain thunk, but avoids pointer dereference once evaluatedCannot instantiate type variables with unboxed types
{-# LANGUAGE MagicHash #-}
import GHC.Prim
data FastPoint = FastPoint Double# Double# -- ok
fp = FastPoint 2.0## 2.0## -- ok
-- Error: can't pass unboxed type to polymorphic function
fp' = FastPoint 2.0## (id 2.0##)
-- Error: can't use unboxed type as type parameter
noInt :: Maybe Int#
noInt = Nothing
Enforced by making unboxed types a different kind of type
Prelude GHC.Types GHC.Prim> :kind Int#
Int# :: #
Recall type variables have kinds with stars (∗, ∗ → ∗, etc.), never #
Polymorphism works because all types of kind ∗ represented as Val *
seq
revisitedseq :: a -> b -> b
seq a b
is forced, then first a
is forced, then b
is forced and returnedConsider the following code (similar to concurrency lecture):
infiniteLoop = infiniteLoop :: Char -- loops forever
seqTest1 = infiniteLoop `seq` "Hello" -- loops forever
seqTest2 = str `seq` length str -- returns 6
where str = infiniteLoop:"Hello"
seqTest1
hangs forever, while seqTest2
happily returns 6seq
only forces a Val
, not the arg
fields of the Val
seqTest2
's seq
forces str
's constructor (:)
, but not the head or tailstr
in Weak Head Normal Form (WHNF)seq
implementationconst struct ValInfo seq_info = {
some_gcinfo, THUNK, .thunk = &seq_thunk
};
Val *seq_2 (Val *closure, Val *b)
{ /* assume seq_1 put first arg of (seq a b) in closure */
c = (Val *) gc_malloc (offsetof (Val, args[2]));
c->info = &seq_info;
c->args[0] = closure->args[0];
c->args[1] = b;
return c;
}
Exception *seq_thunk (Void *c)
{
Exception *e = force (&c->args[0]);
if (!e) {
c->info = &ind_info; /* ValInfo with tag = IND */
c->args[0] = c->args[1]; /* forward to b */
}
return e;
}
Recall strictness flag on fields in data declarations
data IntWrapper = IntWrapper !Int
Int
has !
before it, meaning it must be strictInt
's ValInfo
cannot have tag
THUNK
or IND
Int
touches only one cache line
data Int = I# Int#
has only one constructortag == CONSTRNO
, so know what's in ValInfo
Int#
is unboxedThus, once IntWrapper
forced, immediately safe to access Int
as
myIntWrapper.arg[0].boxed->arg[0].unboxed
Int
is not just a number
Int
are {0, 1}64 ∪ {⊥}Note 2: !Int
not a first-class type, only valid for data
fields
data SMaybe a = SJust !a | SNothing -- ok, data field
strictAdd :: !Int -> !Int -> !Int -- error
type StrictMaybeInt = Maybe !Int -- error
case
statements revisitedcase
statement pattern matching can force thunks
_
is irrefutablecase
undefined :: a
is Prelude
symbol with value ⊥f ('a':'b':rest) = rest
f _ = "ok"
test1 = f (undefined:[]) -- error
test2 = f ('a':undefined) -- error
test3 = f ('x':undefined) -- "ok" (didn't force tail)
Adding ~
before a pattern makes it irrefutable
three = (\ ~(h:t) -> 3) undefined -- evaluates to 3
newtype
declarationsdata
-- creates a new (boxed) type, adding overhead of a Val
wrappertype
-- creates an alias for an existing type, with no overheadMeters
, Seconds
, Grams
, all implemented by Double
type
would make them all synonymous, facilitating errorsShow
for each, impossible with type
data Meters = Meters Double
-- but will add overheadnewtype
keyword introduces new type with no overhead
data
, but limited to one constructor and one fieldnewtype
semanticsWhat's the semantic difference between these two declarations?
newtype NTInt = NTInt Int deriving (Show)
data SInt = SInt !Int deriving (Show)
Exercise: Suppose you have
uNTInt = NTInt undefined
uSInt = SInt undefined
Write code that behaves differently for uNTInt
vs. uSInt
newtype
semanticsWhat's the semantic difference between these two declarations?
newtype NTInt = NTInt Int deriving (Show)
data SInt = SInt !Int deriving (Show)
NTInt
constructor is a "fake" compile-time-only construct
newtype
compiles to nothingnewtype NTInt = NTInt Int deriving (Show)
uNTInt = NTInt undefined
testNT = case uNTInt of NTInt _ -> True -- returns True
data SInt = SInt !Int deriving (Show)
uSInt = SInt undefined
testS = case uSInt of SInt _ -> True -- undefined
UNPACK
pragmanewtype
almost always better than data
when it appliesWhat about a multi-field data type?
data TwoInts = TwoInts !Int !Int
CONSTRNO
ValInfo
Int#
s directly into the args
of a TwoInts
Val
?GHC provides an UNPACK
pragma to do just this
data TwoInts = TwoInts {-# UNPACK #-} !Int {-# UNPACK #-} !Int
newtype
, UNPACK
is not always a win
-funbox-strict-fields
flag unpacks all strict fields
ByteString
sString
s obviously not very efficientStrict ByteString
s efficiently manipulate raw bytes
import qualified Data.ByteString as S
import qualified Data.ByteString.Char8 as S8
S.head
, S.tail
, S.length
, S.foldl
, S.cons
(like :
), S.empty
(like []
), S.hPut
(like hPutStr
), S.readFile
S.pack
and S.unpack
translate to/from [Word8]
S8
has same functions as S
, but uses Char
instead of Word8
--means you lose upper bits of Char
(use toString
from utf8-string to avoid loss)Implementation
data ByteString = PS {-# UNPACK #-} !(ForeignPtr Word8)
{-# UNPACK #-} !Int -- offset
{-# UNPACK #-} !Int -- length
ByteString
sSame package implements lazy ByteString
s
import qualified Data.ByteString.Lazy as L
import qualified Data.ByteString.Lazy.Char8 as L8
ByteString
modulesS.ByteString
and S8.ByteString
are the same type (re-exported), and similarly for L.ByteString
and L8.ByteString
S.ByteString
and L.ByteString
not same type, but can convert:fromChunks :: [S.ByteString] -> L.ByteString
toChunks :: L.ByteString -> [S.ByteString]
ByteString
implementationLazy ByteString
s are implemented in terms of strict ones
data ByteString = Empty
| Chunk {-# UNPACK #-} !S.ByteString ByteString
Chunk
's first argument (S.ByteString
) never null
ByteString
sByteString
s?
S.ByteString
s, but not copy the data they contain)ByteString
s is cheap, reverse is not (so if a library can work efficiently on lazy ByteString
s, good to expose that functionality)