Exercise: Write transfer function to move money between accounts
import Control.Concurrent
import Control.Monad
type Account = MVar Double
transfer :: Double -> Account -> Account -> IO ()
transfer amount from to = ???Example:
*Main> :load "transfer.hs"
Ok, modules loaded: Main.
*Main> main
9.0
1.0type Account = MVar Double
transfer :: Double -> Account -> Account -> IO ()
transfer amount from to =
modifyMVar_ from $ \bf -> do
when (bf < amount) $ fail "not enough money"
modifyMVar_ to $ \bt -> return $! bt + amount
return $! bf - amounttype Account = MVar Double
transfer :: Double -> Account -> Account -> IO ()
transfer amount from to =
modifyMVar_ from $ \bf -> do
when (bf < amount) $ fail "not enough money"
modifyMVar_ to $ \bt -> return $! bt + amount
return $! bf - amountCan deadlock when simultaneously transferring money in both directions
forkIO $ transfer 1 ac1 ac2
forkIO $ transfer 1 ac2 ac1Throwing an exception when not enough money is ugly… what if we just waited for enough money to show up before completing the transfer?
tryTakeMVar for second MVar
safetransfer :: Double -> Account -> Account -> IO ()
safetransfer amount from to = do
let tryTransfer = modifyMVar from $ \ bf -> do
when (bf < amount) $ fail "not enough money"
mbt <- tryTakeMVar to
case mbt of
Just bt -> do putMVar to $! bt + amount
return (bf - amount, True)
Nothing -> return (bf, False)
ok <- tryTransfer
unless ok $ safetransfer (- amount) to fromfromBut in Haskell, the IO type (or lack thereof) can control side effects
Slides inspired by good write-up in [Peyton Jones]
TVar a
IORef a (like an MVar w/o lock)Control.Concurrent.TVar gives younewTVarIO :: a -> IO (TVar a)
readTVarIO :: TVar a -> IO a
readTVar :: TVar a -> STM a
writeTVar :: TVar a -> a -> STM ()
modifyTVar :: TVar a -> (a -> a) -> STM () -- lazy
modifyTVar' :: TVar a -> (a -> a) -> STM () -- strictNew STM monad allows TVar access but no irreversible side effects
atomically :: STM a -> IO aatomically lets you run STM computations from IOIO actionstype Account = TVar Double
transfer :: Double -> Account -> Account -> STM ()
transfer amount from to = do
modifyTVar' from (subtract amount)
modifyTVar' to (+ amount)
main :: IO ()
main = do
ac1 <- newTVarIO 10
ac2 <- newTVarIO 0
atomically $ transfer 1 ac1 ac2subtract a b = b - a
(- a) because that’s unary negation (i.e., 0-a)retry :: STM a
orElse :: STM a -> STM a -> STM aretry aborts the transaction
STM knows what TVars code read to detect conflicts…TVar code read changes w/o explicit condition variablestransfer :: Double -> Account -> Account -> STM ()
transfer amount from to = do
bf <- readTVar from
when (amount > bf) retry
modifyTVar' from (subtract amount)
modifyTVar' to (+ amount)orElse tries second action if first one aborts (sleeps if both abort)
transfer2 :: Double -> Account -> Account -> Account -> STM ()
transfer2 amount from1 from2 to =
atomically $ transferSTM amount from1 to
`orElse` transferSTM amount from2 toalwaysSucceeds :: STM a -> STM ()alwaysSucceeds registers invariant to check after every transaction
(Either the invariant throws an exception or its return value ignored)
Example: say you are paranoid about negative account balances
newAccount :: Double -> STM Account
newAccount balance = do
tv <- newTVar balance
alwaysSucceeds $ do balance <- readTVar tv
when (balance < 0) $ fail "negative balance"
return tv
bogus :: IO ()
bogus = do
ac <- atomically $ newAccount 10
atomically $ modifyTVar ac (subtract 15)*Main> bogus
*** Exception: negative balance
Let’s get back to pure functional code
How does the compiler represent data in memory?
A value requires a constructor, plus arguments
struct Val {
unsigned long constrno; /* constructor # */
struct Val *args[]; /* flexible array */
};[Int], constrno might be 0 for [] and 1 for (:), where [] has 0-sized args and (:) has 2-element argsInt, constrno can be the actual integer, with no argsPoint) constrno not usedargsInts are Val *s, requiring chasing a pointertypedef struct Val {
const struct ValInfo *info;
struct Val *args[];
} Val;
/* Statically allocated at compile time. Only one per
* constructor (or closure-creating expression, etc.) */
struct ValInfo {
struct GCInfo gcInfo; /* for garbage collector */
enum { CONSTRNO, FUNC, THUNK, IND } tag;
union {
unsigned int constrno;
Val *(*func) (const Val *closure, const Val *arg);
Exception *(*thunk) (Val *closure);
};
};gcInfo says how many Val *s are in args and where they aretag == CONSTRNO means constrno valid, used as on last slidetag == IND means args[0] is an indirect forwarding pointer to another Val and union is unused; useful if size of args growsA Val whose ValInfo has tag == FUNC uses the func field
Val *(*func) (const Val *closure, const Val *arg);To apply function f to argument a (where both are type Val *):
f->info->func (f, a);func’s first argument (closure) is the function Val itself
ValInfo/func can be re-usedfunc’s second argument (arg) is the argument a on which the function is being evaluatedTop-level bindings don’t need the closure argument to func
addOne :: Int -> Int
addOne x = x + 1Val for function addOne can have zero-length argsLocal bindings may need environment values in closure
add :: Int -> (Int -> Int)
add n = \m -> addn m
where addn m = n + maddn onceaddn function (with a different n) for each invocation of addaddn instance is a different Val, but all share the same ValInfoargs[0] in each Val to specify the value of nA Val with tag == THUNK uses the thunk field in ValInfo
Exception *(*thunk) (Val *closure);v (turns it into non-thunk) or returns a non-NULL Exception *To evaluate a thunk:
v->info->thunk (v);args?
IND ValInfo tag—Allocate new Val, place indirect forwarding pointer in old ValA possible implementation of forcing that walks IND pointers:
Exception *force (Val **vp)
{
for (;;) {
if ((*vp)->info->tag == IND)
*vp = (*vp)->arg[0];
else if ((*vp)->info->tag == THUNK) {
Exception *e = (*vp)->info->thunk (*vp);
if (e)
return e;
}
else
return NULL;
}
}Set closure->args to head of list of previously curried args
const3 :: a -> b -> c -> a
const3 a b c = aValInfos and 3 functions for const3ValInfo has func = const3_1const3_1 creates Val v1 where arg[0] is first argument (a) and info->func = const3_2const3_2 creates a Val v2 where arg[0] is the second argument (b), arg[1] is v1, and info->func is const3_3const3_3 has access to all arguments and actually implements const3Shared arguments have common arg tails, only evaluated once
let f = const3 (superExpensive 5) -- v1, evaluated once
in (f 1 2, f 3 4)const3 :: a -> b -> c -> a
const3 a b c = aVal *const3_1 (Val *ignored, Val *a)
{
v = (Val *) gc_malloc (offsetof (Val, args[1]));
v->info = &const3_2_info; /* func = const3_2 */
v->args[0] = a;
return v;
}
Val *const3_2 (Val *closure, Val *b)
{
v = (Val *) gc_malloc (offsetof (Val, args[2]));
v->info = &const3_3_info; /* func = const3_3 */
v->args[0] = b;
v->args[1] = closure;
return v;
}
Val *const3_3 (Val *v, Val *c)
{
return v->args[1]->args[0];
}Int has even more overhead
i->info->tag then access i->info->constrValInfo structure (but ValInfos statically allocated—how do you know what numbers the program will need)Idea: Have special unboxed types that don’t use struct Val
union Arg {
struct Val *boxed; /* most values are boxed */
unsigned long unboxed; /* "primitive" values */
};
typedef struct Val {
const struct ValInfo *info;
union Arg args[]; /* args can be boxed or unboxed */
} Val;ValInfo)Val * argGCInfo to identify which args are and are not boxed# character–must enable with -XMagicHash optionInt#) and primitive operations on them (+#):browse GHC.Prim” in GHCI2#, 'a'#, 2## (unsigned), 2.0##Int really?
Prelude> :set -XMagicHash
Prelude> :m +GHC.Types GHC.Prim
Prelude GHC.Types GHC.Prim> :i Int
data Int = I# Int# -- Defined in GHC.Types
...
Prelude GHC.Types GHC.Prim> case 1 of I# u -> I# (u +# 2#)
3
Val * always fits in general-purpose register, unboxed types might need FP reg{-# LANGUAGE MagicHash #-}
import GHC.Prim
data FastPoint = FastPoint Double# Double# -- ok
fp = FastPoint 2.0## 2.0## -- ok
-- Error: can't pass unboxed type to polymorphic function
fp' = FastPoint 2.0## (id 2.0##)
-- Error: can't use unboxed type as type parameter
noInt :: Maybe Int#
noInt = NothingEnforced by making unboxed types a different kind of type
Prelude GHC.Types GHC.Prim> :kind Int#
Int# :: #
Recall type variables have kinds with stars (∗, ∗ → ∗, etc.), never #
Polymorphism works because all types of kind ∗ represented by our Val *
seq revisitedseq :: a -> b -> b
seq a b is forced, then first a is forced, then b is forced and returnedConsider the following code:
infiniteLoop = infiniteLoop :: Char -- loops forever
seqTest1 = infiniteLoop `seq` "Hello" -- loops forever
seqTest2 = str `seq` length str -- returns 6
where str = infiniteLoop:"Hello"seqTest1 hangs forever, while seqTest2 happily returns 6seq only forces a Val, not the arg fields of the Val
seqTest2’s seq forces str’s constructor (:), but not the head or tailstr in Weak Head Normal Form (WHNF)seq implementationconst struct ValInfo seq_info = {
some_gcinfo, THUNK, .thunk = &seq_thunk
};
Val *seq_2 (Val *closure, Val *b)
{ /* assume seq_1 put first arg of (seq a b) in closure */
c = (Val *) gc_malloc (offsetof (Val, args[2]));
c->info = &seq_info;
c->args[0].boxed = closure->args[0];
c->args[1].boxed = b;
return c;
}
Exception *seq_thunk (Void *c)
{
Exception *e = force (&c->args[0].boxed);
if (!e) {
c->info = &ind_info; /* ValInfo with tag = IND */
c->args[0].boxed = c->args[1]; /* forward to b */
}
return e;
}Recall strictness flag on fields in data declarations
data IntWrapper = IntWrapper !IntInt has ! before it, meaning it must be strictInt’s ValInfo cannot have tag THUNK or INDInt touches only one cache line
data Int = I# Int# has only one constructortag == CONSTRNO, so know what’s in ValInfoInt# is unboxedThus, once IntWrapper forced, immediately safe to access Int as
myIntWrapper.arg[0].boxed->arg[0].unboxedInt is not just a number
Int are {0, 1}64 ∪ {⊥}Note 2: !Int not a first-class type, only valid for data fields
data SMaybe a = SJust !a | SNothing -- ok, data field
strictAdd :: !Int -> !Int -> !Int -- error
type StrictMaybeInt = Maybe !Int -- errorcase statements revisitedcase statement pattern matching can force thunks
_ is irrefutablecase
undefined :: a is Prelude symbol with value ⊥f ('a':'b':rest) = rest
f _ = "ok"
test1 = f (undefined:[]) -- error
test2 = f ('a':undefined) -- error
test3 = f ('x':undefined) -- "ok" (didn't force tail)Adding ~ before a pattern makes it irrefutable
three = (\ ~(h:t) -> 3) undefined -- evaluates to 3newtype declarationsdata – creates a new (boxed) type, adding overhead of a Val wrappertype – creates an alias for an existing type, with no overheadnewtype – so we have have been vague about itMeters, Seconds, Grams, all implemented by Doubletype would make them all synonymous, facilitating errorsShow for each, impossible with typedata Meters = Meters Double – but will add overheadnewtype keyword introduces new type with no overhead
data, but limited to one constructor and one fieldnewtype semanticsWhat’s the semantic difference between these two declarations?
newtype NTInt = NTInt Int deriving (Show) -- def 1data NTInt = NTInt !Int deriving (Show) -- def 2Exercise: Suppose you have
uNTInt = NTInt undefinedWrite code that behaves differently for two definitions of NTInt
newtype semanticsWhat’s the semantic difference between these two declarations?
newtype NTInt = NTInt Int deriving (Show) -- def 1data NTInt = NTInt !Int deriving (Show) -- def 2Solution:
uNTInt = NTInt undefined
testNT = case uNTInt of NTInt _ -> TrueThe NTInt constructor is a “fake” compile-time-only construct
newtype compiles to nothingtestNT evaluates to True under definition 1 (newtype)Conversely, forcing a value (by matching constructor) forces strict fields
data)UNPACK pragmanewtype almost always better than data when it appliesWhat about a multi-field data type?
data TwoInts = TwoInts !Int !IntCONSTRNO ValInfoInt#s directly into the args of a TwoInts Val?GHC provides an UNPACK pragma to do just this
data TwoInts = TwoInts {-# UNPACK #-} !Int {-# UNPACK #-} !IntWorks for any strict field with a single-constructor datatype (so omitted ValInfo unambiguous)
newtype, UNPACK is not always a win
-funbox-strict-fields flag unpacks all strict fields
Ptr a represents pointers to type a
Pointers are not typesafe–allow pointer arithmetic and casting
nullPtr :: Ptr a
plusPtr :: Ptr a -> Int -> Ptr b
minusPtr :: Ptr a -> Ptr b -> Int
castPtr :: Ptr a -> Ptr bClass Storable provides raw access to memory using Ptrs
class Storable a where
sizeOf :: a -> Int
alignment :: a -> Int
peek :: Ptr a -> IO a
poke :: Ptr a -> a -> IO ()
...Bool, Int, Char, Ptr a, etc.) are StorableallocaEasiest way to get a valid Ptr is alloca:
alloca :: Storable a => (Ptr a -> IO b) -> IO baPtr to the spacealloca)allocaBytes :: Int -> (Ptr a -> IO b) -> IO bForeign module provides handy with utility
with :: Storable a => a -> (Ptr a -> IO b) -> IO b
with val f =
alloca $ \ptr -> do
poke ptr val
res <- f ptr
return resStorable typesForeign.C contains wrappers for C types
CInt, CUInt, CChar, CDouble, CIntPtr etc.Data.Int and Data.Word have all sizes of machine integer
Int8, Int16, Int32, Int64 – signed integersWord8, Word16, Word32, Word64 – unsigned integersExample: extract all the bytes from a Storable object
toBytes :: (Storable a) => a -> [Word8]
toBytes a = unsafePerformIO $
with a $ \pa -> go (castPtr pa) (pa `plusPtr` sizeOf a)
where go p e | p < e = do b <- peek p
bs <- go (p `plusPtr` 1) e
return (b:bs)
| otherwise = return []unsafePerformIO might be okay here since toBytes pureplusPtr lets us change from Ptr a to Ptr Word8malloc and mallocForeignPtrCan also allocate longer-lived memory with malloc
malloc :: Storable a => IO (Ptr a)
mallocBytes :: Int -> IO (Ptr a)
free :: Ptr a -> IO ()
realloc :: Storable b => Ptr a -> IO (Ptr b)
reallocBytes :: Ptr a -> Int -> IO (Ptr a)ForeignPtr lets you delegate deallocation to garbage collector
mallocForeignPtr :: Storable a => IO (ForeignPtr a)
mallocForeignPtrBytes :: Int -> IO (ForeignPtr a)ForeignPtrsForeignPtr, must convert it to Ptr
ForeignPtr in scope when using Ptr?Ptr within function that keeps reference to ForeignPtrwithForeignPtr :: ForeignPtr a -> (Ptr a -> IO b) -> IO bCan also convert Ptrs to ForeignPtrs
type FinalizerPtr a = FunPtr (Ptr a -> IO ())
newForeignPtr :: FinalizerPtr a -> Ptr a
-> IO (ForeignPtr a)
newForeignPtr_ :: Ptr a -> IO (ForeignPtr a)
addForeignPtrFinalizer :: FinalizerPtr a -> ForeignPtr a
-> IO ()FunPtr – this is type wrapper for C function pointer
finalizerFree symbol conveniently provides function pointer for freeCan import foreign functions like this:
foreign import ccall unsafe "stdlib.h malloc"
c_malloc :: CSize -> IO (Ptr a)
foreign import ccall unsafe "stdlib.h free"
c_free :: Ptr a -> IO ()ccall says use C calling convention (also cplusplus and few others)unsafe promises the C function will not call back into Haskellunafe faster than safe, but gives undefined results if call triggers GC"[static] [c-header] [&][c-name]"
static required only if c-name is dynamic or wrapper.h file with the declaration (ignored by GHC)FunPtrs)foreign import ccall unsafe "foo.h foo"
foo :: Int -- foo must be function: int foo(void);
foreign import ccall unsafe "foo.h &bar"
bar :: Ptr Int -- here bar can be int: int bar;Char, Int, Double, Float, Bool, Int8, Int16, Int32, Int64, Word8, Word16, Word32, Word64, Ptr a, FunPtr a, and StablePtr atype or newtype wrappers for basic types (CInt, CChar, etc.)() (for functions returning void)IO a where a is either of the above twoIO if function has side effects or non-determinism
Okay to omit if it is a pure C function:
foreign import ccall unsafe "arpa/inet.h ntohl"
ntohl :: Word32 -> Word32Haskell can’t check C purity, so omitting IO can cause problems
hsc2hsHow to access C data structures?
struct mystruct {
char *name;
int value;
};data MyStruct -- no constructors, just a placeholder
getValue :: Ptr MyStruct -> IO CInt
getValue ptr = peek $ ptr `plusPtr` 8 -- assumes char * 8 byteshsc2hs is pre-processor that lets you compute C values
#include "myheader.h"
getValue ptr = peek $ ptr `plusPtr`
#{offset struct mystruct, value}printftemplate-hsc.h on your system to see defs of # commands#let (like #define w/o parens)ByteStringsHaskell Strings obviously not very efficient
In Lab2, you saw faster ByteStrings
import qualified Data.ByteString as S
import qualified Data.ByteString.Char8 as S8S.head, S.tail, S.length, S.foldl, S.cons (like :), S.empty (like []), S.hPut (like hPutStr), S.readFileS.pack and S.unpack translate to/from [Word8]S8 has same functions as S, but uses Char instead of Word8—means you lose upper bits of Char
toString from utf8-string to avoid loss, or use Bryan’s text instead of ByteString for any serious text needsNow let’s look at implementation
data ByteString = PS {-# UNPACK #-} !(ForeignPtr Word8)
{-# UNPACK #-} !Int -- offset
{-# UNPACK #-} !Int -- lengthByteStringsSame package implements lazy ByteStrings
import qualified Data.ByteString.Lazy as L
import qualified Data.ByteString.Lazy.Char8 as L8ByteString modulesS.ByteString and S8.ByteString are the same type (re-exported), and similarly for L.ByteString and L8.ByteStringS.ByteString and L.ByteString not same type, but can convert:fromChunks :: [S.ByteString] -> L.ByteString
toChunks :: L.ByteString -> [S.ByteString]ByteString implementationLazy ByteStrings are implemented in terms of strict ones
data ByteString = Empty
| Chunk {-# UNPACK #-} !S.ByteString ByteStringChunk’s first argument (S.ByteString) never nullByteStringsByteStrings?
S.ByteStrings, but not copy the data they contain)ByteStrings is cheap, reverse is not (so if a library can work efficiently on lazy ByteStrings, good to expose that functionality)