Zippers and lenses

Let’s talk about well-behaved Haskell programs for a bit.

So well-typed but non-terminating constructs such as the following are forbidden:

loop :: Bool
loop = loop

wtf :: Bool
wtf = undefined

crash :: Bool
crash = error "fnord"

Back to basics

How many values can we construct from the following type?

data Bool = False | True

Ordering

Another well-known type:

data Ordering = LT | EQ | GT

Clearly we can construct three different values of this type.

A zero-valued type

In Haskell 2010, we can create types from which no values can be constructed:

data Empty

This type has no value constructors (and we can’t use deriving syntax on it).

“Why?” you may ask. For programming with types while compiling.

Zero, one, two…

So big deal, we can create types with zero or more constructors:

data Empty
data One = One
data Bool = False | True

Adding some parameters

Given these:

data Ordering = LT | EQ | GT

data Bool = False | True

Here’s another type to ponder.

data A = A Bool
       | B Ordering

Spend a minute working out how many values this can have. We’ll do a quick poll.

Abstracting I

Now how many values can this familiar type have?

(a,b)

Abstracting II

Now how many values can this familiar type have?

data Either a b = Left a | Right b

Algebra I

Why do we refer to these as product types?

(a,b,c)

data Product a b c = Product a b c

They can hold a number of values equal to:

a × b × c

Algebra II

The same holds for the naming of sum types:

data Sum a b c = A a
               | B b
               | C c

They can hold a number of values equal to:

a + b + c

Working with nested data

Suppose we’re writing a benchmarking tool. We’ll take criterion as an example.

Measurements produce noisy samples.

The effect of outliers

We want to understand how outliers in our sample data affect the sample mean and standard deviation.

data OutlierEffect
    = Unaffected -- ^ Less than 1% effect.
    | Slight     -- ^ Between 1% and 10%.
    | Moderate   -- ^ Between 10% and 50%.
    | Severe     -- ^ Above 50% (i.e. measurements
                 -- are useless).

Our OutlierEffect type is embedded in another type that carries extra information.

data OutlierVariance = OutlierVariance {
      ovEffect      :: OutlierEffect
    , ovDescription :: String
    , ovFraction    :: Double
    }

More nesting

And OutlierVariance is buried in another type.

data SampleAnalysis = SampleAnalysis {
      anMean       :: [Double]
    , anStdDev     :: [Double]
    , anOutlierVar :: OutlierVariance
    }

Which is nested in yet another type.

data Payload = Payload {
      sample         :: [Double]
    , sampleAnalysis :: SampleAnalysis
    , outliers       :: Outliers
    }

Accessing data is easy

Even with three levels of nesting, it’s easy to access an OutlierEffect given a Payload.

effect :: Payload -> OutlierEffect
effect = ovEffect . anOutlierVar . sampleAnalysis

These record accessor functions are handy!

Updates, not so much

OK, so suppose we want to “modify” an OutlierEffect buried in a Payload.

editEffect :: (OutlierEffect -> OutlierEffect)
           -> Payload -> Payload
editEffect eff payload =
    payload {
      sampleAnalysis = analysis {
        anOutlierVar = variance {
          ovEffect = eff effect
        }
      }
    }
  where analysis = sampleAnalysis payload
        variance = anOutlierVar analysis
        effect   = ovEffect variance

This is hideous! It hardly even looks like Haskell.

What was this?

We just saw Haskell’s record update syntax in action.

setAddrZip :: Zip -> Address -> Address
setAddrZip zip addr = addr { addrZip = zip }

This notation means:

It’s a way of “editing” a value that leaves the original unchanged, but doesn’t require us to specify every field to copy.

It’s also a very non-composable hack, as we saw.

What we actually want

Our demands:

  1. Access fields within records.

  2. Compose accesses, so that we can inspect fields within nested records.

  3. Update fields within records.

  4. Compose updates, so that we can modify fields within nested records.

With Haskell’s record syntax, we get #1 and #2, sort of #3 (if we squint), and #4 is hideous.

What to do?

Suppose we have a pair.

(a,b)

We’d like to edit its second element.

editSnd :: (b -> c) -> (a,b) -> (a,c)
editSnd f (a,b) = (a, f b)

Let’s refer to the fact that we’re interested in the second element focusing on it.

It’s equally easy to edit the first element.

editFst :: (a -> c) -> (a,b) -> (c,b)
editFst f (a,b) = (f a, b)

Holes

Let’s refer to the slot we want to fill when editing a tole as a hole.

Here, the hole is in the second position.

editSnd :: (b -> c) -> (a,b) -> (a,c)
editSnd f (a,b) = (a, f b)

And here, it’s in the first.

editFst :: (a -> c) -> (a,b) -> (c,b)
editFst f (a,b) = (f a, b)

Counting holes

If we drop the b from (a,b), how many values does the resulting pseudo-type have?

Counting holes

If we drop the b from (a,b), how many values does the resulting pseudo-type have?

What if we drop a from (a,b)?

Counting holes

If we drop the b from (a,b), how many values does the resulting pseudo-type have?

What if we drop a from (a,b)?

If we want to drop some arbitrary field from (a,b,c), we can represent this via a type.

data Hole3 a b c = AHole b c
                 | BHole a c
                 | CHole a b

Counting holes

We can write the number of values of (x,x,x) as x × x × x, or x3.

If we substitute x for a, b, and c below, how many different values of type Hole3 can there be?

data Hole3 a b c = AHole b c
                 | BHole a c
                 | CHole a b

Counting holes

We can write the number of values of (x,x,x) as x × x × x, or x3.

If we substitute x for a, b, and c below, how many different values of type Hole3 can there be?

data Hole3 x x x = AHole x x
                 | BHole x x
                 | CHole x x

Hmm, that’s 3x2.

Does this remind you of symbolic differentiation?

Back to pairs

Here’s a hole type for pairs.

data PairHole a b = HoleFst b
                  | HoleSnd a

If we pull a value out of the hole, we need to store it somewhere so we can work with it.

data PairZipper a b c = PZ c (PairHole a b)

Why do we have an extra type parameter c?

Quick exercise

Please provide bodies for the two undefined functions below.

You have one minute.

data PairHole a b = HoleFst b
                  | HoleSnd a

data PairZipper a b c = PZ c (PairHole a b)

focusFst :: (a,b) -> PairZipper a b a
focusFst = undefined

focusSnd :: (a,b) -> PairZipper a b b
focusSnd = undefined

Skeleton: http://cs240h.scs.stanford.edu/Hole1.hs

My solution

data PairHole a b = HoleFst b
                  | HoleSnd a

data PairZipper a b c = PZ c (PairHole a b)

focusFst :: (a,b) -> PairZipper a b a
focusFst (a,b) = PZ a (HoleFst b)

focusSnd :: (a,b) -> PairZipper a b b
focusSnd (a,b) = PZ b (HoleSnd a)

A nice thing about this?

The inverse conversion

We obviously also need to be able to convert from a zipper back to a pair.

unfocusFst :: PairZipper a b a -> (a,b)
unfocusFst (PZ a (HoleFst b)) = (a,b)

unfocusSnd :: PairZipper a b b -> (a,b)
unfocusSnd (PZ b (HoleSnd a)) = (a,b)

Accessing the focused value

Now that we have focus functions to get the first or second element of a pair, we can write a generic accessor function for our zipper type.

view :: PairZipper a b c -> c
view (PZ c _) = c

Try in ghci:

>>> view (focusFst ("hello",1))
"hello"
>>> view (focusSnd ("hello",1))
1

Editing the focused value

This is the more fun part.

over :: (c -> c)
     -> PairZipper a b c
     -> PairZipper a b c
over f (PZ c l) = PZ (f c) l

Once again in ghci:

>>> unfocusSnd . over succ . focusSnd $ ("hello",1::Int)
("hello",2)

Editing part deux

What will this print in ghci?

>>> unfocusFst . over length . focusFst $ ("hello",1::Int)

Editing part deux

What will this print in ghci?

>>> unfocusFst . over length . focusFst $ ("hello",1::Int)

It’s a type error! over is not polymorphic enough.

Bad version:

over :: (c -> c)
     -> PairZipper a b c
     -> PairZipper a b c
over f (PZ c l) = PZ (f c) l

The good version allows editing to change the type of the field being edited:

over :: (c -> d)
     -> PairZipper a b c
     -> PairZipper a b d
over f (PZ c l) = PZ (f c) l

Hmm

This approach has problems.

We have to specify what field we’re focusing at both ends of the “pipeline”.

Can we compose these so that we can ‘focusFst’ then ‘focusSnd’ to get another zipper?

Gluing things together

Instead of keeping focusFst and unfocusFst separate and wiring them together by hand, let’s manage them automatically.

data Focused t a b = Focused {
    focused :: a
  , rebuild :: b -> t
  }

A Focused is a pair consisting of:

type Focuser s t a b = s -> Focused t a b

A Focuser is a function that takes a value and gives us a Focused.

Why so polymorphic?

Recall that our original definition of over wasn’t polymorphic enough.

We could not change the type of the first element while editing a pair.

>>> unfocusFst . over length . focusFst $ ("hello",1::Int)

Well, Focused and Focuser have so many type parameters to give exactly this generality.

Another look

data Focused t a b = Focused {
    focused :: a
  , rebuild :: b -> t
  }

Focused is in effect saying:

Another look

type Focuser s t a b = s -> Focused t a b

The “meaning” of Focuser is:

Some machinery

Functions for working with these types:

unfocus :: Focused s a a -> s
unfocus (Focused focused rebuild) = rebuild focused

view :: Focuser s t a b -> s -> a
view l s = focused (l s)

over :: Focuser s t a b -> (a -> b) -> s -> t
over l f s = let Focused focused rebuild = l s
             in rebuild (f focused)

Our friends focusFst and focusSnd recast in this framework:

_1 :: Focuser (a,b) (c,b) a c
_1 (a,b) = Focused a (\c -> (c,b))

_2 :: Focuser (a,b) (a,c) b c
_2 (a,b) = Focused b (\c -> (a,c))

Your turn

Here’s your scaffolding:

data Focused t a b = Focused {
    focused :: a
  , rebuild :: b -> t
  }

type Focuser s t a b = s -> Focused t a b

Take two minutes to implement this:

focusHead :: Focuser [a] [a] a a
focusHead = undefined

It should focus on the head of a list, such that we can run this in ghci:

>>> over focusHead toUpper "anita"
"Anita"

Skeleton: http://cs240h.scs.stanford.edu/Focus.hs

Abstracting again

Our two most interesting functions have a lot in common.

over :: Focuser s t a b -> (a -> b) -> s -> t
view :: Focuser s t a b             -> s -> a

How could we unify these types?

wat :: Focuser s t a b -> (a -> f b) -> s -> f t

Type-level fun

Here, f is a type-level function.

wat :: Focuser s t a b -> (a -> f b) -> s -> f t

If we supply the type-level identity function, f disappears and we get out the type of over:

wat  :: Focuser s t a b -> (a -> f b) -> s -> f t
over :: Focuser s t a b -> (a ->   b) -> s ->   t

With the type-level const a function, we get the type of view:

wat  :: Focuser s t a b -> (a -> f b) -> s -> f t
view :: Focuser s t a b {- ignored -} -> s -> a

Type-level identity

Defined in Data.Functor.Identity:

newtype Identity a = Identity { runIdentity :: a }

instance Functor Identity where
    fmap f (Identity a) = Identity (f a)

Type-level const

Defined in Control.Applicative:

newtype Const a b = Const { getConst :: a }

instance Functor (Const a) where
    fmap _ (Const v) = Const v

Our final type

{-# LANGUAGE RankNTypes #-}

type Lens s t a b = forall f. Functor f =>
                    (a -> f b) -> s -> f t

From our perspective as lens library writers:

We use forall here to make it clear that we control the Functor we use, not our caller.

We choose Identity or Const a to get the right types for over and view.

Our final type

{-# LANGUAGE RankNTypes #-}

type Lens s t a b = forall f. Functor f =>
                    (a -> f b) -> s -> f t

From our perspective as lens library writers:

We have to explain this type to users.

New machinery

{-# LANGUAGE RankNTypes #-}

import Control.Applicative
import Data.Functor.Identity

type Lens s t a b = forall f. Functor f =>
                    (a -> f b) -> s -> f t

over :: Lens s t a b -> (a -> b) -> s -> t
over l f s = runIdentity (l (Identity . f) s)

view :: Lens s t a b -> s -> a
view l s = getConst (l Const s)

Tuple sections

If we turn on this:

{-# LANGUAGE TupleSections #-}

And write this:

(a,)

It’s equivalent to this:

\b -> (a,b)

More machinery

{-# LANGUAGE TupleSections #-}

_1 :: Lens (a,b) (c,b) a c
_1 f (a,b) = (,b) <$> f a

_2 :: Lens (a,b) (a,c) b c
_2 f (a,b) = (a,) <$> f b

_head :: Lens [a] [a] a a
_head f (a:as) = (:as) <$> f a

Composing access

In ghci:

>>> view (_1 . _head) ("foo",True)
'f'

Why is this different from the traditional order of composition?

>>> (head . fst) ("foo",True)
'f'

Composition of lenses

What is a lens even for?

Thus:

What does it then mean to compose lenses?

If you write _1 . _head, you are:

Composing modifications

Let’s work out how we would use the lens machinery to give us a pair with an uppercased first name.

("anita", True)

1: Why are lenses composable?

At first glance, it’s hard to tell why _1 . _head even typechecks:

_1    :: Functor f => (a -> f c) -> (a, b) -> f (c, b)
_head :: Functor f => (a -> f a) -> [a] -> f [a]

And especially—why can we compose using . for function composition?

2: Why are lenses composable?

The key: remembering that a function of 2 arguments is really a function of 1 arg that returns a function.

_1 :: Functor f =>
      (a -> f c) ->
      ((a, b) -> f (c, b))

_head :: Functor f =>
         (a -> f a) ->
         ([a] -> f [a])

_1._head :: Functor f =>
            (a -> f a) ->
            ([a], b) -> f ([a], b)

What next?

The best place to start is with the gateway drug:

The full monty:

Becoming more widely used in practice:

Spotter’s guide to lens operators

^. is view (think “getter”)

%~ is over (think “editor”)

.~ is over – but accepts a value instead of a function (think “setter”)

& is just $ with arguments flipped

Used as follows:

foo & someField %~ ('a':)
    & otherField .~ 'b'

(“Thing being modified, followed by modifiers in a chain.”)