Understanding bind
03 April 2016In Understanding map we learned that implementing
a map
function is what we call a Functor. In Applicative Functors
we extended that idea with the return
and apply
function. The next important function in our toolset is
the bind
function.
Monads
The combination of return
and bind
is what we call a Monad. But currently
I will not consider this as an introduction to Monads at all. If you heard the Monad term and
search for an introduction to understand what a Monad is you will not find an answer her. If
you already have some basic understanding about the term than this and my two previous blogs
can help to understand the concept. Otherwise if you just try to understand what a Monad is
I recommend the following link to understand the problem:
The what are Monads Fallacy
The Problem
I think it is always good to start with a problem. If we understand a problem first, we usually
have it easier to understand why we are doing something. Currently we have map
to upgrade functions with
one argument, with return
and apply
we could upgrade functions with multiple arguments. So
what is bind
supposed to do?
Up to this point we only upgraded functions that had normal unboxed input and output types. We always
faced functions like 'a -> 'b
, but never functions like 'a -> option<'b>
, 'a -> Async<'b>
or 'a -> list<'b>
. But in practice, the latter are quite common.
A simple example is a function that tries to parse a string
to a float
. Because parsing of
a string to a float could fail we usually expect a return type like option<float>
. Usually we
create a Double
extension for this.
1: 2: 3: 4: 5: 6: 7: |
|
We now have a function Double.tryParse
with the signature.
1:
|
|
I will call such functions Monadic functions from now on. All Monadic functions expect
normal input arguments, but return a boxed type, like option<'a>
, list<'a>
, Async<'a>
and so on.
The problem with such functions is that we cannot easily upgrade them like other functions. For example,
let's assume we have a option<string>
, and now we want to pass this value to Double.tryParse
.
As tryParse
only expects string
we could Option.map
tryParse
so it could work with
a option<string>
.
But map
not only adds a option
layer to the input, it also adds it to the output. When we
use Option.map
on our Double.tryParse
function, we get a function that looks like this:
1:
|
|
The problem is that our output is wrapped two times in the same layer. Now we have a
option containing an option containing a float. But what we really want is just an option<float>
.
This is where bind
comes into the play. The purpose of bind
is to only upgrade the input of a
function because the output of a function already returns an upgraded type. A bind
function thus
always have the type-signature
1: 2: 3: |
|
return
once again
The bind
function don't stands on it's own. We also need a return
function. But we
already covered this function in Applicative Functors.
Implementing bind
We can implement bind
in two different ways. It is good to know both as depending on which type we
have, sometimes the one or the other can be easier.
-
The obvious way. You directly write a
bind
function that is similar tomap
, but instead of wrapping the output, you just return the output of the function as-is. -
You first write a
join
,concat
orflatten
function (The exact name of such a function usually depends on the type you have). The idea of such a function is to resolve two boxed types just into a single box. After this you justmap
and thenjoin
the result to createbind
.
The option
type has the advantage that both implementations are easy, so let's look at how we could
implement bind
for Option
in both ways.
The direct way
The direct way can sometimes be nearly identical to map
. Let's look at the map
and bind
implementation side-by-side.
1: 2: 3: 4: 5: 6: 7: 8: 9: |
|
As you can see, both functions are nearly identical. The only difference is that we just do f x
instead
of Some (f x)
. We don't need to wrap the output in a Some
because our function f
already returns
an option
. So we just return it's output directly.
The join
way
The other way is to first implement a new function that can turn a option<option<'a>>
just into a
option<'a>
. That's also quite easy. We first check our outer-most option
. If it is None
we just return None
. In the Some
case we have another option that we directly return.
1: 2: 3: 4: |
|
Now we create bind
by just using map
and join
the result.
1:
|
|
Simple usage
Let's test both functions and compare it with a map
call.
1: 2: 3: |
|
As we can see from the signature. input1
and input2
are just option<float>
instead of option<option<float>>
that a map
will return us.
The Option
module already contains Option.map
and Option.bind
, so we don't have to
rewrite those ourselves. As another exercise, let's look at a bind
implementation for list
.
bind
for list
Creating a bind
for a list
is a case where the first-approach is usually really hard. Let's look
at a map
implementation for list
first.
1: 2: 3: 4: |
|
Option.bind
was really easy as we could directly return what the call to f
returned. But for a list
this is not possible. Because in a list we call f
multiple times for the input list, and the output
of those are collected into a new list.
Because f
is a Monadic function in bind
it means every call to f
will return a list. If
we add a list to another list, we get a list of list as expected list<list<'a>>
. If we try to
return a single list
instead, it means we have to loop over the result of f
and add its element
to another list.
Solving that problem inside of bind
is hard, because list
is an immutable data-structure. With a
mutable list (ResizeArray
) this operation would be quite easy, as we just could call f x
that
returns a list
and loop through it and add it to some other list, but with an immutable list we
cannot just add elements to an existing element.
When we really want to solve it in one-go we could use a mutable list like ResizeArray
, otherwise
we have to use two nested fold
or foldBack
calls. Instead of nesting it and turning it in a complex
function it is usually better to just extract those operation into it's own function. So we create
a concat
operation first, that can turn a list<list<'a>>
just into a single list.
I'm not showing how to implementing concat
for list
, as the focus is bind
not how immutable list
processing works. So for list
we usually would prefer just a map
and concat
implementation for
bind
.
1:
|
|
As you can see. f
in our example now can be a 'a -> 'b list
function. So it now produces a whole new list
for every input of our starting list, but we still get a single list, not a list of list back.
F# also provides an implementation for this function. But it is named List.collect
instead of List.bind
.
An operator for bind
In Applicative Functors we used <!>
for the map
function.
And <*>
for the apply
function. We use >>=
as an operator for the bind
function. But on top of it.
If we write it as an operator we swap the arguments. We expects our type option
, list
, async
on the
left-side and the function on the right-side.
1:
|
|
Continuation-passing Style
The reason for this change is that we think of bind
as some kind of
Continuation-passing Style programming.
To understand the change, we have to go back at the signature. Up until now i often described map
and apply
by the idea to just pass in the first argument. So when we have
1:
|
|
we see it as a function that just upgrades the input of a function. But we still have a two argument
function here, and the two argument form is how bind
is used most often. If we threat it as
a two-argument function we have something like this:
We have a option<'a>
as an input. And we provide a function 'a -> option<'b>
. As we can see, the input
of f
is just 'a
. So what we get as the input is the unwrapped 'a
that is inside option<'a>
.
It can help here if we think with piping |>
. The idea of piping is that we can write the next argument
of a function on the left side. So instead of f x
we also can write x |> f
. When we use bind
with
piping we have something like x |> Option.bind f
. We also can rearange the type-signature to reflect
this style of writing
1:
|
|
When we use piping with bind, we get something similar to the above. And probably the order becomes
clearer. We start with a boxed value like option<'a>
, then our bind
function somehow extract the
'a
from our option<'a>
, this 'a
is then passed to the function ('a -> option<'b>)
. This function
returns an option<'b>
what is also what bind
will then return!
But it is important to understand that there is no guarantee that our function will be called at all!
Look again at the implementation of bind
to understand this. bind
checks whether we have None
or Some
.
In the None
case it will just return None
only in the Some
case it will call f x
and execute
our function that we passed to bind
!
Not only that, the unwrapping of the option
is already handled for us by the bind
function. So we
can pass a function f
to bind
that only will be executed if we have Some value
.
Let's create an example to understand this idea in more depth. At first we create a function that
prints some text to screen and expect the user to enter a float. We try to parse the input as
float
with our Double.tryParse
function that returns an option<float>
.
1: 2: 3: |
|
Now we sure could write
1:
|
|
and someInput
would contain an option<float>
. We now could use that option<float>
with other
functions. We just could map
or apply
all other functions that are not compatible with option
.
But instead of doing that, let's pass the resulting option<float>
directly to bind
. We then
provide a continuation function to bind
that only will be executed if we have Some value
.
The advantage is that our f
function only sees a float
, not a option<float>
. We now
can do something with that float
.
Let's write an example where the user inputs the radius of a circle, and we calculate the area of that circle.
1: 2: 3: 4: 5: 6: 7: 8: |
|
Let's go through the example step-by-step.
-
At first we just create a function
circleArea
that calculates the area from a given radius. For such a function we just expectfloat
as input. We usually don't expectoption<float>
orlist<float>
as the input. -
Then we call
getUserInput "Enter radius"
. The user will see "Enter radius: " and he must enter something. The input will be parsed as afloat
. We will either getSome x
back if the user input was afloat
orNone
if the input was not valid. -
This option is then directly passed to
Option.bind
as the second argument. We use the Pipe|>
here to bring theoption
to the left-side. -
The right-side is now a continuation function. If the
option
passed tobind
containsSome x
, that means a validfloat
, our continuation function is called andbind
returns the result of our continuation function. If the input tobind
wasNone
,bind
will immediately returnNone
without executing the continuation function. -
Look at the type of
userInput
. It is afloat
not anoption<float>
. We have a continuation function that only will be execute if we have a validfloat
. And we can directly work with afloat
. -
In our Continuation function we use the
float
to calculate the area of a circle. As we only havefloat
not anoption<float>
we don't have tomap
circleArea
. -
As you now can see
let area
inside our continuation function is now a normalfloat
. But now we want to returnarea
as the result of our calculation. Butbind
must return anoption
value. So how do we do that? We use ourretn
(return) function to convert a normalfloat
into anoption<float>
-
Our outer
area
is now aoption<float>
that either isSome
and contains the calculated area for a circle. Or it isNone
, because the user input could not be parsed.
Currently we don't print the result. So let's print area
. As area
(outside of the
continuation function) is now a option<float>
we have to Pattern Match it to see if our computation
was successful or not.
1: 2: 3: |
|
If the user input was 10
for example, we will see The area of a circle is 314.159265
, but if we provide
an invalid input, we just see User Input was not a valid number
. In our example we first had a
option
value and passed it to Option.bind
with |>
. This happens often, that is why we created
>>=
previously.
Let's extend that example. We now ask the user for three inputs. And we will calculate the volume of a cube.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: |
|
As we can see now. We ask the user three times to input a number X, Y and Z. If all inputs were valid. We
just calculate the volume with let volume = x * y * z
. The important aspect is that all of our values
are always float
never option<float>
, because the bind
operation >>=
already did the
unwrapping for us.
And probably it now becomes clear why we named our constructor return
(retn). Inside of our
continuation functions we never have lifted values. But at the end of our continuation functions we
always must return a lifted value. So lifting and returning is always the last statement we do.
Let's inspect the syntax a little bit deeper. Look at the syntax of a normal let
definition in F#. Usually a let
definition contains a name, a equal "=" and a expression that
will be executed. Actually just look at the following two lines and just compare them.
1: 2: |
|
Do you spot the similarities?
- Both definition have an expression
getUserInput "Length X"
this expression will be executed. - In the first example: We only have
=
for assignment, and we assign the result tolet x
. - In the second example: We have
>>= (fun x
as we assign the result of the expression tox
.
So what is the difference between both?
The first difference is that the statements are just flipped. With let
we have something like
1:
|
|
But with our bind
operation we just have
1:
|
|
But the more important difference is the result (our variable). In a normal let
definition we
will get option<float>
. But with bind
, we just get float
. bind
decides whether our
continuation function should be called or not.
Computation Expressions
The idea of this kind of continuation-passing style is actually really powerful. So powerful that F#
provides a language construct to let it look like normal code. At first, we just create a class
that contains a Bind
and Return
method that we want to use.
1: 2: 3: 4: 5: |
|
As you can see. The Bind
and Return
methods are not special. They are just the functions you already
know! After you created a class you must create an object of this class. That is our maybe
. Now you
can use the following special syntax.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: |
|
So, what happens exactly here? Whenever you use let!
Bind
is just called. That means,
if you have option<float>
on the right side. But you use let! x
. Then you just get a float
.
Every code after let!
is automatically converted into a continuation function that is passed to
Bind
. The return
statement (that is only available inside a computation expression) turns
a normal value into a lifted value. In this example it wraps it inside a option
.
You now can write code as option
doesn't exists at all. Whenever you have a
function that returns an option
, you just must use let!
instead of let
. The let!
call uses
Bind
under the hood. You never need to upgrade functions with map
or apply
as you don't
work with lifted values. You can use all your functions directly.
But it doesn't mean that we just erased option
. option
is still present, but the handling
of it is done by the bind
function. Whenever we have an expression on the right side that for example
returns a None
then the computation stops at this point. Why? Because our bind
function only calls
the passed in f
function (the continuation) in the Some
case.
And it overall also means that the result of a maybe { ... }
is always an option
! Because it
is an option
you easily can use functions defined with a maybe { ... }
construct in other
maybe { ... }
constructs.
On top of it you still get the safety that option
provides you, that means at some point you must
check the value. But it is up to you if you just use a generic check that you implemented in bind
, or
write your own handling.
What you see here is a basic implementation of the Maybe Monad. And it is the implementation of the second solution I showed in the null is Evil post.
Defining map
and apply
through bind
The combination of return
and bind
is really powerful. In
Applicative Functors we already saw that we can
implement map
through return
and apply
. But with return
and bind
we can easily implement
map
and apply
.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: |
|
Because of this we always have an Applicative Functor when we have a Monad.
Kleisli Composition
Function composition is the idea to create a new function out of two smaller functions. It usually works as long we have two function with a matching output and input type.
1:
|
|
Because 'b
is the output of anothers function input, we can directly create a new composed
function that goes from 'a
to 'c
'a -> 'c
. But this doesn't work for Monadic functions
as they don't have matching input/output.
1: 2: |
|
These functions cannot be composed because option<'b>
is not the same as 'b
. But with our
bind operator >>=
we can easily pass boxed values into function that don't expect them. Because
of that we also can create a compose function that directly compose two Monadic functions.
We use the operator >=>
for this kind of composition. This kind of composition is also named
Kleisli composition.
1: 2: |
|
Now we can compose two Monadic functions directly.
1:
|
|
the result is a new Monadic function.
1:
|
|
Laws
We already saw Laws for Functors and Applicative Functors. The combination of return
and bind
(a Monad) also must satisfy three laws. In the following description I use
1: 2: 3: 4: |
|
But sure, all laws have to work with any function or value combination. But seeing some actual values makes it easier to understand the laws.
1. Law: Left identity
When we return
(box) a value and then use bind
(that unbox the value) and pass it to a function.
It is the same as directly passing the value to a function.
1:
|
|
2. Law: Right identity
Binding a boxed value and returning it, is the same as the boxed value
1:
|
|
3. Law: Associative
Order of composing don't play a role. We can pass a value to f
and the result to g
and
it has to be the same as if we compose f
and g
first, and pass our value to the composed
function.
1: 2: 3: 4: |
|
Summary
With map
, retn
, apply
and bind
we have four general functions that simplifies working
with boxed types like option
, list
, Async
and so on. Whenever you create a new type you
should consider implementing those functions too. Here is a quick overview of those
functions and when to use them.
map
1:
|
|
When we interpret it as a "one-argument" function we can add our boxed type M
to the input and
output of a function.
Interpreted as a "two-argument" function we can use a boxed value M<'a>
directly with a function
that can work with the wrapped type 'a
.
apply
1:
|
|
With apply we can work with boxed functions. We get those as a result if we try to map
a function
that has more than one argument. Or we just lift a function with return
. We can view apply
as
Partial Application for boxed function. With every call we can provide the next value to a
function that also is a boxed value. In this way we can turn every argument of a function
to a boxed value. A function like int -> string -> float -> int
can thus be turned into
1:
|
|
return
or retn
1:
|
|
It just boxes a 'a
bind
1:
|
|
Interpreted as a one-argument function, we can upgrade a function like map
. The difference is
that we only upgrade the input, because the function we have already return a boxed value.
Interpreted as a two-argument function, we see it as a form of Continuation passing style. We
often use piping with |>
to get the value to the left-side and the continuation function on
the right-side.
1:
|
|
On top, we give |> M.bind
it's own operator >>=
1:
|
|
This way we have a boxed value M<'a>
, but our function f
only receives an unboxed 'a
. In this way
we can work with unboxed values and also use any function without explicitly box them. Because
we must return boxed values we usually use return
to return/box an unboxed value inside of f
.
The syntax of this kind of continuation-passing style can be improved with a Computation Expression.
Implementations
map
can be implemented throughreturn
andapply
map
can be implemented throughreturn
andbind
apply
can be implemented throughreturn
andbind
bind
can be implemented throughmap
and some kind ofconcat
operation
struct
member CompareTo : value:obj -> int + 1 overload
member Equals : obj:obj -> bool + 1 overload
member GetHashCode : unit -> int
member GetTypeCode : unit -> TypeCode
member ToString : unit -> string + 3 overloads
static val MinValue : float
static val MaxValue : float
static val Epsilon : float
static val NegativeInfinity : float
static val PositiveInfinity : float
...
end
Full name: System.Double
Full name: Main.tryParse
Double.TryParse(s: string, style: Globalization.NumberStyles, provider: IFormatProvider, result: byref<float>) : bool
val string : value:'T -> string
Full name: Microsoft.FSharp.Core.Operators.string
--------------------
type string = System.String
Full name: Microsoft.FSharp.Core.string
Full name: Microsoft.FSharp.Core.option<_>
val float : value:'T -> float (requires member op_Explicit)
Full name: Microsoft.FSharp.Core.Operators.float
--------------------
type float = System.Double
Full name: Microsoft.FSharp.Core.float
--------------------
type float<'Measure> = float
Full name: Microsoft.FSharp.Core.float<_>
Full name: Microsoft.FSharp.Collections.list<_>
type Async
static member AsBeginEnd : computation:('Arg -> Async<'T>) -> ('Arg * AsyncCallback * obj -> IAsyncResult) * (IAsyncResult -> 'T) * (IAsyncResult -> unit)
static member AwaitEvent : event:IEvent<'Del,'T> * ?cancelAction:(unit -> unit) -> Async<'T> (requires delegate and 'Del :> Delegate)
static member AwaitIAsyncResult : iar:IAsyncResult * ?millisecondsTimeout:int -> Async<bool>
static member AwaitTask : task:Task -> Async<unit>
static member AwaitTask : task:Task<'T> -> Async<'T>
static member AwaitWaitHandle : waitHandle:WaitHandle * ?millisecondsTimeout:int -> Async<bool>
static member CancelDefaultToken : unit -> unit
static member Catch : computation:Async<'T> -> Async<Choice<'T,exn>>
static member FromBeginEnd : beginAction:(AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromBeginEnd : arg:'Arg1 * beginAction:('Arg1 * AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromBeginEnd : arg1:'Arg1 * arg2:'Arg2 * beginAction:('Arg1 * 'Arg2 * AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromBeginEnd : arg1:'Arg1 * arg2:'Arg2 * arg3:'Arg3 * beginAction:('Arg1 * 'Arg2 * 'Arg3 * AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromContinuations : callback:(('T -> unit) * (exn -> unit) * (OperationCanceledException -> unit) -> unit) -> Async<'T>
static member Ignore : computation:Async<'T> -> Async<unit>
static member OnCancel : interruption:(unit -> unit) -> Async<IDisposable>
static member Parallel : computations:seq<Async<'T>> -> Async<'T []>
static member RunSynchronously : computation:Async<'T> * ?timeout:int * ?cancellationToken:CancellationToken -> 'T
static member Sleep : millisecondsDueTime:int -> Async<unit>
static member Start : computation:Async<unit> * ?cancellationToken:CancellationToken -> unit
static member StartAsTask : computation:Async<'T> * ?taskCreationOptions:TaskCreationOptions * ?cancellationToken:CancellationToken -> Task<'T>
static member StartChild : computation:Async<'T> * ?millisecondsTimeout:int -> Async<Async<'T>>
static member StartChildAsTask : computation:Async<'T> * ?taskCreationOptions:TaskCreationOptions -> Async<Task<'T>>
static member StartImmediate : computation:Async<unit> * ?cancellationToken:CancellationToken -> unit
static member StartWithContinuations : computation:Async<'T> * continuation:('T -> unit) * exceptionContinuation:(exn -> unit) * cancellationContinuation:(OperationCanceledException -> unit) * ?cancellationToken:CancellationToken -> unit
static member SwitchToContext : syncContext:SynchronizationContext -> Async<unit>
static member SwitchToNewThread : unit -> Async<unit>
static member SwitchToThreadPool : unit -> Async<unit>
static member TryCancelled : computation:Async<'T> * compensation:(OperationCanceledException -> unit) -> Async<'T>
static member CancellationToken : Async<CancellationToken>
static member DefaultCancellationToken : CancellationToken
Full name: Microsoft.FSharp.Control.Async
--------------------
type Async<'T>
Full name: Microsoft.FSharp.Control.Async<_>
Full name: Main.mapOption
Full name: Main.bindOption
Full name: Main.joinOption
Full name: Main.bindOption2
Full name: Main.input1
Full name: Main.input2
Full name: Main.input3
Full name: Main.mapList
module List
from Microsoft.FSharp.Collections
--------------------
type List<'T> =
| ( [] )
| ( :: ) of Head: 'T * Tail: 'T list
interface IEnumerable
interface IEnumerable<'T>
member GetSlice : startIndex:int option * endIndex:int option -> 'T list
member Head : 'T
member IsEmpty : bool
member Item : index:int -> 'T with get
member Length : int
member Tail : 'T list
static member Cons : head:'T * tail:'T list -> 'T list
static member Empty : 'T list
Full name: Microsoft.FSharp.Collections.List<_>
Full name: Microsoft.FSharp.Collections.List.foldBack
Full name: Main.bindList
Full name: Microsoft.FSharp.Collections.List.map
Full name: Microsoft.FSharp.Collections.List.concat
from Microsoft.FSharp.Core
Full name: Microsoft.FSharp.Core.Option.bind
Full name: Main.getUserInput
Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.printfn
static member BackgroundColor : ConsoleColor with get, set
static member Beep : unit -> unit + 1 overload
static member BufferHeight : int with get, set
static member BufferWidth : int with get, set
static member CapsLock : bool
static member Clear : unit -> unit
static member CursorLeft : int with get, set
static member CursorSize : int with get, set
static member CursorTop : int with get, set
static member CursorVisible : bool with get, set
...
Full name: System.Console
Full name: Main.someInput
Full name: Main.retn
Full name: Main.circleArea
static val PI : float
static val E : float
static member Abs : value:sbyte -> sbyte + 6 overloads
static member Acos : d:float -> float
static member Asin : d:float -> float
static member Atan : d:float -> float
static member Atan2 : y:float * x:float -> float
static member BigMul : a:int * b:int -> int64
static member Ceiling : d:decimal -> decimal + 1 overload
static member Cos : d:float -> float
...
Full name: System.Math
Full name: Main.area
Full name: Main.cubeVolume
type MaybeBuilder =
new : unit -> MaybeBuilder
member Bind : m:'b option * f:('b -> 'c option) -> 'c option
member Return : x:'a -> 'a option
Full name: Main.MaybeBuilder
--------------------
new : unit -> MaybeBuilder
Full name: Main.MaybeBuilder.Bind
Full name: Main.MaybeBuilder.Return
Full name: Main.maybe
Full name: Main.cubeVolume2
Full name: Main.map
Full name: Main.apply
Full name: Main.f
Full name: Main.g
Full name: Main.x
Full name: Main.m
Full name: Main.ax
Full name: Main.ay
val int : value:'T -> int (requires member op_Explicit)
Full name: Microsoft.FSharp.Core.Operators.int
--------------------
type int = int32
Full name: Microsoft.FSharp.Core.int
--------------------
type int<'Measure> = int
Full name: Microsoft.FSharp.Core.int<_>