Sequence and Traverse · David Raab

Sequence and Traverse

One problem that appears from time to time is that we we have some kind of collection (I use list here) and we want to map every element with a monadic function 'a -> M<'b>. This then returns a list<M<'a>>. But often we want a M<list<'a>>.

To be more concrete. Let's assume we want to turn a list of strings into integers. We could write a tryParseInt function that does string -> option<int>. But if we map our function with a list<string> we get a list<option<int>> back.

Sometimes that is what we want, but very often it is not. Usually what we want is a option<list<int>> instead. So we expect the types to be switched.

The idea behind it in this example is that when we use map over a list of strings it is either successful as a whole and all elements are integers or as soon a single element is not an integer we get None back as a whole.

We could just write a function that somehow does that kind of transformation just for tryParseInt, but instead of doing that again and again for every function we try to generalize that problem, so we can turn every list<option<'a>> into option<list<'a>>. Not only that, we want to generalize the problem that it works for any type, not just option.

This generalization is what we think of the sequence and traverse functions.

Sequence

We first start with the monadic tryParseInt function we already mentioned.

1: 
2: 
3: 
4: 
let tryParseInt str =
    match System.Int32.TryParse(str) with
    | false,_ -> None
    | true,x  -> Some x

Next, we have some kind of input from a file, user or somewhere else.

1: 
2: 
let validInput   = ["1";"100";"12";"5789"]
let invalidInput = ["1";"100";"12";"foo"]

In our example we want to do the following:

  1. Parse every string to an int
  2. If all inputs are valid, we want to sum the results
  3. If one input is invalid, we want to print an error message

The first step is easy, as we could just map the list.

1: 
let validInts = List.map tryParseInt validInput

We now have a list containing:

1: 
[Some 1; Some 100; Some 12; Some 5789]

The problem starts in how we determine that every element is valid. We sure could use fold to loop through our list. Starting with a bool set to true and as soon we encounter a None we set the bool to false.

But we already have Option for this kind of purpose. With Option we still can return the idea of true (Some) and false (None), but additional we also can return a value.

Instead of just getting a boolean flag like true and false we just return Some with a new list that has all the option values stripped instead. For the valid input case we just expect:

1: 
Some [1; 100; 12; 5789]

for the invalid input list we just expect:

1: 
None

As we need to loop through every element and build a new list, this is just a task for List.foldBack.

  1. Because we want a option<list<'a>> as a result, we start with Some []
  2. Then we check if acc and x are both Some
  3. If that is the case we add x to acc
  4. Otherwise we return None

We name this operation sequence.

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
let sequenceFold listOfOptions =
    let folder x acc =
        match x,acc with
        | Some x, Some list -> Some (x :: list)
        | Some _, None _    -> None
        | None _, Some _    -> None
        | None _, None _    -> None
    List.foldBack folder listOfOptions (Some [])

When we test our function it returns the right result

1: 
2: 
3: 
4: 
5: 
List.map tryParseInt validInput |> sequenceFold
// Some [1; 100; 12; 5789]

List.map tryParseInt invalidInput |> sequenceFold
// None

Nice, it works! But this implementation still has a problem. Our folder function is basically a duplicate of the apply function! The whole idea to check two option and only execute some code if both are Some is exactly what apply does. Let's look again at apply.

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
let apply fo xo =
    match fo,xo with
    | Some f, Some x -> Some (f x)
    | Some _, None _ -> None
    | None _, Some _ -> None
    | None _, None _ -> None

let (<*>) = apply

There is also another problem here. It doesn't matter which type we use. We always have to lift an empty list. In this case we did Some [] for the accumulator. But in a Async case we just want an empty list inside an Async. So we always just want to return a list for the context. It just means: We always can create a sequence function as long our type provides a return and apply function. Or in other words, our type is an Applicative Functor.

Let's think about how we can implement sequence with return and apply.

  1. The code we executed when we had two Some values was x :: list
  2. So we just create a function for this operation
  3. And apply this function
  4. Then we use this function as our folder function
1: 
2: 
3: 
4: 
5: 
let retn x = Some x

let sequence listOfOptions =
    let folder x xs = retn (fun x xs -> x :: xs) <*> x <*> xs
    List.foldBack folder listOfOptions (retn [])

We still get our expected results

1: 
2: 
3: 
4: 
5: 
List.map tryParseInt validInput |> sequence
// Some [1; 100; 12; 5789]

List.map tryParseInt invalidInput |> sequence
// None

Nice, everything works. So why is rewriting sequence in such a way better? What we basically have here is a Design Pattern (or what every Design Pattern is -- A Copy & Paste Pattern).

The sequence operation for a list is always the same. It just depends solely that a type supports a retn and a apply function. It probably opens up the question that when the implementation is always the same if we cannot just have a single implementation?

Yes and no. Currently in F# retn and <*> are not Polymorphic, they are specific functions we define ourself. We could fix it with Higher-Kinded Polymorphism but F# don't support this nicely, but there are ways around it. But i will not cover this topic here.

Traverse

So far we discussed sequence but in practice you will less likely implement sequence at all. Instead we will implement traverse. So how is traverse different from sequence?

As you have seen so far. Even with sequence there is one pattern that is always the same. You first map a list, then you use sequence on it. traverse is just the idea to combine both operations into a single operation.

If that sounds complicated, it isn't at all! Just think for a moment. map just means we apply a function to every element before we use sequence. So the only thing we need to implement traverse is to make sure we call a function that transforms every element before we pass it to our lifted function.

1: 
2: 
3: 
let traverse f list =
    let folder x xs = retn (fun x xs -> x :: xs) <*> f x <*> xs
    List.foldBack folder list (retn [])

The difference is so minimal that it can even be overlooked easily. We added the function call between the <*> operators: ... <*> f x <*> xs. Instead of map and then calling sequence, we now just can use traverse instead.

1: 
2: 
3: 
4: 
5: 
traverse tryParseInt validInput
// Some [1; 100; 12; 5789]

traverse tryParseInt invalidInput
// None

If the logic seems still hard to follow. We just can think of traverse as a map function for monadic functions that additionally swaps the layer when it finishes. When we use

1: 
List.map tryParseInt xs

We get a list<option<'b>>. But when we use

1: 
traverse tryParseInt xs

we get a option<list<'b>>.

Sequence defined through traverse

The primary reason why you less likely implement sequence is because traverse is basically the same implementation. You can enhance a sequence implementation easily by just adding a function call to the element before you apply it.

Once you have a traverse function, you can very easily create sequence by just using the id function with traverse.

1: 
let sequence xs = traverse id xs

You could even come to the conclusion to not provide a sequence implementation at all.

Finishing the Example

With traverse we now can easily finish our example that we started.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
let sum input =
    input
    |> traverse tryParseInt
    |> Option.map List.sum

let printSum opt =
    match opt with
    | None     -> printfn "Error: Some inputs were not numbers!"
    | Some sum -> printfn "Sum: %d" sum

printSum (sum validInput)
printSum (sum invalidInput)

This code now produces:

1: 
2: 
Sum: 5902
Error: Some inputs were not numbers!

Not limited to Option

It is in general important to understand that this concept is not limited to list and option. The only thing we need is a data-structure that has a fold function, and another type that is an Applicative Functor (has return and a apply function).

After we have those the general idea is to swap both layers. For example when we have a monadic function download that has the signature Uri -> Async<string> we expect that we can use this function on a list<Uri>. With List.map we would get a

1: 
list<Async<string>>

But when we use traverse we get a

1: 
Async<list<string>>

We don't need to write such a function for the Async type as we can use Async.Parallel. Async.Parallel is basically the sequence function. It takes a seq<Async<'a>> and turns it into an Async containing an array Async<'b []>. But anyway to see how it works we could extend the Async module with the needed functions.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
module Async =
    let retn x = async { return x }

    let apply af ax = async {
        // We start both async task in Parallel
        let! pf = Async.StartChild af
        let! px = Async.StartChild ax
        // We then wait that both async operations complete
        let! f = pf
        let! x = px
        // Finally we execute (f x)
        return f x
    }

    let (<*>)   = apply
    let map f x = retn f <*> x

    let traverse f list =
        let folder x xs = retn (fun x xs -> x :: xs) <*> f x <*> xs
        List.foldBack folder list (retn [])

We now can use Async.traverse in the following way

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
let uris = [Uri("http://www.google.com"); Uri("https://fsharpforfunandprofit.com")]
let download uri =
    use wc = new System.Net.WebClient()
    wc.AsyncDownloadString(uri)

let sizes =
    uris
    |> Async.traverse download
    |> Async.map (List.map (fun str -> str.Length))
    |> Async.RunSynchronously

for size in sizes do
    printfn "Content Length: %d" size

With Async.traverse we get a single Async that only completes once all Uris are downloaded. As you can see from the implementation. Async.traverse is identical to the version that we wrote for the option type. We just have to set retn and <*> to functions that work with Async.

Further Reading

module Main
type Uri = System.Uri

Full name: Main.Uri
namespace System
Multiple items
type Uri =
  new : uriString:string -> Uri + 5 overloads
  member AbsolutePath : string
  member AbsoluteUri : string
  member Authority : string
  member DnsSafeHost : string
  member Equals : comparand:obj -> bool
  member Fragment : string
  member GetComponents : components:UriComponents * format:UriFormat -> string
  member GetHashCode : unit -> int
  member GetLeftPart : part:UriPartial -> string
  ...

Full name: System.Uri

--------------------
System.Uri(uriString: string) : unit
System.Uri(uriString: string, uriKind: System.UriKind) : unit
System.Uri(baseUri: System.Uri, relativeUri: string) : unit
System.Uri(baseUri: System.Uri, relativeUri: System.Uri) : unit
val tryParseInt : str:string -> int option

Full name: Main.tryParseInt
val str : string
type Int32 =
  struct
    member CompareTo : value:obj -> int + 1 overload
    member Equals : obj:obj -> bool + 1 overload
    member GetHashCode : unit -> int
    member GetTypeCode : unit -> TypeCode
    member ToString : unit -> string + 3 overloads
    static val MaxValue : int
    static val MinValue : int
    static member Parse : s:string -> int + 3 overloads
    static member TryParse : s:string * result:int -> bool + 1 overload
  end

Full name: System.Int32
System.Int32.TryParse(s: string, result: byref<int>) : bool
System.Int32.TryParse(s: string, style: System.Globalization.NumberStyles, provider: System.IFormatProvider, result: byref<int>) : bool
union case Option.None: Option<'T>
val x : int
union case Option.Some: Value: 'T -> Option<'T>
val validInput : string list

Full name: Main.validInput
val invalidInput : string list

Full name: Main.invalidInput
val validInts : int option list

Full name: Main.validInts
Multiple items
module List

from Microsoft.FSharp.Collections

--------------------
type List<'T> =
  | ( [] )
  | ( :: ) of Head: 'T * Tail: 'T list
  interface IEnumerable
  interface IEnumerable<'T>
  member GetSlice : startIndex:int option * endIndex:int option -> 'T list
  member Head : 'T
  member IsEmpty : bool
  member Item : index:int -> 'T with get
  member Length : int
  member Tail : 'T list
  static member Cons : head:'T * tail:'T list -> 'T list
  static member Empty : 'T list

Full name: Microsoft.FSharp.Collections.List<_>
val map : mapping:('T -> 'U) -> list:'T list -> 'U list

Full name: Microsoft.FSharp.Collections.List.map
val sequenceFold : listOfOptions:'a option list -> 'a list option

Full name: Main.sequenceFold
val listOfOptions : 'a option list
val folder : ('b option -> 'b list option -> 'b list option)
val x : 'b option
val acc : 'b list option
val x : 'b
Multiple items
val list : 'b list

--------------------
type 'T list = List<'T>

Full name: Microsoft.FSharp.Collections.list<_>
val foldBack : folder:('T -> 'State -> 'State) -> list:'T list -> state:'State -> 'State

Full name: Microsoft.FSharp.Collections.List.foldBack
val apply : fo:('a -> 'b) option -> xo:'a option -> 'b option

Full name: Main.apply
val fo : ('a -> 'b) option
val xo : 'a option
val f : ('a -> 'b)
val x : 'a
val retn : x:'a -> 'a option

Full name: Main.retn
val sequence : listOfOptions:'a option list -> 'a list option

Full name: Main.sequence
val xs : 'b list option
val xs : 'b list
val traverse : f:('a -> 'b option) -> list:'a list -> 'b list option

Full name: Main.traverse
val f : ('a -> 'b option)
Multiple items
val list : 'a list

--------------------
type 'T list = List<'T>

Full name: Microsoft.FSharp.Collections.list<_>
val folder : ('a -> 'b list option -> 'b list option)
val sequence : xs:'a option list -> 'a list option

Full name: Main.sequence
val xs : 'a option list
val id : x:'T -> 'T

Full name: Microsoft.FSharp.Core.Operators.id
val sum : input:string list -> int option

Full name: Main.sum
val input : string list
module Option

from Microsoft.FSharp.Core
val map : mapping:('T -> 'U) -> option:'T option -> 'U option

Full name: Microsoft.FSharp.Core.Option.map
val sum : list:'T list -> 'T (requires member ( + ) and member get_Zero)

Full name: Microsoft.FSharp.Collections.List.sum
val printSum : opt:int option -> unit

Full name: Main.printSum
val opt : int option
val printfn : format:Printf.TextWriterFormat<'T> -> 'T

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.printfn
val sum : int
type 'T list = List<'T>

Full name: Microsoft.FSharp.Collections.list<_>
Multiple items
type Async
static member AsBeginEnd : computation:('Arg -> Async<'T>) -> ('Arg * AsyncCallback * obj -> IAsyncResult) * (IAsyncResult -> 'T) * (IAsyncResult -> unit)
static member AwaitEvent : event:IEvent<'Del,'T> * ?cancelAction:(unit -> unit) -> Async<'T> (requires delegate and 'Del :> Delegate)
static member AwaitIAsyncResult : iar:IAsyncResult * ?millisecondsTimeout:int -> Async<bool>
static member AwaitTask : task:Task -> Async<unit>
static member AwaitTask : task:Task<'T> -> Async<'T>
static member AwaitWaitHandle : waitHandle:WaitHandle * ?millisecondsTimeout:int -> Async<bool>
static member CancelDefaultToken : unit -> unit
static member Catch : computation:Async<'T> -> Async<Choice<'T,exn>>
static member FromBeginEnd : beginAction:(AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromBeginEnd : arg:'Arg1 * beginAction:('Arg1 * AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromBeginEnd : arg1:'Arg1 * arg2:'Arg2 * beginAction:('Arg1 * 'Arg2 * AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromBeginEnd : arg1:'Arg1 * arg2:'Arg2 * arg3:'Arg3 * beginAction:('Arg1 * 'Arg2 * 'Arg3 * AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromContinuations : callback:(('T -> unit) * (exn -> unit) * (OperationCanceledException -> unit) -> unit) -> Async<'T>
static member Ignore : computation:Async<'T> -> Async<unit>
static member OnCancel : interruption:(unit -> unit) -> Async<IDisposable>
static member Parallel : computations:seq<Async<'T>> -> Async<'T []>
static member RunSynchronously : computation:Async<'T> * ?timeout:int * ?cancellationToken:CancellationToken -> 'T
static member Sleep : millisecondsDueTime:int -> Async<unit>
static member Start : computation:Async<unit> * ?cancellationToken:CancellationToken -> unit
static member StartAsTask : computation:Async<'T> * ?taskCreationOptions:TaskCreationOptions * ?cancellationToken:CancellationToken -> Task<'T>
static member StartChild : computation:Async<'T> * ?millisecondsTimeout:int -> Async<Async<'T>>
static member StartChildAsTask : computation:Async<'T> * ?taskCreationOptions:TaskCreationOptions -> Async<Task<'T>>
static member StartImmediate : computation:Async<unit> * ?cancellationToken:CancellationToken -> unit
static member StartWithContinuations : computation:Async<'T> * continuation:('T -> unit) * exceptionContinuation:(exn -> unit) * cancellationContinuation:(OperationCanceledException -> unit) * ?cancellationToken:CancellationToken -> unit
static member SwitchToContext : syncContext:SynchronizationContext -> Async<unit>
static member SwitchToNewThread : unit -> Async<unit>
static member SwitchToThreadPool : unit -> Async<unit>
static member TryCancelled : computation:Async<'T> * compensation:(OperationCanceledException -> unit) -> Async<'T>
static member CancellationToken : Async<CancellationToken>
static member DefaultCancellationToken : CancellationToken

Full name: Microsoft.FSharp.Control.Async

--------------------
type Async<'T>

Full name: Microsoft.FSharp.Control.Async<_>
Multiple items
val string : value:'T -> string

Full name: Microsoft.FSharp.Core.Operators.string

--------------------
type string = System.String

Full name: Microsoft.FSharp.Core.string
val retn : x:'a -> Async<'a>

Full name: Main.Async.retn
val async : AsyncBuilder

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.async
val apply : af:Async<('a -> 'b)> -> ax:Async<'a> -> Async<'b>

Full name: Main.Async.apply
val af : Async<('a -> 'b)>
val ax : Async<'a>
val pf : Async<('a -> 'b)>
static member Async.StartChild : computation:Async<'T> * ?millisecondsTimeout:int -> Async<Async<'T>>
val px : Async<'a>
val map : f:('a -> 'b) -> x:Async<'a> -> Async<'b>

Full name: Main.Async.map
val x : Async<'a>
val traverse : f:('a -> Async<'b>) -> list:'a list -> Async<'b list>

Full name: Main.Async.traverse
val f : ('a -> Async<'b>)
val folder : ('a -> Async<'b list> -> Async<'b list>)
val xs : Async<'b list>
val uris : System.Uri list

Full name: Main.uris
val download : uri:System.Uri -> Async<string>

Full name: Main.download
val uri : System.Uri
val wc : System.Net.WebClient
namespace System.Net
Multiple items
type WebClient =
  inherit Component
  new : unit -> WebClient
  member BaseAddress : string with get, set
  member CachePolicy : RequestCachePolicy with get, set
  member CancelAsync : unit -> unit
  member Credentials : ICredentials with get, set
  member DownloadData : address:string -> byte[] + 1 overload
  member DownloadDataAsync : address:Uri -> unit + 1 overload
  member DownloadFile : address:string * fileName:string -> unit + 1 overload
  member DownloadFileAsync : address:Uri * fileName:string -> unit + 1 overload
  member DownloadString : address:string -> string + 1 overload
  ...

Full name: System.Net.WebClient

--------------------
System.Net.WebClient() : unit
member System.Net.WebClient.AsyncDownloadString : address:System.Uri -> Async<string>
val sizes : int list

Full name: Main.sizes
Multiple items
module Async

from Main

--------------------
type Async
static member AsBeginEnd : computation:('Arg -> Async<'T>) -> ('Arg * AsyncCallback * obj -> IAsyncResult) * (IAsyncResult -> 'T) * (IAsyncResult -> unit)
static member AwaitEvent : event:IEvent<'Del,'T> * ?cancelAction:(unit -> unit) -> Async<'T> (requires delegate and 'Del :> Delegate)
static member AwaitIAsyncResult : iar:IAsyncResult * ?millisecondsTimeout:int -> Async<bool>
static member AwaitTask : task:Task -> Async<unit>
static member AwaitTask : task:Task<'T> -> Async<'T>
static member AwaitWaitHandle : waitHandle:WaitHandle * ?millisecondsTimeout:int -> Async<bool>
static member CancelDefaultToken : unit -> unit
static member Catch : computation:Async<'T> -> Async<Choice<'T,exn>>
static member FromBeginEnd : beginAction:(AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromBeginEnd : arg:'Arg1 * beginAction:('Arg1 * AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromBeginEnd : arg1:'Arg1 * arg2:'Arg2 * beginAction:('Arg1 * 'Arg2 * AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromBeginEnd : arg1:'Arg1 * arg2:'Arg2 * arg3:'Arg3 * beginAction:('Arg1 * 'Arg2 * 'Arg3 * AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
static member FromContinuations : callback:(('T -> unit) * (exn -> unit) * (OperationCanceledException -> unit) -> unit) -> Async<'T>
static member Ignore : computation:Async<'T> -> Async<unit>
static member OnCancel : interruption:(unit -> unit) -> Async<IDisposable>
static member Parallel : computations:seq<Async<'T>> -> Async<'T []>
static member RunSynchronously : computation:Async<'T> * ?timeout:int * ?cancellationToken:CancellationToken -> 'T
static member Sleep : millisecondsDueTime:int -> Async<unit>
static member Start : computation:Async<unit> * ?cancellationToken:CancellationToken -> unit
static member StartAsTask : computation:Async<'T> * ?taskCreationOptions:TaskCreationOptions * ?cancellationToken:CancellationToken -> Task<'T>
static member StartChild : computation:Async<'T> * ?millisecondsTimeout:int -> Async<Async<'T>>
static member StartChildAsTask : computation:Async<'T> * ?taskCreationOptions:TaskCreationOptions -> Async<Task<'T>>
static member StartImmediate : computation:Async<unit> * ?cancellationToken:CancellationToken -> unit
static member StartWithContinuations : computation:Async<'T> * continuation:('T -> unit) * exceptionContinuation:(exn -> unit) * cancellationContinuation:(OperationCanceledException -> unit) * ?cancellationToken:CancellationToken -> unit
static member SwitchToContext : syncContext:SynchronizationContext -> Async<unit>
static member SwitchToNewThread : unit -> Async<unit>
static member SwitchToThreadPool : unit -> Async<unit>
static member TryCancelled : computation:Async<'T> * compensation:(OperationCanceledException -> unit) -> Async<'T>
static member CancellationToken : Async<CancellationToken>
static member DefaultCancellationToken : CancellationToken

Full name: Microsoft.FSharp.Control.Async

--------------------
type Async<'T>

Full name: Microsoft.FSharp.Control.Async<_>
property System.String.Length: int
static member Async.RunSynchronously : computation:Async<'T> * ?timeout:int * ?cancellationToken:System.Threading.CancellationToken -> 'T
val size : int