Happy Tuesday, Fantasists! Sorry for the wait; Iāve been chasing around an issue to change this entry! No movement on that yet, so letās soldier on! Weāve seen that chain
allows us to sequence our actions, which means we can now do pretty much anything we want! As fellow JavaScripters, this is of course the time we get cynical; it canāt be that simple, right? Absolutely not, and letās take a look at a convoluted example to show us why!
Have a gander at this little number, which emulates the behaviour of the UNIX seq
command, with the help of our old friend Pair
:
// Click the above link for a full
// explanation of this type!
const Writer = Pair(Array)
//+ seq :: Int -> Writer [Int] Int
const seq = upper => {
//+ seq_ :: Int -> Writer [Int] Int
const seq_ = init =>
init >= upper
// If we're done, stop here!
? Writer([init], upper)
// If we're not...
: Writer([init], init + 1)
.chain(seq_) // ...chain(seq_)!
// Kick everything off
return seq_(1)
}
seq(5)._1 // [1, 2, 3, 4, 5]
Everything looks fine so far: seq_
is just a recursive function. Until init
exceeds upper
, we log the current value and call seq_
on the next integer. Pick any number and see that it works: 10
, 100
, 1000
, ā¦ oh.
> seq(10000)._1
RangeError: Maximum call stack size exceeded
Now, weāre supposedly using computers. They take people to the moon, control nuclear reactors, and get us pictures of cats; surely itās not too big an ask to count to 10000
without exploding?
Our problem here is that, every time we chain(seq_)
, we store the calling functionās local state on the stack until the called function returns; as you can imagine, itās actually pretty expensive to save ten thousand of these states. When we fill up the stack and try to carry on, we cause a stack overflow error. If only there were some website to help usā¦
Typically, in our non-functional JavaScript, we get round this problem with a loop. Itās no secret that we could write seq
as:
const seq = upper => {
const acc = []
for (let i = 1; i <= upper; i++)
acc.push(i)
return acc
}
seq(10000) // Bam, no trouble
See? No stack overflow; we use acc
to store our state at each point explicitly, without recursion, and everything just worksā¦ so, have we been defeated? Do we really have to choose between prettiness - purity, composition, etc. - and performance?
Never! Weāre functional programmers; we solve these problems with abstraction, and ChainRec
turns out to be exactly what we need. Weāre going to start off with a slightly different definition of this specās function, and work towards the real one. First of all, weāll introduce a new type, Step
:
// type Step b a = Done b | Loop a
const { Loop, Done } = daggy.taggedSum({
Done: ['b'], Loop: ['a']
})
Weāre using these two constructors to mimic the two possible states of our imperative solution: are we Done
, or do we still need to Loop
?
We could make this a
Functor
overa
, but we donāt need to; still, itās an exercise if youāre looking!
With that in mind, letās define chainRec
. Remember here that all ChainRec
types are also Chain
types:
chainRec :: ChainRec m
=> (a -> m (Step b a), a)
-> m b
Well, well, well. We start off with a function of type a -> m (Step b a)
, and a value of type a
. We end up with m b
. I reckon the first step is probably to call that function, and end up with m (Step b a)
. When weāve done that, weāll be in one of two situations:
-
That
Step b a
will be aDone b
, and we can pull out theb
(by mapping over them
) and return the answer! -
That
Step b a
will be aLoop a
, and we still donāt have ab
to make the answer. What do we do then? We loop again!
We chain
our starter function to our m a
, and repeat this whole process until we finally get a Done
value, and we can finish up!
This might be a little heavy, so letās implement chainRec
for our Writer
type to clear it up:
// We need a TypeRep for the left bit!
const Pair = T => {
const Pair_ = daggy.tagged('_1', '_2')
// ...
//+ chainRec
//+ :: Monoid m
//+ => (a -> Pair m (Step b a), a)
//+ -> (m, b)
Pair_.chainRec = function (f, init) {
// Start off "empty"
let acc = T.empty()
// We have to loop at least once
, step = Loop(init)
do {
// Call `f` on `Loop`'s value
const result = f(step.a)
// Update the accumulator,
// as well as the current value
acc = acc.concat(result._1)
step = result._2
} while (step instanceof Loop)
// Pull out the `Done` value.
return Pair_(acc, step.b)
}
}
We store our acc
, starting with the āemptyā value for that Monoid
, and update it with every loop. This continues until we hit a Done
, when we make a pair with the final acc
and the value inside the Done
!
It might take a couple read-throughs, but see whatās happened: instead of doing recursion, weāve used Step
to let a function tell us when it wants to recurse. With that extra bit of abstraction, we can hide a while
loop under the hood, and no one will be any the wiser!
So, with all this talk of accumulated results, letās get back to that seq
example, and see what we can do:
//+ seqRec :: Int -> ([Int], Int)
const seqRec = upper => Writer.chainRec(
init => Writer([init],
init >= upper ? Done(init)
: Loop(init + 1)),
0)
Look at the body: thereās no recursion! Weāve abstracted all the recursion into chainRec
, which can do its sneaky optimisation. All we say in our chainRec
-using function is whether weāre Done
or still need to Loop
- isnāt that clearer? Whatās more, itās now totally stack-safe!
> seqRec(100000)._1
[1, 2, 3, 4, 5, ...]
Just a tiny little extra abstraction and we scoff at numbers like 10000
. Why stop the momentum? Next stop: cat pictures on the moon.
Now, we havenāt quite matched the spec, but youāll see why. The actual spec gives us no Step
type, which means we end up with the following:
chainRec :: ChainRec m
=> ((a -> c, b -> c, a) -> m c, a)
-> m b
Now, I have a few problems with this definition, most of which are outlined in the aforementioned GitHub issue, so we wonāt go into them too much. What weāll do instead is define a function to turn our Step
-based functions into spec-compliant functions. Itās a little number I like to call trainwreck
:
//+ trainwreck
//+ :: Functor m
//+ => (a -> m (Step b a))
//+ -> ((a -> c, b -> c, a) -> m c)
const trainwreck = f =>
(next, done, init) => f(init).map(
step => step.cata({
Loop: next, Done: done
})
)
// Or, with ES6 and polymorphic names...
const trainwreck = f => (Loop, Done, x) =>
f(x).map(s => s.cata({ Loop, Done }))
With this cheeky thing, we can get the best of both worlds; we just wrap our definition of Pair_.prototype.chainRec
in the trainwreck
function and itās good to go! Whether you choose to implement chainRec
on your types with the trainwreck-Step
approach or with the specās approach is up to you, but I think itās pretty clear that I have a favourite!
Interestingly, the specās (more formal) encoding is based on the Boehm-Berarducci encoding of the
Step
type; hat-tip to Brian McKenna for this one, as Iād never heard of this!
Last and certainly not least: the laws. Wait, you didnāt think Iād forgotten, did you? Ha! Weāll define these with Step
to make them a bit more expressive, but the original definitions are of course available on the spec.
// For some ChainRec m, predicate p,
// and some M-returning d and n...
M.chainRec(
trainwreck(
v => p(v) ? d(v).map(Done)
: n(v).map(Next)),
i)
// Must behave the same as...
(function step(v) {
return p(v) ? d(v)
: n(v).chain(step)
}(i))
By now, weāve seen the actual difference: the second one will explode before we get anywhere close to the moon. However, in theory (assuming we had an infinite stack), these two expressions would always end up at the same point. Again, weāre just asserting that Done
and Next
provide an abstraction for our recursion.
The other law, which is one of my favourites, is that chainRec(f)
must produce a stack no larger than n
times f
ās stack usage, for some constant n
. In other words, whether Iām looping once or a million times, the stack must not keep growing. With this wordy little law, we ensure stack safety.
The ChainRec
spec is best-suited towards ideas like Free structures and asynchronous looping: when we could end up with a very complicated structure, chainRec
makes sure we donāt hit an unexpected ceiling.
In your day-to-day coding, itās probably not something youāll see an awful lot. In fact, it usually tends to be hidden in implementation detail, rather than being exposed to the user. Stick to the simple, golden rule: when Chain
blows your stack, ChainRec
is there to solve the problem. Every Chain
type can implement ChainRec
, so this will always be an available option!
Whether we use it regularly or not, we now have a practical, stack-safe, and functional answer to the danger of chain
ing huge computations. We can sequence instructions safely, build up large computations, and then let āem rip.
Next time, weāll look at the last piece of the puzzle, and charge through some examples. Brace yourself for the terrifying, incomprehensible, and downright impolite Monad
! Trust me: itās actually none of those things. In fact, itās nothing new at all - rest easy, Fantasists!
ā„
As usual, feel free to play with the postās code gist to experiment further!