What follows is a brief discussion of Rich Hickey’s transducers as I chose to implement them in Python. In particular, I try to provide some insight into how they work that might be more intuitive to someone who has a more imperative mental model of the mechanics of reduce.
Python and Clojure Comparison
We can compare the
map function that returns a transducer in
def map(f): def _map_xducer(step): def _map_step(r=Missing, x=Missing): if r is Missing: return step() return step(r) if x is Missing else step(r, f(x)) return _map_step return _map_xducer
map (single arity) that returns a transducer in
(defn map ([f] (fn [rf] (fn ( (rf)) ([result] (rf result)) ([result input] (rf result (f input))) ([result input & inputs] (rf result (apply f input inputs))))))
fn steps have verbose names in the Python implementation. You can
mentally swap any instance of
def x … return x in Python with
(fn … ) in Clojure. Handling dispatch on variable numbers of args is
cleaner in Clojure as well, so the Python implementation makes use of
the somewhat hackish
Missing value as a stand in for “argument not
present.” We can’t make use of the standard Python
for missing args in this case as
None values should pass through the
reducing function intact as would any other value.
Let’s unpack what’s going on here a bit:
mapfunction takes the function argument
fand returns a function. The function returned by
mapis the map transducer.
The transducer takes a function argument as well, and expects it to be a reducing function.
We use our knowledge of the shape of a reducing function to return a function that takes the same arguments but applies some transformation on the inputs first before it applies the reducing function it wraps.
Composing transducers actually results in a reversed traversal over the resulting composition of returned inner functions. That is,
transducer(reducer)results in the
transducertransformation being applied before the reducing function.
The access to the outer scope by the inner function here is important to note; also: the innermost function transforms the reducing function it is composed with by invoking
f(in the outer lexical scope of the closure) on each item the reducing function folds over.
Handling state in said outer lexical scope in Python closures requires some tricks.
Below is a deeper dive.
Say we want to map a very basic reducing function,
rsum (the following
two methods of defining
rsum are essentially equivalent):
def rsum(x, y): return x + y
rsum = lambda x, y: x + y
Normally we would invoke this in the context of
reduce(rsum, [1, 2, 3]) # 6
It proceeds as follows (naive imperative implementation with some hand-waving):
1 and 2 are added together, the result 3 is assigned to
r- the value of the call to reduce thus far.
The next time
ris the first positional parameter; we add the value 3 to it and again temporarily assign the outcome, 6, to
You could also define this starting from 0 (the identity of
this is simply addition), proceeding as:
0 + 1 yields
r = 1
1 + 2 yields
r = 3
3 + 3 yields
r = 6
Wrapping the Reducer with a Transducer
So what does the
map function do? It returns a transducer that
composes with a reducing function like
rsum. Let’s consider the
x are both passed, thereby ignoring the
return step(r, f(x))
So in this case when we start reducing over elements from the list like
1, 2, etc., we would first call our function
f on them. Let’s consider
this example, rewriting
rsum to be friendly with all our reducing
function input args possibilities:
def rsum(r=Missing, x=Missing): if r is Missing: return 0 if x is Missing: return r return r + x
rsum with no arguments returns the identity of addition, 0. With
r, it returns that argument unchanged. Otherwise this
function is the same as the previous one.
Using this form of
T.transduce(T.map(lambda x: x*x), rsum, [1, 2, 3])
We now are composing the
map transducer with
rsum as we reduce over
the collection (though note I could transduce over a generator or
anything we could iterate over, it’s not locked to collections
return step(r, f(x))
f in this case is a squaring function, so instead of
r in each reducing step, we now add
f(x), which in
this case is x^2^. Using a
map transducer and our
function in this manner gives us a sum of squares. This produces:
r = 1
r = 5(1 + 2^2^)
r = 14(5 + 3^2^)
Python has coroutines, generators, etc. Who needs transducers? Additionally, CPython pays an obnoxious overhead for function invocation that leads perf seekers to stick to for loops, generator expressions, etc. Isn’t this walking down the opposite path?
Well, you probably don’t need transducers. You could rewrite the entire thing in generators and not really care in CPython. That said: