Skip to content

fosskers/transducers.fnl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transducers: Ergonomic, efficient data processing

I think Transducers are a fundamental primitive that decouples critical logic from list/sequence processing, and if I had to do Clojure all over I would put them at the bottom.

– Rich Hickey

Transducers are an ergonomic and extremely memory-efficient way to process a data source. Here “data source” could mean an ordinary Table, but also potentially large files or generators of infinite data.

Transducers…

  • allow the chaining of operations like map and filter without allocating memory between each step.
  • aren’t tied to any specific data type; they need only be implemented once.
  • vastly simplify “data transformation code”.
  • are a joy to use!

Looking for Transducers in other Lisps? Check out the Emacs Lisp and Common Lisp implementations!

History and Motivation

Originally invented in Clojure and later adapted to other Lisps, Transducers are an excellent way to think about - and efficiently operate on - collections or streams of data. Transduction operations are strict and don’t involve “laziness” or “thunking” in any way, yet only process the exact amount of data you ask them to.

Installation

This library consists of only a single module, so it’s simple to vendor into your own projects.

Usage

Importing

(local t (require :transducers))

(t.transduce (t.take 3) t.add [1 2 3 4 5])

Transducers, Reducers, and Sources

;; The fundamental pattern.
(t.transduce <transducer-chain> <reducer> <source>)

Data processing largely has three concerns:

  1. Where is my data coming from? (sources)
  2. What do I want to do to each element? (transducers)
  3. How do I want to collect the results? (reducers)

Each full “transduction” requires all three. We pass one of each to the transduce function, which drives the process. It knows how to pull values from the source, feed them through the transducer chain, and wrap everything together via the reducer.

  • Typical transducers are map, filter, and take.
  • Typical reducers are add, count, and fold.
  • Typical sources are tables and files.

Generators are a special kind of source that yield infinite data. Typical generators are repeat and cycle.

Let’s sum the squares of the first 1000 even integers:

(t.transduce
 (t.comp (t.filter #(= 0 (% $1 2))) ;; (2) Keep only even numbers.
         (t.take 1000)              ;; (3) Keep the first 1000 filtered evens.
         (t.map (fn [n] (* n n))))  ;; (4) Square those 1000.
 t.add       ;; (5) Reducer: Add up all the squares.
 (t.ints 1)) ;; (1) Source: Generate all positive integers.

Two things of note here:

  1. comp is used here to chain together different transducer steps. Notice that the order appears “backwards” from usual function composition. It may help to imagine that comp is acting like the ->> macro here.
  2. The reduction via add is listed as Step 5, but really it’s occuring throughout the transduction process. Each value that makes it through the composed transducer chain is immediately added to an internal accumulator.

Explore the other transducers and reducers to see what’s possible!

Processing CSV Data

As a convenience, this library also exposes a simple interface for reading and writing streams of CSV data.

To sum the values of a particular field:

(local t (require :transducers))

(t.transduce (t.comp (t.filter-map #(. $1 :Age))
                     (t.filter-map tonumber))
             t.add (t.csv-read "foo.csv"))

To reduce the file to certain fields and write the data back out:

(local t (require :transducers))

(t.transduce t.pass
             (t.csv-write "out.csv" ["Name" "Age"])
             (t.csv-read "in.csv"))

API

See here (SourceHut)

Performance

Summing a numeric field in a 45mb CSV file.

RuntimeAverage Time (sec)
LuaJIT1.38
Lua 5.42.56
Lua 5.23.03

The associated code can be found in the examples folder, alongside a hand-written version using only Fennel primitives. Interestingly, this hand-written version performs slightly worse, implying that the overhead from Transducers themselves is minimal.

About

Ergonomic, efficient data processing for Fennel.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages