Skip to content

This project provides a way to benchmark various Sequence operations in Typescript and Javascript, such as find, every and zip.

License

Notifications You must be signed in to change notification settings

tinyield/ts-sequences-benchmarks

Repository files navigation

Glossary

  1. Overview
  2. Usage
  3. Benchmarks Overview
    1. All Match
    2. Every
    3. Find
    4. First
    5. Zip
    6. Distinct Top Artist and Top Track by Country Benchmark
    7. Artists who are in a Country's top ten who also have Tracks in the same Country's top ten Benchmark
    8. Flatmap and Reduce
    9. Temperature Transitions
  4. Performance Comparison
  5. More

Overview

In this document you'll find information on how to use the project in terms of how to run it in your local machine, a brief description of each benchmark's pipeline.

Usage

To run these benchmarks on you local machine just run:

npm run build

And then:

node ./build/lib/index.js

For more information run:

node ./build/lib/index.js -h

Benchmarks Overview

All Match - allMatch(sequence, predicate)

Benchmarks the allMatch() operation in the different sequence types.

Collection Sizes: [1000, 100 000]

Pipeline:

sequenceOfEvenNumbers -> allMatch(isEven)

Every - every(sequence1, sequence2, predicate)

Every is an operation that, based on a user defined predicate, tests if all the elements of a sequence match between corresponding positions. So for instance, if we have:

const seq1 = ["505", "Amsterdam", "Mural"];
const seq2 = ["505", "Amsterdam", "Mural"];
const seq3 = ["Mural", "Amsterdam", "505"];
const pred = (s1, s2) => s1.equals(s2);

Then:

every(seq1, seq2, pred); // returns true
every(seq1, seq3, pred); // returns false

For every to return true, every element of each sequence must match in the same index. The every feature can be achieved through a pipeline of zip-allMatch operations, such as:

seq1.zip(seq2, pred).allMatch(isTrue);

Benchmarked for: [Class, Number, String]
Collection Sizes: [1000, 100 000]

Pipeline:

sourceSequence + copyOfSourceSequence -> zip(equals) -> allMatch(isTrue)

Find - find(sequence1, sequence2, predicate)

The find between two sequences is an operation that, based on a user defined predicate, finds two elements that match, returning one of them in the process. So if we have:

const seq1 = ["505", "Amsterdam", "Mural"];
const seq2 = ["400", "Amsterdam", "Stricken"];
const seq3 = ["Amsterdam", "505", "Stricken"];
const predicate = (s1, s2) => s1.equals(s2);

Then:

find(seq1, seq2, predicate); // returns "Amsterdam"
find(seq1, seq3, predicate); // returns null
find(seq2, seq3, predicate); // returns "Stricken"

For find to return an element, two elements of each sequence must match in the same index. Here's what an implementation of find would look like with the support of a zip:

zip(seq1, seq2, (t1, t2) => predicate(t1, t2) ? t1 : null)
    .filter(isNotNull)
    .first();
}

Benchmarked for: [Class, Number, String]
Collection Sizes: [1000, 100 000]
This pipeline has two ways of determining which element will be the matching element:

  1. For String sequences, the matching element will be on a fixed index, using the following criteria:
    • For collections with more than 100 elements, the matching index will correspond to COLLECTION_SIZE / 100.
      (e.g: for a COLLECTION_SIZE of 100K the pipeline will match with the 1000th element)

    • Otherwise, the matching index will correspond to COLLECTION_SIZE / 10.
      (e.g: for a COLLECTION_SIZE of 100 elements the pipeline will match with the 10th element)

  2. For all sequences (including String), the match index increments with each iteration to make sure there are matches in every index of the sequence.
    (e.g: On the 1st iteration the match index will be 0, on the 2nd it will be 1, etc... )

Pipeline:

sourceSequence1 + sourceSequence2 -> 
-> zip((e1, e2) => predicate(e1, e2) ? t1 : null) -> 
-> filter(isNotNull) ->
-> first() 

First - first(sequence, predicate)

Benchmarks the usage of the findFirst() operator. This benchmark was run with three types of Sequences:

  1. One where the match would be found in the first element.
  2. One where the match would be found in the middle.
  3. One where the match would be found in the last element.

Collection Sizes: [1000, 100 000]

Pipeline:

sequenceOfAllButOneEvenNumbers -> filter(isOdd) -> findFirst()

Zip Primes with Values - zip(primes, values)

Benchmarks zipping a sequence of prime Numbers with instances of the class Value.

Collection Sizes: [1000, 100 000]

Pipeline:

(sequenceOfNumbers -> filter(isPrime)) + sequenceOfValues -> zip(Pair.with) -> forEach(blackhole)

Distinct Top Artist and Top Track by Country Benchmark - zip(artists, tracks)

Benchmarks creating two different sequences, one consisting of the top 50 Artists (provided by Last.fm) of each non english speaking country (provided by REST Countries) and the other the exact same thing but for the top 50 Tracks. Then zipping both sequences into a Trio of Country, First Artist and First Track and retrieving the distinct elements by Artist.

Pipelines:

  • Sequence of Artists:
sequenceOfCountries -> filter(isNonEnglishSpeaking) -> filter(hasArtists) -> map(Pair.of(country, artists));
  • Sequence of Tracks:
sequenceOfCountries -> filter(isNonEnglishSpeaking) -> filter(hasTracks) -> map(Pair.of(country, tracks));
  • Pipeline
sequenceOfArtists + sequenceOfTracks -> 
-> zip(Trio.of(country, topArtist, topTrack)) -> 
-> distinctBy(artist) -> 
-> forEach(blackhole)

Artists who are in a Country's top ten who also have Tracks in the same Country's top ten Benchmark - zip(artists, tracks)

Benchmarks creating two different sequences, one consisting of the top 50 Artists (provided by Last.fm) of each non english speaking country (provided by REST Countries) and the other the exact same thing but for the top 50 Tracks. Then, for each Country, we take the top ten Artists and top ten Track artist's names and zip them into a Trio. After, the top ten artists are filtered by their presence in the top ten Track artist's name, returning a Pair with the Country and the resulting Sequence.

Pipelines:

  • Sequence of Artists:
sequenceOfCountries -> filter(isNonEnglishSpeaking) -> filter(hasArtists) -> map(Pair.of(country, artists));
  • Sequence of Tracks:
sequenceOfCountries -> filter(isNonEnglishSpeaking) -> filter(hasTracks) -> map(Pair.of(country, tracks));
  • Pipeline
sequenceOfArtists + sequenceOfTracks -> 
-> zip(Trio.of(country, artists, tracks)) -> 
-> map(Pair.of(country, artists)) -> 
-> forEach(blackhole)

Flatmap and Reduce - flatmapAndReduce(nestedSequence)

Benchmarks the usage of the flatmap operation terminating with a reduce operation. In this case in particular we use a nested sequence of numbers flatmap it to a regular sequence of numbers and then sum them.

Pipeline:

nestedSequence -> flatmap() -> reduce()

Temperature Transitions - collapse(sequence)

This benchmarks the usage of a user defined operation, for this benchmark we used real data gathered from the World Weather Online API. The data is in .csv which means it needs to be prepared, in order to do so we have two user defined operations "oddLines" which filters any line in an odd index, and "collapse" that filters equal elements in sequence to each other (e.g. input: 3, 5, 5, 3, 5 -> output: 3,5,3,5 )

Pipeline:

lineSequence -> 
-> filter("#") ->
-> skip(No Data Lines) -> 
-> oddLines -> 
-> map(toTemperature) -> 
-> collapse() -> 
-> count()

Performance Comparison

The results presented here were based on the results attained from Github Actions, they are presented in relation to IxJs's IterableX performance, due to it being the closest to Javascript's out of the box lazy sequence which is the Iterable. For the actual results check the Github Actions Section.

Notes:

  • IxJs's IterableX performance is equivalent to 1, all results are presented in relation to it
  • For Pipelines where collection sizes vary, only 1k results and 100k results will be displayed, separated in a pipe format, like so:
    • 1K results | 100K results

Benchmarks with one Sequence

Benchmark Time Complexity Underscore Tinyield Sequency Lodash Lazy.js
All match Linear 6.51 | 3.57 4.11 | 2.70 2.42 | 2.23 6.23 | 3.60 6.96 | 3.57
First in Beginning Constant 0.41 | 1.51 1.71 | 1.98 1.43 | 1.98 0.31 | 1.36 1.54 | 1.96
First in Middle Linear 2.19 | 2.40 4.41 | 2.43 2.98 | 2.25 1.51 | 2.11 6.95 | 3.05
First in End Linear 3.39 | 2.93 4.42 | 2.72 2.55 | 2.27 2.54 | 2.60 8.35 | 3.40
Flatmap and Reduce Linear 1.52 | 0.66 2.47 | 3.62 2.98 | 2.92 2.22 | 0.60 5.03 | 6.17
Temperature Transitions Linear 5.24 5.74 2.61 4.48 5.86 1.00

Benchmarks with two Sequences

Benchmark Time Complexity Underscore Tinyield Sequency Lodash Lazy.js
Every String Linear 1.60 | 0.49 4.47 | 2.05 3.17 | 1.83 3.17 | 0.70 3.44 | 2.09
Every Number Linear 1.48 | 0.40 4.40 | 3.24 3.21 | 2.64 3.53 | 0.67 3.87 | 3.17
Every Class Linear 1.44 | 0.41 4.15 | 3.07 3.25 | 2.55 3.47 | 0.69 3.65 | 3.05
Zip Primes with Values Linear 0.64 | 0.27 3.27 | 1.89 2.68 | 1.75 0.95 | 0.29 3.00 | 2.06
Zip Top Artist & Track(1) Linear 0.69 1.59 2.13 0.42 2.13
Zip Artists in Top10(2) Linear 1.12 1.90 1.55 0.67 2.70

(1) Corresponds to "Distinct Top Artist and Top Track by Country Benchmark"
(2) Corresponds to Artists who are in a Country's top ten who also have Tracks" in the same Country's top ten Benchmark

More

If you're interested in a similar comparison but in the Java world head over to our other Benchmarks Github Repository. You can also check a more in depth analysis in an Article we wrote over at DZONE.