Flow is a PHP based, strongly typed ETL (Extract Transform Load), asynchronous data processing library with constant memory consumption.
Supported PHP versions
This package is a monorepo that should not be directly installed in your project. Please check below packages and select only those that you are going to use:
- ETL
- Adapters
- Libraries
- array-dot - auto included
- doctrine-dbal-bulk
In order to run tests locally please make sure you have docker up and running. You also need PHP 8.1 and composer to be available from your CLI.
For the code coverage, please install pcov.
cp docker-compose.yml.dist docker-compose.yml
composer install
docker compose up -d
composer test
composer static:analyze
This command will execute exactly the same tests as we run at Github Actions before PR can get merged. If it passes locally, you are good to open pull request.
composer build
In order to understand how Flow works, please read documentation
- low and constant memory consumption
- asynchronous data processing
- reading from any data source
- writing to any data source
- rich collection of data transformation functions
- direct access to remote filesystems
- partitioning
- grouping & aggregating
- remote files processing
- joins
- sorting
- displaying datasets as ASCII table
- validation against schema
- caching
- DataFrame - Lazy data processing frame.
- Rows - Immutable colllection of
Row
objects. - Row - Immutable, strongly typed collection of
Entry
objects. - Entry - Immutable, strongly typed object representing cell in a row.
- Extractor (Reader) - Memory safe, Data Source returning \Generator, yielding
Rows
to thePipeline
- Transformer - Data transformer receiving and returning
Rows
(in most cases transformer), one instance ofRows
at once. - Loader (Writer) - Memory safe representation of Data Sink, responsibility of Loader is to write
Rows
into destination storage, one at time. - Pipeline - Interface representing ETL process, each received
Rows
instanced is pased through allPipes
, also responsible for error handling. - Pipe - Loader of Transformer instance existing in
Pipes
collection.
- 8.1 - ✅
Flow PHP is sponsored by:
- Blackfire - the best PHP profiling and monitoring tool!