Better than Rails Batching. Can you batch backwards for destructive tasks? each_in_batches
can.
Project | EachInBatches |
---|---|
gem name | each_in_batches |
license | MIT |
expert support | |
download rank | |
version | |
dependencies | |
code quality | |
inline documenation | |
continuous integration | |
test coverage | |
homepage | on Github.com |
documentation | on Rdoc.info |
live chat | |
Spread |
🌏, 👼, , , 🌹 |
(Originally BolingForBatches)
I am resurrecting this code because I still have this recurring need, and Rail's native batching doesn't cut mustard.
It is some of my most ancient code, and it isn't pretty, but I hope to improve it over time.
I often need to execute really large computations on really large data sets. I usually end up writing a rake task to do it, which calls methods in my models. But something about the process bugged me. Each time I had to re-implement my 'batching code' that allowed me to not chew up GB after GB of memory due to klass.find(:all, :include => [:everything_under_the_sun]). Re-implementation of the same logic over and over across many projects is not very DRY, so I got out my blow torch and lit it up. The difficulty was that the part that was different each time I batched was at the center of the code, right in the middle of the batch loop. But I didn't let that stop me!
- I've been doing batching in Rails a lot longer than Rails has.
- Metrics. I measure stuff.
- I can batch from the top down (a.k.a backwards), making it possible to DELETE things in batches. A. If you've never tried using the built-in rails batching for deleting millions of records... don't start now. Use this gem instead.
- Merged in the EachInBatches fork (from Brian Kidd): I needed to iterate over the results and perform more actions than a single method would provide. I didn't want to write a method in my app that performed the needed functionality as I felt the plugin should support this directly. I modified the original plugin so that it takes a block instead of a method. It will pass the object instance to the block. It works pretty much the same as Class.find(:all).each {|x| do something}, except in batches n that you specify with :batch_size.
Add this line to your application's Gemfile:
gem 'each_in_batches'
And then execute:
$ bundle
Or install it yourself as:
$ gem install each_in_batches
To create a new Batch, call Batch#new
pass it the class and any additional arguments (all as a hash).
batch = EachInBatches::Batch.new(:arel => Payment.canceled.order("transaction_id ASC"), :batch_size => 50)
To process the batched data, pass a block to Batch#run
the same way you would to an object in a block like Klass.all.each {|x| x.do_something }
.
Batch#run
will pass the data to your block, one at a time, in batches set by the :batch_size argument.
batch.run {|x| puts x.id; puts x.transaction_id}
Print the results!
batch.print_results
Or...
Consolidate your code if you prefer
EachInBatches::Batch.new(:arel => Payment.canceled.order("transaction_id ASC"), batch_size => 50, :show_results => true).run{|x| puts x.id; puts x.transaction_id}
Arguements for the initializer (Batch.new) method are:
Required:
:arel - Usage: :arel => Payment.canceled.order("transaction_id ASC")
Required, as this is the class that will be batched
Optional:
:verbose - Usage: :verbose => true or false
Sets verbosity of output
Default: false (if not provided)
:backwards - Usage: :backwards => true or false
Using the order supplied for :arel (if any) figures out what the batch set looks like,
and begins working from the end rather than the front.
Default: false (if not provided)
Example: EachInBatches::Batch.new(:arel => Report.where(company_id: nil).order("id ASC"), :backwards => true, :batch_size => 50).run{|x| x.destroy }
:batch_size - Usage: :batch_size => x
Where x is some number.
How many AR Objects should be processed at once?
Default: 50 (if not provided)
:last_batch - Usage: :last_batch => x
Where x is some number.
Only process up to and including batch #x.
Batch numbers start at 0 for the first batch.
Default: won't be used (no limit if not provided)
:first_batch - Usage: first_batch => x
Where x is some number.
Begin processing batches beginning at batch #x.
Batch numbers start at 0 for the first batch.
Default: won't be used (no offset if not provided)
:show_results - Usage: :show_results => true or false
Prints statistics about the results of Batch#run.
Default: true if verbose is set to true and :show_results is not provided, otherwise false
Interpreting the output:
'[O]' means the batch was skipped due to an offset.
'[L]' means the batch was skipped due to a limit.
'[P]' means the batch is processing.
'[C]' means the batch is complete.
and yes... it was a coincidence. This class is not affiliated with 'one laptop per child'
Copyright ©2008-2015 Peter H. Boling, Brian Kidd, released under the MIT license
After checking out the repo, run bin/setup
to install dependencies. Then, run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
to create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
To release a new version, update the version number in version.rb
, and then run bundle exec rake release
to create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
This library aims to adhere to Semantic Versioning 2.0.0. Violations of this scheme should be reported as bugs. Specifically, if a minor or patch version is released that breaks backward compatibility, a new version should be immediately released that restores compatibility. Breaking changes to the public API will only be introduced with new major versions.
As a result of this policy, you can (and should) specify a dependency on this gem using the Pessimistic Version Constraint with two digits of precision.
For example:
spec.add_dependency 'each_in_batches', '~> 0.0'
- Fork it ( https://github.com/[my-github-username]/each_in_batches/fork )
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Make sure to add tests!
- Create a new Pull Request
See the Network View