How to bulk load? #127
Replies: 1 comment
-
Indeed, for large datasets it's easiest just to chunk the data into blocks, typically I upload As a rule of thumb when indexing lnx should be using the number of threads allowed to index at 90-100% CPU usage, i.e. If I allocate 8 threads to the indexer and I have an 8-core machine when indexing I would aim to be using 90-100% of those 8 cores while indexing for maximum throughput, if it's using less than that then you're probably not uploading documents to it as fast as it can ingest. If you're bulk loading it's easiest to only As a rule of thumb, I think you should be looking at around |
Beta Was this translation helpful? Give feedback.
-
I have a 4.5 GB JSON file with about 34 million records that I would like to index with lnx. The README shows/describes a 27 million record dataset that is 18 GB indexed. How did you load this data into lnx? Lnx currently doesn't have a bulk loading feature: #35. I'm guessing you chunked the input file then did multiple POST requests to
/documents
. My main question is how often you made a POST to/commit
, if that has an impact on speed and what you recommend.Beta Was this translation helpful? Give feedback.
All reactions