Applies To
Revolution Analytics

You can use the same RevoScaleR functions to process huge data sets stored on disk as you do to analyze in-memory data frames. This is because RevoScaleR functions use 'chunking' algorithms. Basically, chunking algorithms follow this process:

  1. Initialization: intermediate results needed for computation of final statistics are initialized

  2. Read data: read a chunk (set of observations of variables) of data

  3. Transform data: perform transformations and row selections for the chunk of data as needed; write out data if only performing import or data step

  4. Process data: compute intermediate results for the chunk of data

  5. Update results: combine the results from the chunk of data with those of previous chunks

  6. Repeat steps (2) - (5) (perhaps in parallel) until all data has been processed

  7. Process results: when results from all the chunks have been completed, do final computations and return results

Need more help?

Want more options?

Explore subscription benefits, browse training courses, learn how to secure your device, and more.