Applies ToRevolution Analytics

You can use an R 'transform' function to transform the data and pass that function to the RevoScaleR 'rxDataStepXdf()' function. You can then use the newly created, subset .xdf file with other RevoScaleR functions. Below is a sample R script that creates a new .xdf file by randomly sampling a larger .xdf file using the hidden row selection variable available in 'transformFunc'. 

# Create a transformFunc that selects 25% of the data at random set.seed(13) xform <- function(data) { data$.rxRowSelection<-as.logical(rbinom(length(data[[1]]),1,.25)) return(data) rxDataStepXdf(inFile=inFile, outFile="sampledData.xdf", transformFunc=xform, overwrite=TRUE) # check that subsetting was done and the row selection variable is not kept in the data set. rxGetInfoXdf(inFile) rxGetInfoXdf("sampledData.xdf") 

Need more help?

Want more options?

Explore subscription benefits, browse training courses, learn how to secure your device, and more.

Communities help you ask and answer questions, give feedback, and hear from experts with rich knowledge.