You are currently offline, waiting for your internet to reconnect

QA: How can I randomly select data from an .xdf file?

You can use an R 'transform' function to transform the data and pass that function to the RevoScaleR 'rxDataStepXdf()' function. You can then use the newly created, subset .xdf file with other RevoScaleR functions. Below is a sample R script that creates a new .xdf file by randomly sampling a larger .xdf file using the hidden row selection variable available in 'transformFunc'. 

# Create a transformFunc that selects 25% of the data at random set.seed(13) xform <- function(data) { data$.rxRowSelection<-as.logical(rbinom(length(data[[1]]),1,.25)) return(data) } rxDataStepXdf(inFile=inFile, outFile="sampledData.xdf", transformFunc=xform, overwrite=TRUE) # check that subsetting was done and the row selection variable is not kept in the data set. rxGetInfoXdf(inFile) rxGetInfoXdf("sampledData.xdf") 
Note This is a "FAST PUBLISH" article created directly from within the Microsoft support organization. The information contained herein is provided as-is in response to emerging issues. As a result of the speed in making it available, the materials may include typographical errors and may be revised at any time without notice. See Terms of Use for other considerations.

Article ID: 3104278 - Last Review: 10/29/2015 08:59:00 - Revision: 1.0

Revolution Analytics

  • KB3104278