Help with RxTextData/rxImport delimiters


How do I tell the RxTextData function to use the ‘|’ as delimiter or other character?


If your text data is not separated by commas or tabs, you must specify the delimiter using the columnDelimiters argument. (This is not actually an argument to rxImport, but to the underlying RxTextData data source object.) In normal usage, this argument is a single character, such as columnDelimiters="\t" for tab-delimited data or columnDelimiters="," for comma-delimited data. However, each column may be delimited by a different character; all the delimiters must be concatenated together into a single character string. For example, if you have one column delimited by a comma, a second by a plus sign, and a third by a new line, you would use the argument columnDelimiters=",+\n".



So for the above data how do I fix the below code to consider ‘|’ as the delimeter

hdfsFS <- RxHdfsFileSystem(hostName=”dummy ", port="dummy") 
txtSource <- RxTextData("directory value/ file_name in hdfs", fileSystem=hdfsFS) 
airData <- rxImport(inData=txtSource, outFile = "/tmp/test.xdf",stringsAsFactors = TRUE, missingValueString = "M", rowsPerRead = 200000, overwrite=TRUE) 
rxSummary(~ id+val, data = airData)

2). To be able to read 'pipe'-delimited data, you will need to set the option 'delimiter="|"' in your RxTextData() call: 

txtSource <- RxTextData(("directory value/ file_name in hdfs", fileSystem=hdfsFS, delimiter = "|")

Need more help?

Expand your skills
Explore Training
Get new features first
Join Microsoft Insiders

Was this information helpful?

Thank you for your feedback!

Thank you for your feedback! It sounds like it might be helpful to connect you to one of our Office support agents.