Help with RxTextData/rxImport delimiters

How do I tell the RxTextData function to use the ‘|’ as delimiter or other character?
If your text data is not separated by commas or tabs, you must specify the delimiter using the columnDelimiters argument. (This is not actually an argument to rxImport, but to the underlying RxTextData data source object.) In normal usage, this argument is a single character, such as columnDelimiters="\t" for tab-delimited data or columnDelimiters="," for comma-delimited data. However, each column may be delimited by a different character; all the delimiters must be concatenated together into a single character string. For example, if you have one column delimited by a comma, a second by a plus sign, and a third by a new line, you would use the argument columnDelimiters=",+\n".



So for the above data how do I fix the below code to consider ‘|’ as the delimeter

hdfsFS <- RxHdfsFileSystem(hostName=”dummy ", port="dummy") txtSource <- RxTextData("directory value/ file_name in hdfs", fileSystem=hdfsFS) airData <- rxImport(inData=txtSource, outFile = "/tmp/test.xdf",stringsAsFactors = TRUE, missingValueString = "M", rowsPerRead = 200000, overwrite=TRUE) rxSummary(~ id+val, data = airData)

2). To be able to read 'pipe'-delimited data, you will need to set the option 'delimiter="|"' in your RxTextData() call: 

txtSource <- RxTextData(("directory value/ file_name in hdfs", fileSystem=hdfsFS, delimiter = "|")
Note This is a "FAST PUBLISH" article created directly from within the Microsoft support organization. The information contained herein is provided as-is in response to emerging issues. As a result of the speed in making it available, the materials may include typographical errors and may be revised at any time without notice. See Terms of Use for other considerations.

Article ID: 3103847 - Last Review: 11/01/2015 15:48:00 - Revision: 1.0

Revolution Analytics

  • KB3103847