In the local compute context, all of RevoScaleR’s supported data sources are available to you. In a distributed compute context, however, your choice of data sources may be severely limited.
The most extreme case is the RxInTeradata compute context, which supports only the RxTeradata data source—this makes sense, as the computations are being performed on data inside the Teradata database. The following table shows the available combinations of compute contexts and data sources (x indicates available):
Compute Context → Data Source↓ |
RxLocalSeq/Parallel |
RxHpcServer |
RxLsfCluster |
RxHadoopMR |
RxInTeradata |
Delimited Text (RxTextData) |
x |
x |
x |
x |
|
Fixed-Format Text (RxTextData) |
x |
x |
x |
||
.xdf data files (RxXdfData) |
x |
x |
x |
x |
|
SAS data files (RxSasData) |
x |
x |
x |
||
SPSS data files (RxSpssData) |
x |
x |
x |
||
ODBC data (RxOdbcData) |
x |
x |
x |
||
Teradata database (RxTeradata) |
x |
x |
x |
x |
For more information - please review the RevoScaleR Distributed Computing Guide