Forest and Tree Modeling AccuracyTune rxDForest parameters (speed trade-off)  (*: OSR and RRE defaults)–     Increase nTree, e.g. to 20 or more  (OSR=500, RRE=10)*–     Increase maxDepth, e.g. to 20 or more  (OSR=N/A, RRE=10)*–     Decrease minSplit, e.g. to 2  (OSR=5, RRE=sqrt(N))*–     Increase mTry, e.g. to 40 or more  (OSR/RRE=sqrt(p) or p/3)*–     Increase maxNumBins, e.g. to 1e5 or 1e6–     Accuracy of 81.4% with the KDD dataset using the following with a further increase to 82.3% when ntree=200:ntree=20, mtry=40, minSplit=2, maxDepth=20, maxNumBins=1e6
-
Alternatively, run the open source randomForest routine across the Hadoop cluster using rxExec
–     See randomShrubbery in Section 6.5 of our Distributed Computing Guide–     Adjust MR memory limits if needed since data must fit within memory on each node.