Sign in with Microsoft
Sign in or create an account.
Hello,
Select a different account.
You have multiple accounts
Choose the account you want to sign in with.

Forest and Tree Modeling Accuracy

Tune rxDForest parameters (speed trade-off)   (*: OSR and RRE defaults)

–      Increase nTree, e.g. to 20 or more   (OSR=500, RRE=10)*

–      Increase maxDepth, e.g. to 20 or more   (OSR=N/A, RRE=10)*

–      Decrease minSplit, e.g. to 2   (OSR=5, RRE=sqrt(N))*

–      Increase mTry, e.g. to 40 or more   (OSR/RRE=sqrt(p) or p/3)*

–      Increase maxNumBins, e.g. to 1e5 or 1e6

–      Accuracy of 81.4% with the KDD dataset using the following with a further increase to 82.3% when ntree=200:

ntree=20, mtry=40, minSplit=2, maxDepth=20, maxNumBins=1e6

  • Alternatively, run the open source randomForest routine across the Hadoop cluster using rxExec

–      See randomShrubbery in Section 6.5 of our Distributed Computing Guide

–      Adjust MR memory limits if needed since data must fit within memory on each node.

Need more help?

Want more options?

Explore subscription benefits, browse training courses, learn how to secure your device, and more.

Communities help you ask and answer questions, give feedback, and hear from experts with rich knowledge.

Was this information helpful?

What affected your experience?
By pressing submit, your feedback will be used to improve Microsoft products and services. Your IT admin will be able to collect this data. Privacy Statement.

Thank you for your feedback!

×