python - How to obtain information gain from a scikit-learn DecisionTreeClassifier? -


i see decisiontreeclassifier accepts criterion='entropy', means must using information gain criterion splitting decision tree. need information gain each feature @ root level, when split root node.

you can access information gain (or gini impurity) feature has been used split node. attribute decisiontreeclassifier.tree_.best_error[i] holds entropy of i-th node splitting on feature decisiontreeclassifier.tree_.feature[i]. if want entropy of examples reach i-th node @ decisiontreeclassifier.tree_.init_error[i].

for more information see documentation here: https://github.com/scikit-learn/scikit-learn/blob/dacfd8bd5d943cb899ed8cd423aaf11b4f27c186/sklearn/tree/_tree.pyx#l64

if want access entropy each feature (at split node) - need modify function find_best_split https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_tree.pyx#l713


Comments