Random forests or random decision forests are an ensemble learning method for categorization, regression and other tasks that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (categorization) or mean prediction (regression) of the individual trees. The first algorithm for random decision forests was created by Tin Kam Ho use the random subspace method, which, in Ho's formulation, is a manner to implement the" stochastic discrimination" approach to categorization proposed by Eugene Kleinberg. The report also offers the first theoretical consequence for random forests in the form of a bound on the generalization mistake which depends on the strength of the trees in the forest and their correlation. The idea of random subspace choice from Ho was also influential in the design of random forests. Ho established that forests of trees dividing with oblique hyperplanes can gain accuracy as they grow without suffering from overtraining, as long as the forests are randomly restricted to be sensitive to only choose feature dimensions.
# Random Forest Classifier Example from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import plot_confusion_matrix import matplotlib.pyplot as plt def main(): """ Random Forest Classifier Example using sklearn function. Iris type dataset is used to demonstrate algorithm. """ # Load Iris dataset iris = load_iris() # Split dataset into train and test data X = iris["data"] # features Y = iris["target"] x_train, x_test, y_train, y_test = train_test_split( X, Y, test_size=0.3, random_state=1 ) # Random Forest Classifier rand_for = RandomForestClassifier(random_state=42, n_estimators=100) rand_for.fit(x_train, y_train) # Display Confusion Matrix of Classifier plot_confusion_matrix( rand_for, x_test, y_test, display_labels=iris["target_names"], cmap="Blues", normalize="true", ) plt.title("Normalized Confusion Matrix - IRIS Dataset") plt.show() if __name__ == "__main__": main()