random forest classifier Algorithm

Random forests or random decision forests are an ensemble learning method for categorization, regression and other tasks that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (categorization) or mean prediction (regression) of the individual trees. The first algorithm for random decision forests was created by Tin Kam Ho use the random subspace method, which, in Ho's formulation, is a manner to implement the" stochastic discrimination" approach to categorization proposed by Eugene Kleinberg. 

The report also offers the first theoretical consequence for random forests in the form of a bound on the generalization mistake which depends on the strength of the trees in the forest and their correlation. The idea of random subspace choice from Ho was also influential in the design of random forests. Ho established that forests of trees dividing with oblique hyperplanes can gain accuracy as they grow without suffering from overtraining, as long as the forests are randomly restricted to be sensitive to only choose feature dimensions.
# Random Forest Classifier Example

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import plot_confusion_matrix
import matplotlib.pyplot as plt


def main():

    """
    Random Forest Classifier Example using sklearn function.
    Iris type dataset is used to demonstrate algorithm.
    """

    # Load Iris dataset
    iris = load_iris()

    # Split dataset into train and test data
    X = iris["data"]  # features
    Y = iris["target"]
    x_train, x_test, y_train, y_test = train_test_split(
        X, Y, test_size=0.3, random_state=1
    )

    # Random Forest Classifier
    rand_for = RandomForestClassifier(random_state=42, n_estimators=100)
    rand_for.fit(x_train, y_train)

    # Display Confusion Matrix of Classifier
    plot_confusion_matrix(
        rand_for,
        x_test,
        y_test,
        display_labels=iris["target_names"],
        cmap="Blues",
        normalize="true",
    )
    plt.title("Normalized Confusion Matrix - IRIS Dataset")
    plt.show()


if __name__ == "__main__":
    main()

LANGUAGE:

DARK MODE: