gaussian naive bayes Algorithm

The Gaussian Naive Bayes algorithm is a probabilistic machine learning algorithm based on the Bayes theorem, primarily used for classification problems. It is called "naive" because it assumes that the features in the dataset are independent of each other, meaning that the presence of one feature does not affect the presence of another feature. This simplification allows the algorithm to perform well on various tasks, despite the fact that the independence assumption might not hold true in real-world scenarios. Gaussian Naive Bayes specifically deals with continuous data, assuming that the continuous values associated with each class are distributed according to a Gaussian (or normal) distribution. The Gaussian Naive Bayes algorithm works by first computing the prior probabilities for each class, which represent the likelihood of each class occurring in the dataset. Then, it calculates the likelihood of observing a particular feature value given a specific class, assuming that the feature values are distributed according to a Gaussian distribution. To classify a new data point, the algorithm computes the posterior probability of each class given the feature values of the data point, and assigns the class with the highest posterior probability. Since Gaussian Naive Bayes is computationally efficient and can handle large datasets, it is often used as a baseline method for text classification, spam filtering, and other applications where features can be assumed to be conditionally independent.
# Gaussian Naive Bayes Example

from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import plot_confusion_matrix
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt


def main():

    """
    Gaussian Naive Bayes Example using sklearn function.
    Iris type dataset is used to demonstrate algorithm.
    """

    # Load Iris dataset
    iris = load_iris()

    # Split dataset into train and test data
    X = iris["data"]  # features
    Y = iris["target"]
    x_train, x_test, y_train, y_test = train_test_split(
        X, Y, test_size=0.3, random_state=1
    )

    # Gaussian Naive Bayes
    NB_model = GaussianNB()
    NB_model.fit(x_train, y_train)

    # Display Confusion Matrix
    plot_confusion_matrix(
        NB_model,
        x_test,
        y_test,
        display_labels=iris["target_names"],
        cmap="Blues",
        normalize="true",
    )
    plt.title("Normalized Confusion Matrix - IRIS Dataset")
    plt.show()


if __name__ == "__main__":
    main()

LANGUAGE:

DARK MODE: