average mode Algorithm
The average mode algorithm, also known as mean mode imputation, is a statistical approach used to handle missing data in a dataset. It is a simple method for handling missing values by replacing them with the most frequently occurring value, also known as the mode, in the dataset. This algorithm is particularly useful when dealing with categorical data, as it helps to maintain the overall frequency distribution of the data. Since it uses the mode value, it can be applied to both numerical and categorical attributes.
In practice, the average mode algorithm involves analyzing the dataset to identify the mode value for each attribute with missing data. Once the mode value is determined, the algorithm replaces all missing values in the attribute with the identified mode. This method has its advantages and disadvantages: on the one hand, it helps to maintain the original distribution of the data and is easy to implement; on the other hand, it may introduce bias, as it assumes that the mode value is the most appropriate replacement for all missing values. Despite its limitations, the average mode algorithm is a widely used technique for handling missing data, especially when more advanced methods are not feasible or necessary.
import statistics
def mode(input_list): # Defining function "mode."
"""This function returns the mode(Mode as in the measures of
central tendency) of the input data.
The input list may contain any Datastructure or any Datatype.
>>> input_list = [2, 3, 4, 5, 3, 4, 2, 5, 2, 2, 4, 2, 2, 2]
>>> mode(input_list)
2
>>> input_list = [2, 3, 4, 5, 3, 4, 2, 5, 2, 2, 4, 2, 2, 2]
>>> mode(input_list) == statistics.mode(input_list)
True
"""
# Copying input_list to check with the index number later.
check_list = input_list.copy()
result = list() # Empty list to store the counts of elements in input_list
for x in input_list:
result.append(input_list.count(x))
input_list.remove(x)
y = max(result) # Gets the maximum value in the result list.
# Returns the value with the maximum number of repetitions.
return check_list[result.index(y)]
if __name__ == "__main__":
data = [2, 3, 4, 5, 3, 4, 2, 5, 2, 2, 4, 2, 2, 2]
print(mode(data))
print(statistics.mode(data))