Bug: NaiveBayes algorithm implementation


Problem: From the attached image (19.png) To calculate the likelihood probability for attribute value ’5.4’ for each label, you calculate the probability of ’5.4’ in A1 for the specified label (e.g Iris-setosa). However, SharpClassifier calculates the probability of ’5.4’ in all attributes A1 to A4 for the specified label.

You can confirm this if by running a dataset with different unique data types (i.e a dataset containing 1 attribute of categorical type, 1 attribute of text type, 1 attribute of integer) thereby limiting the occurrence of a value to only within its attribute. Then, SharpClassifier’s test set classifications will be the same as the same as other implementations of NaiveBayes algorithm

If we run a dataset whereby an attribute value occurs in more than one attribute then SharpClassifier will have more missclassifications.

file attachments