Naive Bayes Classification

Naive Bayes classification is based on Bayes Theorem. Naive Bayes assumes that the features contribute independently to determine the classification of the final outcome.

Naive Bayes is a particularly handy tool you can use to address classification based problems. It can be implemented using the naiveBayes() function in e1071 package or naive.bayes() in bnlearn package.

How to Implement Naive Bayes?

Dealing with Missing Values

In classification based problems, prior to applying the algorithm, it is a good practice to convert the missing values as an additional category, i.e. if you choose to not remove the missing value observations altogether.

# Optional - Convert NA as another factor in your categorical data.
for (Var in names (inputData)) {
    if (any ( (inputData[, Var]))) {
        levels (inputData[, Var]) <- c (levels (inputData[, Var]), "UNKNOWN") # Add 'unknown' level for missing
        inputData[ (inputData[, Var]), Var] <- "UNKNOWN" # replace missing with 'UNKNOWN'

Apply the Naive Bayes Classification and Predict

library (bnlearn)
bn <- naive.bayes (inputData, "responseVarName") # make Bayes model
fitted <- (bn, inputData) # fit parameters of Bayes model
pred <- predict (fitted, newData) # predict on new data
table (pred, newData[, "responseVarName"]) # confusion matrix

# sample confusion matrix: Actual vs Predicted
pred   a   b   c
   a 253  55  32
   b  44 197  41
   c  42  73 263


If you like us, please tell your friends.Share on LinkedInShare on Google+Share on RedditTweet about this on TwitterShare on Facebook