Skip to contents

This function applies optimal Weight of Evidence (WoE) values to an original categorical feature based on the results from an optimal binning algorithm. It assigns each category in the feature to its corresponding optimal bin and maps the associated WoE value.

Usage

OBApplyWoECat(obresults, feature, bin_separator = "%;%")

Arguments

obresults

A list containing the output from an optimal binning algorithm for categorical variables. It must include at least the following elements:

  • bin: Character vector of merged categories for each optimal bin

  • woe: Numeric vector of WoE values for each bin

  • id: Numeric vector of bin IDs representing the optimal order

feature

A character vector containing the original categorical feature data to which WoE values will be applied.

bin_separator

A string representing the separator used in bins to separate categories within merged bins (default: "%;%").

Value

A data frame with four columns:

  • feature: Original feature values.

  • bin: Optimal merged bins to which each feature value belongs.

  • woe: Optimal WoE values corresponding to each feature value.

  • idbin: ID of the bin to which each feature value belongs.

Details

The function processes the bin from obresults by splitting each merged bin into individual categories using bin_separator. It then creates a mapping from each category to its corresponding bin index, WoE value, and bin ID.

For each value in feature, the function assigns the appropriate bin, WoE value, and bin ID based on the category-to-bin mapping. If a category in feature is not found in any bin, NA is assigned to bin, woe, and idbin.

The function handles missing values (NA) in feature by assigning NA to bin, woe, and idbin for those entries.

Examples

if (FALSE) { # \dontrun{
# Example usage with hypothetical obresults and feature vector
obresults <- list(
  bin = c("business;repairs;car (used);retraining",
           "car (new);furniture/equipment;domestic appliances;education;others",
           "radio/television"),
  woe = c(-0.2000211, 0.2892885, -0.4100628),
  id = c(1, 2, 3)
)
feature <- c("business", "education", "radio/television", "unknown_category")
result <- OBApplyWoECat(obresults, feature, bin_separator = ";")
print(result)
} # }