Optimal Binning for Numerical Features using Monotonic Optimal Binning (MOB) — optimal_binning_numerical

Implements the Monotonic Optimal Binning (MOB) algorithm for discretizing numerical features while maintaining monotonicity in the Weight of Evidence (WoE) values. This is particularly useful for credit scoring and risk modeling applications where monotonicity is often a desirable property for interpretability and regulatory compliance.

Usage

optimal_binning_numerical_mob(
  target,
  feature,
  min_bins = 3L,
  max_bins = 5L,
  bin_cutoff = 0.05,
  max_n_prebins = 20L,
  convergence_threshold = 1e-06,
  max_iterations = 1000L,
  laplace_smoothing = 0.5
)

Arguments

target: An integer vector of binary target values (0 or 1)
feature: A numeric vector of feature values to be binned
min_bins: Minimum number of bins to create (default: 3)
max_bins: Maximum number of bins to create (default: 5)
bin_cutoff: Minimum frequency of observations in a bin (default: 0.05)
max_n_prebins: Maximum number of prebins to create initially (default: 20)
convergence_threshold: Threshold for convergence in the iterative process (default: 1e-6)
max_iterations: Maximum number of iterations for the binning process (default: 1000)
laplace_smoothing: Smoothing parameter for WoE calculation (default: 0.5)

Value

A list containing the following elements:

id: Bin identifiers (1-based)
bin: A character vector of bin labels showing the intervals
woe: A numeric vector of Weight of Evidence values for each bin
iv: A numeric vector of Information Value for each bin
count: An integer vector of total count of observations in each bin
count_pos: An integer vector of count of positive class observations in each bin
count_neg: An integer vector of count of negative class observations in each bin
event_rate: A numeric vector with the proportion of positive cases in each bin
cutpoints: A numeric vector of cutpoints used to create the bins
total_iv: The total Information Value of all bins combined
converged: A logical value indicating whether the algorithm converged
iterations: An integer value indicating the number of iterations run

Details

Mathematical Framework:

Weight of Evidence (WoE): For a bin $i$ with Laplace smoothing alpha: $$WoE_i = \ln\left(\frac{n_{1i} + \alpha}{n_{1} + m\alpha} \cdot \frac{n_{0} + m\alpha}{n_{0i} + \alpha}\right)$$ Where:

$n_{1i}$ is the count of positive cases in bin $i$
$n_{0i}$ is the count of negative cases in bin $i$
$n_{1}$ is the total count of positive cases
$n_{0}$ is the total count of negative cases
$m$ is the number of bins
$\alpha$ is the Laplace smoothing parameter

Information Value (IV): Summarizes predictive power across all bins: $$IV = \sum_{i} (P(X|Y=1) - P(X|Y=0)) \times WoE_i$$

Monotonicity: The algorithm ensures that WoE values either consistently increase or decrease as the feature value increases, which aligns with business expectations that risk should change monotonically with certain features.

Algorithm Steps:

Create Initial Pre-bins: Divide the feature into equal-frequency bins
Merge Rare Bins: Combine bins with frequency below the threshold
Enforce Monotonicity: Identify and merge bins that violate monotonicity
Optimize Bin Count: Adjust number of bins to stay within min/max constraints
Calculate Metrics: Compute final WoE and IV values with Laplace smoothing

References

Belotti, T. & Crook, J. (2009). "Credit Scoring with Macroeconomic Variables Using Survival Analysis." Journal of the Operational Research Society, 60(12), 1699-1707.
Hand, D.J. & Adams, N.M. (2000). "Defining attributes for scorecard construction in credit scoring." Journal of Applied Statistics, 27(5), 527-540.
Thomas, L.C. (2009). "Consumer Credit Models: Pricing, Profit, and Portfolios." Oxford University Press.
Good, I.J. (1952). "Rational Decisions." Journal of the Royal Statistical Society, Series B, 14, 107-114. (Origin of Laplace smoothing/additive smoothing)

Examples

if (FALSE) { # \dontrun{
# Basic usage
set.seed(42)
feature <- rnorm(1000)
target <- rbinom(1000, 1, plogis(0.5 * feature))
result <- optimal_binning_numerical_mob(target, feature)
print(result)

# Advanced usage with custom parameters
result2 <- optimal_binning_numerical_mob(
  target = target,
  feature = feature,
  min_bins = 2,
  max_bins = 10,
  bin_cutoff = 0.03,
  laplace_smoothing = 0.1
)

# Plot Weight of Evidence by bin
plot(result2$woe, type = "b", xlab = "Bin", ylab = "WoE",
     main = "Weight of Evidence by Bin")
abline(h = 0, lty = 2)
} # }