Optimal Binning for Numerical Features using Monotonic Optimal Binning (MOB)
optimal_binning_numerical_mob.Rd
Implements the Monotonic Optimal Binning (MOB) algorithm for discretizing numerical features while maintaining monotonicity in the Weight of Evidence (WoE) values. This is particularly useful for credit scoring and risk modeling applications where monotonicity is often a desirable property for interpretability and regulatory compliance.
Usage
optimal_binning_numerical_mob(
target,
feature,
min_bins = 3L,
max_bins = 5L,
bin_cutoff = 0.05,
max_n_prebins = 20L,
convergence_threshold = 1e-06,
max_iterations = 1000L,
laplace_smoothing = 0.5
)
Arguments
- target
An integer vector of binary target values (0 or 1)
- feature
A numeric vector of feature values to be binned
- min_bins
Minimum number of bins to create (default: 3)
- max_bins
Maximum number of bins to create (default: 5)
- bin_cutoff
Minimum frequency of observations in a bin (default: 0.05)
- max_n_prebins
Maximum number of prebins to create initially (default: 20)
- convergence_threshold
Threshold for convergence in the iterative process (default: 1e-6)
- max_iterations
Maximum number of iterations for the binning process (default: 1000)
- laplace_smoothing
Smoothing parameter for WoE calculation (default: 0.5)
Value
A list containing the following elements:
- id
Bin identifiers (1-based)
- bin
A character vector of bin labels showing the intervals
- woe
A numeric vector of Weight of Evidence values for each bin
- iv
A numeric vector of Information Value for each bin
- count
An integer vector of total count of observations in each bin
- count_pos
An integer vector of count of positive class observations in each bin
- count_neg
An integer vector of count of negative class observations in each bin
- event_rate
A numeric vector with the proportion of positive cases in each bin
- cutpoints
A numeric vector of cutpoints used to create the bins
- total_iv
The total Information Value of all bins combined
- converged
A logical value indicating whether the algorithm converged
- iterations
An integer value indicating the number of iterations run
Details
Mathematical Framework:
Weight of Evidence (WoE): For a bin \(i\) with Laplace smoothing alpha
:
$$WoE_i = \ln\left(\frac{n_{1i} + \alpha}{n_{1} + m\alpha} \cdot \frac{n_{0} + m\alpha}{n_{0i} + \alpha}\right)$$
Where:
\(n_{1i}\) is the count of positive cases in bin \(i\)
\(n_{0i}\) is the count of negative cases in bin \(i\)
\(n_{1}\) is the total count of positive cases
\(n_{0}\) is the total count of negative cases
\(m\) is the number of bins
\(\alpha\) is the Laplace smoothing parameter
Information Value (IV): Summarizes predictive power across all bins: $$IV = \sum_{i} (P(X|Y=1) - P(X|Y=0)) \times WoE_i$$
Monotonicity: The algorithm ensures that WoE values either consistently increase or decrease as the feature value increases, which aligns with business expectations that risk should change monotonically with certain features.
Algorithm Steps:
Create Initial Pre-bins: Divide the feature into equal-frequency bins
Merge Rare Bins: Combine bins with frequency below the threshold
Enforce Monotonicity: Identify and merge bins that violate monotonicity
Optimize Bin Count: Adjust number of bins to stay within min/max constraints
Calculate Metrics: Compute final WoE and IV values with Laplace smoothing
References
Belotti, T. & Crook, J. (2009). "Credit Scoring with Macroeconomic Variables Using Survival Analysis." Journal of the Operational Research Society, 60(12), 1699-1707.
Hand, D.J. & Adams, N.M. (2000). "Defining attributes for scorecard construction in credit scoring." Journal of Applied Statistics, 27(5), 527-540.
Thomas, L.C. (2009). "Consumer Credit Models: Pricing, Profit, and Portfolios." Oxford University Press.
Good, I.J. (1952). "Rational Decisions." Journal of the Royal Statistical Society, Series B, 14, 107-114. (Origin of Laplace smoothing/additive smoothing)
Examples
if (FALSE) { # \dontrun{
# Basic usage
set.seed(42)
feature <- rnorm(1000)
target <- rbinom(1000, 1, plogis(0.5 * feature))
result <- optimal_binning_numerical_mob(target, feature)
print(result)
# Advanced usage with custom parameters
result2 <- optimal_binning_numerical_mob(
target = target,
feature = feature,
min_bins = 2,
max_bins = 10,
bin_cutoff = 0.03,
laplace_smoothing = 0.1
)
# Plot Weight of Evidence by bin
plot(result2$woe, type = "b", xlab = "Bin", ylab = "WoE",
main = "Weight of Evidence by Bin")
abline(h = 0, lty = 2)
} # }