Optimal Binning for Numerical Variables using OSLP
optimal_binning_numerical_oslp.Rd
Performs optimal binning for numerical variables using the Optimal Supervised Learning Partitioning (OSLP) approach. This advanced binning algorithm creates bins that maximize predictive power while preserving interpretability through monotonic Weight of Evidence (WoE) values.
Usage
optimal_binning_numerical_oslp(
target,
feature,
min_bins = 3L,
max_bins = 5L,
bin_cutoff = 0.05,
max_n_prebins = 20L,
convergence_threshold = 1e-06,
max_iterations = 1000L,
laplace_smoothing = 0.5
)
Arguments
- target
A numeric vector of binary target values (0 or 1).
- feature
A numeric vector of feature values.
- min_bins
Minimum number of bins (default: 3, must be >= 2).
- max_bins
Maximum number of bins (default: 5, must be > min_bins).
- bin_cutoff
Minimum proportion of total observations for a bin to avoid being merged (default: 0.05, must be in (0, 1)).
- max_n_prebins
Maximum number of pre-bins before optimization (default: 20).
- convergence_threshold
Threshold for convergence (default: 1e-6).
- max_iterations
Maximum number of iterations (default: 1000).
- laplace_smoothing
Smoothing parameter for WoE calculation (default: 0.5).
Value
A list containing:
- id
Numeric vector of bin identifiers (1-based).
- bin
Character vector of bin labels.
- woe
Numeric vector of Weight of Evidence (WoE) values for each bin.
- iv
Numeric vector of Information Value (IV) for each bin.
- count
Integer vector of total count of observations in each bin.
- count_pos
Integer vector of positive class count in each bin.
- count_neg
Integer vector of negative class count in each bin.
- event_rate
Numeric vector of positive class rate in each bin.
- cutpoints
Numeric vector of cutpoints used to create the bins.
- total_iv
Numeric value of total Information Value across all bins.
- converged
Logical value indicating whether the algorithm converged.
- iterations
Integer value indicating the number of iterations run.
Details
Mathematical Framework:
Weight of Evidence (WoE): For a bin \(i\) with Laplace smoothing alpha
:
$$WoE_i = \ln\left(\frac{n_{1i} + \alpha}{n_{1} + m\alpha} \cdot \frac{n_{0} + m\alpha}{n_{0i} + \alpha}\right)$$
Where:
\(n_{1i}\) is the count of positive cases in bin \(i\)
\(n_{0i}\) is the count of negative cases in bin \(i\)
\(n_{1}\) is the total count of positive cases
\(n_{0}\) is the total count of negative cases
\(m\) is the number of bins
\(\alpha\) is the Laplace smoothing parameter
Information Value (IV): Summarizes predictive power across all bins: $$IV = \sum_{i} (P(X|Y=1) - P(X|Y=0)) \times WoE_i$$
Algorithm Steps:
Pre-binning: Initial bins created using quantile-based approach
Merge Small Bins: Bins with frequency below threshold are merged
Enforce Monotonicity: Bins that violate monotonicity in WoE are merged
Optimize Bin Count: Bins are merged if exceeding max_bins
Calculate Metrics: Final WoE, IV, and event rates are computed
References
Belcastro, L., Marozzo, F., Talia, D., & Trunfio, P. (2020). "Big Data Analytics." Handbook of Big Data Technologies. Springer.
Mironchyk, P., & Tchistiakov, V. (2017). "Monotone Optimal Binning Algorithm for Credit Risk Modeling." SSRN 2987720.
Good, I.J. (1952). "Rational Decisions." Journal of the Royal Statistical Society, Series B, 14, 107-114. (Origin of Laplace smoothing)
Thomas, L.C. (2009). "Consumer Credit Models: Pricing, Profit, and Portfolios." Oxford University Press.
Examples
if (FALSE) { # \dontrun{
# Sample data
set.seed(123)
n <- 1000
target <- sample(0:1, n, replace = TRUE)
feature <- rnorm(n)
# Perform optimal binning
result <- optimal_binning_numerical_oslp(target, feature,
min_bins = 2, max_bins = 4)
# Print results
print(result)
# Visualize WoE against bins
barplot(result$woe, names.arg = result$bin, las = 2,
main = "Weight of Evidence by Bin",
ylab = "WoE")
abline(h = 0, lty = 2)
} # }