
Optimal Binning for Numerical Features using Monotonic Risk Binning with Likelihood Ratio Pre-binning
Source:R/obn_mrblp.R
ob_numerical_mrblp.RdImplements a greedy binning algorithm with monotonicity enforcement and majority-vote direction detection. Important Note: Despite the "Likelihood Ratio Pre-binning" designation in the name, the current implementation uses equal-frequency pre-binning without likelihood ratio statistics. The algorithm is functionally a variant of Monotonic Optimal Binning (MOB) with minor differences in merge strategies.
This method is suitable for credit scoring applications requiring monotonic WoE patterns, but users should be aware that it does not employ the statistical rigor implied by "Likelihood Ratio" in the name.
Usage
ob_numerical_mrblp(
feature,
target,
min_bins = 3,
max_bins = 5,
bin_cutoff = 0.05,
max_n_prebins = 20,
convergence_threshold = 1e-06,
max_iterations = 1000,
laplace_smoothing = 0.5
)Arguments
- feature
Numeric vector of feature values to be binned. Missing values (NA) and infinite values are not permitted and will trigger an error (unlike other binning methods that issue warnings).
- target
Integer vector of binary target values (must contain only 0 and 1). Must have the same length as
feature.- min_bins
Minimum number of bins to generate (default: 3). Must be at least 1. Acts as a hard constraint during monotonicity enforcement.
- max_bins
Maximum number of bins to generate (default: 5). Must be greater than or equal to
min_bins.- bin_cutoff
Minimum fraction of total observations required in each bin (default: 0.05). Bins with frequency below this threshold are merged. Must be in the range (0, 1).
- max_n_prebins
Maximum number of pre-bins before optimization (default: 20). Must be at least equal to
min_bins.- convergence_threshold
Convergence threshold (default: 1e-6). Currently used to check if WoE range is below threshold; primary stopping criterion is
max_iterations.- max_iterations
Maximum number of iterations for bin merging and monotonicity enforcement (default: 1000). Prevents infinite loops.
- laplace_smoothing
Laplace smoothing parameter for WoE calculation (default: 0.5). Must be non-negative.
Value
A list containing:
- id
Integer vector of bin identifiers (1-based indexing).
- bin
Character vector of bin intervals in the format
"[lower;upper)".- woe
Numeric vector of Weight of Evidence values. Guaranteed to be monotonic.
- iv
Numeric vector of Information Value contributions per bin.
- count
Integer vector of total observations per bin.
- count_pos
Integer vector of positive class counts per bin.
- count_neg
Integer vector of negative class counts per bin.
- event_rate
Numeric vector of event rates per bin.
- cutpoints
Numeric vector of bin boundaries (excluding -Inf and +Inf).
- total_iv
Total Information Value (sum of bin IVs).
- converged
Logical flag indicating convergence within
max_iterations.- iterations
Integer count of iterations performed.
Details
Algorithm Overview
The MRBLP algorithm executes in five phases:
Phase 1: Equal-Frequency Pre-binning
Initial bins are created by dividing sorted data into approximately equal-sized groups:
$$n_{\text{bin}} = \max\left(1, \left\lfloor \frac{N}{\text{max\_n\_prebins}} \right\rfloor\right)$$
Note: Despite "Likelihood Ratio Pre-binning" in the name, no likelihood ratio statistics are computed. A true likelihood ratio approach would compute:
$$\text{LR}(c) = \prod_{x \le c} \frac{P(x|y=1)}{P(x|y=0)} \times \prod_{x > c} \frac{P(x|y=1)}{P(x|y=0)}$$
and select cutpoints \(c\) that maximize \(|\log \text{LR}(c)|\). This is not implemented in the current version.
Phase 2: Rare Bin Merging
Bins with total count below bin_cutoff \(\times N\) are merged. The
merge direction (left or right) is chosen to minimize IV loss:
$$\text{direction} = \arg\min_{d \in \{\text{left}, \text{right}\}} \left( \text{IV}_i + \text{IV}_{i+d} - \text{IV}_{\text{merged}} \right)$$
Phase 3: Initial WoE/IV Calculation
Weight of Evidence for bin \(i\):
$$\text{WoE}_i = \ln\left(\frac{n_i^{+} + \alpha}{n^{+} + k\alpha} \bigg/ \frac{n_i^{-} + \alpha}{n^{-} + k\alpha}\right)$$
where \(\alpha = \text{laplace\_smoothing}\) and \(k\) is the number of bins.
Phase 4: Monotonicity Enforcement
The algorithm determines the desired monotonicity direction via majority vote:
$$\text{increasing} = \begin{cases} \text{TRUE} & \text{if } \#\{\text{WoE}_i > \text{WoE}_{i-1}\} \ge \#\{\text{WoE}_i < \text{WoE}_{i-1}\} \\ \text{FALSE} & \text{otherwise} \end{cases}$$
This differs from:
MOB: Uses first two bins only (
WoE[1] >= WoE[0])MBLP: Uses Pearson correlation between bin indices and WoE
Violations are detected as: $$\text{violation} = \begin{cases} \text{WoE}_i < \text{WoE}_{i-1} & \text{if increasing} \\ \text{WoE}_i > \text{WoE}_{i-1} & \text{if decreasing} \end{cases}$$
Violating bins are merged iteratively until monotonicity is achieved or
min_bins is reached.
Phase 5: Bin Count Reduction
If the number of bins exceeds max_bins, the algorithm merges bins with
the smallest absolute IV difference:
$$\text{merge\_idx} = \arg\min_{i=0}^{k-2} |\text{IV}_i - \text{IV}_{i+1}|$$
Critique: This criterion assumes bins with similar IVs are redundant, which is not theoretically justified. A more rigorous approach (used in MBLP) minimizes IV loss after merge:
$$\Delta \text{IV} = \text{IV}_i + \text{IV}_{i+1} - \text{IV}_{\text{merged}}$$
Theoretical Foundations
Monotonicity Enforcement: Based on Zeng (2014), ensuring stability under data distribution shifts.
Likelihood Ratio (Theoretical): Neyman-Pearson lemma establishes likelihood ratio as the optimal test statistic for hypothesis testing. For binning, cutpoints maximizing LR would theoretically yield optimal class separation. However, this is not implemented.
Practical Equivalence: The algorithm is functionally equivalent to MOB with minor differences in direction detection and merge strategies.
Comparison with Related Methods
| Method | Pre-binning | Direction Detection | Merge Criterion |
| MRBLP | Equal-frequency | Majority vote | Min IV difference |
| MOB | Equal-frequency | First two bins | Min IV loss |
| MBLP | Quantile-based | Pearson correlation | Min IV loss |
| MDLP | Equal-frequency | N/A (optional) | MDL cost |
Computational Complexity
Identical to MOB: \(O(N \log N + k^2 \times \text{max\_iterations})\)
When to Use MRBLP vs Alternatives
Use MRBLP: If you specifically need majority-vote direction detection and can tolerate the non-standard merge criterion.
Use MOB: For simplicity and slightly faster direction detection.
Use MBLP: For more robust direction detection via correlation.
Use MDLP: For information-theoretic optimality without mandatory monotonicity.
References
Neyman, J., & Pearson, E. S. (1933). "On the Problem of the Most Efficient Tests of Statistical Hypotheses". Philosophical Transactions of the Royal Society A, 231(694-706), 289-337. [Theoretical foundation for likelihood ratio, not implemented in code]
Mironchyk, P., & Tchistiakov, V. (2017). "Monotone optimal binning algorithm for credit risk modeling". Frontiers in Applied Mathematics and Statistics, 3, 2.
Zeng, G. (2014). "A Necessary Condition for a Good Binning Algorithm in Credit Scoring". Applied Mathematical Sciences, 8(65), 3229-3242.
Siddiqi, N. (2006). Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring. Wiley.
Anderson, R. (2007). The Credit Scoring Toolkit: Theory and Practice for Retail Credit Risk Management and Decision Automation. Oxford University Press.
Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression (3rd ed.). Wiley.
See also
ob_numerical_mob for the base monotonic binning algorithm,
ob_numerical_mblp for correlation-based direction detection,
ob_numerical_mdlp for information-theoretic binning.
Examples
# \donttest{
# Simulate credit scoring data
set.seed(2024)
n <- 10000
feature <- c(
rnorm(4000, mean = 620, sd = 50),
rnorm(4000, mean = 690, sd = 45),
rnorm(2000, mean = 740, sd = 35)
)
target <- c(
rbinom(4000, 1, 0.20),
rbinom(4000, 1, 0.10),
rbinom(2000, 1, 0.04)
)
# Apply MRBLP
result <- ob_numerical_mrblp(
feature = feature,
target = target,
min_bins = 3,
max_bins = 5
)
# Compare with MOB (should be very similar)
result_mob <- ob_numerical_mob(
feature = feature,
target = target,
min_bins = 3,
max_bins = 5
)
# Compare results
data.frame(
Method = c("MRBLP", "MOB"),
N_Bins = c(length(result$woe), length(result_mob$woe)),
Total_IV = c(result$total_iv, result_mob$total_iv),
Iterations = c(result$iterations, result_mob$iterations)
)
#> Method N_Bins Total_IV Iterations
#> 1 MRBLP 5 0.1654134 16
#> 2 MOB 5 0.1744501 15
# }