Optimal Binning for Numerical Features Using Monotonic Binning via Linear Programming (MBLP) — optimal_binning_numerical

Implements an advanced binning algorithm for numerical features that ensures monotonicity in the Weight of Evidence (WoE) while maximizing predictive power. The method formulates the binning problem as an optimization task with monotonicity constraints and solves it through an iterative process that preserves information value.

Usage

optimal_binning_numerical_mblp(
  target,
  feature,
  min_bins = 3L,
  max_bins = 5L,
  bin_cutoff = 0.05,
  max_n_prebins = 20L,
  force_monotonic_direction = 0L,
  convergence_threshold = 1e-06,
  max_iterations = 1000L
)

Arguments

target: An integer binary vector (0 or 1) representing the target variable.
feature: A numeric vector representing the feature to bin.
min_bins: Minimum number of bins (default: 3).
max_bins: Maximum number of bins (default: 5).
bin_cutoff: Minimum frequency fraction for each bin (default: 0.05).
max_n_prebins: Maximum number of pre-bins before optimization (default: 20).
force_monotonic_direction: Force specific monotonicity direction: 0=auto, 1=increasing, -1=decreasing (default: 0).
convergence_threshold: Convergence threshold for optimization (default: 1e-6).
max_iterations: Maximum iterations allowed (default: 1000).

Value

A list containing:

id: Numeric identifiers for each bin (1-based).
bin: Character vector with bin intervals.
woe: Numeric vector with Weight of Evidence values for each bin.
iv: Numeric vector with Information Value contribution for each bin.
count: Integer vector with the total number of observations in each bin.
count_pos: Integer vector with the positive class count in each bin.
count_neg: Integer vector with the negative class count in each bin.
event_rate: Numeric vector with the event rate (proportion of positives) in each bin.
cutpoints: Numeric vector with the bin boundaries (excluding infinities).
converged: Logical indicating whether the algorithm converged.
iterations: Integer count of iterations performed.
total_iv: Numeric total Information Value of the binning solution.
monotonicity: Character indicating monotonicity direction ("increasing", "decreasing", or "none").

Details

Algorithm Overview

The Monotonic Binning via Linear Programming algorithm operates through several coordinated steps:

Input Validation: Ensures proper formatting and constraints for data and parameters.
Pre-Binning: Creates initial bins based on quantiles or handles special cases for few unique values.
Statistical Optimization:
- Merges bins with frequencies below bin_cutoff to ensure statistical stability
- Enforces monotonicity in Weight of Evidence (WoE) values
- Optimizes bin count to satisfy the min_bins/max_bins constraints
- Iteratively improves binning to maximize Information Value (IV)
Monotonicity Analysis: Automatically detects optimal monotonicity direction or applies a forced direction if specified.

Mathematical Foundation

Linear Programming Connection

The binning optimization problem can be formulated as a constrained optimization problem:

$$\max_{b} \sum_{i=1}^{k} IV_i(b)$$

Subject to: $$WoE_i \leq WoE_{i+1} \quad \text{or} \quad WoE_i \geq WoE_{i+1} \quad \forall i \in \{1, \ldots, k-1\}$$ $$min\_bins \leq k \leq max\_bins$$ $$count_i \geq bin\_cutoff \times N \quad \forall i \in \{1, \ldots, k\}$$

Where:

$b$ is the set of bin boundaries
$k$ is the number of bins
$IV_i(b)$ is the Information Value of bin $i$ given boundaries $b$
$WoE_i$ is the Weight of Evidence of bin $i$

Weight of Evidence (WoE)

For bin $i$, with Laplace smoothing:

$$WoE_i = \ln\left(\frac{(p_i + \alpha) / (P + k\alpha)}{(n_i + \alpha) / (N + k\alpha)}\right)$$

Where:

$p_i$: Number of positive cases in bin $i$
$P$: Total number of positive cases
$n_i$: Number of negative cases in bin $i$
$N$: Total number of negative cases
$\alpha$: Smoothing factor (0.5 in this implementation)
$k$: Number of bins

Information Value (IV)

For bin $i$:

$$IV_i = \left(\frac{p_i}{P} - \frac{n_i}{N}\right) \times WoE_i$$

Total Information Value:

$$IV_{total} = \sum_{i=1}^{k} IV_i$$

Advantages

Guaranteed Monotonicity: Ensures monotonic relationship between binned variable and target
Optimal Information Preservation: Merges bins in a way that minimizes information loss
Flexible Direction Control: Automatically detects optimal monotonicity direction or allows forcing a specific direction
Statistical Stability: Ensures sufficient observations in each bin
Efficient Implementation: Uses binary search and optimized merge strategies

References

Zeng, Y. (2018). Discretization of Continuous Features by Weighs of Evidence with Isotonic Regression. arXiv preprint arXiv:1812.05089.

Barlow, R. E., & Brunk, H. D. (1972). The isotonic regression problem and its dual. Journal of the American Statistical Association, 67(337), 140-147.

Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6), 1373-1396.

Bertsimas, D., & Tsitsiklis, J. N. (1997). Introduction to Linear Optimization. Athena Scientific.

Siddiqi, N. (2006). Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring. John Wiley & Sons.

Thomas, L. C., Edelman, D. B., & Crook, J. N. (2002). Credit Scoring and Its Applications. Society for Industrial and Applied Mathematics.

Examples

if (FALSE) { # \dontrun{
# Generate synthetic data
set.seed(123)
feature <- rnorm(1000)
target <- rbinom(1000, 1, 0.3)

# Basic usage
result <- optimal_binning_numerical_mblp(target, feature)
print(result)

# Custom parameters with forced increasing monotonicity
result_custom <- optimal_binning_numerical_mblp(
  target = target,
  feature = feature,
  min_bins = 3,
  max_bins = 6,
  force_monotonic_direction = 1  # Force increasing
)

# Access specific components
bins <- result$bin
woe_values <- result$woe
total_iv <- result$total_iv
monotonicity <- result$monotonicity
} # }