Skip to contents

Implements an advanced binning algorithm for numerical features that ensures monotonicity in the Weight of Evidence (WoE) while maximizing predictive power. The method formulates the binning problem as an optimization task with monotonicity constraints and solves it through an iterative process that preserves information value.

Usage

optimal_binning_numerical_mblp(
  target,
  feature,
  min_bins = 3L,
  max_bins = 5L,
  bin_cutoff = 0.05,
  max_n_prebins = 20L,
  force_monotonic_direction = 0L,
  convergence_threshold = 1e-06,
  max_iterations = 1000L
)

Arguments

target

An integer binary vector (0 or 1) representing the target variable.

feature

A numeric vector representing the feature to bin.

min_bins

Minimum number of bins (default: 3).

max_bins

Maximum number of bins (default: 5).

bin_cutoff

Minimum frequency fraction for each bin (default: 0.05).

max_n_prebins

Maximum number of pre-bins before optimization (default: 20).

force_monotonic_direction

Force specific monotonicity direction: 0=auto, 1=increasing, -1=decreasing (default: 0).

convergence_threshold

Convergence threshold for optimization (default: 1e-6).

max_iterations

Maximum iterations allowed (default: 1000).

Value

A list containing:

id

Numeric identifiers for each bin (1-based).

bin

Character vector with bin intervals.

woe

Numeric vector with Weight of Evidence values for each bin.

iv

Numeric vector with Information Value contribution for each bin.

count

Integer vector with the total number of observations in each bin.

count_pos

Integer vector with the positive class count in each bin.

count_neg

Integer vector with the negative class count in each bin.

event_rate

Numeric vector with the event rate (proportion of positives) in each bin.

cutpoints

Numeric vector with the bin boundaries (excluding infinities).

converged

Logical indicating whether the algorithm converged.

iterations

Integer count of iterations performed.

total_iv

Numeric total Information Value of the binning solution.

monotonicity

Character indicating monotonicity direction ("increasing", "decreasing", or "none").

Details

Algorithm Overview

The Monotonic Binning via Linear Programming algorithm operates through several coordinated steps:

  1. Input Validation: Ensures proper formatting and constraints for data and parameters.

  2. Pre-Binning: Creates initial bins based on quantiles or handles special cases for few unique values.

  3. Statistical Optimization:

    • Merges bins with frequencies below bin_cutoff to ensure statistical stability

    • Enforces monotonicity in Weight of Evidence (WoE) values

    • Optimizes bin count to satisfy the min_bins/max_bins constraints

    • Iteratively improves binning to maximize Information Value (IV)

  4. Monotonicity Analysis: Automatically detects optimal monotonicity direction or applies a forced direction if specified.

Mathematical Foundation

Linear Programming Connection

The binning optimization problem can be formulated as a constrained optimization problem:

$$\max_{b} \sum_{i=1}^{k} IV_i(b)$$

Subject to: $$WoE_i \leq WoE_{i+1} \quad \text{or} \quad WoE_i \geq WoE_{i+1} \quad \forall i \in \{1, \ldots, k-1\}$$ $$min\_bins \leq k \leq max\_bins$$ $$count_i \geq bin\_cutoff \times N \quad \forall i \in \{1, \ldots, k\}$$

Where:

  • \(b\) is the set of bin boundaries

  • \(k\) is the number of bins

  • \(IV_i(b)\) is the Information Value of bin \(i\) given boundaries \(b\)

  • \(WoE_i\) is the Weight of Evidence of bin \(i\)

Weight of Evidence (WoE)

For bin \(i\), with Laplace smoothing:

$$WoE_i = \ln\left(\frac{(p_i + \alpha) / (P + k\alpha)}{(n_i + \alpha) / (N + k\alpha)}\right)$$

Where:

  • \(p_i\): Number of positive cases in bin \(i\)

  • \(P\): Total number of positive cases

  • \(n_i\): Number of negative cases in bin \(i\)

  • \(N\): Total number of negative cases

  • \(\alpha\): Smoothing factor (0.5 in this implementation)

  • \(k\): Number of bins

Information Value (IV)

For bin \(i\):

$$IV_i = \left(\frac{p_i}{P} - \frac{n_i}{N}\right) \times WoE_i$$

Total Information Value:

$$IV_{total} = \sum_{i=1}^{k} IV_i$$

Advantages

  • Guaranteed Monotonicity: Ensures monotonic relationship between binned variable and target

  • Optimal Information Preservation: Merges bins in a way that minimizes information loss

  • Flexible Direction Control: Automatically detects optimal monotonicity direction or allows forcing a specific direction

  • Statistical Stability: Ensures sufficient observations in each bin

  • Efficient Implementation: Uses binary search and optimized merge strategies

References

Zeng, Y. (2018). Discretization of Continuous Features by Weighs of Evidence with Isotonic Regression. arXiv preprint arXiv:1812.05089.

Barlow, R. E., & Brunk, H. D. (1972). The isotonic regression problem and its dual. Journal of the American Statistical Association, 67(337), 140-147.

Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6), 1373-1396.

Bertsimas, D., & Tsitsiklis, J. N. (1997). Introduction to Linear Optimization. Athena Scientific.

Siddiqi, N. (2006). Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring. John Wiley & Sons.

Thomas, L. C., Edelman, D. B., & Crook, J. N. (2002). Credit Scoring and Its Applications. Society for Industrial and Applied Mathematics.

Examples

if (FALSE) { # \dontrun{
# Generate synthetic data
set.seed(123)
feature <- rnorm(1000)
target <- rbinom(1000, 1, 0.3)

# Basic usage
result <- optimal_binning_numerical_mblp(target, feature)
print(result)

# Custom parameters with forced increasing monotonicity
result_custom <- optimal_binning_numerical_mblp(
  target = target,
  feature = feature,
  min_bins = 3,
  max_bins = 6,
  force_monotonic_direction = 1  # Force increasing
)

# Access specific components
bins <- result$bin
woe_values <- result$woe
total_iv <- result$total_iv
monotonicity <- result$monotonicity
} # }