Skip to contents

Implements an advanced binning algorithm for numerical variables using isotonic regression to ensure monotonicity in bin event rates. This method is particularly valuable for risk modeling, credit scoring, and other applications where monotonic relationships between features and target variables are expected or preferred.

Usage

optimal_binning_numerical_ir(
  target,
  feature,
  min_bins = 3L,
  max_bins = 5L,
  bin_cutoff = 0.05,
  max_n_prebins = 20L,
  auto_monotonicity = TRUE,
  convergence_threshold = 1e-06,
  max_iterations = 1000L
)

Arguments

target

Binary integer vector (0 or 1) representing the target variable.

feature

Numeric vector of values to be binned.

min_bins

Minimum number of bins to generate (default: 3).

max_bins

Maximum number of bins allowed (default: 5).

bin_cutoff

Minimum frequency fraction for each bin (default: 0.05).

max_n_prebins

Maximum number of pre-bins before optimization (default: 20).

auto_monotonicity

Automatically determine monotonicity direction (default: TRUE).

convergence_threshold

Convergence threshold for optimization (default: 1e-6).

max_iterations

Maximum number of iterations allowed (default: 1000).

Value

A list containing:

id

Numeric identifiers for each bin (1-based).

bin

Character vector with the bin intervals.

woe

Numeric vector with Weight of Evidence values for each bin.

iv

Numeric vector with Information Value contribution for each bin.

count

Integer vector with the total number of observations in each bin.

count_pos

Integer vector with the positive class counts in each bin.

count_neg

Integer vector with the negative class counts in each bin.

cutpoints

Numeric vector with the bin cutpoints (excluding ±Inf).

converged

Logical value indicating whether the algorithm converged.

iterations

Integer with the number of optimization iterations performed.

total_iv

Total Information Value of the binning solution.

monotone_increasing

Logical indicating whether monotonically increasing (TRUE) or decreasing (FALSE).

Details

Algorithm Overview

The algorithm transforms a continuous feature into discrete bins that maximize the relationship with a binary target while enforcing monotonicity constraints. It operates through several phases:

  1. Pre-Binning: Initial segmentation based on quantiles or unique feature values

  2. Frequency Stabilization: Merging of low-frequency bins to ensure statistical reliability

  3. Monotonicity Enforcement: Application of isotonic regression via Pool Adjacent Violators (PAV)

  4. Bin Optimization: Adjustments to meet constraints on minimum and maximum bin count

  5. Information Value Calculation: Computation of WoE and IV metrics for each bin

Mathematical Foundation

The core mathematical concepts employed in this algorithm are:

1. Isotonic Regression

Isotonic regression solves the following optimization problem:

$$\min_{\mu} \sum_{i=1}^{n} w_i (y_i - \mu_i)^2$$

Subject to: $$\mu_1 \leq \mu_2 \leq \ldots \leq \mu_n$$ (for increasing monotonicity)

Where:

  • \(y_i\) is the original event rate in bin \(i\)

  • \(w_i\) is the weight (observation count) of bin \(i\)

  • \(\mu_i\) is the isotonic (monotone) estimate for bin \(i\)

2. Weight of Evidence (WoE)

For each bin \(i\), the Weight of Evidence is defined as:

$$WoE_i = \ln\left(\frac{p_i/P}{n_i/N}\right)$$

Where:

  • \(p_i\): Number of positive cases in bin \(i\)

  • \(P\): Total number of positive cases

  • \(n_i\): Number of negative cases in bin \(i\)

  • \(N\): Total number of negative cases

3. Information Value (IV)

For each bin \(i\), the Information Value contribution is:

$$IV_i = \left(\frac{p_i}{P} - \frac{n_i}{N}\right) \times WoE_i$$

The total Information Value is:

$$IV_{total} = \sum_{i=1}^{k} IV_i$$

4. Laplace Smoothing

To handle zero counts, Laplace smoothing is applied:

$$\frac{p_i + \alpha}{P + k\alpha}, \frac{n_i + \alpha}{N + k\alpha}$$

Where:

  • \(\alpha\): Smoothing factor (0.5 in this implementation)

  • \(k\): Number of bins

Key Features

  • Automatic Monotonicity Direction: Determines optimal monotonicity (increasing/decreasing) based on data

  • Robust Handling of Edge Cases: Special processing for few unique values, missing data, etc.

  • Optimal Information Preservation: Merges bins to minimize information loss while meeting constraints

  • Statistical Reliability: Ensures each bin has sufficient observations for stable estimates

References

Barlow, R. E., & Brunk, H. D. (1972). The isotonic regression problem and its dual. Journal of the American Statistical Association, 67(337), 140-147.

Robertson, T., Wright, F. T., & Dykstra, R. L. (1988). Order restricted statistical inference. Wiley.

de Leeuw, J., Hornik, K., & Mair, P. (2009). Isotone optimization in R: pool-adjacent-violators algorithm (PAVA) and active set methods. Journal of Statistical Software, 32(5), 1-24.

Siddiqi, N. (2006). Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring. John Wiley & Sons.

Thomas, L. C., Edelman, D. B., & Crook, J. N. (2002). Credit Scoring and Its Applications. Society for Industrial and Applied Mathematics.

Belkin, M., Hsu, D., & Mitra, P. (2018). Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate. Advances in Neural Information Processing Systems.

Examples

if (FALSE) { # \dontrun{
# Generate synthetic data
set.seed(123)
n <- 1000
target <- sample(0:1, n, replace = TRUE)
feature <- rnorm(n)

# Basic usage
result <- optimal_binning_numerical_ir(target, feature)
print(result)

# Custom settings
result_custom <- optimal_binning_numerical_ir(
  target = target,
  feature = feature,
  min_bins = 2,
  max_bins = 6,
  bin_cutoff = 0.03,
  auto_monotonicity = TRUE
)

# Access specific components
bins <- result$bin
woe_values <- result$woe
is_increasing <- result$monotone_increasing
} # }