Skip to contents

A sophisticated numerical binning algorithm designed to optimize the Information Value (IV) while ensuring monotonic Weight of Evidence (WoE) relationships. The algorithm employs quantile-based pre-binning combined with adaptive merging strategies, ensuring both statistical stability and optimal information retention.

Usage

optimal_binning_numerical_jedi(
  target,
  feature,
  min_bins = 3L,
  max_bins = 5L,
  bin_cutoff = 0.05,
  max_n_prebins = 20L,
  convergence_threshold = 1e-06,
  max_iterations = 1000L
)

Arguments

target

Integer binary vector (0 or 1) representing the target variable.

feature

Numeric vector representing the continuous predictor.

min_bins

Minimum number of bins to create (default: 3).

max_bins

Maximum number of bins allowed (default: 5).

bin_cutoff

Minimum relative frequency per bin (default: 0.05).

max_n_prebins

Maximum number of pre-bins before optimization (default: 20).

convergence_threshold

IV change threshold for convergence (default: 1e-6).

max_iterations

Maximum number of optimization iterations (default: 1000).

Value

A list containing the following elements:

  • bin: Character vector with the intervals of the bins.

  • woe: Numeric vector with Weight of Evidence values.

  • iv: Numeric vector with Information Value per bin.

  • count: Integer vector with the observation counts per bin.

  • count_pos: Integer vector with the positive class counts per bin.

  • count_neg: Integer vector with the negative class counts per bin.

  • cutpoints: Numeric vector with the cutpoints (excluding ±Inf).

  • converged: Logical indicating whether the algorithm converged.

  • iterations: Integer with the number of iterations performed.

Details

Mathematical Framework:

For a numerical variable \(X\) and a binary target \(Y \in \{0,1\}\), the algorithm creates \(K\) bins defined by \(K-1\) cutpoints where each bin \(B_i = (c_{i-1}, c_i]\) optimizes the information content, satisfying the following constraints:

  1. Monotonic WoE: \(WoE_i \le WoE_{i+1}\) (or \(\ge\) for decreasing trends).

  2. Minimum Bin Size: count\((B_i)/N \ge\) bin_cutoff.

  3. Bin Quantity Limits: min_bins \(\le K \le\) max_bins.

Weight of Evidence (WoE) for bin \(i\): $$WoE_i = \ln\left(\frac{\text{Pos}_i / \sum \text{Pos}_i}{\text{Neg}_i / \sum \text{Neg}_i}\right)$$

Information Value (IV) per bin: $$IV_i = \left(\frac{\text{Pos}_i}{\sum \text{Pos}_i} - \frac{\text{Neg}_i}{\sum \text{Neg}_i}\right) \times WoE_i$$

Total IV: $$IV_{total} = \sum_{i=1}^K IV_i$$

Algorithm Phases:

  1. Quantile-based Pre-Binning: Initial segmentation with validation of minimum frequency.

  2. Rare Bin Merging: Combines bins below the bin_cutoff to ensure statistical stability.

  3. Monotonicity Enforcement: Adjusts bins to maintain monotonic WoE relationships.

  4. Bin Count Optimization: Ensures the number of bins respects min_bins and max_bins constraints.

  5. Convergence Monitoring: Tracks IV stability to identify convergence.

Key Features:

  • Numerical Stability: WoE calculation includes epsilon to avoid division by zero.

  • Adaptive Merging Strategy: Minimizes IV loss during bin merging.

  • Robust Handling of Edge Cases: Designed to handle extreme values and skewed distributions effectively.

  • Efficient Binary Search: Used for bin assignments during pre-binning.

  • Early Convergence Detection: Stops iterations when IV stabilizes within the threshold.

Parameters:

  • min_bins: Minimum number of bins to be created (default: 3, must be >= 2).

  • max_bins: Maximum number of bins allowed (default: 5, must be >= min_bins).

  • bin_cutoff: Minimum relative frequency required for a bin to remain standalone (default: 0.05).

  • max_n_prebins: Maximum number of pre-bins created before optimization (default: 20).

  • convergence_threshold: Threshold for IV change to determine convergence (default: 1e-6).

  • max_iterations: Maximum number of optimization iterations (default: 1000).

References

  • Information Theory and Statistical Learning (Cover & Thomas, 2006)

  • Optimal Binning for Scoring Models (Mironchyk & Tchistiakov, 2017)

  • Monotonic Scoring and Binning (Beltrami & Bassani, 2021)

Examples

if (FALSE) { # \dontrun{
# Basic usage with default parameters
result <- optimal_binning_numerical_jedi(
  target = c(1,0,1,0,1),
  feature = c(1.2,3.4,2.1,4.5,2.8)
)

# Custom configuration for finer granularity
result <- optimal_binning_numerical_jedi(
  target = target_vector,
  feature = feature_vector,
  min_bins = 5,
  max_bins = 10,
  bin_cutoff = 0.03
)
} # }