Optimal Numerical Binning JEDI (Joint Entropy-Driven Interval Discretization)
optimal_binning_numerical_jedi.Rd
A sophisticated numerical binning algorithm designed to optimize the Information Value (IV) while ensuring monotonic Weight of Evidence (WoE) relationships. The algorithm employs quantile-based pre-binning combined with adaptive merging strategies, ensuring both statistical stability and optimal information retention.
Usage
optimal_binning_numerical_jedi(
target,
feature,
min_bins = 3L,
max_bins = 5L,
bin_cutoff = 0.05,
max_n_prebins = 20L,
convergence_threshold = 1e-06,
max_iterations = 1000L
)
Arguments
- target
Integer binary vector (0 or 1) representing the target variable.
- feature
Numeric vector representing the continuous predictor.
- min_bins
Minimum number of bins to create (default: 3).
- max_bins
Maximum number of bins allowed (default: 5).
- bin_cutoff
Minimum relative frequency per bin (default: 0.05).
- max_n_prebins
Maximum number of pre-bins before optimization (default: 20).
- convergence_threshold
IV change threshold for convergence (default: 1e-6).
- max_iterations
Maximum number of optimization iterations (default: 1000).
Value
A list containing the following elements:
bin
: Character vector with the intervals of the bins.woe
: Numeric vector with Weight of Evidence values.iv
: Numeric vector with Information Value per bin.count
: Integer vector with the observation counts per bin.count_pos
: Integer vector with the positive class counts per bin.count_neg
: Integer vector with the negative class counts per bin.cutpoints
: Numeric vector with the cutpoints (excluding ±Inf).converged
: Logical indicating whether the algorithm converged.iterations
: Integer with the number of iterations performed.
Details
Mathematical Framework:
For a numerical variable \(X\) and a binary target \(Y \in \{0,1\}\), the algorithm creates \(K\) bins defined by \(K-1\) cutpoints where each bin \(B_i = (c_{i-1}, c_i]\) optimizes the information content, satisfying the following constraints:
Monotonic WoE: \(WoE_i \le WoE_{i+1}\) (or \(\ge\) for decreasing trends).
Minimum Bin Size: count\((B_i)/N \ge\) bin_cutoff.
Bin Quantity Limits: min_bins \(\le K \le\) max_bins.
Weight of Evidence (WoE) for bin \(i\): $$WoE_i = \ln\left(\frac{\text{Pos}_i / \sum \text{Pos}_i}{\text{Neg}_i / \sum \text{Neg}_i}\right)$$
Information Value (IV) per bin: $$IV_i = \left(\frac{\text{Pos}_i}{\sum \text{Pos}_i} - \frac{\text{Neg}_i}{\sum \text{Neg}_i}\right) \times WoE_i$$
Total IV: $$IV_{total} = \sum_{i=1}^K IV_i$$
Algorithm Phases:
Quantile-based Pre-Binning: Initial segmentation with validation of minimum frequency.
Rare Bin Merging: Combines bins below the
bin_cutoff
to ensure statistical stability.Monotonicity Enforcement: Adjusts bins to maintain monotonic WoE relationships.
Bin Count Optimization: Ensures the number of bins respects
min_bins
andmax_bins
constraints.Convergence Monitoring: Tracks IV stability to identify convergence.
Key Features:
Numerical Stability: WoE calculation includes epsilon to avoid division by zero.
Adaptive Merging Strategy: Minimizes IV loss during bin merging.
Robust Handling of Edge Cases: Designed to handle extreme values and skewed distributions effectively.
Efficient Binary Search: Used for bin assignments during pre-binning.
Early Convergence Detection: Stops iterations when IV stabilizes within the threshold.
Parameters:
min_bins
: Minimum number of bins to be created (default: 3, must be >= 2).max_bins
: Maximum number of bins allowed (default: 5, must be >=min_bins
).bin_cutoff
: Minimum relative frequency required for a bin to remain standalone (default: 0.05).max_n_prebins
: Maximum number of pre-bins created before optimization (default: 20).convergence_threshold
: Threshold for IV change to determine convergence (default: 1e-6).max_iterations
: Maximum number of optimization iterations (default: 1000).
References
Information Theory and Statistical Learning (Cover & Thomas, 2006)
Optimal Binning for Scoring Models (Mironchyk & Tchistiakov, 2017)
Monotonic Scoring and Binning (Beltrami & Bassani, 2021)
Examples
if (FALSE) { # \dontrun{
# Basic usage with default parameters
result <- optimal_binning_numerical_jedi(
target = c(1,0,1,0,1),
feature = c(1.2,3.4,2.1,4.5,2.8)
)
# Custom configuration for finer granularity
result <- optimal_binning_numerical_jedi(
target = target_vector,
feature = feature_vector,
min_bins = 5,
max_bins = 10,
bin_cutoff = 0.03
)
} # }