Skip to contents

Implements the Local Density Binning (LDB) algorithm for optimal binning of numerical variables. This method adapts bin boundaries based on the local density structure of the data while maximizing the predictive relationship with a binary target variable. LDB is particularly effective for features with non-uniform distributions or multiple modes.

Usage

optimal_binning_numerical_ldb(
  target,
  feature,
  min_bins = 3L,
  max_bins = 5L,
  bin_cutoff = 0.05,
  max_n_prebins = 20L,
  enforce_monotonic = TRUE,
  convergence_threshold = 1e-06,
  max_iterations = 1000L
)

Arguments

target

A binary integer vector (0 or 1) representing the target variable.

feature

A numeric vector representing the feature to be binned.

min_bins

Minimum number of bins (default: 3).

max_bins

Maximum number of bins (default: 5).

bin_cutoff

Minimum frequency fraction for each bin (default: 0.05).

max_n_prebins

Maximum number of pre-bins before optimization (default: 20).

enforce_monotonic

Whether to enforce monotonic WoE across bins (default: TRUE).

convergence_threshold

Convergence threshold for optimization (default: 1e-6).

max_iterations

Maximum iterations allowed (default: 1000).

Value

A list containing:

id

Numeric identifiers for each bin (1-based).

bin

Character vector with bin intervals.

woe

Numeric vector with Weight of Evidence values for each bin.

iv

Numeric vector with Information Value contribution for each bin.

count

Integer vector with the total number of observations in each bin.

count_pos

Integer vector with the positive class count in each bin.

count_neg

Integer vector with the negative class count in each bin.

event_rate

Numeric vector with the event rate (proportion of positives) in each bin.

cutpoints

Numeric vector with the bin boundaries (excluding infinities).

converged

Logical indicating whether the algorithm converged.

iterations

Integer count of iterations performed.

total_iv

Numeric total Information Value of the binning solution.

monotonicity

Character indicating the monotonicity direction ("increasing", "decreasing", or "none").

Details

Algorithm Overview

The Local Density Binning algorithm operates in several phases:

  1. Density Analysis: Analyzes the local density structure of the feature to identify regions of high and low density, placing bin boundaries preferentially at density minima.

  2. Initial Binning: Creates initial bins based on density minima and/or quantiles.

  3. Statistical Optimization:

    • Merges bins with frequencies below threshold for stability

    • Enforces monotonicity in Weight of Evidence (optional)

    • Adjusts to meet constraints on minimum and maximum bin count

  4. Information Value Calculation: Computes predictive metrics for each bin

Mathematical Foundation

The algorithm employs several statistical concepts:

1. Kernel Density Estimation

To identify the local density structure:

$$f_h(x) = \frac{1}{nh}\sum_{i=1}^{n}K\left(\frac{x-x_i}{h}\right)$$

Where:

  • \(K\) is a kernel function (Gaussian kernel in this implementation)

  • \(h\) is the bandwidth parameter (selected using Silverman's rule of thumb)

  • \(n\) is the number of observations

2. Weight of Evidence (WoE)

For assessing the predictive power of each bin:

$$WoE_i = \ln\left(\frac{(p_i + \alpha) / (P + k\alpha)}{(n_i + \alpha) / (N + k\alpha)}\right)$$

Where:

  • \(p_i\): Number of positive cases in bin \(i\)

  • \(P\): Total number of positive cases

  • \(n_i\): Number of negative cases in bin \(i\)

  • \(N\): Total number of negative cases

  • \(\alpha\): Smoothing factor (0.5 in this implementation)

  • \(k\): Number of bins

3. Information Value (IV)

For quantifying overall predictive power:

$$IV_i = \left(\frac{p_i}{P} - \frac{n_i}{N}\right) \times WoE_i$$

$$IV_{total} = \sum_{i=1}^{k} IV_i$$

Advantages of Local Density Binning

  • Respects Data Structure: Places bin boundaries at natural gaps in the distribution

  • Adapts to Multimodality: Handles features with multiple modes effectively

  • Maximizes Information: Optimizes binning for predictive power

  • Statistical Stability: Ensures sufficient observations in each bin

  • Interpretability: Produces monotonic WoE patterns when requested

References

Bin, Y., Liang, S., Chen, Z., Yang, S., & Zhang, L. (2019). Density-based supervised discretization for continuous feature. Knowledge-Based Systems, 166, 1-17.

Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6), 1373-1396.

Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall/CRC.

Dougherty, J., Kohavi, R., & Sahami, M. (1995). Supervised and unsupervised discretization of continuous features. Proceedings of the Twelfth International Conference on Machine Learning, 194-202.

Siddiqi, N. (2006). Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring. John Wiley & Sons.

Thomas, L. C. (2009). Consumer Credit Models: Pricing, Profit and Portfolios. Oxford University Press.

Examples

if (FALSE) { # \dontrun{
# Generate synthetic data
set.seed(123)
target <- sample(0:1, 1000, replace = TRUE)
feature <- rnorm(1000)

# Basic usage
result <- optimal_binning_numerical_ldb(target, feature)
print(result)

# Custom parameters
result_custom <- optimal_binning_numerical_ldb(
  target = target,
  feature = feature,
  min_bins = 2,
  max_bins = 8,
  bin_cutoff = 0.03,
  enforce_monotonic = TRUE
)

# Access specific components
bins <- result$bin
woe_values <- result$woe
total_iv <- result$total_iv
monotonicity <- result$monotonicity
} # }