Applies the binning and Weight of Evidence (WoE) transformations learned by
obwoe to new data. This is the scoring function for deploying
WoE-based models in production. For each feature, the function assigns
observations to bins and maps them to their corresponding WoE values.
Usage
obwoe_apply(
data,
obj,
suffix_bin = "_bin",
suffix_woe = "_woe",
keep_original = TRUE,
na_woe = 0
)Arguments
- data
A
data.framecontaining the features to transform. Must include all features present in theobjresults. The target column is optional; if present, it will be included in the output.- obj
An object of class
"obwoe"returned byobwoe.- suffix_bin
Character string suffix for bin columns. Default is
"_bin".- suffix_woe
Character string suffix for WoE columns. Default is
"_woe".- keep_original
Logical. If
TRUE(default), include the original feature columns in the output. IfFALSE, only bin and WoE columns are returned.- na_woe
Numeric value to assign when an observation cannot be mapped to a bin (e.g., new categories not seen during training). Default is 0.
Value
A data.frame containing:
targetThe target column (if present in
data)<feature>Original feature values (if
keep_original = TRUE)<feature>_binAssigned bin label for each observation
<feature>_woeWeight of Evidence value for the assigned bin
Details
Bin Assignment Logic
Numerical Features:
Observations are assigned to bins based on cutpoints stored in the
obwoe object. The cut() function is used with intervals
\((a_i, a_{i+1}]\) where \(a_0 = -\infty\) and \(a_k = +\infty\).
Categorical Features:
Categories are matched directly to bin labels. Categories not seen
during training are assigned NA for bin and na_woe for WoE.
Production Deployment
For production scoring, it is recommended to:
Train the binning model using
obwoe()on the training setSave the fitted object with
saveRDS()Load and apply using
obwoe_apply()on new data
The WoE-transformed features can be used directly as inputs to logistic regression or other linear models, enabling interpretable credit scorecards.
References
Siddiqi, N. (2006). Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring. John Wiley & Sons. doi:10.1002/9781119201731
See also
obwoe for fitting the binning model,
summary.obwoe for model diagnostics.
Examples
# \donttest{
# =============================================================================
# Example 1: Basic Usage - Train and Apply
# =============================================================================
set.seed(42)
n <- 1000
# Training data
train_df <- data.frame(
age = rnorm(n, 40, 15),
income = exp(rnorm(n, 10, 0.8)),
education = sample(c("HS", "BA", "MA", "PhD"), n, replace = TRUE),
target = rbinom(n, 1, 0.15)
)
# Fit binning model
model <- obwoe(train_df, target = "target")
# New data for scoring (could be validation/test set)
new_df <- data.frame(
age = c(25, 45, 65),
income = c(20000, 50000, 80000),
education = c("HS", "MA", "PhD")
)
# Apply transformations
scored <- obwoe_apply(new_df, model)
print(scored)
#> age age_bin age_woe income income_bin
#> 1 25 (-Inf;39.802484] -0.07802808 20000 (8069.329816;61783.844034]
#> 2 45 (41.347493;52.361109] 0.03006841 50000 (8069.329816;61783.844034]
#> 3 65 (52.361109;+Inf] 0.13769907 80000 (61783.844034;+Inf]
#> income_woe education education_bin education_woe
#> 1 -0.03653409 HS HS -0.43188518
#> 2 -0.03653409 MA MA -0.09570827
#> 3 -0.04630457 PhD PhD 0.19851891
# Use WoE features for downstream modeling
woe_cols <- grep("_woe$", names(scored), value = TRUE)
print(woe_cols)
#> [1] "age_woe" "income_woe" "education_woe"
# =============================================================================
# Example 2: Without Original Features
# =============================================================================
scored_compact <- obwoe_apply(new_df, model, keep_original = FALSE)
print(scored_compact)
#> age_bin age_woe income_bin income_woe
#> 1 (-Inf;39.802484] -0.07802808 (8069.329816;61783.844034] -0.03653409
#> 2 (41.347493;52.361109] 0.03006841 (8069.329816;61783.844034] -0.03653409
#> 3 (52.361109;+Inf] 0.13769907 (61783.844034;+Inf] -0.04630457
#> education_bin education_woe
#> 1 HS -0.43188518
#> 2 MA -0.09570827
#> 3 PhD 0.19851891
# }
