Takes a discrete (or continuous) response on the scale
\(0, 1, \ldots, K\) (where \(K =\) ncuts) and converts
it to a pair of interval endpoints on the open unit interval
\((0, 1)\). Each observation is classified into one of four
censoring types following the complete likelihood used in this package:
- \(\delta = 0\)
Uncensored (exact): the observation is a continuous value already in \((0, 1)\). The likelihood contribution is the density \(f(y_i | \theta)\). Endpoints: \(l_i = u_i = y_i\) (or \(y_i / K\) when on the scale).
- \(\delta = 1\)
Left-censored: the latent value is below some upper bound \(u_i\). The contribution is \(F(u_i | \theta)\). When the observation is at the scale minimum (\(y = 0\)), the upper bound is \(u_i = \mathrm{lim} / K\). When the user forces \(\delta = 1\) on a non-boundary observation (\(y \neq 0\)), the upper bound is \(u_i = (y + \mathrm{lim}) / K\), preserving observation- specific variation. In both cases \(l_i = \epsilon\).
- \(\delta = 2\)
Right-censored: the latent value is above some lower bound \(l_i\). The contribution is \(1 - F(l_i | \theta)\). When the observation is at the scale maximum (\(y = K\)), the lower bound is \(l_i = (K - \mathrm{lim}) / K\). When the user forces \(\delta = 2\) on a non-boundary observation (\(y \neq K\)), the lower bound is \(l_i = (y - \mathrm{lim}) / K\), preserving observation- specific variation. In both cases \(u_i = 1 - \epsilon\).
- \(\delta = 3\)
Interval-censored: the standard case for scale data. The contribution is \(F(u_i | \theta) - F(l_i | \theta)\) with midpoint interval endpoints \([(y - \mathrm{lim})/K,\; (y + \mathrm{lim})/K]\).
Arguments
- y
Numeric vector: the raw response. Can be either integer scores on the scale \(\{0, 1, \ldots, K\}\) or continuous values already in \((0, 1)\).
- ncuts
Integer: number of scale categories \(K\) (default 100). Must be \(\geq \max(y)\).
- lim
Numeric: half-width \(h\) of the uncertainty region (default 0.5). Controls the width of the interval around each scale point.
- delta
Integer vector or
NULL. IfNULL(default), censoring types are inferred automatically from the boundary rules described above.If provided, must have the same length as
y, with every element in{0, 1, 2, 3}. The supplied values override the automatic classification on a per-observation basis, and the endpoint formulas adapt to non-boundary observations as described in the table above.This parameter is used internally by the simulation functions when the analyst forces a specific censoring type (e.g.,
brs_sim(..., delta = 2)).
Value
A numeric matrix with \(n\) rows and 5 columns:
leftLower endpoint \(l_i\) on \((0, 1)\), clamped to \([\epsilon, 1 - \epsilon]\).
rightUpper endpoint \(u_i\) on \((0, 1)\), clamped to \([\epsilon, 1 - \epsilon]\).
ytMidpoint approximation \(y_t\) for starting-value computation (does not enter the likelihood).
yOriginal response value (preserved unchanged).
deltaCensoring indicator: 0 = exact (density), 1 = left-censored \(F(u)\), 2 = right-censored \(1 - F(l)\), 3 = interval-censored \(F(u) - F(l)\).
Details
Automatic classification (delta = NULL):
If the entire input vector is already in \((0, 1)\) (i.e., all values satisfy \(0 < y < 1\)), all observations are treated as uncensored (\(\delta = 0\)).
Otherwise, for scale (integer) data:
\(y = 0\): left-censored (\(\delta = 1\)).
\(y = K\): right-censored (\(\delta = 2\)).
\(0 < y < K\): interval-censored (\(\delta = 3\)).
User-supplied delta (delta vector):
When the delta argument is provided, the user-supplied
censoring indicators override the automatic boundary-based rules
on a per-observation basis. This is the mechanism used by
brs_sim when the analyst forces a
specific censoring type in Monte Carlo studies.
The endpoint formulas for each delta value are:
| \(\delta\) | Condition | \(l_i\) (left) | \(u_i\) (right) |
| 0 | \(y \in (0, 1)\) | \(y\) | \(y\) |
| 0 | \(y\) on scale | \(y / K\) | \(y / K\) |
| 1 | \(y = 0\) (boundary) | \(\epsilon\) | \(\mathrm{lim} / K\) |
| 1 | \(y \neq 0\) (forced) | \(\epsilon\) | \((y + \mathrm{lim}) / K\) |
| 2 | \(y = K\) (boundary) | \((K - \mathrm{lim}) / K\) | \(1 - \epsilon\) |
| 2 | \(y \neq K\) (forced) | \((y - \mathrm{lim}) / K\) | \(1 - \epsilon\) |
| 3 | midpoint interval | \((y - \mathrm{lim}) / K\) | \((y + \mathrm{lim}) / K\) |
All endpoints are clamped to \([\epsilon, 1 - \epsilon]\) with \(\epsilon = 10^{-5}\) to avoid boundary issues in the beta likelihood.
The midpoint approximation yt is computed as:
\(y_t = y\) when \(y \in (0, 1)\) (continuous data).
\(y_t = y / K\) when \(y\) is on the integer scale.
This value is used exclusively as an initialization aid for starting-value computation and does not enter the likelihood.
Interaction with the fitting pipeline:
This function is called internally by .extract_response()
when the data does not carry the "is_prepared"
attribute. If data has already been processed by
brs_prep or by simulation with forced delta
(brs_sim with delta != NULL),
the pre-computed columns are used directly and
brs_check() is skipped.
References
Lopes, J. E. (2023). Modelos de regressao beta para dados de escala. Master's dissertation, Universidade Federal do Parana, Curitiba. URI: https://hdl.handle.net/1884/86624.
Hawker, G. A., Mian, S., Kendzerska, T., and French, M. (2011). Measures of adult pain: Visual Analog Scale for Pain (VAS Pain), Numeric Rating Scale for Pain (NRS Pain), McGill Pain Questionnaire (MPQ), Short-Form McGill Pain Questionnaire (SF-MPQ), Chronic Pain Grade Scale (CPGS), Short Form-36 Bodily Pain Scale (SF-36 BPS), and Measure of Intermittent and Constant Osteoarthritis Pain (ICOAP). Arthritis Care and Research, 63(S11), S240-S252. doi:10.1002/acr.20543
Hjermstad, M. J., Fayers, P. M., Haugen, D. F., et al. (2011). Studies comparing Numerical Rating Scales, Verbal Rating Scales, and Visual Analogue Scales for assessment of pain intensity in adults: a systematic literature review. Journal of Pain and Symptom Management, 41(6), 1073-1093. doi:10.1016/j.jpainsymman.2010.08.016
Examples
# Scale data with boundary observations
y <- c(0, 3, 5, 7, 9, 10)
brs_check(y, ncuts = 10)
#> left right yt y delta
#> [1,] 0.00001 0.05000 0.00001 0 1
#> [2,] 0.25000 0.35000 0.30000 3 3
#> [3,] 0.45000 0.55000 0.50000 5 3
#> [4,] 0.65000 0.75000 0.70000 7 3
#> [5,] 0.85000 0.95000 0.90000 9 3
#> [6,] 0.95000 0.99999 0.99999 10 2
# Force all observations to be exact (delta = 0)
brs_check(y, ncuts = 10, delta = rep(0L, length(y)))
#> left right yt y delta
#> [1,] 0.00001 0.00001 0.00001 0 0
#> [2,] 0.30000 0.30000 0.30000 3 0
#> [3,] 0.50000 0.50000 0.50000 5 0
#> [4,] 0.70000 0.70000 0.70000 7 0
#> [5,] 0.90000 0.90000 0.90000 9 0
#> [6,] 0.99999 0.99999 0.99999 10 0
# Force delta = 1 on non-boundary observations:
# endpoints use actual y values, preserving variation
y2 <- c(30, 60)
brs_check(y2, ncuts = 100, delta = c(1L, 1L))
#> left right yt y delta
#> [1,] 1e-05 0.305 0.3 30 1
#> [2,] 1e-05 0.605 0.6 60 1
# left = (eps, eps), right = (30.5/100, 60.5/100)
