Logistic Regression with Optional Hessian Calculation — fit_logistic

This function performs logistic regression using a gradient-based optimization algorithm (L-BFGS) and provides the option to compute the Hessian matrix for variance estimation. It supports both dense and sparse matrices as input.

Usage

fit_logistic_regression(X_r, y_r, maxit = 300L, eps_f = 1e-08, eps_g = 1e-05)

Arguments

X_r: A matrix of predictor variables. This can be a dense matrix (MatrixXd) or a sparse matrix (dgCMatrix).
y_r: A numeric vector of binary target values (0 or 1).
maxit: Maximum number of iterations for the L-BFGS optimization algorithm (default: 300).
eps_f: Convergence tolerance for the function value (default: 1e-8).
eps_g: Convergence tolerance for the gradient (default: 1e-5).

Value

A list containing the following elements:

coefficients: A numeric vector of the estimated coefficients for each predictor variable.
se: A numeric vector of the standard errors of the coefficients, computed from the inverse Hessian (if applicable).
z_scores: Z-scores for each coefficient, calculated as the ratio between the coefficient and its standard error.
p_values: P-values corresponding to the Z-scores for each coefficient.
loglikelihood: The negative log-likelihood of the final model.
gradient: The gradient of the log-likelihood function at the final estimate.
hessian: The Hessian matrix of the log-likelihood function, used to compute standard errors.
convergence: A boolean indicating whether the optimization algorithm converged successfully.
iterations: The number of iterations performed by the optimization algorithm.
message: A message indicating whether the model converged or not.

Details

The logistic regression model is fitted using the L-BFGS optimization algorithm. For sparse matrices, the algorithm automatically detects and handles the matrix efficiently.

The log-likelihood function for logistic regression is maximized: $$\log(L(\beta)) = \sum_{i=1}^{n} \left( y_i \log(p_i) + (1 - y_i) \log(1 - p_i) \right)$$ where $p_i$ is the predicted probability for observation $i$.

The Hessian matrix is computed to estimate the variance of the coefficients, which is necessary for calculating the standard errors, Z-scores, and p-values.

References

Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer Science & Business Media.
Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.

Author

José E. Lopes

Examples

if (FALSE) { # \dontrun{
# Create sample data
set.seed(123)
X <- matrix(rnorm(1000), ncol = 10)
y <- rbinom(100, 1, 0.5)

# Run logistic regression
result <- fit_logistic_regression(X, y)

# View results
print(result$coefficients)
print(result$p_values)
} # }