Skip to contents

This function performs logistic regression using a gradient-based optimization algorithm (L-BFGS) and provides the option to compute the Hessian matrix for variance estimation. It supports both dense and sparse matrices as input.

Usage

fit_logistic_regression(X_r, y_r, maxit = 300L, eps_f = 1e-08, eps_g = 1e-05)

Arguments

X_r

A matrix of predictor variables. This can be a dense matrix (MatrixXd) or a sparse matrix (dgCMatrix).

y_r

A numeric vector of binary target values (0 or 1).

maxit

Maximum number of iterations for the L-BFGS optimization algorithm (default: 300).

eps_f

Convergence tolerance for the function value (default: 1e-8).

eps_g

Convergence tolerance for the gradient (default: 1e-5).

Value

A list containing the following elements:

coefficients

A numeric vector of the estimated coefficients for each predictor variable.

se

A numeric vector of the standard errors of the coefficients, computed from the inverse Hessian (if applicable).

z_scores

Z-scores for each coefficient, calculated as the ratio between the coefficient and its standard error.

p_values

P-values corresponding to the Z-scores for each coefficient.

loglikelihood

The negative log-likelihood of the final model.

gradient

The gradient of the log-likelihood function at the final estimate.

hessian

The Hessian matrix of the log-likelihood function, used to compute standard errors.

convergence

A boolean indicating whether the optimization algorithm converged successfully.

iterations

The number of iterations performed by the optimization algorithm.

message

A message indicating whether the model converged or not.

Details

The logistic regression model is fitted using the L-BFGS optimization algorithm. For sparse matrices, the algorithm automatically detects and handles the matrix efficiently.

The log-likelihood function for logistic regression is maximized: $$\log(L(\beta)) = \sum_{i=1}^{n} \left( y_i \log(p_i) + (1 - y_i) \log(1 - p_i) \right)$$ where \(p_i\) is the predicted probability for observation \(i\).

The Hessian matrix is computed to estimate the variance of the coefficients, which is necessary for calculating the standard errors, Z-scores, and p-values.

References

  • Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer Science & Business Media.

  • Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.

Author

José E. Lopes

Examples

if (FALSE) { # \dontrun{
# Create sample data
set.seed(123)
X <- matrix(rnorm(1000), ncol = 10)
y <- rbinom(100, 1, 0.5)

# Run logistic regression
result <- fit_logistic_regression(X, y)

# View results
print(result$coefficients)
print(result$p_values)
} # }