Publications and Software

Software

igate: R package for initial Guided Analytics for Parameter Testing and Controlband extraction (see Stein et al., 2021). This package automates initial data analysis and reporting of the results. It can be downloaded directly from CRANLink opens in a new window or from GitHubLink opens in a new window.
A suite of code for covariance modelling in longitudinal data, including an implementation of the method in Zhang, Leng, and Tang (JRSSB, 2015), can be found here.
R package (version 2.0) and manual for MIP (multiple influence point detection in high-dimensional spaces, JRSSB, 2019), see https://arxiv.org/abs/1609.03320. After installing the package, try R command "example(MIP)".
R Code for HOLP in screening variables, see Wang and Leng (JRSSB, 2016).
Matlab Code and an Example for estimating high-dimensional correlation matrices. See Cui, Leng, and Sun (CSDA, 2015, Sparse estimation of high-dimensional correlation matrices).
Matlab Code (zip file) for gradient-based kernel dimension reduction in Fukumizu and Leng (JASA, 2014).
Matlab Code (rar file) for Bayesian adaptive Lasso. See Leng, Tran, and Nott (AISM, 2014, Bayesian adaptive lasso).
R Code and an Example for sparse matrix graphical models in Leng and Tang (JASA, 2012).
R Code (rar file) and an Example for penalised empirical likelihood in Tang and Leng (Biometrika, 2010) and Leng and Tang (Biometrika, 2012).
Matlab Code (rar file) for predictive Lasso in Tran, Nott, and Leng (STCO, 2012, The predictive lasso).
R Code (rar file) for variable selection in heteroscedastic linear models. See Nott, Tran, and Leng (STCO, 2012, Variational approximation for heteroscedastic linear models and matching pursuit algorithms).
R Code and an Example for regularised rank regression in Leng (Statistica Sinica, 2010, Variable selection and coefficient estimation via regularized rank regression).
R Code and an Example for sparse PCA in Leng and Wang (JCGS, 2009, On general adaptive sparse principal component analysis).
R Code and an Example for variable selection via least squares approximation in Wang and Leng (JASA, 2007).

Selected Publications (More can be found on Google Scholar)

in journals

Stein, S. and Leng, C. (2022). Fallacy of data-selective inference in modelling networks. Stat, to appear.
Zhao, J., Liu, X., Wang, H., and Leng, C. (2021). Dimension reduction for covariates in network data. Biometrika, to appear.
Stein, S., Leng, C., Thornton, S., and Randrianandrasana, M. (2021). A guided analytics tool for feature selection in steel manufacturing with an application to blast furnace top gas efficiency. Computational Materials Science, 186, 110053.
Chen, M., Kato, K., and Leng, C. (2020). Analysis of networks via the sparse β-model. Journal of the Royal Statistical Society, Series B, to appear.
Li, R., Leng, C., and You,. J. (2020). Semiparametric tail index regression. Journal of Business and Economic Statistics, to appear.
Wang, H., Peng, B., Li, D., and Leng, C. (2020). Nonparametric estimation of large covariance matrices with conditional sparsity. Journal of Econometrics, to appear.
Wang, Y., Xu, H., and Leng, C. (2019). Provable subspace clustering: When LRR meets SSCLink opens in a new window. IEEE Transactions on Information Theory, 65, 5406-5432.
Tang, C.Y., Zhang, W., and Leng, C. (2019). Discrete longitudinal data modeling with a mean-correlation regression approach. Statistica Sinica, 29, 853-876.
Zhao, J., Liu, C., Niu, L., and Leng, C. (2019). Multiple influential point detection in high-dimensional spaces. Journal of the Royal Statistical Society Series B, 81, 385-408. The journal version can be found here (open access!). Among the top 10% most downloaded papers published in JRSSB between January 2018 and December 2019.
Yan, T., Jiang, B., Fienberg, S. E., and Leng, C. (2019). Statistical inference in a directed network model with covariates. Journal of the American Statistical Association, 114, 857-868.
Jiang, B., Wang, X., and Leng, C. (2018). A direct approach for sparse quadratic discriminant analysis. Journal of Machine Learning Research, 19, 1-37.
Leng, C. and Pan, G. (2018). Covariance estimation via sparse Kronecker structures. Bernoulli, 24, 3833-3863.
Chen, Z. and Leng, C. (2016). Dynamic covariance models. Journal of the American Statistical Association, 111, 1196-1207. Supplementary materials.
Wang, X. and Leng, C. (2016). High-dimensional ordinary least-squares projection for screening variables (http://arxiv.org/abs/1506.01782). Journal of the Royal Statistical Society Series B, 78, 589-611.
Leng, C. and Yan, T. (2016). Discussion of "Statistical modelling of citation exchange between statistics journals" by Varin, Cattelan and Firth, Journal of the Royal Statistical Society Series A, 179, 54.
Zhao, J. and Leng, C. (2016). An analysis of penalised interaction models. Bernoulli, 22, 1937-1961.
Yan, T., Leng, C., and Zhu, J. (2016). Asymptotics in directed exponential random graph models with an increasing bi-degree sequence (http://arxiv.org/abs/1408.1156). The Annals of Statistics, 44, 31-57.
Yan, T. and Leng, C. (2015). A simulation study of the p1 model for directed random graphs. Statistics and Its Interface, 8, 255-266.
Zhang, W., Leng, C., and Tang, C. Y. (2015). A joint modeling approach for longitudinal studies (pdf). Journal of the Royal Statistical Society Series B, 77, 219-238. This video explains the geometric interpretation of the angles in a new variance-correlation decomposition. (Authors' note: The angles are denoted as β's in the video instead of φ's as in the paper for better visualization due to technical reasons.)
Fukumizu, K. and Leng, C. (2014). Gradient-based kernel dimension reduction (pdf). Journal of the American Statistical Association, 109, 359-370.
Zhao, J., Leng, C., Li, L., and Wang, H. (2013). High dimensional influence measure. The Annals of Statistics, 41, 2639-2667.
Leng, C. and Tong, X. (2013). A quantile regression estimator for censored data (http://arxiv.org/abs/1302.0181). Bernoulli, 19, 344-361.
Leng, C. and Tang, C. Y. (2012). Sparse matrix graphical models. Journal of the American Statistical Association, 107, 1187-1200.
Leng, C. and Tang, C. Y. (2012). Penalized empirical likelihood and growing dimensional general estimating equations. Biometrika, 99, 703-716.
Zhang, W. and Leng, C. (2012). A moving average Cholesky factor model in covariance modeling for longitudinal data. Biometrika, 99, 141-150.
Tang, C. Y. and Leng, C. (2011). Empirical likelihood and quantile regression in longitudinal data analysis. Biometrika, 98, 1001-1006.
Leng, C. and Li, B. (2011). Forward adaptive banding for estimating large covariance matrices. Biometrika, 98, 821-830.
Tang, C. Y. and Leng, C. (2010). Penalized high dimensional empirical likelihood. Biometrika, 97, 905-920.
Leng, C., Zhang, W., and Pan, J. (2010). Semiparametric mean-covariance regression analysis for longitudinal data. Journal of the American Statistical Association, 105, 181-193.
Wang, H., Li, B., and Leng, C. (2009). Shrinkage tuning parameter selection with a diverging number of parameters. Journal of the Royal Statistical Society, Series B, 71, 671-683.
Leng, C. and Wang, H. (2009). On general adaptive sparse principal component analysis. Journal of Computational and Graphical Statistics, 18, 201-215.
Wang, H. and Leng, C. (2007). Unified Lasso estimation via least squares approximation. Journal of the American Statistical Association, 102, 1039-1048.
Leng, C., Lin, Y., and Wahba, G. (2006). A note on the Lasso and related procedures in model selection. Statistica Sinica, 16, 1273-1284.

in conferences

Wang, X., Dunson, D., and Leng, C. (2016). No penalty no tears: Least squares in high-dimensional linear models. ICML.
Wang, X., Leng, C., and Dunson, D. (2015). On the consistency theory of high dimensional variable screening (http://arxiv.org/abs/1502.06895). NIPS.
Zhu, C., Xu, H., Leng, C., and Yan, S. (2014). Convex optimization procedure for clustering: Theoretical revisit. NIPS.
Wang, Y., Xu, H., and Leng, C. (2013). Provable subspace clustering: When LRR meets SSC. NIPS. Spotlight presentation. See also Yuxiang's website for code.
Fukumizu, K. and Leng, C. (2012). Gradient-based kernel method for feature extraction and variable selection. NIPS.
Xu, H. and Leng, C. (2012). Robust multi-task regression with grossly corrupted observations. AISTATS.