# Statistics and Machine Learning Reading Group

**IMPORTANT NOTICE: the new webpage for the reading group can be found here. **

### Suggested topics and papers:

- Lloyd et al. (2014) Automatic construction and natural-language description of nonparametric regression models. More about the Automatic Statistician can be found here.
- Wang et al. (2015) Regularised principal component analysis for spatial data.
- Singh et al. (2008) A unified view of matrix factorisation models.
- Dwork et al. (2015) The reusable holdout: preserving validity in adaptive data analysis.
- Particle filtering. Resources include: Murphy (2012) Machine Learning: A Probabilistic Perspective chapter 23.5 (available from the university library, or see Jim Skinner for other forms of access).
- Combrisson et al. (2015) Exceeding chance level by chance: the caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy

### Previously discussed topics and papers:

- Korb et al. (1999) Bayesian poker.
- Wilson et al. (2015) Kernel interpolation for scalable structured Gaussian processes (KISS-GP).
- Morey et al. (2015) The fallacy of placing confidence in confidence intervals.
- VC dimension. Resources include: Vapnik (1999) An overview of statistical learning theory.
- Dropout for neural networks. Resources include: Srivastava (2014) Dropout: a simple way to prevent neural networks from overfitting.
- Heller et al. (2005) Randomized algorithms for fast bayesian hierarchical clustering.
- Speech recognition. Resources include: Deng et al. (2013) Machine learning paradigms for speech recognision: an overview.
- Latent Dirichlet allocation. Resources include: Murphy (2012) Machine Learning: A Probabilistic Perspective chapter 27.3 (available from the university library, or see Jim Skinner for other forms of access).
- Gaussian process regression derivations. Resources include: Williams et al. (1996) Gaussian processes for regression, Zhu (1997) Gaussian regression and optimal finite dimensional linear models, and the Gaussian processes website tutorial section.
- GP-LVM. Resources include: Lawrence (2005) Probabilistic non-linear principal component analysis with Gaussian process latent variable models, and Lawrence (2003) Gaussian process latent variable models for visualisation of high dimensional data.
- Boosted trees. Further information can be found here.
- Faraway (2014) Does data splitting improve prediction?
- Recommendation engines. Resources include: Leskovec et al. (2014) Mining of Massive Datasets chapter 9 (available here).
- Lo et al. (2015) Why significant variables aren't automatically good predictors.
- Naive Bayes. Resources include: Murphy (2012) Machine Learning: A Probabilistic Perspective chapters 3.5 and 10.2.1 (available from the university library, or see Jim Skinner for other forms of access).
- Latent linear models. Resources include: Murphy (2012) Machine Learning: A Probabilistic Perspective chapters 12.1 and 12.3 (available from the university library, or see Jim Skinner for other forms of access).
- Kernels and SVMs. Resources include: Murphy (2012) Machine Learning: A Probabilistic Perspective chapter 14 (available from the university library, or see Jim Skinner for other forms of access).
- Guest speaker: Mauricio Álvarez. NB: different location - room B3.02.

Gaussian processes in Applied Neuroscience: a case study in Deep Brain Stimulation.

Deep brain stimulation (DBS) is a treatment for movement disorders, such as Parkinson's disease, dystonia, or essential tremor. It usually consists of the implantation of a stimulator in the infraclavicular region connected to an electrode lead that is placed in a target structure in the basal ganglia, particularly in the subthalamic nucleus, or the thalamus. The stimulator delivers electric pulses of a specific frequency, amplitude and pulse-width to the target via the electrode, which results in symptom improvement. In this talk, I will review different topics about how we have been using recent developments in Gaussian processes to tackle different inference problems that appear when dealing with the analysis of the process of DBS. In particular, I will talk about generalized Wishart processes for diffusion tensor estimation, and latent force models for simulating electrical potential fields. - Ciresan et al. (2012) Multi-column deep neural networks for image classification.
- Genton (2001) Classes of kernels for machine learning: a statistics perspective.
- Reinforcement learning. Resources include: Sutton et al. (2012) Reinforcement Learning: An Introduction chapter 1 (available here).
- Heller et al. (2005) Bayesian hierarchical clustering.
- Dirichlet processes. Resources include: Murphy (2012) Machine Learning: A Probabilistic Perspective chapter 25.2 (available from the university library, or see Jim Skinner for other forms of access).
- Sohl-Dickstein (2015) Deep unsupervised learning using nonequilibrium thermodynamics.
- Recurrent neural networks. Resources include: Goodfellow et al. Deep Learning (unfinished) chapter 10 (available here) and Karpathy's blogpost on the topic (available here).
- Kulis (2012) Revisiting k-means: new algorithms via bayesian nonparametrics.

- Ghahramani (2001) An introduction to hidden Markov models and Bayesian networks.
- Video lecture: The next generation of neural networks - Geoff Hinton.
- Stanley et al. (2005) Evolving neural network agents in the NERO video game.
- Gaussian mixture models. Resources include: Bishop (2006) Pattern Recognition and Machine Learning chapter 9.2 (available in both the Complexity Centre and university libraries).
- Domingos (2012) A few useful things to know about machine learning.
- Mikolov et al. (2013) Efficient estimation of word representations in vector space. Tool resources can be found here. A summarising presentation can be found here.
- Schmidt et al. (2009) Distilling free-form natural laws from experimental data.
- Multiple testing. Resources include: Bland et al. (1995) Multiple significance tests: the Bonferroni method, and Scott et al. (2003) An exploration of aspects of Bayesian multiple testing.
- Variational inference. Resources include: Bishop (2006) Pattern Recognition and Machine Learning chapter 10.1 (available in both the Complexity Centre and university libraries), and Murphy (2012) Machine Learning: A Probabilistic Perspective chapters 21.1, 21.2, and 21.5 (available from the university library, or see Jim Skinner for other forms of access).
- Previous topics and papers read during 2014/15.

Dropout neural network gif by Michael Pearce.