CS909 Data Mining
Academic Aims
The module aims to provide students with a broad understanding of the subject of data mining, the algorithms developed to address different data mining goals and the application of these algorithms to real-world problems. Foundational concepts underlying the learning process and algorithms commonly used in the domain will also be introduced.
Learning Outcomes
By the end of the module, the student should
- Display a comprehensive understanding of different data mining tasks and the algorithms most appropriate for addressing these tasks enabling the student to independently carry out data mining projects
- Creatively deal with data related issues that need to be addressed for successful data mining to be carried out
- Systematically evaluate models/algorithms with respect to their accuracy
- Critique emerging standards for data mining and apply them to practical scenarios
- Carry out a self directed piece of practical work that requires the application of data mining techniques in a creative manner
- Critique the results of a data mining exercise, Develop Hypotheses based on the analysis of the results obtained and test them
- Conceptualize a data mining solution to a practical problem
Content
- Data Pre-processing: Methods for Handling Missing Values and Outliers, Common Basic Data Transformations
- Basics about Learning: Instance and Hypothesis Spaces, Version Spaces, Learning as Search, Inductive Learning
- Supervised Learning: Decision Tree Induction, Rule Induction, Lazy Learning
- Unsupervised Learning: Clustering, Association Rule Discovery
- Temporal Data Mining: Sequence Pattern Discovery
- Bayesian Probability: Naïve Bayes, Density Estimation, Bayesian Belief Networks
- Statistical Evaluation Techniques: Cross Validation, ROC analysis
- Emerging Standards: Predictive Modelling Markup Language, Java Data Mining and CRISP-DM
- Based on current industry practice, a selection of advanced topics will also be covered from Mining Data Streams, Graph Mining, Multi-Relational Mining, Text Mining
