Skip to main content

CS909 Data Mining

Academic Aims

  • Understanding of the value of data mining in solving real-world problems.
  • Understanding of foundational concepts underlying data mining.
  • Understanding of algorithms commonly used in data mining tools.
  • Ability to apply data mining tools to real-world problems.

Learning Outcomes

By the end of the module, the student should

  • Display a comprehensive understanding of different data mining tasks and the algorithms most appropriate for addressing them.
  • Evaluate models/algorithms with respect to their accuracy.
  • Demonstrate capacity to perform a self directed piece of practical work that requires the application of data mining techniques.
  • Critique the results of a data mining exercise.
  • Develop hypotheses based on the analysis of the results obtained and test them.
  • Conceptualise a data mining solution to a practical problem.

Content

  • Introduction, basic concepts and motivation.
  • Data pre-processing: handling missing values, basic data transformations.
  • Rule induction; decision trees; naïve Bayesian probability; neural networks.
  • Advanced topic 1: image processing
  • Perceptron and support vector machines.
  • Ensemble methods: boosting, bagging & random forests.
  • Evaluation: cross validation, ROC.
  • Lazy learning: clustering and rule mining; association rule mining.
  • Time series.
  • Advanced topic 2: text mining with feature engineering; vector space models.
  • Advanced topic 3: graph mining.
  • Advanced topic 4: TBC

 

15 CATS
Term 2

Organiser:
Maria Liakata

Lecturers:
Maria Liakata
Rob Procter

Syllabus

Online material