Welcome to the homepage for the ECS COMP6237 Data Mining module.
The challenge of data mining is to transform raw data into useful information and actionable knowledge. Data mining is the computational process of discovering patterns in data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and data management.
This course will introduce key concepts in data mining, information extraction, and information indexing; including specific algorithms and techniques for feature extraction, clustering, outlier detection, topic modelling and prediction of complex unstructured data sets. By taking this course, you will be given a broad view of the general issues surrounding unstructured and semi-structured data and the application of algorithms to such data. At a practical level, you will have the chance to explore an assortment of data mining techniques which you will apply to problems involving real-world data.
The lectures for this course will be given by Dr Markus Brede (email), Dr Zhiwu Huang (email) and Dr Shoaib Ehsan (email)
The lecture slots are as follows:
Day | Time | Room |
---|---|---|
Mondays | 9AM | B02 1039 (L/T K) |
Tuesdays | 9AM | B46 2003 (L/T B) |
Thursdays | 10AM | B02 1039 (L/T K) |
Fridays | 1PM | B02 1039 (L/T K) |
There will generally be three lectures each week, making use of the Monday, Tuesday and the Thursday slot. For some of the weeks we will also use the Friday slot.
The current timetable is shown below - be aware that this might change (especially if you ask us to add additional tutorial sessions):
Date | Semester Week | Lecturer(s) | Topic/Title |
---|---|---|---|
29-Jan | 1 | Zhiwu & Markus & Shoaib | Intro to data mining |
01-Feb | Shoaib | Linear Regression | |
02-Feb | Shoaib | Maximum Likelihood Estimation | |
06-Feb | 2 | Shoaib | Tutorial/seminar: linear regression and MLE; CW set |
08-Feb | Shoaib | Logistic regression | |
09-Feb | Shoaib | Dealing with non-linear data | |
12-Feb | 3 | Zhiwu & Shoaib | Group coursework Q & A |
13-Feb | Zhiwu & Shoaib | Group coursework Q & A | |
15-Feb | Zhiwu & Shoaib | Group coursework Q & A | |
19-Feb | 4 | Shoaib | Tutorial/seminar: logistic regression |
20-Feb | Shoaib | Intro to information theory | |
22-Feb | Shoaib | Information theory II | |
26-Feb | 5 | Zhiwu | Making Recommendations |
27-Feb | Zhiwu | Finding Groups | |
29-Feb | Zhiwu | Covariance | |
04-Mar | 6 | Zhiwu | Embedding Data |
05-Mar | Zhiwu | Search | |
07-Mar | Zhiwu | Document filtering | |
11-Mar | 7 | Zhiwu | Modelling with decision trees |
12-Mar | Zhiwu | Modelling Prices & Nearest Neighbours | |
14-Mar | Zhiwu | Market Basket Analysis | |
18-Mar | 8 | Zhiwu & Shoaib & Markus | Group coursework presentations |
19-Mar | Zhiwu & Shoaib & Markus | Group coursework presentations | |
21-Mar | Zhiwu & Shoaib & Markus | Group coursework presentations | |
22-Mar | Zhiwu & Shoaib & Markus | Group coursework presentations | |
Easter | |||
22-Apr | 9 | Zhiwu | Semantic Spaces & Latent Semantics |
23-Apr | Zhiwu | Topic Modelling | |
25-Apr | Zhiwu | Outlier Detection | |
29-Apr | 10 | Markus | Link Prediction on Networks |
30-Apr | Markus | Community Detection on Networks | |
02-May | Markus | Exploiting network structure for IR | |
06-May | 11 | Bank Holiday | |
07-May | Shoaib | Mining Data Streams | |
09-May | Shoaib | Exam Revision Q&A | |
13-May | 12 | Markus | Exam Revision Q&A |
14-May | Zhiwu | Exam Revision Q&A | |
16-May | CW Due | ||
20-May | 13 | Exams |
Materials to accompany the lectures can be found here:
The schedule for each coursework is shown below. Further details will become available below as each coursework is set:
Link for forming groups for the coursework Link.
Here are a couple of documents to help you. The first is an overview of the course, so that you can see the structure:
Talk to us! As we said above, you are more than welcome to arrange to meet with us via Teams (or in the unused lecture slots / at other times if appropriate) to discuss issues related to the course. Zhiwu (email), Shoaib (email) and Markus (email) can all be reached by email.