Dansk - English

Short version - Full version


Data Mining and Visualization (Spring 2011)

Course code : XDAT-U1
ECTS Credits : 5 Status : Optional
Revised : 18/06 2010 Written : 18/06 2010
Placement : 6. semester Hours per week : 4
Length : 1 semester Teaching Language :

Objective : In industry and research areas as business intelligence, computer science, genomics, e-science, production, development etc. huge amounts of data are produced on all sorts of materials, processes and products. Data Mining and Visualization offers tools to extract the relevant information through the use of modern software and computer technology. The course will give a theoretical introduction of how to handle and visualize complex multivariate data sets supported by examples from different industries and from the internet. Examples could be the amount of different lipids in the blood, screening e-mails for “spam” by classification, explore the Google Matrix, automated letter recognition, EEG data, and genetic information.
The course introduces basic methods used for data mining – Principal Component Analysis (PCA), Partial Least Squares regression (PLS), PLS Discriminant Analysis (PLS-DA) and Soft Independent Modeling of Class Analogy (SIMCA). All methods will be introduced using Latentix – a user-friendly software developed for data mining purposes.
After completing the course the student should be able to:
• Describe the methods for multivariate data analysis (exploration, classification and regression)
• Describe techniques for data pre-preprocessing
• Describe techniques for outlier detection
• Describe method validation principles
• Describe methods for variable selection
• Apply theory on real life data
• Apply commercial software for data analysis
Principal Content : Data mining for exploratory analysis (PCA), classification (PLS-DA and SIMCA) and multivariate calibration (PLS) which are thoroughly described and applied. Methods for outlier detection, data pre-processing, and model validation are central parts of the course. Computer exercises and the projects will be performed applying user-friendly software. A thorough introduction to the software will be given.
Teaching method : Lectures (33%), group exercises (33%), and group reports (34%).
Required prequisites : None.
Recommended prerequisites : -
Relations : -
Type of examination : Look under remarks
External examiner : Internal
Marking : Passed/Not passed
Remarks : Written and oral examination.
The students will hand in four group reports during the course which should be passed in order to participate in the exam. At the individual oral examination the students will be examined in the reports as well as the examination requirements.

The course is a collaboration between IHK and Quality & Technology, Department of Food Science, Faculty of Life Sciences, University of Copenhagen (KU-LIFE).
Teaching material : Compendium and hand-outs
Responsible teacher : Knud Holm Hansen , Knud.Holm.Hansen@fysik.dtu.dk