Lecturer(s)
|
-
Stoklasa Jan, Mgr. et Mgr. Ph.D.
|
Course content
|
a) introduction to data processing, basics of data mining b) installation of RapidMiner Studio c) basic usage of RapidMiner Studio d) data import (xls, xlsx, csv,...) e) graphics 2D and 3D plots f) descriptive statistics of data g) data preprocessing missing values, outliers, normalization h) splitting data into training and testing sets i) regression analysis linear regression, time series j) classification decision trees, naive Bayes classifier, support vector machines, neural networks k) cluster analysis the nearest neighbor method l) cross-validation, ROC m) data export (xls, xlsx, csv,...)
|
Learning activities and teaching methods
|
Lecture, Dialogic Lecture (Discussion, Dialog, Brainstorming), Demonstration
- Homework for Teaching
- 30 hours per semester
- Attendace
- 25 hours per semester
- Semestral Work
- 20 hours per semester
|
Learning outcomes
|
In real-life applications, the data mining is becoming more and more necessary. It is used in marketing, business, control of complex systems, pharmacy, quality control or in internet services, which uses data mining to predict behavior of costumers to accommodate their needs. For these purposes, data mining is using tools from several scientific disciplines (e.g. statistics, machine learning, neural networks,). The aim of the course is to familiarize the student with basics of data mining and data processing of economical data using RapidMiner Studio. Student will learn not only how to preprocess input data and then applied various mathematical methods on this data, but also how to interpret the obtained results and present them.
Student will adopt basics of data processing and data mining. He/she will be able to preprocess input data, and evaluate them using regression analysis, cluster analysis or classification methods. The obtained results will be able to present in the form of graphical outputs and draw a relevant conclusions in the context of economic practice.
|
Prerequisites
|
Own notebook! It can be rpovided by the departement.
|
Assessment methods and criteria
|
Student performance, Systematic Observation of Student, Seminar Work
Attendance max. two absences are acceptable. Active work in seminars. Seminar paper a processing of economic data with the software discussed within the course + its presentation.
|
Recommended literature
|
-
RapidMiner resources ? getting started central.
-
RapidMiner Studio Manual.
-
M. A. North. (2012). Data Mining for the Masses..
-
M. Hofmann, R. Klinkenberg. (2016). RapidMiner: Data Mining Use Cases and Business Analytics Application..
-
P. BERKA. (2003). Dobývání znalostí z databází.
-
R. O. Duda, P. E. Hart, D. G. Stork. (2001). Pattern classification, 2ND ed.
|