|
Lecturer(s)
|
-
Matlach Vladimír, Mgr. Ph.D.
|
|
Course content
|
1) Machine learning in general - meaning, use, model, parameters, goals, optimization. 2) Optimization techniques: - Rough-force optimization, grid-search, random-search, - genetic and other algorithms, - gradient descent, variants and implementations, - cost function design, derivability, formalisms. 3) SVM models, LDA, k-NN, Naive Bayes, Decision Trees, Gradient Boosting: - Fundamentals of theory, implementation and use in Python. 4) Features suitable for machine learning: - Quantitative variables, feature engineering, - selection, extraction, reduction, the curse of dementia, applications of SVD, - Models and text vectorization: bag-of-words, semantics, LSA, - scaling, normalization, standardization. 5) Pragmatics of training: - Evaluating the success of models, - overfit, underfit phenomena and their detection, - Training, validation and test sets & training/test data problem. 6) Practical problem solving: - Creating a custom comment sentiment classifier, spam detector, ... 7) Creating and writing a report
|
|
Learning activities and teaching methods
|
|
Monologic Lecture(Interpretation, Training), Dialogic Lecture (Discussion, Dialog, Brainstorming), Work with Text (with Book, Textbook)
|
|
Learning outcomes
|
The aim of the course is to introduce the application of mathematical modelling of text in the form of machine learning using R/Python programming languages. The course will introduce the theory and practice of machine learning on a number of concrete and practical applications including creating a custom spam filter, sentiment detection of reviews, language detection, latent semantic analysis, etc.
|
|
Prerequisites
|
unspecified
|
|
Assessment methods and criteria
|
Student performance, Systematic Observation of Student, Seminar Work
(1) Elaboration and completion of assigned tasks. (2) Reading the assigned materials.
|
|
Recommended literature
|
-
Andres, J., Benešová, M., Kubáček, L., Vrbková, J. (2011). Methodological note on the fractal analysis of texts. Journal of Quantitative Linguistics 18, 4, 337-367.
-
Hřebíček, L. (2002). Vyprávění o lingvistických experimentech s textem. Praha.
-
Popescu, I. (2009). Word Frequency Studies.
-
Wimmer, G. a kol. (2003). Úvod do analýzy textov. Bratislava.
|