Course: The statistics of language data and databases

» List of faculties » FIF » KBH
Course title The statistics of language data and databases
Course code KBH/EJAZD
Organizational form of instruction Seminar
Level of course Master
Year of study not specified
Semester Winter and summer
Number of ECTS credits 4
Language of instruction Czech
Status of course unspecified
Form of instruction Face-to-face
Work placements This is not an internship
Recommended optional programme components None
Lecturer(s)
  • Pořízka Petr, PhDr. Ph.D.
Course content
unspecified

Learning activities and teaching methods
Lecture, Dialogic Lecture (Discussion, Dialog, Brainstorming), Work with Text (with Book, Textbook), Demonstration
Learning outcomes
The aim of the course is to introduce the use of qualitative-quantitative methods of analysis of the text to lead students to apply and use these methods for a comprehensive and exact analysis based on quantification (quantitative data). Selected software tools (including so-called query languages) will be presented for these purposes as well as corpus databases that can serve as an extensive and structured source of authentic language data. During the course, students will be guided to acquire the basic terminology of the field, theories and methods for effective work with language data.
After completing the course, the student will be able to make use of language databases (so-called corporates), especially from the Czech National Corpus (CNB) portal, will adopt the most important data mining tools and procedures, and will be able to build and evaluate his own small corpus of language data for special purposes.
Prerequisites
unspecified

Assessment methods and criteria
Analysis of Activities ( Technical works), Seminar Work

(1) Participation in tutorials (2) Fulfillment of correspondence tasks (homeworks) (3) Seminar project
Recommended literature
  • Sketch Engine User Guide.
  • Baker, P. - Hardie, A. - McEnery, T. (2006). A Glossary of Corpus Linguistics. Edinburgh.
  • Machálek, T. (2018). KonText - rozhraní pro vyhledávání v korpusech. Praha.
  • Pořízka, P. (2014). Tvorba korpusů a vytěžování jazykových dat (metody, modely, nástroje). Olomouc.


Study plans that include the course
Faculty Study plan (Version) Category of Branch/Specialization Recommended year of study Recommended semester