The focus of the course is on linguistic resources, such as web corpora, language models, (lexical & syntactical) databases, treebanks, etc. The focus is on answering the following questions:
-    How can suitable data sources be found?
-    What types of data sources are available?
-    In which formats are the data provided and how can it be handled?
-    How and for what can these data sets be used?
-    How can own data sets be created?
-    How can the data be analyzed?
These topics are first theoretically introduced and then practically addressed. Prior knowledge of Python is therefore desirable but not mandatory. The goal is to support students in selecting, creating, and processing suitable linguistic resources and analysis methods for quantitative questions, such as for term papers, theses, or research papers.
Following the handbook of modules, there will be only an opportunity for a proof of active participation (BN) and no opportunity for an exam. For the obtainment of the BN, the completion of regular homework is required.