Main page | Study Branches/Specializations | Groups of Courses | All Courses | Roles                Instructions

A course is the basic teaching unit, it's design as a medium for a student to acquire comprehensive knowledge and skills indispensable in the given field. A course guarantor is responsible for the factual content of the course.
For each course, there is a department responsible for the course organisation. A person responsible for timetabling for a given department sets a time schedule of teaching and for each class, s/he assigns an instructor and/or an examiner.
Expected time consumption of the course is expressed by a course attribute extent of teaching. For example, extent = 2 +2 indicates two teaching hours of lectures and two teaching hours of seminar (lab) per week.
At the end of each semester, the course instructor has to evaluate the extent to which a student has acquired the expected knowledge and skills. The type of this evaluation is indicated by the attribute completion. So, a course can be completed by just an assessment ('pouze zápočet'), by a graded assessment ('klasifikovaný zápočet'), or by just an examination ('pouze zkouška') or by an assessment and examination ('zápočet a zkouška') .
The difficulty of a given course is evaluated by the amount of ECTS credits.
The course is in session (cf. teaching is going on) during a semester. Each course is offered either in the winter ('zimní') or summer ('letní') semester of an academic year. Exceptionally, a course might be offered in both semesters.
The subject matter of a course is described in various texts.

BI-VZD Data Mining Extent of teaching: 2P+2C
Instructor: Klouda K., Kovalenko A., Tichý O., Vašata D. Completion: Z,ZK
Department: 18105 Credits: 4 Semester: L,Z

Annotation:
Students are introduced to the basic methods of discovering knowledge in data. In particular, they learn the basic techniques of data preprocessing, multidimensional data visualization, statistical techniques of data transformation, and fundamental principles of knowledge discovery methods. Students will be aware of the relationships between model bias and variance, and know the fundamentals of assessing model quality. Data mining software is extensively used in the module. Students will be able to apply basic data mining tools to common problems (classification, regression, clustering).

Lecture syllabus:
1. Introduction to the field and applications
2. Decision trees, test, train, validation set
3. Ensemble methods (random forest, AdaBoost)
4. Hierarchical clustering, k-means algorithm
5. kNN (k-nearest neighbours)
6. Naive Bayes
7. Linear regression
8. Logistic regression
9. Ridge regression and regularisation
10. Dimensionality reduction
11. Neural networks
12. Natural language processing

Seminar syllabus:
1. Jupyter notebooks and machine learning packages
2. Decision trees, hyperparameters tuning
3. Ensemble methods (random forest, AdaBoost)
4. Hierarchical clustering, k-means algorithm
5. kNN (k-nearest neighbours), cross-validation
6. Naive Bayes classifier
7. Linear regression
8. Logistic regression
9. Ridge regression
10. Dimensionality reduction
11. Neural networks
12. Natural language processing

Literature:
1. Data Mining: Practical Machine Learning Tools and Techniques, I. H. Witten, E. Frank, M. A. Hall, Elsevier, 2011, ISBN 978-0080890364.
2. Deep Learning, I. Goodfellow, Y. Bengio, A. Courville, MIT Press, 2016, ISBN 978-0262035613.
3. Machine Learning: A Probabilistic Perspective, K. P. Murphy, MIT Press, 2012, ISBN 978-0262018029.

Requirements:
The knowledge of calculus, linear algebra and probability theory is assumed.

Informace o předmětu a výukové materiály naleznete na https://courses.fit.cvut.cz/BI-VZD/

The course is also part of the following Study plans:
Study Plan Study Branch/Specialization Role Recommended semester
BI-WSI-WI.2015 Web and Software Engineering V 5
BI-WSI-PG.2015 Web and Software Engineering V 5
BI-WSI-SI.2015 Web and Software Engineering V 5
BI-SPOL.2015 Unspecified Branch/Specialisation of Study VO 5
BI-BIT.2015 Computer Security and Information technology V 5
BI-TI.2015 Computer Science PO 5
BI-PI.2015 Computer engineering V 5
BI-ZI.2018 Knowledge Engineering PO 5
BI-ISM.2015 Information Systems and Management V 5


Page updated 28. 3. 2024, semester: Z/2023-4, L/2019-20, L/2022-3, Z/2019-20, Z/2022-3, L/2020-1, L/2023-4, Z/2020-1, Z,L/2021-2, Send comments to the content presented here to Administrator of study plans Design and implementation: J. Novák, I. Halaška