Main page | Study Branches/Specializations | Groups of Courses | All Courses | Roles                Instructions

A course is the basic teaching unit, it's design as a medium for a student to acquire comprehensive knowledge and skills indispensable in the given field. A course guarantor is responsible for the factual content of the course.
For each course, there is a department responsible for the course organisation. A person responsible for timetabling for a given department sets a time schedule of teaching and for each class, s/he assigns an instructor and/or an examiner.
Expected time consumption of the course is expressed by a course attribute extent of teaching. For example, extent = 2 +2 indicates two teaching hours of lectures and two teaching hours of seminar (lab) per week.
At the end of each semester, the course instructor has to evaluate the extent to which a student has acquired the expected knowledge and skills. The type of this evaluation is indicated by the attribute completion. So, a course can be completed by just an assessment ('pouze zápočet'), by a graded assessment ('klasifikovaný zápočet'), or by just an examination ('pouze zkouška') or by an assessment and examination ('zápočet a zkouška') .
The difficulty of a given course is evaluated by the amount of ECTS credits.
The course is in session (cf. teaching is going on) during a semester. Each course is offered either in the winter ('zimní') or summer ('letní') semester of an academic year. Exceptionally, a course might be offered in both semesters.
The subject matter of a course is described in various texts.

NI-PDD Data Preprocessing Extent of teaching: 2P+1C
Instructor: Jiřina M. Completion: Z,ZK
Department: 18105 Credits: 5 Semester: Z

Annotation:
Students learn to prepare raw data for further processing and analysis. They learn what algorithms can be used to extract information from various data sources, such as images, texts, time series, etc., and learn the skills to apply these theoretical concepts to solve specific problems in individual projects - e.g., extraction of characteristics from images or from web pages.

Lecture syllabus:
1. Introduction, KDDM standards, CRISP-DM, DM software.
2. Visualization and data exploration.
3. Methods for determining the significance of features.
4. Problems in data: preparation, representation, validation, cleaning, missing values, date format, conversion of non-numeric data.
5. Problems in data: discretization / binning, outliers, cluster analysis, false predictors, group balancing, transformation, sampling.
6. Data reduction: nearest neighbor rule, boundaries between groups, CNN, distance graphs, Wilson editing, multi-edit method.
7. Data reduction: class balancing, Tomek links, SMOTE method, extended nearest neighbor rule.
8. Design methods PCA, ICA, LDA.
9. Preprocessing of time series and extraction of features.
10. Text preprocessing and feature extraction.
11. Image preprocessing and feature extraction: image description, filtering, edge detection, Fourier transform.
12. Image preprocessing and feature extraction: edge and area segmentation, description of objects in the image, feature and structural methods.

Seminar syllabus:
1. Assignment of course projects.
2. Consultations.
3. Presentation of course projects.

Literature:
1. Pyle, D. : Data Preparation for Data Mining. Morgan Kaufmann, 1999. ISBN 1558605290.
2. Guyon, I. - Gunn, S. - Nikravesh, M. - Zadeh, L. A. : Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing). Springer, 2006. ISBN 3540354875.
3. García , S. - Luengo, J. - Herrera F. : Data Preprocessing in Data Mining (Intelligent Systems Reference Library). Springer, 2015. ISBN 978-3319102467.
4. Blokdyk, G. : Data pre-processing (2nd Edition). CreateSpace Independent Publishing Platform, 2018. ISBN 978-1987493245.

Requirements:
Fundamentals of statistics, FCD course in data mining. The recommended prerequisite is BIE-VZD.

Předmět je ekvivalentní s MI-PDD.16 // Informace o předmětu a výukové materiály naleznete na https://courses.fit.cvut.cz/MI-PDD/

The course is also part of the following Study plans:
Study Plan Study Branch/Specialization Role Recommended semester
NI-PSS.2020 Computer Systems and Networks V 1
BI-SPOL.2015 Unspecified Branch/Specialisation of Study V Není
BI-WSI-PG.2015 Web and Software Engineering V Není
BI-WSI-WI.2015 Web and Software Engineering V Není
BI-WSI-SI.2015 Web and Software Engineering V Není
BI-ISM.2015 Information Systems and Management V Není
BI-ZI.2018 Knowledge Engineering V Není
BI-PI.2015 Computer engineering V Není
BI-TI.2015 Computer Science V Není
BI-BIT.2015 Computer Security and Information technology V Není
NI-SP.2020 System Programming V 1
NI-SP.2023 System Programming V 1
NI-SPOL.2020 Unspecified Branch/Specialisation of Study VO 1
BI-SPOL.21 Unspecified Branch/Specialisation of Study V Není
BI-PI.21 Computer Engineering 2021 (in Czech) V Není
BI-PG.21 Computer Graphics 2021 (in Czech) V Není
BI-MI.21 Business Informatics 2021 (In Czech) V Není
BI-IB.21 Information Security 2021 (in Czech) V Není
BI-PS.21 Computer Networks and Internet 2021 (in Czech) V Není
BI-PV.21 Computer Systems and Virtualization 2021 (in Czech) V Není
BI-SI.21 Software Engineering 2021 (in Czech) V Není
BI-TI.21 Computer Science 2021 (in Czech) V Není
BI-UI.21 Artificial Intelligence 2021 (in Czech) V Není
BI-WI.21 Web Engineering 2021 (in Czech) V Není
NI-MI.2020 Managerial Informatics V 3
NI-TI.2023 Computer Science V 1
NI-TI.2020 Computer Science V 1
NI-NPVS.2020 Design and Programming of Embedded Systems V 3
NIE-DBE.2023 Digital Business Engineering VO 1
NI-PB.2020 Computer Security V 1
NI-SI.2020 Software Engineering (in Czech) V 1
NI-WI.2020 Web Engineering V 1
NI-TI.2018 Computer Science V 1,3
NI-ZI.2020 Knowledge Engineering PS 1
NI-SPOL.2020 Unspecified Branch/Specialisation of Study V 1


Page updated 28. 3. 2024, semester: Z/2023-4, L/2019-20, L/2022-3, Z/2019-20, Z/2022-3, L/2020-1, L/2023-4, Z/2020-1, Z,L/2021-2, Send comments to the content presented here to Administrator of study plans Design and implementation: J. Novák, I. Halaška