Main page | Study Branches/Specializations | Groups of Courses | All Courses | Roles                Instructions

A course is the basic teaching unit, it's design as a medium for a student to acquire comprehensive knowledge and skills indispensable in the given field. A course guarantor is responsible for the factual content of the course.
For each course, there is a department responsible for the course organisation. A person responsible for timetabling for a given department sets a time schedule of teaching and for each class, s/he assigns an instructor and/or an examiner.
Expected time consumption of the course is expressed by a course attribute extent of teaching. For example, extent = 2 +2 indicates two teaching hours of lectures and two teaching hours of seminar (lab) per week.
At the end of each semester, the course instructor has to evaluate the extent to which a student has acquired the expected knowledge and skills. The type of this evaluation is indicated by the attribute completion. So, a course can be completed by just an assessment ('pouze zápočet'), by a graded assessment ('klasifikovaný zápočet'), or by just an examination ('pouze zkouška') or by an assessment and examination ('zápočet a zkouška') .
The difficulty of a given course is evaluated by the amount of ECTS credits.
The course is in session (cf. teaching is going on) during a semester. Each course is offered either in the winter ('zimní') or summer ('letní') semester of an academic year. Exceptionally, a course might be offered in both semesters.
The subject matter of a course is described in various texts.

MI-PDD.16 Data Preprocessing Extent of teaching: 2P+1C
Instructor: Completion: Z,ZK
Department: 18105 Credits: 5 Semester: Z

Annotation:
Students learn to prepare raw data for further processing and analysis. They learn what algorithms can be used to extract parameters from various data sources, such as images, texts, time series, etc., and learn the skills to apply these theoretical concepts to solve a specific problem in individual projects - e.g., parameter extraction from image data or from Internet.

Lecture syllabus:
1. Data exploration, exploratory analysis techniques, visualization of raw data.
2. Descriptive statistics.
3. Methods to determine the relevance of features.
4. Problems with data ? dimensionality, noise, outliers, inconsistency, missing values, non-numeric data.
5. Data cleaning, transformation, imputing, discretization, binning.
6. Reduction of data dimension.
7. Reduction of data volume, class balancing.
8. Feature extraction from text.
9. Feature extraction from documents, web. Preprocessing of structured data.
10. Feature extraction from time series.
11. Feature extraction from images.
12. Data preparation case studies.
13. Automation of data preprocessing.

Seminar syllabus:
1. Assignment of course projects.
2. Consultations.
3. Presentation of course projects.

Literature:
1. Pyle, D. ''Data Preparation for Data Mining''. Morgan Kaufmann, 1999. ISBN 1558605290.
2. Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. A. ''Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)''. Springer, 2006. ISBN 3540354875.

Requirements:
Fundamentals of statistics, FCD course in data mining. The recommended prerequisite is BIE-VZD.

Předmět je nahrazen ekvivalentním NI-PDD // Informace o předmětu a výukové materiály naleznete na https://courses.fit.cvut.cz/MI-PDD/

The course is also part of the following Study plans:
Study Plan Study Branch/Specialization Role Recommended semester
MI-SPOL.2016 Unspecified Branch/Specialisation of Study VO 1
MI-ZI.2018 Knowledge Engineering PO 1
MI-WSI-WI.2016 Web and Software Engineering V 1
MI-WSI-SI.2016 Web and Software Engineering V 1
MI-PSS.2016 Computer Systems and Networks V 1
MI-SP-SP.2016 System Programming V 1
MI-WSI-ISM.2016 Web and Software Engineering V 1
MI-NPVS.2016 Design and Programming of Embedded Systems V 1
MI-ZI.2016 Knowledge Engineering PO 1
MI-SP-TI.2016 System Programming V 1
MI-PB.2016 Computer Security V 1


Page updated 28. 3. 2024, semester: Z/2023-4, L/2019-20, L/2022-3, Z/2019-20, Z/2022-3, L/2020-1, L/2023-4, Z/2020-1, Z,L/2021-2, Send comments to the content presented here to Administrator of study plans Design and implementation: J. Novák, I. Halaška