Main page | Study Branches/Specializations | Groups of Courses | All Courses | Roles                Instructions

A course is the basic teaching unit, it's design as a medium for a student to acquire comprehensive knowledge and skills indispensable in the given field. A course guarantor is responsible for the factual content of the course.
For each course, there is a department responsible for the course organisation. A person responsible for timetabling for a given department sets a time schedule of teaching and for each class, s/he assigns an instructor and/or an examiner.
Expected time consumption of the course is expressed by a course attribute extent of teaching. For example, extent = 2 +2 indicates two teaching hours of lectures and two teaching hours of seminar (lab) per week.
At the end of each semester, the course instructor has to evaluate the extent to which a student has acquired the expected knowledge and skills. The type of this evaluation is indicated by the attribute completion. So, a course can be completed by just an assessment ('pouze zápočet'), by a graded assessment ('klasifikovaný zápočet'), or by just an examination ('pouze zkouška') or by an assessment and examination ('zápočet a zkouška') .
The difficulty of a given course is evaluated by the amount of ECTS credits.
The course is in session (cf. teaching is going on) during a semester. Each course is offered either in the winter ('zimní') or summer ('letní') semester of an academic year. Exceptionally, a course might be offered in both semesters.
The subject matter of a course is described in various texts.

NI-DDW Web Data Mining Extent of teaching: 2P+1C
Instructor: Kuchař J. Completion: Z,ZK
Department: 18102 Credits: 5 Semester: L

Annotation:
Students will learn latest methods and technologies for web data acquisition, analysis and utilization of the discovered knowledge. Students will gain an overview of Web mining techniques for Web crawling, Web structure analysis, Web usage analysis, Web content mining and information extraction. Students will also gain an overview of most recent developments in the field of social web and recommendation systems.

Lecture syllabus:
1. Key web data mining principles.
2. Web content mining approaches (formats, restrictions, ethical aspects).
3. Web content mining tools.
4, Accessing and extracting specific web content (deep web).
5. Main text mining concepts.
6. Practical applications of text mining.
7. Social network structure and content analysis (2).
8. Web graph, web structure mining.
9. Web usage mining: data collecting.
10. Web usage mining: data analysis, web analytics.
11. Recommender systems and personalization.
12. Data stream mining: algorithms and applications.

Seminar syllabus:
1. Basics of data acquisition and processing
2. Text preprocessing, text mining applications
3. Acquisition and analysis of graph-based data
4. User data analysis
5. Basics of recommendation systems
6. Project presentation and assessment

Literature:
1. Liu, B. "Web Data Mining", Springer-Verlag Berlin Heidelberg, 2011. ISBN 978-3-642-19459-7.
2. Charu C. Aggarwal. "Machine Learning for Text", Springer, 2018. ISBN 9783319735313.
3. Easley, D., Kleinberg, J. "Networks, Crowds, and Markets: Reasoning About a Highly Connected World", Cambridge
4. A. Russel, M. "Mining the Social Web: Data Mining Facebook, Twitter, LinkedIn, Google+, GitHub, and More (3rd Edition)", O'Reilly Media, 2019. ISBN 978-1491985045.
5. Charu C. Aggarwal. "Recommender Systems: The Textbook", Springer, 2016. ISBN 9783319296579.

Requirements:
Basic knowledge in Web architecture (HTTP, HTML, URI), programming skills (e.g. Java, JavaScript), graph theory and basic algorithms.

Informace o předmětu a výukové materiály naleznete na https://courses.fit.cvut.cz/NI-DDW/

The course is also part of the following Study plans:
Study Plan Study Branch/Specialization Role Recommended semester
NI-SPOL.2020 Unspecified Branch/Specialisation of Study V 2
NI-TI.2023 Computer Science V 2
NI-TI.2020 Computer Science V 2
NI-PSS.2020 Computer Systems and Networks V 2
NI-SI.2020 Software Engineering (in Czech) V 2
NI-ZI.2020 Knowledge Engineering V 2
NIE-DBE.2023 Digital Business Engineering VO 2
NI-WI.2020 Web Engineering PS 2
NI-NPVS.2020 Design and Programming of Embedded Systems V 2
NI-SP.2020 System Programming V 2
NI-SP.2023 System Programming V 2
NI-MI.2020 Managerial Informatics V 2
NI-SPOL.2020 Unspecified Branch/Specialisation of Study VO 2
NI-PB.2020 Computer Security V 2


Page updated 16. 4. 2024, semester: Z/2024-5, L/2021-2, Z,L/2022-3, Z/2019-20, L/2023-4, L/2019-20, L/2020-1, Z/2021-2, Z/2023-4, Z/2020-1, Send comments to the content presented here to Administrator of study plans Design and implementation: J. Novák, I. Halaška