Description
This module will cover technology, principles and applications of data collection, preparation, and
storage for data science systems. In particular, the module will cover a wide range of topics such as
sampling theory and practice, data collection, database query and processing, data processing, and feature engineering techniques. The module will also encompass practical assignments (in Python) so that students can learn how to apply the underlying principles to address problems in the areas of database queries and processing, data wrangling, and processing.
Syllabus:
This module will cover the following topics:
• Introduction to Data & Data Acquisition
• Data Preparation
• Data Storage
• Feature Engineering
• Hardware & Methods for Efficient Data Processing
• Data Representation
• Data Modelling, Analysis and Processing
Ìý
Learning Outcomes:
At the end of this module, participants should be able to
• Understand the concepts, techniques & tools for data acquisition, processing & analysis—both in
theory and practice; (e.g., data collection/ sampling/ processing/ integration/ feature engineering/
exploration, efficient data processing, ethical considerations, ML techniques, etc.)
• Understand the hardware architectures associated with data processing and machine learning;
• Understand & implement tools for handling large-scale datasets in real-world applications;
• Acquire hands-on experience with popular ML libraries, well-known real datasets, databases, and
Python.
Module deliveries for 2024/25 academic year
Last updated
This module description was last updated on 19th August 2024.
Ìý