Description
Aims:
The module is aimed at introducing to data analytics providing some fundamental data science tools. Students will learn statistical tools to identify regularities, discover patterns and laws in complex datasets together with instruments to analyse, characterize, validate, parameterize and model complex data. Practical issues on business data analysis and statistics will be covered with specific case studies.
Intended learning outcomes:
On successful completion of the module, a student will be able to:
- Analyse the main statistical features of complex datasets.
- Understand how to analyse and characterize empirically complex data.
- Understand how to compute relevant statistical quantities and quantify their confidence intervals.
- Understand how to build sensible models and how to parameterize and validate them.
- Understand how to quantify inter-dependency/causality structure between different variables.
- Understand how to use the outcomes of data analysis to develop forecasting tools.
Indicative content:
The following are indicative of the topics the module will typically cover:
Empirical investigation of complex data:
- Essential practical familiarization with complex and big data, and with the most commonly used software packages to analyse them. Typical challenges with real data. Basics on data acquisition, manipulation, cleaning, filtering, representation and plotting.
Univariate and multivariate statistics:
- Marginal probability, joint probability and conditional probability. Empirical estimation of probability distributions. Measures of dependency. Cause and effect, Granger causality.
Information theoretic measures:
- Mutual information, transfer entropy. Spurious correlations and regularization. Forecasting and regressions. Hypothesis testing and validation.
Modelling and filtering through networks:
- Basics on complex networks: definitions and properties. Construction of networks of interactions from correlation matrices and causality measures. Information filtering through networks.
Probabilistic modelling:
- Constructing predictive probabilistic models form data. Testing and validating model performances. Selecting between alternative models.
Applications and case-studies:
- Application of the material and methods covered in the module to practical cases and real data will be done within the course through case studies.
- Case studies will be discussed in class and used as demonstrations of the methodologies covered during the lectures. Other case studies will instead be given as assignments and will represent the core material for the assessment.
Requisites:
To be eligible to select this module as optional or elective, a student must: (1) be registered on a programme and year of study for which it is a formally available; and (2) have a good knowledge of basic mathematics and statistics.
Module deliveries for 2024/25 academic year
Last updated
This module description was last updated on 19th August 2024.
Ìý