Note: Franz Kiraly´s primary affiliation has changed to Shell UK (Shell.ai) in Jan 2020. He retains an honorary faculty position at 911±¬ÁÏÍø. Due to IT problems, this home page is frozen in its 2019 state – please note that it therefore contains outdated information. UCL ISD is working on resolving this problem. Dr Kiraly can still be contacted via his UCL email address.



Core interests

As a practical statistician and machine learner, I am interested in creating a data analytics workflow which is empirically solid, quantitative, and useful in the real world.

My research aims to provide the foundations, through:

(i) studying external assessment, comparison, and validation of white-box and black-box methodology: how to empirically test whether the (black/white-box) method does what is claimed? Is it better than simpler alternatives, or better than a random guess?

(ii) theoretical analysis and practical workflow building for complex modelling tasks, e.g., in the presence of structured/hierarchical observation mechanisms or non-standard/composite data types. For example, prediction in the context of time series, spatial observations, comparisons, multiple data sources.

(iii) Design and implementation of automated modelling and model validation workflows: how to best do the above in a suitable software environment? How to use external checks to find the most suitable model, especially within a variety of trade-offs such as between accuracy, computational cost, and interpretability?

These are especially relevant in applications where usually the data and the associated scientific questions, and not a single method class is in the focus of interest; current project and collaboration domains include energy, finance, clinical health, sports and prevention.

Recent projects

Selected recent work on data scientific methodology:

Workflow design and theory for probabilistic supervised learning. Where supervised learning predicts a label, the probabilistic variant aims at predicting the uncertainty in prediction, in addition. Our work is the first to formalize this task in an entirely model-agnostic way, provides a number of theoretical insights, as well as a formal workflow design implemented in a python package, , which is sklearn for the probabilistic case.

Predictive independence testing and graphical modelling. Our work establishes a close theoretical connection between the task of testing whether variables are (conditionally) independent, and testing whether it is possible to predict one from the other (better than a certain baseline). This leads to a close link between the predictive modelling and the independence testing workflows, enabling easy multivariate independence testing.

Learning with complex data types. In cases where variables are not numbers, categories, or strings, but more complex objects such as series, images, graphs, a more abstract data storage, processing and modelling infrastructure is needed. Model composition is the natural paradigm for the latter, leading to challenges in object oriented software design. The package for python extends joint functionality of pandas and sklearn transformers for this setting (work in progress).

Selected recent application work:

Prediction and Prevention of Falls in a Neurological In-Patient Population. Falling, and associated injuries such as hip fracture, are a major strain on health and health resources, especially in the elderly or hospitalized. We are able to predict, with high accuracy in a neurological population, whether a patient is likely to fall during their stay, using only a number connecting test (the Trail making test).

Quantification and Prediction in Running Sports. Characterizing the training state of running athletes, and making predictions for race planning and training. We can predict Marathon times with an error in the order of a few minutes, and we are able to accurately summarize an athlete by three characteristic numbers.

Professional roles

I am currently holding the following professional roles which are points of contact in matters as described below:

UCL Statistics: enterprise coordinator & MAPS enterprise board member
Internal enabling role, and point of contact for translational engagement with UCL statistics and the MAPS faculty, especially on data science topics - e.g., data scientific consulting, courses on data analytics, statistics, machine learning/AI, commissioned research projects. MAPS faculty PoC is Jawwad Darr (UCL MAPS faculty, Vice-Dean Enterprise)

UCL Statistics: diversity & equality board member
Internal point of contact for diversity & equality related matters.

UCL CoMPLEX: board member
Representing UCL statistics on the board of UCL CoMPLEX. Potential point of contact for academics and industry, especially on topics in the intersection of biotech/health and statistics/machine learning/AI.

Alan Turing Institute: Data Study Groups, scientific lead and coordination team member
Co-organisation of the outreach scheme, management of data scientific and technical aspects. Possible point of contact for translational engagement with the Alan Turing Institute. Main PoC are Sebastian Vollmer (Data Study Groups, Director) and Nicolas Guernion (Alan Turing Institute, Director of Partnerships).

PhD applications

I am currently accepting applications for PhD supervision, subject to on formal requirements for obtaining a graduate research degree, and supervisory limits. Applications should include a CV, a short description of your research interests, a description of your background in mathematics/statistics, data analysis, and programming, as well as a motivation statement on what you are looking for in a PhD.

PhD stipends (through grants and projects) may be available - for these, kindly apply through the official channels, e.g., through the respective funding bodies (which may depend on your citizenship), or the UCL Human Resources portal.

Internship applications

I am accepting applications for short-term or summer internships from highly talented applicants. These are of 1-3 months length and cover subsistence plus world-wide travel expenses at the bursary rate. Initiative applications are possible, and should include an up-to-date CV, a motivation letter with description of research interests, evidence of scientific writing skills (e.g., a thesis or paper), evidence of data analytics skills (e.g., an analytics report), and/or evidence of programming skills (e.g., github account with public repositories).

Short Curriculum Vitae

At the , I have obtained my Diplomae (equivalent to MSc or MD, as regards content) in Computer Science, Mathematics, Medicine and Physics in the years 2003, 2005, 2006 and 2011; in 2008, I recieved my PhD in Medicine.

From 2007 to 2010, I have completed my PhD thesis in Mathematics on the topic of Arithmetic Geometry, under supervision of and in cooperation with in Ulm.

From 2010 to 2013, I have worked as a postdoctoral researcher in 's , at the Technische Universität Berlin, and I have been an associate member of 's , at the Freie Universität Berlin.

In 2012 I was appointed at the where I spent a total of six months, split over 2012, 2013 and 2014.

Since 2013, I am working as a lecturer (comparable to a tenured assistant professor) at .

In 2015, I have been visiting the as an AScI Visiting Fellow.

Since 2016, I am also a faculty fellow at the newly founded whose vision is to bundle and catalyze the UK's efforts in modern data science, and have been recently co-organizing its as a member of the DSG coordination team.

[Download Curriculum Vitae] (2018/04)


Király FJ. Wild quotient singularities of surfaces and their regular models. Doctoral dissertation, Ulm. 2010.

Király FJ. Vergleich verschiedener Postremissionsstrategien bei der akuten myeloischen Leukämie mit normalem Karyotyp. Doctoral dissertation, Ulm. 2008.

Data scientific software

Open source software in the sklearn ecosystem - contributions and collaborations are very welcome:

- extending pandas to data containers for structured, hierarchical and complext data types, and transformer interfaces compatible with the sklearn API

- machine learning toolbox for paradigm-agnostic probabilistic supervised learning, i.e., probabilistic label predictions, extends the sklearn API and provides interfaces for Bayesian toolboxes (see also section 8 of the )

- predictive conditional independence testing with a workflow interface to predictive models in sklearn (see also section 6 of the )

