Data science for science

Ensuring that research across science and the humanities can make effective use of state of the art methods in artificial intelligence and data science.


It is becoming easier and easier to collect large amounts of data across a broad range of research areas, and there is a growing need to understand how this can best be exploited to make new discoveries. The use of modern computational methods has already revolutionised research in physics and biology, and the stage is set for this approach to become a standard methodology in many different fields. However, we know it is not sufficient to just collect data, and hope that some generic algorithms will be able to help – the crucial step is in how to incorporate the deep knowledge that already exists about a system into the computational methods used.

For example, in particle physics, the search for the Higgs boson was guided by the well-developed theory of the Standard Model. In contrast, the search for particles of 'dark matter' has very little theory to guide it. In linguistics, there has been much progress in developing statistical models of language use, but it is not clear how to combine this with what is understood theoretically about how humans read, write and speak. In others cases, a computational model may be needed to make sense of data – for example, the need for an atomic-level model of a material when characterising its properties from high resolution images.

Programme challenges

The main aim of the programme is to work with researchers from all disciplines across the Turing's university partner network, and with national research facilities, to make effective use of state of the art methods in artificial intelligence and data science.

The social sciences and arts provide particularly interesting challenges within the programme. Our understanding is often qualitative, and aligning this with what data sets are telling us can be difficult. There is also a considerable need to provide relevant training in these new methodologies, in a way that is acceptable and meaningful to researchers from non-numerate disciplines.

It is envisaged that, within a very few years, research across universities and national facilities will more and more come to be based on the computational data science and AI methods being developed at The Alan Turing Institute and its partners. It is therefore a key challenge that the programme's research remains at the vanguard of this movement nationally.