Highly sophisticated computers are mining vast amounts of data from the web, digital maps and satellite imagery to pick out trends in areas like demographics, transport and the environment.
Using satellite data from the EU’s Sentinel missions along with other data is helping scientists figure out how to improve shipping safety. Copyright 2016 EUMETSAT
‘We are working with all kinds of data,’ said Dr Jon Blower, at the University of Reading, UK. ‘Smart machines can then help us find the needle in the haystack or, for example, identify interesting trends in petabytes of satellite imagery.’
It means, among other things, that new ways have been discovered to improve the efficiency of shipping by tracking ocean eddies, and to improve greenhouse gas emission estimates by tracking land-use change in the UK.
The technology taps into an increasing trend to open up access to the vast amounts of data being collected every day by global information systems (GIS), including the EU’s Copernicus system to observe the earth using a family of so-called Sentinel satellites.
‘Satellite imagery forms a big part of it,’ said Dr Blower. ‘The Sentinel programme got going part-way through our project, so we were able to take advantage of that data. There’s also a lot of open geographic data that we can use, such as OpenStreetMap.’
His project is MELODIES, a pan-European initiative which has been funded by the EU to try to develop ways that these vast data stores can be interpreted and analysed. They’re doing it by using a technology called linked data which enables datasets to be connected and shared.
‘We are working with all kinds of data,’ said Dr Jon Blower, at the University of Reading, UK. ‘Smart machines can then help us find the needle in the haystack or, for example, identify interesting trends in petabytes of satellite imagery.’
It means, among other things, that new ways have been discovered to improve the efficiency of shipping by tracking ocean eddies, and to improve greenhouse gas emission estimates by tracking land-use change in the UK.
The technology taps into an increasing trend to open up access to the vast amounts of data being collected every day by global information systems (GIS), including the EU’s Copernicus system to observe the earth using a family of so-called Sentinel satellites.
‘Satellite imagery forms a big part of it,’ said Dr Blower. ‘The Sentinel programme got going part-way through our project, so we were able to take advantage of that data. There’s also a lot of open geographic data that we can use, such as OpenStreetMap.’
His project is MELODIES, a pan-European initiative which has been funded by the EU to try to develop ways that these vast data stores can be interpreted and analysed. They’re doing it by using a technology called linked data which enables datasets to be connected and shared.
‘We want to show open data publishers that if they put the data out there, people will use it,’ he said.
In September, the European Commission proposed new copyright rules that will give universities, research institutes, and research-performing companies more legal rights to perform text and data mining. The proposed law would benefit those working with big data for public interest purposes.
Interoperability
Researchers at the coal face of open data say that linking information together and making vast datasets searchable is by far the biggest problem they encounter—whether they are start-ups, universities or research institutes.
That’s why a number of researchers are working specifically on linked data, creating tools that allow datasets of different sizes and formats of data to be connected and searched.
‘Linked data is a framework of tools and practices for exposing, sharing and connecting information,’ explained Jesús Estrada from the SmartOpenData project which created a way of linking up a number of different European environmental datasets.
‘The key idea is that each data provider wants to publish information and this information is easily understandable by others.’
This is done by creating meaningful, or semantic, connections between datasets that can identify different representations of the same content. For example, a linked open data system would allow geographical data points from different datasets to be interpreted as ‘roads’ or ‘rivers’ and contain relationships, such as ‘road goes over river’, allowing people to obtain comparable or complementary information from different sources.
The SmartOpenData project set up five pilots where their system is used for environmental management. For example in Italy, datasets on algae, sewage, pollutants from rivers, chemicals, groundwater monitoring and wastewater treatment have been combined to help authorities monitor overall water quality in Sicily.
The next step is to allow small- and medium-sized enterprises to build on this and take advantage of a growing market. Open data is projected to create 25 000 new jobs in the EU between 2016 and 2020, with market size growing from EUR 55.3 billion in 2016 to EUR 75.7 billion by 2020.