Combining Wikidata and OpenStreetMap to improve Welsh language mapping services
Open data as a concept has developed rapidly in recent years, propelled further by the need for rapid, collaborative solutions during the pandemic. In many ways platforms like Wikidata and OpenStreetMap (OSM), which have been growing at pace for a number of years now, are leading this open data revolution.
OSM is a crowd-sourced mapping dataset, where the public works together to build a rich open-access global map, which can be reused and adapted for free by all.
Wikidata is a huge linked open data set containing data about just about everything. Again, anyone can contribute and reuse for free, but perhaps the biggest difference here is that many organizations and other data aggregators also contribute.
In both these datasets the name of each entity can be given in multiple languages, including Welsh. And additional variations in each language can also be added.
In Wales we have small but active communities of contributors to both projects, and both receive support from the Welsh Language Technology Unit at the Welsh Government. For a number of years the National Library of Wales has directly supported Wikidata by appointing a ‘National Wikimedian’.
The project currently underway is a partnership between the National Library of Wales and the Mapio Cymru team, funded by the Welsh Government.
Combining Wikidata with OSM allows us to build on the work of Mapio Cymru which has been developing a map of Wales using only Welsh language data held in the OSM database. By aligning and combining this with Wikidata the map can begin to grow further, offering more information to users through the medium of Welsh.
And this is important. Many places in Wales, be they towns, villages, hills or beaches have two names, or sometimes more. The names in Welsh are almost always the original place names, ancient in origin and steeped in history. These names are usually descriptive or refer to long lost saints, chieftains or fortresses. The English versions of place names are sometimes meaningless mutations of the Welsh originals or names imposed by medieval invaders or Victorian ‘modernisers’. Even today historic properties are renamed in English by their new owners and Welsh names are dropped from websites and maps in favour of English alternatives deemed to be ‘more easy to pronounce’.
This project aims to decolonise mapping in Wales, not by erasing English place names from the record but giving users the option to view and explore a modern map of Wales solely through the medium of Welsh – a service that didn’t really exist until the launch of Mapio Cymru.
So the first challenge with this project is actually to encourage communities to contribute their local Welsh place names to OSM or Wikidata so that they can be included in the map, and this is done through a series of discussions, workshops and editing events.
The technical aspect of combining Wikidata with OSM begins with aligning the two datasets. OSM allows you to add the corresponding Wikidata ID records for places in its database, and this lets us know which places are missing from either of the datasets, and more particularly, where Welsh language data is missing. By looking at Welsh places already aligned to Wikidata we were immediately able to add 5000 additional Welsh place names to the Mapio Cymru map tiles using Welsh labels from Wikidata and this number should continue to rise as more places in OSM are aligned to Wikidata and more Welsh names are added.
Rendering Wikidata names directly onto the OSM map tiles is one way of adding value to the Welsh map, and a way of uniting two distinct communities to the cause. However, we can also bring further value by adding a Wikidata skin on top of the OSM map. The additional layer allows us to render pins (or points) on the map for a number of different data types, such as transport hubs, medical services, beaches and historic buildings. It allows users to filter specific content types, and gives them the option to see many places that don’t yet have Welsh OSM data. The proof of concept map below shows how this might look;
Thinking further ahead, this type of interface could be easily adapted as a crowdsourcing tool, allowing the community to visualize gaps in the data and leading them to OSM or Wikidata to add the missing information.
Ultimately the map could also form the foundations of a Welsh language Sat-Nav system.
Making the connection with Wikidata also has plenty of extra potential, since it holds far more information than just coordinates and multilingual names, including images and links to Wikipedia articles. The concept below shows another way in which Wikidata could combine with OSM to connect users with relevant Welsh language Wikipedia articles.
A proof of concept for combining the Welsh OSM map with Welsh Wikipedia content. Credit; JB Robertson.
For this project we have also worked with the Welsh Language Commissioner to add standardized Welsh language place names to Wikidata, so these can also be displayed on the prototype map. In total over 10,000 Welsh place names have been added to Wikidata using Welsh Government Open Data and other Open Data sources and these can now be displayed on the prototype map.
We hope the development of the map can continue, and there is already interest from bilingual organizations in using the OSM Cymru map to enhance their Welsh language online services.
This is a map that belongs to the people of Wales. It is a living entity and its existence is a testament to the strength of the Welsh language and the resolve of those who volunteer their time in order to ensure its future. The map is more than simply a record of Welsh names, it’s the basis for a rich, modern Welsh language mapping service, which can be developed and used by all.
Open Data Manager
This post is also available in: Welsh