Improving open data in transport: the Bus Stop Checker
At Passenger, we build apps that make use of open transport data. A key dataset here is National Public Transport Access Nodes (NaPTAN), the open data resource for public transport terminals. Using NaPTAN we make technology that routes bus users from A to B as effectively and efficiently as possible.
However, while working on Passenger technology, we came up against a problem. We started to notice discrepancies in the NaPTAN dataset. Specifically, we found some bus stops were listed as facing the opposite direction of the roads they were on, or claiming to be positioned in locations that we knew to be incorrect.
Despite exploring the issue, we found it difficult to get in contact with the people at the local transport authorities responsible for the upkeep of the NaPTAN dataset. NaPTAN is open data, but its not clear how to contribute improvements or even if that option even exists.
Nevertheless, we wanted to drive a better dataset for the companies, local transport authorities, councils and technology companies who use NaPTAN and create better passenger experiences as a result. With this goal in mind, we decided to create a free online tool designed to highlight the inaccuracies in NaPTAN by visualising them and to encourage a discussion around a healthier open data ecosystem in the UK.
Why was it important to fix this information?
NaPTAN is a UK-based dataset that uniquely identifies all 400,000+ public transport points of access in Britain, including bus stops, coach stations, railway stations, taxi ranks, and ferry terminals. Many popular journey planning apps including Citymapper, Google Maps and Passenger pull data from NaPTAN to provide information about key public transport locations (or Access Nodes).
Such journey planning apps, and the routes they calculate, are affected by stop locations or bearings. If NaPTAN data is wrong, then the journey planners that use it as part of their foundation might also be wrong. As mentioned in the Bus Services Act 2017, without accurate, up-to-date bus information, the value of the data produced is diminished, resulting in loss of passenger trust in the information supplied and in the case of NaPTAN potentially confusing them or sending them to the wrong stop.
We created Bus Stop Checker as a first step towards remedying this situation.
We knew that open data could help, but first it needed to be simplified
NaPTAN is exhaustive. While thats certainly a good thing, its a double-edged sword in that the sheer size of the data also conceals the errors within its database. We wanted to make the problems in NaPTAN more readily available by visualising them. Erroneous data in vast tables of information can be overlooked; its less easy to do so when that data is plotted on a map.
Bus Stop Checker achieves this by verifying NaPTAN data information against that held in OpenStreetMap (OSM). (The reason for using OSM being that it is the largest and most accurate open mapping database currently available.)
This process is as follows:
- Bus Stop Checker finds the nearest road in OSM with a similar name within a certain radius from the coordinates of the bus stop being checked
- It calculates the bearing of that road
- It finds the position of the stop being checked with respect to the road
- It checks if the stop bearing in NaPTAN is similar to the road bearing in OSM
- It assigns a confidence value based on this data
This confidence score assigned to each bus stop in the UK is based on a percentage, which is then translated into a simplified A-F grade. This information is aggregated and applied en masse to the local transport authorities in which each bus stop is located.
The score assigned to each bus stop and local authority in Bus Stop Checker denotes our confidence in the quality of a bus stops data (how confident we are that the bearing and location in NaPTAN are correct, based on other information available within OSM), not the accuracy of the NaPTAN data itself.
If the above process does not find enough information to determine whether a stop bearing/location is correct, we assign a value of 50% confidence to the accuracy of that NaPTAN data. Other confidence values are based on how close the bearing match is. We do not assign 0% or 100% confidence because, without checking the stops themselves, we can't be 100% confident the information recorded about them is completely right or wrong. Bus Stop Checker presents an algorithmic confidence of how correct a stop is, not that a stop is 100% wrong.
At the time of writing, Bus Stop Checker was showing 4% of the UKs stops as having low confidence in accuracy of their data. That may seem an insignificant amount, but it equates to 13,403 stops. That makes even more of an impression when visualised:
What does this mean?
By visualising confidence in NaPTAN as Bus Stop Checker, we aim to highlight any inaccuracies that could be accidentally overlooked in NaPTAN, give people the confidence to ask questions about the quality level of the data they provide and encourage a step change towards a healthier public transport data ecosystem.
Open data has proven extremely complementary to the provision of an effective, economically beneficial public transport service. This is why we not only use open data in the creation of Passenger, but also support its growth and refinement wherever we can.
Although we cannot directly correct the data contained within NaPTAN, we would like to learn about the barriers faced by local authorities responsible for NaPTANs upkeep and discuss what could be done to ensure a higher quality level in published data.
Since the launch of Bus Stop Checker
- Passenger has been invited to speak to Transport for the North to define how the transport body can benefit from the information Bus Stop Checker is able to provide
- Passenger has been invited to join the Department for Transports Bus Open Data Implementation Group alongside leading names in the sector
- Passenger has been invited to review an alpha of the Department for Transports upcoming digital transformation project and provide feedback on approach
- Transport Systems Catapult awarded Passenger as its SME of the Week for its work on the Bus Stop Checker project
- Passenger partner, Nottingham City Council, has committed to improving the data it submits to NaPTAN over the coming weeks
- Multiple requests for deeper data have been submitted via the Bus Stop Checker website
What happens next?
Were excited to see so much activity taking place around Bus Stop Checker, but were keen to continue this conversation with any other individuals, authorities or organisations seeking to improve the way they approach NaPTAN and open data in general.
Weve made Bus Stop Checker free for everyone as we want to easily identify who can improve this data and find new ways to encourage fixes. NaPTAN is an important open dataset, which should be forgotten about by any means.
Theres more to come from Bus Stop Checker yet, so please get in touch if you would like to discuss our work in more depth.
CEO of Passenger