Abstract: Despite the growing volume of available transportation data and the efforts of many cities to increase cycling levels, there remains a lack of data on where people cycle. The use of GPS trajectories have now been used in cycle studies for several years, and more recently large, crowdsourced datasets of GPS recorded cycle trips have become available and of interest to transportation and planning departments. However, there is limited research on how representative these new crowdsourced data sources are of the general cycling population.

This study uses GPS trip data from a GPS data crowdsourcing project called the Bike Data Project. It prepares a dataset of GPS for the city of Lund, Sweden and matches the trips to a street network dataset. The study creates cycle counts based on the GPS trajectories for locations through the study area and compares these to manual counts made at the same location. No correlation was found between the counts, suggesting that the GPS trajectory data set is not a reliable representation of cycle trips within the study area, likely due to a lack of unique users within the dataset. The study plugs a gap in the current literature by quantitatively testing the accuracy of different map matching algorithms for matching cycling GPS data to the street network, by comparing them to ground truth trips of the actual routes taken, in the first example of this for cycle data. It is found that in certain situations in dense networks the matching cannot be relied upon to give the correct link, which could have implications for studies looking to quantify the percentage of cycling taking place on different road infrastructure types.

Source: Preparation and analysis of crowdsourced GPS bicycling data : a study of Skåne, Sweden