I am enjoying the end of the year 2021 in Cambridge, hosted as a
visiting fellow at the McDonald Institute for Archaelogical Research.
For an article I am about to submit, I need a situation map showing
where the Karabel relief
is located. A perfect occasion to use
OpenStreetMap data with
R, but it demands to create a
static exhibit. Eventually, I found myself mapping my last run instead
of working on the situation map. It was more compelling this way to
learn the how-tos and to go through the documentation of the different
packages. In short, this post is about how I (mis)used the
osmdata package combined
with sf and
ggplot2 to make a quick
map of dummy data. It starts with the loading of running data undergoing
some data wrangling. This will give us some stats to add to the plot,
such as the distance covered or the pace. Then it goes to import OSM
data and how to plot everything together via sf+ggplot2.
Set-up
Load the packages (“stopifnot installed”)
I import some colour keys for the plot from my file
def_colours.R. Colour keys are stored in
a separate file so it is easier to reuse. I tried to make colour
names meaningful, so I may remember them. It happens to be a
reccurent problem to me….
We will need some running
data.
Originally, my trace was recorded with a GPS device,
but I am logging my runs and biking trips with
GoldenCheetah. It is
free software, and you can use it offline (for me, this
means owning your data). I used the export function to
JSON that I can now import into the object result.
Now, that we imported the object ride into R, let us keep
only the main data.frame, called “SAMPLES”:
It has the following variables: SECS, KM, KPH, ALT, LAT, LON, SLOPE, TEMP.
For convenience, I withdraw some data and do some formatting
And I transform the ride data.frame into sf
In the first version of the map, I plotted the distance covered
every 10 minutes and created e10m . Then I changed my mind and
wanted to plot my pace for every kilometre. In both cases, I
create directly a label in a variable LABEL.
This will give label such as “1st k pace
5.8min/km” (meaning the pace
for the 1st kilometre).
Take it on line
The data are recorded as a series of points for every second, but we
want to plot the ride as a continuous line, and we keep only the
coordinates.
With the line, it is easy to compute the length of the ride and
from there the average speed
ride_average_speed? 9.71km/h? Not too bad. Now, the data of the ride
are ready, we will need to get geographical information.
Using osmdata
With osmdata, the most convenient way I found to access the OSM data was
first to download the data from the area of my ride and then to extract
features. I create different objects mimicking a layer paradigm for the
plot.
Where did I run? We compute the bounding box for the plot by expanding
of 15% the limit of the ride bounding box
Now I use the box to download the OSM data within the box
This object contains various geometries (points, lines,
polygons, and more). The aim of the next chunk of code
is to extract the different elements into different
objects. I create a convenient function cull_osm to
delete empty (NAs) column when a specific feature is
subset (copy-paste from mnel posted on
SO).
It does not bring anything, but I prefer to have neat
data.
From there, I do a dummy stack of the features with specific
colouring and sizing, adding a minimum of labels and some info
Let’s have a preview of the data
Ok, it is time to add the layers with the ride trace
and the stats. I am changing the theme and the general
layout, but this is only to improve my ggplot grammar
by training.
I am not convinced with this plot, but I think it is a
decent start.
Testing replicability
When I was writing these lines, I thought I should go and
record another run to see how my code performs.
Originally, I wanted to run the same track, but I lost
myself and I did a slightly different route. Well, it’s
probably better for the
replicability
(and not only the reproducibility of the experiment).
This next map is the result if you run (ahem) the code
presented above (script here), only by swapping the data exported
from GoldenCheetah.
Well, a lot of overlays. The key takeaway from this exercise
is obviously that I should not circuit! Or being faster at
it… Otherwise, if the map is static (a requirement
for the situation map I want to do), reducing the amount of
text and put the kilometres (1st, 2nd, 3rd) into the point.
Likewise, for circuit, I should change the orientation of
the label (not too difficult to orient them according to the
location in the map: left goes left, top goes up, and so
on). At least, there is room for improvement in my running,
orientation, and coding skills!