
Making the choice between accessing data using an API or downloading data directly and storing it locally can be a tricky decision.
Downloading data directly is considered to be point-in-time snapshots of any given data set. So, unlike accessing data via an API, the version of the data in snapshots may not always reflect the most up-to-date data. For many users of data this either may not be an issue, or the ease of access to snapshots (i.e. simply downloading a file) is a worthwhile trade-off.
Deciding which option is best for you will vary from case to case.
APIs
Pros –
- Are always up-to-date with the latest data provided by data custodians.
- Can be simple to use in supported software. e.g. For geospacial data APIs, there are many common GIS and web-based mapping tools that integrate with APIs without needing to download, extract, and re-host the data yourself.
- Often support a range of advanced querying and filters options to enable fast and easy access to larger data sets without the need to download the whole data set.
- Data sets which are completely open and public may be accessed without needing to register for an account.
Cons –
- Retrieving any moderately sized data set may require many consecutive HTTP web requests to retrieve the data, merge it together into a single data set, before saving it to your local system.
- Are inherently more prone to failure and slower response times due to the technical complexity inherent in receiving, parsing, querying, and translating data directly from the database.
Direct Download
Pros –
- Available in a variety of formats suitable suitable for a wide range of desktop and web-based software.
- Are less prone to failure due to the lack of technical complexity – snapshots are simply hosted on a file-server with none of the technical overhead of accessing databases or providing APIs.
- Are stored as ZIP files to reduce their file sizes (where relevant).
Cons –
- May not always contain the latest data. Snapshot are recreated on a daily, weekly, or monthly basis and data will often change several times between the creation of one snapshot and the next.
- It’s not possible to filter snapshots before downloading them, so accessing larger data sets may involving downloading several gigabytes in order to extract only a few megabytes of data.
Leave a Comment