1.0 Introduction
This will be a bit technical, no biodiversity here, but I want to get this down in print. What is “this”? Well, the story and methodology behind Data Dwellers and my journey into data analytics. I will keep it short, because it will be text-only … let’s get cracking.
2.0 Personal Story
I recently started on this data analytics journey and have picked up, or improved, skills like Excel, SQL, Tableau visualisations and the R programming language using RStudio. These are the basics you need and for future Data Dwellers posts I’ll probably have to use all those tools. For example, I’ll use Tableau to create a dashboard with some nice-looking visuals that might get people more interested in what I have to say.
The data analytics basics were picked up through Google’s Data Analytics course. An awesome online learning experience. Hard work, but highly recommended if you are interested in this topic.
I’ve specialised in Sports Data Analytics through a course at the Johan Cruyff Institute. For those who don’t know, Johan Cruyff was a famous football/soccer player from The Netherlands who played for Ajax and Barcelona (where the JCI is located). It was a great experience, and the course instructor was inspirational, he really helped me out a lot and provided great feedback to the work I handed in. The main benefit was using my general skills and applying them to a specific topic, football in this case.
Then, I also did an online course for Statistics which was given by Stanford University. Hah, not for the faint of heart and I’m super proud of myself for passing this.
I’ve always had an interest in data, statistics, and numbers. So, why not take my new skills and apply them to biodiversity!?! But how do I approach data and what do I do with it to get it ready for you the reader? …
3.0 The Methodology
Having acquired these skills and a passion for data analytics, I embarked on a journey to apply them to biodiversity. The following methodology outlines the steps I take to analyse and present the data effectively. For each species I highlight in this series, the end result will be slightly different.
- I’ll need to find a good source for data, which include repositories like the GBIF, Atlas Hymenoptera, and others.
- Then I’ll need to clean up the data. Most data files for species will have 1000s of observations from across the world. However, the focus of this blog is Cantabria, so I’ll at least have to focus the data down to Spain only.
- After that, the goal is to analyse what is there, are there any issues with the data? What are the limitations? There will be a host of questions to answer, and I’ll make sure to touch on them in the posts.
- A story will need to be developed … what is the data telling us? Is it the data that is interesting, or is the more interesting aspect the data that is “missing”? “Missing” data can be interesting for rare species because there will be so few observations of them, so can those lead us to possibly go out and look for the species in areas where we currently do not have observations recorded for them?
- Make some visualisations using Tableau and RStudio.
- Write up the blog post.
4.0 Next Steps …
So, the next steps are to write up a number of posts. The goal is to link these Data Dwellers posts into a specific species post found in Fly Facts etc. However, some of the more interesting Data Dwellers subjects probably won’t have a species post yet, because they are rare species that I’ve not yet seen.
I have a lot more to say, so if you are interested, then let’s get a conversation going in the comments section of this post.