...
For this, we’ll need to join the user, userFarm, and farm tables.
Looking at the ERD we can see:
...
There are innumerable ways to query information about farms and users, so please bring any specific problems you may have in mind to our training and we’ll work them through together.
Locations
Farms consist of many different location types, such as farm site boundaries, fields, and gates. Many attributes of these specific location types are general attributes of every location, so you’ll typically need to join the type and location to see all of what you’re looking for.
Additionally, there’s a separate idea regarding locations that describes the “physical” attributes of the location. Things like it’s location (using latitude and longitude or a collection of latitudes and longitudes - collectively known as ‘grid points’), it’s area, perimeter, etc. We link this idea to a location through the “figure” table. Furthermore, there are three types of figures: areas (things that have area), points, and lines.
You can learn much more about these concepts on this page: Location categories: Areas, Lines, & Points. For now, it’s just important to know you may need to do joins across each of these tables to get all the information for a particular location or set of locations.
...
Lets start with a very simple query to find out how many locations exist on a particular farm:
Code Block |
---|
SELECT COUNT(*)
FROM location
WHERE farm_id = 'abc' -- You would input the actual farm_id in the place of 'abc'
AND deleted = FALSE |
Digging a little deeper, we could list out the name and some of the attributes of all of the locations for a specific farm by executing the following:
Code Block | ||
---|---|---|
| ||
SELECT name, type, ROUND(a.total_area) AS "Total Area m^2",
ROUND(a.perimeter) AS "Total perimeter m", notes
FROM location l
JOIN figure f ON f.location_id = l.location_id
FULL JOIN area a ON a.figure_id = f.figure_id -- Note the full join here since not all figures are areas
JOIN farm fa ON l.farm_id = fa.farm_id
WHERE l.farm_id = 'abc' -- You would input the actual farm_id in the place of 'abc'
AND fa.deleted = FALSE
AND l.deleted = FALSE |
Here’s an example of what the above query returns when run on an actual demo farm:
...
A quick note: All areas are stored in m^2. All lengths are stored in m. If you need other units, you’ll need to convert them during your query or in post-processing. For example, to convert m^2 to Ha, you could do the following:
ROUND(a.total_area, 2)/10000 AS "Total Area (Ha)"
Obviously, this will be a case by case basis on the types of figures you’re dealing with.
Taking a step back. If we were interested to know about mapped locations across the entire dataset rather than a specific farm, we could extend this to learn how many locations all farms in the system have by executing the following:
Code Block |
---|
SELECT f.type, COUNT(f.type)
FROM location l
JOIN figure f ON f.location_id = l.location_id
FULL JOIN area a ON a.figure_id = f.figure_id
JOIN farm fa ON l.farm_id = fa.farm_id
WHERE fa.deleted = FALSE
AND l.deleted = FALSE
GROUP BY f.type |
Executing this query would tell us the number of each type of location across the entire system:
Doing so, we can see many more fields are created than any other type of location. No surprise there!
We might also be interested in learning more about specific types of locations on a farm. For example, it might be interesting to know how many fields different farms have.
Code Block |
---|
SELECT farm_name, COUNT(f.type) AS "Number of fields"
FROM location l
JOIN figure f ON f.location_id = l.location_id
FULL JOIN area a ON a.figure_id = f.figure_id
JOIN farm fa ON l.farm_id = fa.farm_id
WHERE f.type = 'field' -- Here we limit the query to only view fields
AND fa.deleted = FALSE
AND l.deleted = FALSE
GROUP BY farm_name, f.type
ORDER BY COUNT(f.type) DESC |
We can easily extend this query to do some descriptive analysis of those fields:
Code Block |
---|
SELECT farm_name, COUNT(f.type) AS "Number of fields",
SUM(a.total_area)/10000 AS "Total Area (Ha)",
ROUND(AVG(a.total_area)/10000, 4) AS "Avg. field size (Ha)",
MAX(a.total_area)/10000 AS "Largest field (Ha)",
MIN(a.total_area)/10000 AS "Smallest field (Ha)"
FROM location l
JOIN figure f ON f.location_id = l.location_id
FULL JOIN area a ON a.figure_id = f.figure_id
JOIN farm fa ON l.farm_id = fa.farm_id
WHERE f.type = 'field'
AND fa.deleted = FALSE
AND l.deleted = FALSE
GROUP BY farm_name, f.type
ORDER BY COUNT(f.type) DESC |
Leaving out the farm names for privacy, we can see a wide variety of field sizes here:
...
A brief aside on data quality
It might be good to pause here and briefly discuss data quality. LiteFarm is a publicly available app and anyone can enter whatever data they would like. We should make an effort to analyze data returned in our queries and attempt to make them more defensive in nature to exclude obviously bad data. For example, Using the last query in the locations section, sorted in DESC order by “Total area under production (Ha)” we see the following 2 farms:
...
According to Herdsy.com, the biggest farm in the world is 22.5m acres. The 5th largest is 4m acres. 4m acres is roughly 1.6m hectares. In fact, most of the largest farms in the world (by area) are cattle or dairy operations with the largest crop producing farms being significantly smaller (less than 200,000 Ha). Since LiteFarm doesn’t support dairy or cattle operations, and we don’t exactly target super large farms, we can probably safely set our upper bounds on area significantly lower, perhaps 50,000 Ha.
We might also consider excluding some obvious terms dabblers might put into their farm names to indicate they aren’t real data points, such as “Fake” or “Test”. As such, our WHERE clause might grow to look as follows:
Code Block |
---|
WHERE ...
AND farm.deleted = FALSE
AND location.deleted = FALSE
AND LOWER(farm_name) NOT LIKE '%test%'
AND LOWER(farm_name) NOT LIKE '%fake%'
AND (a.total_area/10000) < 50000 |