Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The above user table has 2 records. One for Jane Doe and another for Zues. Videos 1 - 10 in the reference videos section above give you background on many ways you can modify or display specific columns on individual tables - so please watch them and experiment. However, where things get really interesting is video 11. In this video, Nathan shows us how to connect different tables together using primary and foreign keys using an operation called a “JOIN”. This is a somewhat difficult concept, but is absolutely essential to getting the most of the LiteFarm database. We’ll get into that in just a moment, but first…

In LiteFarm, we almost always use “soft deletes”. This means that when a user “deletes” data in the app what’s actually happen happening is we’re setting a flag in the database that says that data is now deleted, e.g. deleted = true. A “hard delete” is when the row of data is actually just removed from the table altogether. In LiteFarm, the user interface then knows to not display those records that have deleted = true. As such, you’ll also need to make sure and exclude deleted records when you’re pulling information from the database or risk pulling incorrect information. In most cases this just means you’ll need to add the following to whatever table or tables you’re querying:

Code Block
AND deleted = FALSE

...

For this, we’ll need to join the user, userFarm, and farm tables.

Image RemovedImage Added

Looking at the ERD we can see:

...

There are innumerable ways to query information about farms and users, so please bring any specific problems you may have in mind to our training and we’ll work them through together.

Farms consist of many different location types, such as farm site boundaries, fields, and gates. Many attributes of these specific location types are general attributes of every location, so you’ll typically need to join the type and location to see all of what you’re looking for.

Additionally, there’s a separate idea regarding locations that describes the “physical” attributes of the location. Things like it’s location (using latitude and longitude or a collection of latitudes and longitudes - collectively known as ‘grid points’), it’s area, perimeter, etc. We link this idea to a location through the “figure” table. Furthermore, there are three types of figures: areas (things that have area), points, and lines.

You can learn much more about these concepts on this page: Location categories: Areas, Lines, & Points. For now, it’s just important to know you may need to do joins across each of these tables to get all the information for a particular location or set of locations.

...

Lets start with a very simple query to find out how many locations exist on a particular farm:

Code Block
SELECT COUNT(*) 
FROM location 
WHERE farm_id = 'abc' -- You would input the actual farm_id in the place of 'abc'
AND deleted = FALSE

Digging a little deeper, we could list out the name and some of the attributes of all of the locations for a specific farm by executing the following:

Code Block
breakoutModewide
SELECT name, type, ROUND(a.total_area) AS "Total Area m^2", 
  ROUND(a.perimeter) AS "Total perimeter m", notes
FROM location l
JOIN figure f ON f.location_id = l.location_id
FULL JOIN area a ON a.figure_id = f.figure_id -- Note the full join here since not all figures are areas
JOIN farm fa ON l.farm_id = fa.farm_id
WHERE l.farm_id = 'abc' -- You would input the actual farm_id in the place of 'abc'
AND fa.deleted = FALSE
AND l.deleted = FALSE

Here’s an example of what the above query returns when run on an actual demo farm:

...

A quick note: All areas are stored in m^2. All lengths are stored in m. If you need other units, you’ll need to convert them during your query or in post-processing. For example, to convert m^2 to Ha, you could do the following:

ROUND(a.total_area, 2)/10000 AS "Total Area (Ha)"

Obviously, this will be a case by case basis on the types of figures you’re dealing with.

Taking a step back. If we were interested to know about mapped locations across the entire dataset rather than a specific farm, we could extend this to learn how many locations all farms in the system have by executing the following:

Code Block
SELECT f.type, COUNT(f.type)
FROM location l
JOIN figure f ON f.location_id = l.location_id
FULL JOIN area a ON a.figure_id = f.figure_id
JOIN farm fa ON l.farm_id = fa.farm_id
WHERE fa.deleted = FALSE
AND l.deleted = FALSE
GROUP BY f.type

Executing this query would tell us the number of each type of location across the entire system:

Image Added

Doing so, we can see many more fields are created than any other type of location. No surprise there!

We might also be interested in learning more about specific types of locations on a farm. For example, it might be interesting to know how many fields different farms have.

Code Block
SELECT farm_name, COUNT(f.type) AS "Number of fields"
FROM location l
JOIN figure f ON f.location_id = l.location_id
FULL JOIN area a ON a.figure_id = f.figure_id
JOIN farm fa ON l.farm_id = fa.farm_id
WHERE f.type = 'field' -- Here we limit the query to only view fields
AND fa.deleted = FALSE
AND l.deleted = FALSE
GROUP BY farm_name, f.type
ORDER BY COUNT(f.type) DESC

We can easily extend this query to do some descriptive analysis of those fields:

Code Block
SELECT farm_name, COUNT(f.type) AS "Number of fields", 
  SUM(a.total_area)/10000 AS "Total Area (Ha)", 
  ROUND(AVG(a.total_area)/10000, 4) AS "Avg. field size (Ha)", 
  MAX(a.total_area)/10000 AS "Largest field (Ha)", 
  MIN(a.total_area)/10000 AS "Smallest field (Ha)"
FROM location l
JOIN figure f ON f.location_id = l.location_id
FULL JOIN area a ON a.figure_id = f.figure_id
JOIN farm fa ON l.farm_id = fa.farm_id
WHERE f.type = 'field' 
AND fa.deleted = FALSE
AND l.deleted = FALSE
GROUP BY farm_name, f.type
ORDER BY COUNT(f.type) DESC

Leaving out the farm names for privacy, we can see a wide variety of field sizes here:

...

It might be good to pause here and briefly discuss data quality. LiteFarm is a publicly available app and anyone can enter whatever data they would like. We should make an effort to analyze data returned in our queries and attempt to make them more defensive in nature to exclude obviously bad data. For example, Using the last query in the locations section, sorted in DESC order by “Total area under production (Ha)” we see the following 2 farms:

...

According to Herdsy.com, the biggest farm in the world is 22.5m acres. The 5th largest is 4m acres. 4m acres is roughly 1.6m hectares. In fact, most of the largest farms in the world (by area) are cattle or dairy operations with the largest crop producing farms being significantly smaller (less than 200,000 Ha). Since LiteFarm doesn’t support dairy or cattle operations, and we don’t exactly target super large farms, we can probably safely set our upper bounds on area significantly lower, perhaps 50,000 Ha.

We might also consider excluding some obvious terms dabblers might put into their farm names to indicate they aren’t real data points, such as “Fake” or “Test”. As such, our WHERE clause might grow to look as follows:

Code Block
WHERE ... 
AND farm.deleted = FALSE
AND location.deleted = FALSE
AND LOWER(farm_name) NOT LIKE '%test%'
AND LOWER(farm_name) NOT LIKE '%fake%'
AND (a.total_area/10000) < 50000