This Week’s Top 5 Tips to Solve Performance Bottlenecks in Fruit-Counting

Jose Gabriel Islas Montero

May 20, 2023

min read

This Week’s Top 5 Tips to Solve Performance Bottlenecks in Fruit-Counting

In this article, we will explore a practical use case of AI in agriculture: fruit counting. We will show two common data-quality failures and three model failures that can occur in object detectors used for fruit counting, along with solutions to address them.

‍

By following this walk-through, you will understand of how AI can be leveraged in agriculture, and learn best practices for implementing and troubleshooting YOLO v8 for fruit-counting.

‍

Agriculture challenges and how AI can help to overcome them
Model failure vs Data-Quality failure
Tips to deal with model & data-quality failures
Summary

‍

1. Agriculture challenges and how AI can help overcome them

There are several common challenges in agriculture, including:

Crop monitoring: Monitoring crops for disease, pests, and overall health is a time-consuming and labor-intensive task.
Yield prediction: Farmers need to predict yields to optimize production, but this can be difficult due to the many factors that influence crop growth.
Irrigation management: Over- or under-irrigation can have a significant impact on crop yield and quality, but it can be challenging to determine the optimal amount of water to use.
Harvesting: Picking fruits and vegetables at the right time is essential for maximizing yield, but it can be challenging to identify when crops are ready for harvest.

‍

AI, and in particular, object detection, can help overcome some of these challenges:

Crop monitoring: By using object detection to identify disease, pests, and other crop issues, farmers can quickly pinpoint problem areas and take action to address them before they spread.
Yield prediction: Object detection can be used to monitor crop growth and health, allowing farmers to predict yields more accurately and adjust their production accordingly.

‍

2. Model failure vs Data-Quality failure

In object detection, model failure and data-quality failure are two common types of failures that affect model performance.

‍

Model failure takes place when the detector fails to accurately identify or locate objects in an image. This can happen due to a variety of factors, such as poor model architecture, insufficient training data, or overfitting. Model failure can be addressed by improving the model architecture, collecting more diverse training data, or regularizing the model to prevent overfitting.

‍

Data-quality failure happens when the input data provided to the object detection model is of poor quality, making it difficult for the model to accurately detect objects. Examples of data-quality failures include low resolution images, images with poor lighting or shadows, or images with objects partially occluded or obscured. Data-quality failures can be addressed by improving the quality of input data by using better cameras, adjusting lighting conditions, augmenting the data, or manually curating the input data to remove low-quality images.

‍

3. Top 5 suggestions on how to deal with data-quality and model failures in object detection

‍

3.1 Our task

Our goal is to identify how object detectors fail, for that we first need a trained model capable of predicting apples given a set of examples. Detecting apples using object detection can help farmers assess fruit quality, estimate yields, and identify areas that need attention.

‍

Dataset credits: Arnold Schmid.

Figure 1. Our task consist on detecting apples using a custom dataset of apple trees

3.2 Our colab notebook to follow along

We prepared a colab notebook that includes both how to train YOLO v8 on our custom dataset and some of the failures we are covering, we suggest to follow along in this notebook.

‍

3.3 Dealing with failures

For each failure we use a simple three step process:

1. Definition
2. Failure
3. Potential solution

‍

To speed up the failure identification process, we will use Tenyks, the most advanced failure analysis platform for computer vision. The Tenyks platform significantly reduces the time it takes to find model and data-quality failures in a real-world agriculture use case.

‍

Tip # 1. Data-quality failure: missing annotations

Definition: Object detection models learn to detect objects based on the examples and labels in the training data. Human annotators may have missed some objects or mislabeled them, resulting in incorrect ground truth labels.

‍

Failure: Using the Tenyks platform we found two scenarios where we can observe a recurring pattern of missing annotations.

‍

Figure 2. The Tenyks platform can be used to find missing annotations

‍

Scenario 1: apples on the edge of the image often have missing annotations.

Figure 3. Missing annotations on the top-right edge of the image

Figure 4. Missing annotations on the top-right edge of the image

Figure 5. Missing annotations on the bottom edge of the image

Figure 6. Missing annotations on the top-right edge of the image

Scenario 2: apples located around dense groups of apples often lack annotations.

Figure 7. Missing annotations on the bottom-center of the image

Figure 8. Missing annotations on the top-center of the image

Potential solution: perform data-quality checks on the annotations to identify missing labels and have annotators fix them: this is probably the most pragmatic way to address this failure. Caveat: it may not scale well for very large datasets.

‍

⭐ But, what are data-quality checks? These are procedures designed to evaluate and improve the quality of your dataset annotations. Some useful techniques for object detection may include:

Manually go through a sample of the annotations (e.g. 10–20% of images) to check for missing labels. This allows you to estimate the overall error rate in your dataset and re-annotate as needed.
Have multiple annotators label the same set of images independently, then compare their annotations for discrepancies.
Apply your trained object detection model to a held-out validation set to check what objects it’s able to detect and which ones it misses. This can indicate if there are any systematically missing annotations for some object classes or instances.

‍

🔥 At tenyks.ai we are currently working on data-quality checks! Stay tuned!

‍

Tip # 2. Data-quality failure: occlusion

‍Definition: Occlusion refers to when one object obscures another object, hiding parts of it from view.

Objects may be overlapping or closely positioned in images, with one object partially obscuring the other. Without properly handling occlusion, the obscured objects will not be fully annotated or detected.
Occlusion disproportionately impacts smaller or thinner objects that have more of their surface area obstructed. These hard-to-see objects are more likely to have missing or inaccurate boxes/labels.

‍

Failure: thanks to the Tenyks platform we can quickly navigate through the dataset to find occluded objects in several examples.

‍

Figure 9. Finding occluded failures using one of the Tenyks platform features

‍

Scenario 1: Apples often occluded by tree’s branches or leaves.

Figure 10. Center-located apples occluded by leaves

Figure 11. Center-located apples occluded by leaves

Figure 12. Bottom-right located apples occluded by leaves

Scenario 2: Apples often occluded by other apples.

Figure 13. Apples around the center of the image tend to heavily occlude other apples

Figure 14. Apple located on the top-center is occluded by other apple, which is in turn occluded by the main apple on the center stage

Figure 15. Apple on the right-hand side almost completely covers other apple

‍Potential solution: apply augmentations to the dataset including changes in color, light, and other techniques to expose the model to more occluded examples. Albumentations is a great library to start with.

‍

Other practical ways of dealing with this problem are:

Annotate current non-labeled occluded objects (or add new images containing occlusion): higher quality annotations translate into more performant models.
Provide bounding box-level occlusion labels to teach the model which parts of an object are obscured. The model can then factor out the occluded features when making a detection prediction.
Artificially occlude objects in the dataset using bounding box overlays, image segmentation masks, etc. This shows the model how objects appear when partly occluded in various ways.

‍

Tip # 3. Model failure: undetected objects

‍Definition: this failure is related to objects that should have been detected but were missed.

‍

A few reasons why models present this failure are:

The model is prioritizing precision (over recall) by choosing a confidence_threshold that is too high.
Some objects may be too small, too large or oddly shaped for the model to properly classify and localize them.
The model may simply not have enough examples of that particular object category in its training data.

‍

Failure: the Tenyks platform has a built-in feature that helps you filter out all the images that contain undetected objects: you simply select the Ground Truth class & False Negatives to obtain the samples containing one or more undetected object.

‍

This feature alone can easily cut the time to spot undetected objects by 10x, otherwise you might need to put together a bunch of scripts in tensorflow or pytorch to separate: images, annotations, bounding boxes, undetected bounding boxes and predicted bounding boxes!

‍

Figure 16. The Tenyks platform in action: finding undetected objects

‍

Scenario 1: undetected apples in images taken under low light exposure or in obscure surroundings

Figure 17. Undetected apple in a low light exposure image

Figure 18. Undetected apples (top-left and top-right) in a obscure image

Figure 19. Undetected apple (bottom) surrounded by contrasting light exposure

‍

Scenario 2: undetected apples in images where apples are located on the edge of an image

Figure 20. Undetected apples (center-left- and center-right) on the edge of an image

Figure 21. Undetected apple (bottom-right) on the edge of an image

Figure 22. Undetected apple (bottom-center) on the edge of an image

‍Potential solution: first, be aware of the confidence_threshold you are using, and second, match this value with your use case.

‍

In principle, for our use case, we can infer that a farmer’s goal is to detect as many apples as possible, missing detections may impact the farmer’s yield forecast, hence find a confidence_threshold that maximizes this goal. We can use the recall curve in our validation set as a guide to test different values.

Figure 23. Recall curve for our validation set

The recall curve above provides a few interesting insights:

a) A low threshold, say 0.2, has a very high recall. The downside is that we risk having too many false positives.
b) A threshold of 0.8 will make our model very precise, however it will badly increase the number of undetected apples.
c) A threshold of 0.5 seems the most reasonable choice to start with.

‍

The following figure shows how a confidence_threshold of 0.8 misses several objects, while a confidence_threshold of 0.5 is quite better at avoiding undetected apples.

Figure 24. A confidence threshold of 0.5 is better at avoiding undetected objects

⭐ Have a quick refresher on Recall, Precision, Confidence Threshold and mAP by checking our post on Mean Average Precision (mAP): Common Definition, Myths & Misconceptions.

‍

Other approaches to deal with undetected objects include:

Fine-tune your model with a dataset from your use case. We already do this, but often general pre-trained only models found on model hubs (i.e. Hugging Face) aren’t good at detecting all desired objects. We show this on Tip # 5 Domain shift 🕵.
Tweaking the anchor boxes in YOLO to better match the shapes and sizes of missed objects is useful. Adding more anchors with different aspect ratios introduces anchors that can match objects of wider or taller shapes, not just squares or 1:1/2 aspect ratios. This allows the model to detect objects it couldn’t before due to lacking anchor references.

‍

Tip # 4. Model failure: false positives

‍Definition: False positives occur when the model detect objects that are not actually present in the image. The model mistakes something else for an object or detects an object where there is none.

‍

Models fail due to false positives for some of the following reasons:

The model confidence_threshold is too low.
The dataset lacks enough samples containing the false positive object.
Samples containing dense group of objects, reflections, or shadows can confuse models to predict objects that aren’t present.
Anchors that are disproportionately large or have extreme aspect ratios may detect essentially any image content as some object, producing false positives.

‍

Failure: we use the Tenyks platform to find false positive examples:

Scenario 1: false positives on images with bounding boxes of apples on the edge

Figure 25. False positive prediction of apple on the bottom, where there’s clearly no

Figure 26. False positive prediction of apple (top-right) on the age

Figure 27. False positive prediction of apple (top-left) on the edge

Figure 28. False positive prediction of apple (bottom-left) on the edge

‍Potential solution: adjust the confidence_threshold you are using. A higher confidence will certainly reduce the number of false positives (but increase the missed objects).

Other approaches to reduce this failure may include:

Apply data augmentation techniques to increase the effective size of your training dataset.
Fine-tuning can help the model learn features that are more relevant to your use-case, which could lead to a decrease in false positives.
Combine the predictions of multiple YOLO models trained with different hyperparameters. This can reduce false positives by considering only the detections that are consistent across multiple models.
If you have access to a large amount of unlabeled data, you can consider advanced techniques such as active learning or even self-supervised learning.

‍

Tip # 5. Model failure: domain shift

Definition: Domain shift takes place when a ML algorithm exhibits suboptimal performance on a novel domain, which is distinct from the domain it was originally trained on.

This phenomenon arises when a model is trained on a particular dataset, known as the source domain, but is then tested on a different dataset, referred to as the target domain, which is derived from a dissimilar data distribution.

‍

⚠️ Our YOLO v8 model is pre-trained on the COCO dataset, which contains a wide variety of object classes and is not specific to apple detection in a farm.

‍

The following figure shows how if we use the pre-trained model only (i.e. no fine-tuning), the model fails at detecting apples (i.e. it has poor performance on the target task -> domain shift).

Figure 29. The model fails at predicting apples without fine-tuning

‍Potential solution: Fine-tune the pre-trained model on our custom apple dataset using transfer learning. This involves freezing the early layers of the network and only updating the later layers to learn the specific apple detection task.

‍

Fine-tuning can help the model adapt to the target domain and improve detection performance as shown on Figure 30 .

Figure 30. Fine-tuning on our custom dataset helps alleviates the domain shift failure

4. Summary

Two issues arised from the dataset itself, data-quality failures: Missing annotations and occlusion made training data incomplete, limiting the model’s ability to detect all apples present or handle heavily occluded cases. Three model failures materialized: undetected objects, false positives, and domain shift revealed apples the model missed entirely, detected non-apples as apples, or failed to generalize beyond the distribution of scenes in the training data.

‍

We leveraged the Tenyks platform to quickly identify model and data-quality failures. We showed how using the tools of Tenyks we can rapidly identify false positives, false negatives, and many other errors in object detection.

‍

Before that, we trained a YOLO v8 model to detect apples in images, fine-tuning the pre-trained weights with a custom dataset of apple orchard scenes. While fine-tuning improved the model’s apple detections, we identified the key failures described above.

‍

Stay tuned for future posts where we will continue showing you how to use the Tenyks platform to solve data-quality and model failures for agriculture companies leveraging AI.

‍

Authors: Dmitry Kazhdan, Jose Gabriel Islas Montero

‍

If you would like to know more about Tenyks, sign up for a sandbox account.