Yield Mapping in Apple Orchards

A study of Yield Mapping in Apple Orchards.


In the paper[1] the researchers present a method to reconstruct an apple orchard and to estimate yield. The main focus of this Medium post is on their state-of-the-art yield mapping solution. There will be a short part dedicated to their fruit counting method. To give a quick idea of what this yield mapping is capable of, figure 1 shows an example of the reconstructed tree row.

Yield mapping technique is necessary in precision farming and tasks like seeding, crop inspection and harvesting that need yield mapping information to work. These days there is a variety of commercial solutions available on the market. The most of these solutions are dedicated to wheat, rice and maize like crops. The most challenging crops are fruits and vegetables due to the complex shapes of orchards compared to row crops. The researchers came with a stunning solution that uses a RGB-D (a camera with 4 channels Red, Green, Blue and depth) camera.


Multi-view Fruit Tracking

Imagine a camera that records video data, is attached to a platform that drives through the orchard. When the platform is done collecting all video data, there are multiple frames with the same apple in the video data. This causes double or more countings. The researchers managed to fix this problem with a special filter technique named “Multi-view Fruit Tracking” see figure 2. This double counting filter is based on affine tracking[2] and incremental Structure From Motion (SfM)[3].

Affine tracking [2] in short, this principle is to identify points in the image which are easy to distinguish from their environment. This enables the researchers to track them. By tracking enough points, we can get an idea of the way the camera is moving through space.

Incremental Structure From Motion [3] in short, this technique determines the spatial and geometric relationship of the target through the movement of the camera, which is a common method of 3D reconstruction.

Multi-view fruit merging and Yield Estimation

This technique takes the single side treeline reconstructions, and multi-view fruit counts as inputs. It merges the input reconstructions from front and back treeline sides using semantic information [4], to avoid double/over counting. This makes fruits visible from the front and back view of the tree and outputs the total fruit count for the tree row. In figure 3 the reconstructed treeline is shown.


The researchers used six datasets, three RGB-D see figure 4 (Red, Green , Blue and Depth channels) and three RGB datasets. These six datasets contain tree row front and back sided image data. All imagery (RGB and RGB-D) data is captured using the Intel RealSense R200. The RGB-D and RGB data is used for testing the merging/reconstruction algoritm. The yield estimation used the RGB data only (see section: Multi-view fruit merging and Yield Estimation).


Examples of this dataset are shown in figure 4.

  1. Dataset-I (968 images at 30 fps, 1920x1080) contains an apple-tree row of 21 trees with a lot of wild weed captured in a horizontal view.
  2. Dataset-II (2394 images at 60 fps, 640x480) contains 27 trees captured in a tilted view with a focus on tree trunks.
  3. Dataset-III (2020 images at 60 fps, 640x480) of 30 trees is collected by a camera attached to a stick in a tilted-top view of the tree canopies.


Examples of this dataset are shown in figure 5.

  1. Dataset-IV (873 images at 30 fps, 1920x1080) contains six trees that are mostly planar. Most of the apples on these trees (with 270 apples in total) are fully red and visible from two sides.
  2. Dataset-V (1065 images at 30 fps, 1920x1080) contains ten trees that have non-planar geometry. Apples (274 in total) in these trees are mostly red.
  3. Dataset-VI (831 images at 30 fps, 1920x1080) contains six trees that have non-planar geometry. Fruits (414 apples in total) in these trees are a mixture of red and green apples.

Yield Mapping Results

As shown in figure 5, the yield mapping approach is capable of buiding well-aligned global 3D models of tree rows. The merging algoritm needs only two-sides objects (front and back view). Duplicated poles and trunks are all merged. This merging technique is validated by visual checking, if the misalignment of landmarks (e.g., poles and tree trunks) is eliminated. The accuracy of the merging are tested by comparison with manual measurements of trunk diameter and tree height. In figure 5 the yellow boxes are misalignments.

How well does it works

Below in table 1. A comparison shows how well the trunk estimation algorithm works. The mean trunk error is around 0.49 cm. This trunk diameter estimate algorithm is tested on dataset 2, because this dataset focusus on the trunk.

Below in table 2. A comparison is shown how well the height estimation algorithm performs. The mean height error is around 0.038m. This height estimate algorithm is tested on dataset 3, because this dataset focusus on the overall tree.


This yield mapping technique is one of the most sophisticated solutions to do yield estimation. however, this Medium post focuses on the merge technique only, we can say that the overal yield estimation system(from detection, counting and merging) looks very promising. The researchers think this work will be helpfull in furter research, they plan to localize induvidual apples in 3D space so they can build a complete map, witch can be used by an apple picker robot as data for path planning. I as a writer have great respect for the researchers for all the effort it took to accomplish this. I am sure that this technique will become one of the most important techniques in the agriculture farming revolution.


[1] Dong, W., Roy, P., and Isler, V., “Semantic Mapping for Orchard Environments by Merging Two-Sides Reconstructions of Tree Rows”, arXiv, 2018.

[2] Baker, S. and Matthews, I. (2004). Lucas-Kanade 20 Years On: A Unifying Framework. International Journal of Computer Vision, 56(3):221{255.

[3] Sinha, S. N., Steedly, D., and Szeliski, R. (2012). A Multi-stage Linear Approach to Structure from Motion. In Hutchison, D., Kanade, T., Kittler, J., Kleinberg, J. M., Mattern, F., Mitchell, J. C., Naor, M., Nierstrasz, O., Pandu Rangan, C., Ste en, B., Sudan, M., Terzopoulos, D., Tygar, D., Vardi, M. Y., Weikum, G., and Kutulakos, K. N., editors, Trends and Topics in Computer Vision, volume 6554, pages 267{281. Springer Berlin Heidelberg, Berlin, Heidelberg.

[4] Roy, P., Dong, W., and Isler, V. (2018a). Registering Reconstructions of the Two Sides of Fruit Tree Rows. In Intelligent Robots and Systems (IROS), 2018 IEEE International Conference on. IEEE.

An electronics student with interest in deeplearning.