Moving to See

Why it matters

Crop counting and monitoring is an important task within the agricultural industry as it informs the grower on how well the crop is growing and whether and potential issues are developing such as pest or disease outbreaks. Growers are looking to technology for enabling better ways of monitoring their crops. However, crops can be particularly difficult to inspect behind all the leaves and branches in a greenhouse environment. This project is investigating smarter ways for robots to see around all the clutter in order to better monitor crops in agricultural environments.

In robotic harvesting, dealing with obstructing leaves and branches that stop a robot from getting a clear view of a crop is really tricky because the ‘occlusions’ are extremely difficult to model, or anticipate, in a technical way,” said Mr Zapotezny-Anderson, who is currently on sabbatical from Airbus to complete a Masters of Engineering (Electrical Engineering) at QUT.

“Basically, robots don’t cope well and often just give up!”

Project overview

The first proof of concept used a 3D-printed system with nine eyes (cameras) operating at different depths, enabling a robot to look around obstructing leaves much like a human [1]. The system used the different cameras to pick the best direction to move in order to see the fruit better, hence why we refer to it as moving to see.

The second version of the system has been revolutionised into a faster ‘one-eyed’ system that learns from data captured by multiple eyes. The real beauty of the system is that one eye has the power of nine, without the performance limitations of a multi-camera system not least being slower data processing time. Basically, the robot has been trained off all nine cameras to use monocular vision to guide its end effector (harvesting tool or gripper) on the fly around occluding leaves or branches to get an unblocked view of the crop, without prior knowledge of the environment it needs to navigate in

This new concept utilises advances in Deep Learning to develop a single camera version of the system, aptly entitled Deep 3D Move To See [2]. It is is the result of a novel ‘deep learning’ method involving one camera being trained off nine via a Deep Convolutional Neural Network. More information can be found in the paper [2], which was presented at the 6th IFAC Conference on Sensing, Control and Automation Technologies for Agriculture.


  1. Zapotezny-Anderson, P., & Lehnert, C. (2019). Towards Active Robotic Vision in Agriculture: A Deep Learning Approach to Visual Servoing in Occluded and Unstructured Protected Cropping Environments. IFAC-PapersOnLine, 52(30), 120–125.
  2. Lehnert, C., Tsai, D., Eriksson, A., & McCool, C. (2019). 3D Move to See: Multi-perspective visual servoing towards the next best view within unstructured and occluded environments. In Proceedings of 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems: IROS 2019 (pp. 3890–3897). IEEE.

Other Team Members