key: cord-0674021-yppbjs79 authors: Eiffert, Stuart; Wallace, Nathan D.; Kong, He; Pirmarzdashti, Navid; Sukkarieh, Salah title: A Hierarchical Framework for Long-term and Robust Deployment of Field Ground Robots in Large-Scale Farming date: 2020-06-24 journal: nan DOI: nan sha: a435998cca85e7e368f1a34f96ccfcc86dfd6eae doc_id: 674021 cord_uid: yppbjs79 Achieving long term autonomy of robots operating in dynamic environments such as farms remains a significant challenge. Arguably, the most demanding factors to achieve this are the on-board resource constraints such as energy, planning in the presence of moving individuals such as livestock and people, and handling unknown and undulating terrain. These considerations require a robot to be adaptive in its immediate actions in order to successfully achieve long-term, resource-efficient and robust autonomy. To achieve this, we propose a hierarchical framework that integrates a local dynamic path planner with a longer term objective based planner and advanced motion control methods, whilst taking into consideration the dynamic responses of moving individuals within the environment. The framework is motivated by and synthesizes our recent work on energy aware mission planning, path planning in dynamic environments, and receding horizon motion control. In this paper we detail the proposed framework and outline its integration on a robotic platform. We evaluate the strategy in extensive simulated trials, traversing between objective waypoints to complete tasks such as soil sampling, weeding and recharging across a dynamic environment, demonstrating its capability to robustly adapt long term mission plans in the presence of moving individuals and obstacles for real world applications such as large scale farming. This work is primarily motivated by recent developments in the automation of field operations for unstructured environments such as construction, mining, and agriculture [1] , where the application of field robotics has the potential to automate many essential or desirable tasks that are often labour-intensive, tedious or dangerous. To achieve this, however, a number of significant challenges must be addressed, including robust motion control across a variety of terrains [2] - [4] , long-term autonomy and efficiency under resource constraints [5] - [6] , and safe operation around moving obstacles [7] - [8] . In this paper we propose a principled operational framework capable of handling these challenges, building upon our recent research and development activities in agricultural robotics. That said, our proposed methodology are also applicable for other domains such as mining, infrastructure, and construction. In agriculture, there are a large variety of essential tasks-including soil sampling, weeding, crop observation, and recharging-that are often dispersed widely over large geographical areas. In order to complete these tasks a robot can be required to navigate off-road environments in the Fig. 1 : The hierarchical framework deployed in simulation, demonstrating the output of the local planner (blue) being used to update the long term plan (yellow) in order to navigate between mission waypoints in a dynamic environment. presence of dynamic agents such as livestock and humans. This requires the planning of longer term paths between a series of mission objective waypoints whilst ensuring the adaptability of these paths to changes in the environment. To allow the robot to efficiently travel between waypoints, any short term plan updates must be made in a way that understands how the dynamic agents are likely to move around the robot and respond to its motion. Whilst previous work has demonstrated how static and dynamic elements of an unknown environment can be learnt and mapped through explicit identification of the agents, a complete long term planner which accounts for how these agents will respond to a robot's motion as it moves through the static world has not yet been demonstrated. In this work we detail a hierarchical framework integrating a dynamic perception pipeline, which learns both static and dynamic elements of an unknown environment, with a multi-level path planner and tracker. This path planner consists of a local dynamic path planner which considers the responses of nearby agents, and a longerterm, objective and energy-aware global planner, as well as an advanced receding horizon motion controller. We demonstrate this integrated framework in extensive simulated trials 1 to verify its use for long term and robust autonomy in large scale farming. These trials are conducted in an environment which replicates terrain from real world field trial locations, and are deployed on a dynamic simulation of the University of Sydney's Swagbot agricultural robot platform. Our contributions include (i) a hierarchical framework for the deployment of field robots in unstructured and dynamic environments, integrating our prior work in [2] - [8] ; (ii) improvement of collision avoidance around dynamic agents, extending on our recent work [7] for better planning persistence; (iii) comprehensive and high-fidelity validation in simulation, including traversal between waypoints to accomplish an updated set of tasks, whilst ensuring safe and efficient planning around dynamic agents, and adhering to resource and recharging constraints; (iv) demonstration of how this framework can be used to adapt to changing resource constraints by using different local planner approaches to achieve varied resource use. While mobile robots have been applied to dynamic environments for decades [9] - [12] and see continuous improvement with regards to their ability to plan paths around moving agents [7] , [13] - [15] , the problem of extending these systems for long term deployment, with the ability to update their mission objectives in real time, remains an outstanding challenge. Existing solutions in agriculture tend to focus on more structured problems such as completing a single task in row crops or orchards [1] and do not take into consideration the difference between static and dynamic elements in their environments, limiting their use around livestock and people. Additionally, limited work has focused on long term applications that require consideration of resource management in path planning. To best utilise mobile robotic agents-particularly electric-powered wheeled mobile robots (WMRs)-in offroad and large scale environments, it is necessary to be aware of the levels of onboard resources, and the anticipated costs of performing tasks and actions in the environment. This is especially relevant for energy usage, which determines the robot's range and max operational time and has been the topic of our previous work on modelling the energy cost of WMR motion [5] . The development of energy-aware and efficient path planning methods has also received significant interest in recent years, utilising cost models for fuel and energy use for point-to-point and coverage path planning [17] - [20] and planning energy efficient multi-stage paths for WMRs in undulating off-road environments [6] . The problem of achieving longer term autonomy under resource constraints is often modelled as variants of the Orienteering Problem (OP) across numerous domains [21] - [22] with recent literature exploring approaches which allow for recharging of resources [23] - [24] . Our recent work on the OP with replenishment (OPR) [25] presents a formulation which can handle multiple revisits to a number of recharging stations distributed throughout the operating environment, while also optimising the amount of time spent recharging to ensure tasks are completed as efficiently as possibledriven by the consideration that it is desirable for agricultural tasks to be completed in time. The ability to operate safely around moving agents is critical for any robotic system intended for use around humans or livestock. There exists significant interest in motion prediction methods in unstructured environments for dynamic path planning [13] - [16] , [26] , and methods that consider the social responses in crowds and herds [7] - [8] . However, unlike applications in more structured environments such as autonomous driving [15] - [16] , where we see local maneuver-based path planning used to update higher level mission planners [27] - [28] -these works have not yet been applied to systems that require longer term autonomy, such as those addressed here. Existing dynamic path planning methods for long term field robot autonomy are limited to simple collision avoidance systems, such as velocity obstacles [10] and potential fields [29] , which we compare against as an alternative to the local path planning module developed here. Also, these methods are generally applied to semi-structured environments, such as indoor use, rather than large scale farming, where current implementations make use of fail safe (FS) methods which simply stop the robot, or control its speed along a predefined path in the presence of moving individuals. Whilst these methods allow certain operations around moving agents, they tend to greatly reduce the efficiency of task completion and can result in the robot becoming stuck with no feasible path forward. Our approach combines a long term mission planner with a local dynamic path planner to achieve long term autonomy of a robot travelling between updated objective waypoints in the presence of both static obstacles and dynamic agents. This framework: (1) utilises our prior work in energy efficient path planning [5] - [6] for generating long term plans; (2) Additionally, our framework generates a real-time updated map of static obstacles and traversable terrain in the environment, used both by the local dynamic planner and by a FS collision avoidance module. Fig. 2 outlines how each component integrates into the hierarchical framework, with the output of the local, long term and fail safe planners being passed to a high level controller. The long term mission plan is generated from a set of periodically updating, externally provided mission objective waypoints which must be visited for the completion of tasks such as weeding, soil sampling and recharging of the robot. These target locations are then connected into a Probabilistic Roadmap (PRM) which describes a set of kinematically feasible paths for the robot over the environment. The minimum energy paths between all pairwise combinations of locations-calculated using the energy cost of motion model developed in [5] -are used to generate a 'goal graph', over which an asymmetric travelling salesman problem is solved to yield the optimal ordering of visits and the corresponding energy-minimising global plan. The approach in this paper builds upon our work in [6] by constraining the robot to operate in Ackermann configuration, thereby forbidding motion in any directions without adequate perception coverage for the sake of safety. Each sample pose in the PRM was therefore connected by Clothoid curve segments, to generate a continuous curvature path that is easily trackable by such a vehicle. The differentiation of traversable terrain, static obstacles and dynamic agents is another essential element required for long term autonomy in unstructured environments. Whilst this problem has been addressed in more structured applications such as on road autonomous vehicles, it remains a significant challenge in unknown and unstructured terrain such as is often encountered on farms. In this work we apply a multimodal perception system that combines 3D LIDAR and 2D RGB vision to both detect dynamic agents and learn a static map of the robot's environment. This pipeline involves the identification of ground sets within the pointcloud and clustering of non-ground sets, based on work from [30] , and then association of these clustered 3D sets to 2D detected bounding boxes within the camera frame. This is achieved through the use of a Single Shot Multi-Box Detector (SSD) CNN [31] and accurate calibration of the 2D and 3D sensors [32] , which allows projection of the 3D sets into the 2D frame for nearest neighbour association with the 2D bounding boxes. Intersection over union is also computed between all 2D bounding boxes and the minimal fit bounding box for projected 3D sets, which is used to validate the association and provide a classwise probability score. The remaining non-associated 3D sets are passed to a probabilistic OctoMap framework [33] for the continuous updating of a map of static obstacles and traversable terrain. This map is used for fail safe collision avoidance and as an input to our MCTS dynamic path planner, constraining the search space. The local path planning module used in this work is adapted from our prior work using GRNNs and MCTS for planning in dynamic environments [7] . This planner uses a learnt model of social response to predict crowd dynamics during planning across the action space and has been modified in this paper for improved persistence of paths between planning iterations, and to better account for collisions between the robot and dynamic agents in continuous time between discrete planning timesteps. This section details the specific implementation of our GRNN and MCTS local path planner, however the hierarchical framework has been designed to be agnostic with regards to the specific local path planner used, as demonstrated by our comparison of various local path planners in Section IV. 1) Predictive Model: Our local planner uses a learnt model of social response, predicting the future motion of each agent based on an observation of its past motion and the known robot's motion from the same time period, as well as using the planned future action of the robot as input. This model is based on our prior work [7] , and similarly uses the robot's position at the next timestep to represent its current intended action. The predictive model consists of a recurrent Encoder-Decoder framework using long short-term memory (LSTM) layers, where the output of the Encoder becomes the state of the root node used by the MCTS planner, as detailed in Section III-C.2. The Decoder is then used by the MCTS at each simulation step as a state transition model. Training of this model is performed as per [7] , using generated trajectories of interacting agents modelled using the Optimal Reciprocal Collision Avoidance (ORCA) motion model [12] . 2) Sampling Based Planner: An adapted MCTS is used to search the robot's future action space to find the sequence of actions that minimises the cost shown in Eq. 1, where R t refers to the robot's position at time t, G is the local goal position, N is the total number of agents currently observed by the robot and U refers to the uncertainty of agent i's estimated position at the future timestep t within the search. This value is determined from the 2D Gaussian prediction for the agent's position at t, as detailed in [7] . α is a scaling parameter based on the distance between R t and the agent's position X t i , as shown in Eq. 2. A value of 0 is used for α when the agent is beyond a distance threshold of d, which is set to 3m for all experiments. This approach extends the Parallel Single Step Simulation MCTS detailed in Alg. 1 of [7] for improved persistence of plans between timesteps. Before each planning step, we check if the computed tree from the prior step can be reused to seed the current search. A comparison is made between the current observed state, and the child node from the last tree's root node which best matches the actual action taken by the robot over the last timestep. If the positions of all nearby agents within a distance threshold of 2d of the robot's position in the root node are within an error , equal to twice the expected sensor noise standard deviation (0.25m), we reuse the values of the node's tree to seed our current root. The reused values are scaled by a factor of γ, determined by Eq. 3, where M is the number of agents within the 2d distance threshold, and P t i is the current predicted position of agent i. Additionally, we extend the MCTS planner to better identify invalid actions with regards to collisions with both the static map, and dynamic agents between discrete timesteps. Similarly to [7] , our MCTS planner uses a static map, detailed in Section III-B, to constrain the action space of the robot, which has been dilated by the radius of the robot. Valid actions are determined as those that do not cause a collision between the robot and the dilated map, or the robot and the position of each agent dilated by the sum of the robot's radius and the average agent radius. We extend this by comparing the straight line path connecting the robot position in parent and child nodes of two subsequent timesteps, ensuring that this line does not intersect either the contour of a static obstacle, or any other line connecting the predicted positions of dynamic agents in the same two timesteps. As in our prior work [2] - [4] , a receding horizon estimator (RHE) is adopted to provide an estimate of the robot state and slip conditions of the terrain. Then, based on the proximity to detected dynamic agents and static obstacles, a hierarchical mode switching module-a crucial element in the integration of the global optimal planner and the local dynamic planner-determines whether to source the local reference trajectory from the dynamic planner, or to follow the online update of the global path provided by the long term planner. In the absence of obstacles, the default path tracking behaviour will compute a reference based on the robot's progress along the global path and the specified nominal speed. If a dynamic agent or static obstacle is detected within the planning area, the local dynamic planner will be engaged. The chosen path is then passed to the RHMC module, which solves the nonlinear optimisation problem to determine the set of control actions necessary to track the desired path over the forward horizon. This module also compensates for the impact of varying slip conditions in the underlying terrain which may otherwise adversely impact the robot's motion, and potentially risk a collision with an obstacle or dynamic agent. As the local planner operates with a planning timestep of approximately 300ms we also make use of a FS collision avoidance module, ensuring that the robot is able to reflexively react to rapid changes in its environment without having to wait for the local planner to complete planning. This section presents extensive and high-fidelity simulation studies to validate the performance of our proposed framework, conducted over a large set of generated problem scenarios. Each scenario involves the robot departing from and returning to a recharging station, while being externally provided with a set of mission objective waypoints. Between 5-12 waypoints are randomly selected with an average spacing of 25m. The robot is also provided with an elevation map of the terrain, provided externally from aerial LIDAR data. The robot must initially plan a long term path offline that visits every waypoint, whilst ensuring that the energy constraints of the robot are adhered to by returning to the recharging station as required. In each of these generated scenarios, we have compared our framework as shown in Fig. 2 to a version in which the local planner has been replaced with either a potential field (PF) based approach, as described in [7] , or with no planner, instead relying only on the FS Collision Avoidance module. We simulate dynamic agents in each scenario, interacting with the robot as it navigates to various waypoints, and have repeated all scenarios with changes to parameters such as agent density and required positional accuracy at waypoints, as described in the sequel. Furthermore, as outlined in Section III-A, the robot has been kinematically constrained to operate in Ackermann configuration for the purpose of ensuring that motion is only possible within the forward facing planning area dictated by the field of view of the robot's sensors. This is necessary to ensure that Swagbot-a platform with omnidirectional motion capabilities-does not move in Fig. 3 : Example generated scenario showing offline computed paths for 3 iterations, each using a different type of waypoint with required positional accuracies of 5, 2 and 1m. Each plan begins and ends at the recharging station, with required accuracy of 0.5m, to adhere to the energy budget. These plans are computed for real world terrain generated from the Sydney University farm at Bringelly. any directions not covered by the sensor footprint, where it would otherwise risk damaging or injuring property, livestock or personnel. We evaluate our approach using varied types of mission waypoints, each with different positional accuracy requirements for the robot. These include: • Observation Waypoint: 5m Accuracy • Soil Sampling Waypoint: 2m Accuracy • Weed Spraying Waypoint: 1m accuracy • Recharging Station: 0.5m accuracy Each generated scenario requires visiting a number of waypoints of a single type, whilst ensuring that the robot returns to a recharging station within its energy budget. Upon recharging, an updated set of waypoints is provided to the robot-consisting of either observing, sampling, or weeding tasks-and the robot begins again. This iteration is repeated 8 times for each scenario. We measure performance using the following metrics: 1) Deviation from the optimal path; 2) Average speed to reach each waypoint; 3) Number of near-collisions with dynamic agents. The optimal path connecting each node along the robot's longer term plan is determined by the offline planner as described in Section III-A. The average speed to reach a waypoint is determined by distance of the waypoint at the time that the robot receives it as its next goal, over the total duration required for the robot to achieve the required positional accuracy for that type of waypoint. These agents are simulated using the ORCA [12] model. They are spawned at random locations between waypoints for each scenario and are assigned similarly random goal points to travel towards at speeds ranging from 0.1-1.5 m/s. For all scenarios, agents are simulated with a radius of 0.5m and react to the robot assuming a radius of 1.5m. The robot's perception has been simulated to reflect the localisation and object detection of the real world agricultural robot Swagbot, described in [7] . Sensor noise has been applied to the simulated GPS and IMU sensors used for localisation, and to the simulated obstacle detection proportional to the distance from the robot, matching that observed in previous real world experiments. Simulated terrain has been included, using information from publicly available satellite data. For the purposes of these simulated experiments we have also assumed the availability of a static obstacle map for use in the local planner, rather than the creation of one from sensor input. We repeat each scenario for the compared methods, using either our MCTS approach, a PF planner, or only the FS method. Additionally, we repeat these experiments using two different crowd densities of 10m 2 (dense) and 50m 2 (sparse) per agent, giving on average distances of approximately 3m and 7m between each agent respectively. Simulations are run in real time, taking on average 2.5 hours to complete all iterations per scenario, depending on the type of local planner used. We additionally generate 3 separate scenarios for each testing variation, totalling 144 iterations across all experiments and 45 hours of testing. In terms of estimation and control, the RHMC module uses a horizon length of 3s, consisting of 15 sampling intervals of 0.2s, and the robot's steering angle is constrained to within ±40 • . The configuration of this module is otherwise the same as in [4] . The energy cost of motion model used for the global long-term planner is configured to use the 'Bringelly' site configuration as defined in [5] , which corresponds to the site from which the simulated operating environment was derived. Table I summarises the performance of the robot in reaching the required positional accuracy for each waypoint. The % of times that the robot is successfully able to navigate to the waypoint within timeouts of 1 minute and 5 minutes is shown for two different agent densities, 10m 2 /agent and 50m 2 /agent, simulating both a dense and sparse herd or crowd. It is clear that the choice of local planner impacts the robot's performance, with the failsafe only version failing to reach weeding waypoints (1m positional accuracy) in dense environments within the 1 min timeout 28% of the time. This number drops significantly to 8.6% in sparse environments, indicating the applicability of different planning methods to different environments. Versions of the framework using responsive local planners, MCTS and PF, perform well in both the sparse and dense environments, achieving only 2.8% failure at 1m accuracy for MCTS and 4.6% for PF for dense environments when restricted to 1 minute timeout, and less than 1% failure when allowed to continue for up to 5 minutes. Fig. 4 : Average speed to achieve a required positional accuracy of 5, 2 and 1m for % of waypoints tested. In both dense (top) and sparse (bottom) environments, the hierarchical framework is able to achieve the required accuracy in a shorter time using the more responsive local path planners, MCTS and PF, compared to the FS method. A key reason for these failures was due to the robot taking evasive action, which could occasionally perturb the robot pose to an extent that made subsequently reaching the goal region no longer kinematically feasible without taking a more circuitous route, which in turn increased the chance of further disruptions by the dynamic agents. This situation provides a clear motivation for the use of platforms with omnidirectional mobility-a direction we plan to pursue further with Swagbot once its sensing configuration supports it-as the increased mobility of such a platform would circumvent this situation almost entirely. Fig. 4 illustrates the average speed at which the robot was able to reach the required positional accuracy of each planned waypoint for each test (within 5m for observation waypoints, 2m for soil sampling and 1m for weeding). The MCTS local planner implementation was able to outperform both the PF and FS versions for all accuracies, in both sparse and dense agent densities. These results indicate that in the presence of dynamic agents, the time efficiency of the system is greatly dependent on the type of local planner used. The robot is able to complete the required mission objectives in significantly less time when a more responsive local planner is included in the framework. However, the results shown in Table II show that the choice of local planner also impacts the robot's energy efficiency. Deviation from the offline optimal path is used as a proxy for energy use, indicating that a less responsive local planner-in this case the FS method used alone-uses significantly less energy than a planner which adapts its path to account for the motion of nearby agents. Additionally, Table III shows that whilst all tested versions are usually able to maintain adequate distance from agents, with very similar median distances and standard deviations for both agent environment densities, the non-responsive FS method results in significantly more collisions with agents in the sparse environment. This result suggests that a number of collisions are due to the agents running into the robot, rather than the robot hitting the agents, and can be avoided by using a more responsive planner in sparse environments. Conversely, in the dense environment, the more responsive versions result in greater number of collisions, suggesting that when the environment becomes too complex these methods may in fact cause more collisions through their efforts to avoid an initial collision. Our results suggest that by using various combinations of planners, the framework is able to achieve different performances in varied environments, demonstrating not just the flexibility of the framework to adapt to changes in the environment, such as density of crowds or herds, but also to changing requirements, such as energy usage, or speed of mission completion. This work has proposed a hierarchical framework enabling the long-term operation of a field robot for navigation of mission waypoints in dynamic environments whilst adhering to resource budgets and other operational constraints. In particular, we have shown how this framework can handle changes in the environment and in current mission constraints, by using an adaptable local planning module to achieve varied performance as required. We have evaluated the strategy in extensive simulated trials and demonstrated its capability to robustly adapt long term mission plans in the presence of moving individuals and obstacles for real world applications. In future work, we will apply this framework to a robot to test its capability in the continuous completion of tasks such as weeding, soil sampling, and observing crops and livestock. Agricultural robots for field operations. Part 2: Operations and systems Receding horizon estimation and control with structured noise blocking for mobile robot slip compensation Structured noise blocking strategies for receding horizon estimation and control of mobile robots with slip Experimental validation of structured receding horizon estimation and control for mobile ground robot slip Compensation Motion cost characterisation of an omnidirectional WMR on uneven terrains Energy aware mission planning for WMRs on uneven terrains Path planning in dynamic environments using Generative RNNs and Monte Carlo tree search Predicting responses to a robot's future motion using generative recurrent neural networks The dynamic window approach to collision avoidance Motion planning in dynamic environments using velocity obstacles Learning to navigate through crowded environments Reciprocal n-body collision avoidance Motion planning among dynamic, decision-making agents with deep reinforcement learning Motion planning among dynamic, crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforcement learning Generic tracking and probabilistic prediction framework and its application in autonomous driving Personalized vehicle trajectory prediction based on joint time series modeling for connected vehicles Energy management for four-wheel independent driving vehicle Deployment of mobile robots with energy and timing constraints Energy-optimal trajectory planning for car-like robots Coverage path planning under the energy constraint Correlated orienteering problem and its application to persistent monitoring tasks Dubins orienteering problem The orienteering problem with variable profits, Networks Optimal routing of energyaware vehicles in transportation networks with inhomogeneous charging nodes Orienteering problem with replenishment BRVO: Predicting pedestrian trajectories using velocityspace reasoning Real-time motion planning methods for autonomous on-road driving: stateof-the-art and future research directions Planning and decisionmaking for autonomous vehicles Dynamic motion planning for mobile robots using potential field method On the segmentation of 3d lidar point clouds SSD: Single shot multibox detector Automatic extrinsic calibration between a camera and a 3d lidar using 3d point and plane correspondences OctoMap: An efficient probabilistic 3D mapping framework based on Octrees