Introduction

Barchans are dunes of crescent shape with horns pointing downstream, which are formed mainly under one-directional flows and when the amount of available sand is limited1. These bedforms are frequently found on Earth (both in aquatic and eolian environments) and Mars, having in common the same morphology, but presenting different scales2: they are much larger and slower on Mars, where the scales are up to one kilometer for the length and millenniums for the turn-over time (although their average length has recently been found3 to be of the order of 200 m, and the turn-over time on the north pole to be of the order of a century4), than under water, whose scales are tens of centimeters and minutes5. On terrestrial deserts, the scales are up to hundreds of meters and years. However, it is not always easy to clearly identify barchans and measure their dimensions based on images, since they are organized in barchan fields in which they migrate over long distances while interacting with each other6,7,8,9,10,11,12,13. In addition, the fluid flow often presents seasonal variations, affecting the morphology of dunes14,15,16. Therefore, it is common to observe barchans that touch each other (colliding dunes), that are highly asymmetric, and that shed small barchans, for instance. Despite these difficulties, barchan dunes are still a bedform of much interest since they can be used to deduce information about the atmosphere of planets and moons based on satellite images, such as the existence of an atmosphere that is or has been capable of mobilizing sediments (otherwise there would not exist barchans), the mean direction of winds, and even the flow strength (from stability analyses)5. In the particular case of Martian barchans, these inferences represent in some cases mean winds that have blown over the last millenniums (given the turn-over time of large Martian barchans), something that the satellites and sensors in locus cannot measure.

Over the last decades, with better resolution satellites being launched and orbiting Earth and also Mars, dunes on both planets have been monitored with reasonable accuracy17,18,19. In particular, the detection of barchans on Earth and Mars based on satellite images have been largely employed. The first works detecting barchans used non-machine learning detection17,20,21,22, meaning that the dunes were identified, classified and measured (morphology and, sometimes, displacement) with an algorithm specially written for these purposes, instead of being trained for identification; however, with the advance of Machine and Deep Learnings (ML and DL, respectively), automatic detection based on computer training has become more common23,24. As pointed out by Rubanenko et al.25, those works, based on Support Vector Machine26 or R-Vine classifiers, were accurate for automatically detecting and classifying, but not for segmenting and outlining individual barchans.

Recently, Rubanenko et al.25 made use of Mask R-CNN (Regional Convolutional Neural Network)27, which detects objects while simultaneously generating a segmentation mask, for automatically detecting, classifying and outlining barchan dunes on Mars and Earth. To minimize false detection of barchans and derive the trends for wind and sand transport, they focused their training on isolated barchan dunes, which they carried out for 1076 images and surpassed 70% of accuracy (mean average precision mAP, see section “Methods”). Afterward, they applied the Mask R-CNN to 137,111 images of the Martian surface extracted from a CTX global mosaic from the Mars Reconnaissance Orbiter (MRO) Context Camera (CTX) dataset. With those images, they mapped large regions of the Martian surface and found that around 60% of dune fields on the northern hemisphere are covered with barchans, while only 30% of fields on the southern hemisphere are covered with barchans. Finally, they applied the same training to satellite images from Earth and obtained reasonable accuracy (which can be improved by inverting the image colors or performing new training). Later, Rubanenko et al.3 explored the barchans identified and outlined in Ref.25 for probing the assumption that m-scale ripples found on Mars are the result of a hydrodynamic instability. Their measurements showed that the lengths of both small barchans and m-size ripples decrease with increasing the atmosphere density, following, in addition, a power-law predicted by a hydrodynamic analysis, which corroborates the initial assumption.

Although recent works, in especial Rubanenko et al.3,25, increased the accuracy in detecting and outlining individual barchans in satellite images, variations in the crescent shape (due to barchan-barchan interactions and seasonal winds) and the existence of other types of dunes that are also curved (parabolic dunes, for instance) hinder the automatic detection and outline in many cases. In particular, to the best of the authors’ knowledge, current trained networks do not detect successfully groups of interacting barchans, mainly when they are touching or superposing each other (barchan-barchan collisions). In addition, because previous works were conducted for single images, it remains to be proved that CNNs (Convolutional Neural Networks) can track the detected barchans and update their outline along a sequence of frames (or movie). The automatic detection of barchans undergoing complex interactions (such as barchan-barchan collisions) is relevant for many reasons. One of them is that it opens the possibility for updating the number of barchans on the surface of planets (Mars, for example), which were possibly underestimated in previous works (since they computed isolated barchans only), and determining their location, orientation and concentration25. This information can be useful for estimating the direction and strength of local winds, determining the regimes of sand transport and accumulation, and estimating the effects of global changes on Earth based on the dynamics of dunes28. Other reason is the possibility for predicting the future of barchan fields based on barchan-barchan interaction maps, such as those from Assis and Franklin11 (or, in the same way, deducing the ancient past of such fields), or the interaction of dunes with dune-size obstacles based on the corresponding maps, such as shown in Assis et al.29: by training a CNN with interacting patterns measured in laboratory, the trained CNN might predict the same kind of interaction in the field based on satellite images. Here again, this information can be used for estimating desertification as an effect of climate change28, and predicting if constructions are under imminent threat of being overtaken by sand30. Another application would be the yearly monitoring of dune motion for estimating the sand cover on Earth and testing climate models.

In this paper, we inquire into the automatic detection, classification, outline, and tracking of barchans in different environments by carrying out experiments and exploring with DL the images of both individual and groups of barchans. We made use of the existing python library YOLO (You Only Look Once) for training a neural network with images from experiments where complex interactions took place between dunes, and, afterward, did the same for satellite images from Earth and Mars. We show, for the first time, that the trained network can identify, classify, outline, and track dunes interacting with each other in different environments, using different image types (contrasts, colors, points of view, resolutions, etc.), with confidence scores (estimated accuracy for each detected object) within 70 and 90% and mean average precision that reaches 99%. The trained CNN opens new possibilities for updating the number of barchans on the surface of planets (by considering also those undergoing complex interactions) and predicting the future of barchan fields (based on barchan-barchan interaction maps11 and satellite images). Our results represent a step further for automatically monitoring barchans and understanding their dynamics, with important applications for human activities, such as mitigating disasters on Earth and exploring Mars.

Results

Figure 1
figure 1

Snapshots placed side by side of two barchans interacting in a pattern called exchange11. The images were taken from the open repository31 created by Assis and Franklin11, from which we selected some time instants shown on the top of each snapshot. The instants were numbered from t1 to t7 (shown in orange on the top), a length scale is shown on the bottom left, the boxes enclosing the identified objects are shown in white, and the outline of objects are shown in green. Each object has a label (Shape 1–Shape 3) which is kept until the last image, and the classes Barchan and Not a barchan are shown for each object with the corresponding confidence score. In the images, water flow is from top to bottom and the grains forming one of the bedforms (that which was initially upstream) are red in order to track them along images11.

After training the network with a certain number of images from the experiments (see section “Methods” for more details), we applied the trained network to identify, classify, outline and track dunes that interacted between each other in other experiments. One advantage of training the CNN with images from experiments with subaqueous barchans is that they contain the whole interaction sequence, allowing the accurate identification of bedforms during labeling. Besides, because in the experiments the flow conditions, grains’ properties, and terrain conditions are well controlled, we can ascertain which is the exact barchan-barchan interaction that is going on. We applied the trained object detection, for instance, to some of the experiments of Assis and Franklin11 (open dataset available31), in which the initially upstream and downstream dunes consisted of grains of different colors. For example, Fig. 1 shows snapshots placed side by side of two barchans as the smaller barchan (red dune, initially upstream) collides with the larger one (white, downstream). They merge and, after some time has elapsed, eject a small barchan whose size is similar to that of the barchan initially upstream, but consisting of different grains (this pattern is know as exchange11). Time instants are shown on the top, and they are numbered from t1 to t7 (also shown on the top). We observe that the trained CNN can detect the objects (in this case, bedforms), which are bounded by white boxes in the figure. Each object has an label assigned (Shape 1 to Shape 3), is classified as Barchan or Not a barchan with an accuracy (confidence score) higher than 0.9 (see the section “Methods” for a description of the computation of the accuracy shown in the images), and the outline is determined (in green lines). The assigned label is kept constant for each object along the images, so that after Shape 2 is absorbed by Shape 1 it disappears, and the new ejected barchan is identified by another label (Shape 3). The identification, classification, outline and tracking work well independent of the color of objects, evincing the ability of the trained CNN in following each object along frames. Interestingly, the trained CNN is able to correctly identify as one single barchan the initially merged dune (t = 98 to 130 s), even if the red grains continue with a barchan-like shape. This is a major result of this training, since object detection codes usually have problems in correctly identifying this kind of interaction11,12,13. This opens new possibilities for computing more accurately the number of barchans appearing in satellite images (since barchans undergoing complex interactions can now be detected and outlined), and also for predicting the future configurations of barchan fields11.

Figure 2
figure 2

Time evolution of morphology and position of interacting bedforms for the case shown in Fig. 1: (a) aspect ratio W/L; (b) mean horn length \(\overline{L}_h\); (c) longitudinal position \(P_y\) of the centroid of bedforms; and (d) Area A of the horizontal projection of the bedform surface. Time instants corresponding to the snapshots of Fig. 1 are shown in all panels. The insert in panel (c) shows the dimensions of barchan dunes considered in this figure: their length L, width W, length of the right horn \(L_{hr}\), and that of the left horn \(L_{hl}\). The mean horn length is \(\overline{L}_h\) \(=\) \((L_{hr} + L_{hl})/2\).

Figure 3
figure 3

Comparison between automatic and manual detections: time evolution of the area A (horizontal projection) of bedforms for the case shown in Fig. 1 (orange circles) and the results computed manually by Assis and Franklin11 (blue triangles).

In order to evaluate if positions and outlines are accurately detected and tracked, we computed the length L, width W, horns’ length \(L_h\) and surface area A of barchans. The lengths, width and area were computed as defined in Assis and Franklin11,12, and the mean horn length \(\overline{L}_h\) was computed as the average of both horns. We note that A is the area bounded by the outlines generated by trained CNN, corresponding thus to the surface area of the dune projected in the horizontal plane. For the exchange case of Figs. 1, 2a–d show the time evolution of the aspect ratio W/L, mean horn length \(\overline{L}_h\), longitudinal position \(P_y\), and projected area A, and the time instants corresponding to those of snapshots of Fig. 1 are shown in the panels. We observe that these quantities are in good agreement with the results shown in Assis and Franklin11, in which a conventional (non-machine learning) detection code was used. In particular, Fig. 2d can be compared directly with Fig. 3f of Ref.11, which we show in Fig. 3, and the agreement is perfect (deviations of less than 5%). We applied the same trained CNN to many other experiments, in especial the other cases reported in Assis and Franklin11 (images and movies of which are available in an open repository31), and the results were as good as those of Figs. 1 and 2 (the results for other experiments are available in the Supplementary Information S1, as well as movies showing the tracking of individual dunes along frames). Obtaining results in good agreement with dedicated (non-machine learning) codes implies that the latter are no longer necessary for future experiments with subaqueous dunes, and that AI (Artificial Intelligence) can be successfully used for processing image sequences taken from drones or satellites.

Figure 4
figure 4

Snapshots placed side by side of two barchans interacting in a pattern called fragmentation11. The images were taken from the open repository31 created by Assis and Franklin11, from which we selected some time instants shown on the top of each snapshot. The symbols and labels are as in Fig. 1. In the images, the water flow is from top to bottom and the grains forming one of the bedforms (that which was initially upstream) are red in order to track them along images11.

It is worth noting that in some experiments the camera was displaced (it was mounted on a traveling system) to maintain the barchans in its field of view. In those cases, even with the spatial reference changing abruptly between two images, the trained CNN tracked correctly each barchan, as can be seen in some of the movies of the Supplementary Information S1. This can be also seen in Fig. 4, in which the camera was displaced between t = 195 and 300 s (in the tests, it was displaced at some point between those instants in an abrupt way, i.e., the test stopped, the camera was displaced, and the test re-started). Figure 4 shows snapshots side by side of two barchans that interact with each other, and, at some point in time, one of them ejects a small barchan (the fragmentation pattern described in Assis and Franklin11). The dunes are correctly outlined and tracked, even if with the camera displacement the barchans are (wrongly) seen in the frame as in upstream positions with respect to previous frames, showing the CNN robustness. The morphological parameters obtained in this case and others are available in the Supplementary Information S1, and are in agreement with Ref.11.

Figure 5
figure 5

HiRISE image32 showing a field of individual barchans on the surface of Mars: \(23.190^\circ\) latitude (centered), \(339.585^\circ\) longitude (East), spacecraft altitude 287.3 km. Courtesy NASA/JPL-Caltech/UArizona. The detection boxes, class type, confidence score, and outline are superposed with the image.

Figure 6
figure 6

Medium-resolution image (4 m per px) from CTX33 showing a field of barchans undergoing complex interactions on the surface of Mars: \(-41.488^\circ\) latitude (centered), \(44.589^\circ\) longitude (East), spacecraft altitude 253.8 km. Courtesy NASA/JPL/MSSS/The Murray Lab. The detection boxes, class type, confidence score, and outline are superposed with the image. The detections using the HiRISE image are available in the Supplementary Information S1.

Figure 7
figure 7

Satellite image showing a barchan field on Earth. This image corresponds to \(24.836^\circ\) latitude (centered), \(51.311^\circ\) longitude (East), on Qatar, with eye altitude of 5 km. Courtesy Google Earth. The detection boxes, class type, confidence score, and outline are superposed with the image.

Table 1 Morphological properties of dunes detected, classified, labeled, and outlined in Fig. 5: label (shape number), class type, length L, width W, mean horn length \(\overline{L}_h\), projected area A, and longitudinal and transverse components of the centroid position, \(P_x\) and \(P_y\), respectively.
Table 2 Morphological properties of dunes detected, classified, labeled, and outlined in Fig. 6: Label (shape number), class type, length L, width W, mean horn length \(\overline{L}_h\), projected area A, and longitudinal and transverse components of the centroid position, \(P_x\) and \(P_y\), respectively.
Table 3 Morphological properties of dunes detected, classified, labeled, and outlined in Fig. 7: Label (shape number), class type, length L, width W, mean horn length \(\overline{L}_h\), projected area A, and longitudinal and transverse components of the centroid position, \(P_x\) and \(P_y\), respectively.

Having confirmed that the instance segmentation based on YOLOv8 successfully identify, classify, outline, and track each dune appearing on images from experiments, we trained the same CNN using satellite images of eolian dunes on the Martian and Earth’s surfaces following a procedure described in the “Methods” section. After training a given set of images, we used the trained CNN to identify dunes in other images. For instance, Fig. 5 shows a HiRISE image32 of a field of individual barchans on the surface of Mars (\(23.190^\circ\) latitude, \(339.585^\circ\) longitude), where we can observe single barchans migrating over irregular terrains containing craters, while Fig. 6 shows a medium-resolution image (4 m per px) from CTX33 of a field of barchans undergoing complex interactions on the surface of Mars (\(-41.488^\circ\) latitude, \(44.589^\circ\) longitude). As for the experiments, the trained CNN is able to correctly identify, classify and outline dunes in satellite images with confidence scores of the order of 0.9 in the case of single barchans, and with lower accuracy in the case of interacting barchans, using both high- and medium-definition images. We note that the detection is not perfect in Fig. 6, with some dunes undergoing complex interactions not being detected while others are (all barchans are detected). However, we have shown from our experimental dunes (for which we have large datasets) that it is possible to have accurate detections of interacting barchans. For satellite images, datasets of interacting barchans are relatively small (time sequences that show the complete outcome of each interaction being absent) since a single barchan-barchan interaction on Earth takes decades to finish completely (on Mars it can take millenniums). This, added to the fact that the satellite images used are of lower quality than those from our experiments (in terms of spatial resolution and contrast with the background), decreases the detection accuracy of interacting dunes in Fig. 6.

Figure 7 shows an image of a barchan field on Earth (\(24.836^\circ\) latitude, \(51.311^\circ\) longitude, in Qatar), in which the contrast of colors between the dunes and background is poor. However, we observe that the CNN is able to detect and classify dunes with confidence scores of approximately 0.90 (the lower confidence score is 0.88, corresponding to a highly asymmetric barchan that is probably shedding a small dune through one of its horns), and to successfully outline them. Based on the outlines generated by the trained CNN, we measured the main features of all barchans identified in Figs. 56 and 7, which are listed in Tables 12 and 3, respectively.

We used the same procedure for other satellite images of Mars and Earth, as well as aerial pictures of eolian dunes, with different backgrounds (terrains), colors, resolutions, and point-of-view (view in perspective), and the results were as good as those shown in Figs. 5 and 7. In particular, we processed images with medium to low resolutions (15 m per px to 30 m per px) in which groups of barchans were undergoing complex interactions, and the trained CNN was able to correctly identify each dune (examples of instance segmentation of other satellite images are available in the Supplementary Information S1). The mean average precision mAP (definition available in section “Methods”) reached in our CNN training was around 0.90 for the satellite images and 0.99 for the images from experiments (graphics of the evolution of mAP along the epochs are available in the Supplementary Information S1). Finally, we carried out tracking (with detection and outline) in medium- to low-quality satellite images of an eight-year sequence of barchans undergoing dune-dune interactions in the Sahara desert, which we show in the Supplementary Information S1. Although the sequence contains only a small portion of barchan-barchan interactions (given the large timescales involved), the results are good, showing a great potential of the CNN for field tracking and measurements.

Discussion

We carried out training of a single-stage object detection model YOLOv8 (YOLO version 8), together with scrips written in the course of this work to handle data and measure barchan dimensions, for image segmentation and tracking of barchan dunes. Different from previous works, we used a large database of time-resolved images of barchans of different sizes, colors, grain types, and format (camera type), consisting of mono or bidisperse grains (with more than one color), and undergoing different types of interaction11. A small part of these images were used for training, and we afterward employed the trained CNN for processing images that were new to the CNN. With that, we could, besides the identification, classification, outline, and tracking of dunes, measure the time evolution of morphological quantities and compare them with the results from our non-machine learning detection code. For the experiments, the confidence scores were over 0.9, even when dunes of different color underwent different types of interaction. Therefore, for the first time, we showed the ability of a trained CNN to correctly identify, classify, outline and track dunes that undergo complex interactions with each other in a dune field, while previous works relied only on static satellite images for identifying single barchans.

We used the same procedure with satellite or aerial images of barchan fields on Mars and on Earth, with different image types, colors and perspectives, and in those cases the trained CNN identified, classified and outlined dunes with confidence scores above 70%. However, in this case we did not systematically track dunes because of the small number of sequential images: good-quality images on Earth date back 30 years ago only, while dunes take a decade to displace a considerable distance, and on Mars the timescale is much higher (centuries or even millenniums). In the particular case of barchan-barchan interactions, there is no image sequence from satellites showing the entire process (only part of it) since time scales are much higher than those for subaqueous barchans (it would need a century or more of satellite images from Earth, and even more from Mars, to finish the typical barchan-barchan interactions11). We nevertheless carried out tracking (with detection and outline) in medium- to low-quality satellite images of an eight-year sequence of barchans undergoing dune-dune interactions in the Sahara desert, and the result was good (the results are available in the Supplementary Information S1), showing the potential of the technique for field measurements.

Although considerable improvements have been achieved in the automatic detection of barchans undergoing complex interactions, the trained CNN has still an important limitation: when processing images where barchans interact over a terrain (background) that has poor contrast with respect to the dunes, some dunes are not detected, and those detected have lower accuracy (such as happens in Fig. 6). When the contrast with the background is good and image resolution is not poor, the barchans are correctly identified with high accuracy (as in Figs. 1 and 4). However, as can be seen in Figs. 5, 6 and 7 and in those in the Supplementary Information S1, the great majority of barchans is correctly identified, classified, and outlined.

The success in identifying, classifying, outlining, and tracking barchans undergoing complex interactions by using CNN can be employed for dune monitoring, which engenders positive impacts in human activities. For example, it can be used for monitoring the growth and migration of dunes that are burying (or on the verge of burying) human constructions, such as in Florianopolis (Brazil) and Silver Lake (USA)30,34, or for detecting complex barchan interactions on Mars. Besides, it can be explored further: the CNN can be trained on the history of certain barchan-barchan patterns (based on experimental data such as Refs.11,31), and the trained CNN can be afterward applied, for example, for deducing the past history of barchan fields on Mars (based on satellite images openly available). It can be also used for predicting the future of those barchan fields. If (or when) carried out, this would represent a considerable step for understanding the ancient past of Mars, for comprehending the undergoing climate change on Earth28, and for predicting the far future of our planet. Our results represent, therefore, an important step in that direction.

Methods

Experimental setup

The CNN was trained with images from controlled experiments. The experimental setup consisted of a water tank, two centrifugal pumps, a flow straightener, a 5-m-long closed-conduit channel, a settling tank, and a return line, and we imposed a pressure-driven water flow in closed loop following the aforementioned order. The channel had a rectangular cross section 160 mm wide by 2\(\delta\) = 50 mm high and was made of transparent material. It consisted of a 3-m-long entrance section (corresponding to 40 hydraulic diameters), a 1-m-long test section, and a 1-m-long section connecting the test section to the channel exit. With the channel completely filled with water in still conditions, controlled grains were poured inside, forming one or more conical heaps. Afterward, we imposed a specified water flow which deformed each conical pile into a barchan dune and, in the case of multiple piles, the barchan dunes interacted with each other. We used tap water at temperatures within 22 and 30 °C and different populations of grains (sometimes mixed): round glass beads (\(\rho _s\) = 2500 kg/m\(^3\)) with 0.15 mm \(\le \,d\,\le\) 0.25 mm and 0.40 mm \(\le \,d\,\le\) 0.60 mm, angular glass beads with 0.21 mm \(\le \,d\,\le\) 0.30 mm, and zirconium beads (\(\rho _s\) = 4100 kg/m\(^3\)) with 0.40 mm \(\le \,d\,\le\) 0.60, where \(\rho _s\) and d are, respectively, the density and diameter of grains. We used grains of different colors in order to track them during barchan-barchan interactions. A layout and a photograph of the experimental setup are shown in Fig. 8, and are also available in Assis and Franklin11,12.

Figure 8
figure 8

(a) Layout of the experimental setup. (b) Photograph of the test section.

Top view images of the dunes were acquired with either a high-speed or a conventional camera mounted on a traveling system and placed above the channel. The high-speed camera was of complementary metal-oxide-semiconductor (CMOS) type with maximum resolution of 2560 px \(\times\) 1600 px at 800 Hz, and the conventional camera, also of CMOS type, had a maximum resolution of 1920 px \(\times\) 1080 px at 60 Hz. Both the camera and traveling system were controlled by a computer, and we varied the field of view and the ROI (region of interest) in accordance with the number of dunes in the test and their velocity and those of grains. We mounted lenses of 60 mm focal distance and F2.8 maximum aperture on the cameras and made use of LED (light-emitting diode) lamps branched to a continuous-current source to provide the necessary light while avoiding beating with the frequencies of cameras. More details about the experimental setup can be found in Refs.11,12,13,35,36,37,38 Datasets with the images and results of the experiments are available in open repositories31,39,40,41.

Object detection model using convolutional neural network

We used the python library YOLOv8 (YOLO - You Only Look Once version 8) for carrying out instance segmentation of dunes (objects), in order to identify, classify, outline, and track each object along images of a given time sequence42. YOLOv8 is a single-stage object detection model based on CNN, whose architecture consists of backbone, neck and head, and it is known for fast generating masks while computing in parallel their coefficients43. The backbone (here the CSPDarknet-53) contains the CNN and generates feature maps at different levels of detail, which are passed to the neck. The neck then processes the feature maps and builds feature maps for prediction, which are passed to the head. Finally, the head predicts the classes of objects, their bounding boxes, and their masks, which can be directly used to outline objects. Figure 9 shows a simplified architecture of YOLOv8.

Figure 9
figure 9

Flowchart showing a simplified architecture of YOLOv8. The C’s represent the convolution layers (used in the backbone to extract relevant features from images, such as edges and textures) and the P’s represent the pyramid of characteristics (used in the head to capture information at different scales and resolution).

For the automatic dune detection, the following steps were taken sequentially. First, a large database of experimental data was constructed, taking into account binary interactions from previous work. Next, 7455 images were labeled using the CVAT platform (https://www.cvat.ai/) to annotate the images (this process was carried out manually), where we decided when dunes merged or a new dune was ejected based on the continuity of areas covered with grains (examples of object labeling with CVAT are available in the Supplementary Information S1). A validation and training database was then created using the labeled images, in which we used 876 images for validation, and we trained 300 epochs with a batch size equal to 1065. For the training, we made use of the pre-trained model yolov8n.pt, and trained two specific layers: one to detect barchan dunes (Barchan) and the other to detect non-barchan objects (Not a barchan). Finally, a Python code was developed to run the trained model, detect, classify and outline dunes, and analyze their morphology. The training was carried out in a GPU nvidia RTX 2070 using CUDA 12 and cuDNN 8 (CUDA Deep Neural Network library version 8), and we have not used data augmentation. However, images had different resolution, sharpness, and orientation.

The average accuracy of the trained CNN, for a given image dataset, is usually measured by the mean average precision mAP,

$$\begin{aligned} mAP= \frac{1}{T}\Sigma _{i=1}^T AP_i , \end{aligned}$$
(1)

where T is the number of categories (segmented trees) and AP is the average precision of segmentation42,44. For a given category i, the average precision is given by the integral of the precision P as a function of the recall R,

$$\begin{aligned} AP_i= \int _{0}^{1} PdR , \end{aligned}$$
(2)

where

$$\begin{aligned} P = \frac{TP}{TP+FP}, \qquad R = \frac{TP}{TP+FN} , \end{aligned}$$
(3)

TP being the true positives, FP the false positives, and FN the false negatives. Finally, the intersection over Union IoU is a measure of how much the detection box overlaps a box containing the real object (ground truth)44:

$$\begin{aligned} IoU = \frac{intersection \,\, area}{union \,\, area} . \end{aligned}$$
(4)

The estimated accuracy C plotted in Figs. 14, 5, 6 and 7 for each detection box is usually called confidence score, and corresponds to the precision P multiplied by the IoU and by the conditional class probability \(P_c\):

$$\begin{aligned} C = P_c \cdot P \cdot IoU , \end{aligned}$$
(5)

where \(P_c\) indicates if a given class is present in the box. The trained CNN is available in an open repository45.

Satellite images

We used a combination of satellite imagery platforms to collect images of the surfaces of Earth and Mars, which we used to train and apply the single-stage object detection model for identifying, classifying, and outlining dunes in satellite images. On Mars, high-resolution images were obtained from the HiRISE project32, and medium- and low-resolution images from the Global CTX mosaic33. For high resolution images, we made use of the HiView code32 for converting the pixel scale to a real physical unit (m). For terrestrial dunes, images ranging from low to high resolution were obtained from the Google Earth Pro and Copernicus46 platforms.

We trained the CNN following the same procedure as for the experiments (examples of object labeling with CVAT are available in the Supplementary Information S1). In this case, 12395 images were labeled and trained with 300 epochs, and 2376 images were used for validation. The trained CNN is available in an open repository47.