Introduction
The Global Learning and Observations to Benefit the Environment (GLOBE) program is an international science and education program that provides people worldwide with opportunity to participate in data collection and the scientific process and contribute meaningfully to our understanding of the Earth system and global environment (GLOBE 2024a). Since 1995, hundreds of thousands of volunteers in more than 125 countries have contributed more than 250 million environmental observations, which are openly available through the GLOBE database (GLOBE 2024b).
Since its inception in 1995 as a classroom-based K–12 citizen science education and science program, GLOBE has evolved to support the multiple modalities of public participation in scientific research (Bonney et al. 2009; Shirk et al. 2012), including collaborative research connecting students with research scientists, co-created citizen science projects supporting community-led investigations with volunteers, and contributory long-term citizen science through the GLOBE program’s GLOBE Observer (GO) data collection mobile app. This mobile application incorporates user-centered design, ease of data entry, availability in multiple languages, and convenience of in-app training that broadens GLOBE’s reach and lowers barriers to entry. With more than 280,000 participants worldwide, AI solutions support the privacy of users while maintaining participants’ submitted data quality.
In this paper, the term AI refers to computer hardware, software, and processes that “enable machine learning or deep learning algorithms aiming to simulate the intellectual processes of humans,” including generalization, categorization, discovery, and learning from previous experience (Liu and Biljecki 2022). While citizen science harnesses human intelligence, intuition, and community engagement, AI contributes advanced computational capabilities, scalability, and efficiency. The combination of these approaches leads to novel, robust, and comprehensive research outcomes at scale, improves scientific productivity (Ceccaroni et al. 2019), and enables the scientific community to accelerate research so society is equipped to face today’s complex scientific challenges. AI and citizen science are increasingly intersecting fields, with AI enhancing data analysis capabilities and citizen science providing vast, diverse datasets (McClure et al. 2020).
Here, we discuss these intersections—where AI tools are used to support the ingest of high quality, relevant image data from the community, and how these same tools are used to protect the privacy rights of citizens. Once data is acquired from participating volunteers, computer vision AI algorithms (Amazon 2024) are used to support attribute recognition in data, becoming part of the data quality control process. Not surprisingly, it is the velocity and volume of citizen science data obtained from participants that feeds back into the development of automated classification algorithms used in data quality processes. The areas around participant interaction, specifically user privacy concerns (Bowser et al. 2017), screening for the appropriateness of image data, and addressing the reality of data quality issues commonly found in volunteer data reporting are the basis of GLOBE’s AI ecosystem. These strategies and research have prepared a citizen science program for response to emerging AI capabilities as processing moves from the cloud to edge computing on mobile devices (Zhang et al. 2018), where there are further opportunities to enhance the citizen science experience and satisfaction of our volunteers, especially in areas with low- to no-internet connections. Because we are working within a technical and rapidly changing computational environment, herein we have decided to stress the ways in which GO’s different independent AI initiatives come together to address the data challenges frequently encountered by citizen science projects.
Figure 1 describes GLOBE’s AI citizen scientist ecosystem mediated through the GO data collection mobile app. The blue outlined boxes identify the structural components of GO, including the mobile app providing training and data collection, manual data review, and the database. The boxes outlined in red identify the pathways by which individual citizen scientists can engage in citizen science tasks, which we describe as contributory (participants collecting data), collaborative (participants involved in data analysis and/or dissemination), and co-created (community designed and led projects) (citizen science research projects can be initiated by participants) (Shirk et al. 2012). Co-creation of projects is supported in-app through the geofencing tool, which supports targeted data collection by the community at a specific location and within a specific time frame. Within the GLOBE AI ecosystem, citizen scientists participate as collaborators in the manual labeling of images and review of AI-classification outputs through hackathon events and Zooniverse. The GLOBE database is open and welcomes access of citizen scientists who wish to conduct their own research projects using GLOBE data. The AI-mediated data processes include (1) cloud database ingest of submitted data, (2) data extraction from submitted citizen scientist images, and (3) AI-driven data collector feedback and data enrichment through GeoAI integration.

Figure 1
Citizen scientist-AI data ecosystem mediated through the GLOBE Observer data collection mobile app. Human touch elements in the data procurement and analysis system are indicated in red. Blue arrows describe data pathways between user inputs in the app, AI routines, and data return. Dashed outlines indicate processes that are planned or in development. (a) Data collection app. (b) manual image review. (c) GLOBE database. (d) citizen scientist entry points to GLOBE science. (e) Geofence capability alerts volunteers for project-specific targeted data collection requests submitted by community users. (f) Citizen scientist tasks contributing to AI development and classification processes. (g) Data analysis outcomes. The AI-mediated data processes include (1) cloud database ingest of submitted data, (2) data extraction from submitted citizen scientist images, and (3) AI-driven data collector feedback and data enrichment through GeoAI integration.
Cloud Database Ingest
Artificial intelligence–based privacy and safety screening
Image data is one class of information that is uploaded by GLOBE observers (Figure 1). The images not only serve as validation data for the environmental text and numerical attributes submitted by participants but they are also a common separate class of primary data that further supports its own analysis and assessment of current environmental conditions and characteristics (de Mesquita et al. 2024; Flowers et al. 2023; Li and Hsu 2022; Yan and Ryu 2021; Zhao et al. 2021; Hoffmann et al. 2019; Perger et al. 2014).
The adherence to privacy and security standards Is crucial when archiving citizen science data (Berti, Suman, and Albas 2023). To prevent the risk of exposing the public to inappropriate images that may include violence, nudity, and offensive or harmful language, and to eliminate irrelevant photo submissions, GLOBE has screened every photo submitted via mobile app since its launch in 2016. At the beginning of the project, a team of humans manually reviewed submitted images periodically during the week (Figure 1b). A downside of this safety practice was the time delay of image uploads, causing concern in volunteers who did not immediately see their data in the database while also slowing participant feedback (Fischer, Cho, and Storksdieck 2021). A survey of GLOBE volunteers found that the two most significant reasons people stop participating in citizen science data collection are (1) when they experience technical barriers and (2) when they perceive that their contribution was not meaningful or useful to scientists (Fischer, Cho, and Storksdieck 2021). The original manual photo review process likely contributed to user dissatisfaction and a decrease in continued participation.
In addition to removing inappropriate or irrelevant submissions, photo monitoring supports a stringent privacy policy. Some volunteers submit location-tagged photos that identify themselves or recognizable bystanders, including children, who have not consented to photography. To protect the privacy of participants and bystanders, the image validation team rejected all photos that included recognizable faces, as well as all text found on license plates or other sources. These actions are a common safety practice for projects that accept photos from mobile devices (Foody et al. 2024). Between 2016 and 2022, the manual photo review process rejected 14,739 photos, most for privacy issues. As the participants using the GO app grew, the volume of submitted images outpaced the human ability to review manually: By 2021, more than 429,500 photos were submitted by citizen scientists (Kohl 2024). Therefore, an automated solution was sought to automate the photo validation process, address the issue of data loss, and enable near-real-time upload of photos to the database.
Starting in 2022, GLOBE data is processed through Amazon Rekognition SAAS (software as a service) tool to automate the approval process (Amazon 2024). Rekognition is a pretrained computer vision machine learning model designed to analyze and label images while supplying an automated system of quality assurance checks on submitted images (Singh and Arora 2024). The system is used to screen for graphic image quality, privacy, and categorial accuracy prior to data ingest into the public database. The pretrained algorithms used to screen for appropriate image data from contributors include facial and text detection. Based on the AI tags, the system blurs face and text information (Figure 2). On a programmatic level, the GLOBE Data and Information System (DIS) team developed a machine learning algorithm to interpret the requested Rekognition tags and determine the likelihood that a submitted image is irrelevant or inappropriate for the long-term public database. To further improve confidence in the screening classification results while reducing the possibility that inappropriate content could still pass through the AI tools, all submitted images are subdivided into smaller quadrants and scanned by the Rekognition algorithm in pieces. The sum of the tags from the overall photo, in addition to the individual subdivided pieces, are fed into this programmatic algorithm to determine success or failure and still require human review. This allows photos that were previously rejected by the manual system due to privacy issues to be altered and approved, thus increasing the volume of citizen science data accepted into the database. The AI photo review system has dramatically reduced staff time needed to screen photos, even though the volume of submitted citizen science photos continues to increase.

Figure 2
Examples of photos flagged or blurred using AI screening: (a) license plate text; (b) face; (c) feet captured in a downward photo; (d) useful text data lost from AI blurring of features; (e) classification error where the AI system recognized mosquito larvae as text.
As of December 2023, 690,000 photos were screened by the system. Nearly 430,000 images were automatically approved (~62% of images), leaving 260,000 images for human review. Of the 260,000 photos screened by humans, 161 were rejected as inappropriate or irrelevant content. Many of the photos flagged by AI for human review were appropriately identified as including parts of a human figure or text. Since downward-facing photos are requested by the GO cloud and land cover protocols, flagged photos frequently contain the volunteer’s feet. To date, there have been no reported instances of an automatically approved photo containing either inappropriate or irrelevant content, thus providing an effective 100% confidence rate in the system’s ability to ensure only appropriate images are uploaded to the database (Holli Kohl, personal communication).
While the overall implementation of Rekognition for photo screening is beneficial, its use also has created new challenges. At the extreme, Rekognition occasionally makes classification errors. As an example, it mistakenly interpreted a submitted photo of a mosquito larva as text, and blurred it (Figure 2e), rendering the photo unsuitable for species classification. All original (pre-blurred) images are archived but not displayed in the database (Kohl 2024). Currently, rejection or modification of the data is not communicated to the submitter.
The automated blurring of text has impacted the use of GLOBE app calibration stations by volunteers at visitor centers, museums, libraries, and similar locations. At a calibration site, volunteers are instructed to submit a photo of the calibration graphic to demonstrate that the compass directions and location in the app are accurate. In the current system, calibration station graphics, and text are blurred, eliminating their value. In other instances, GLOBE students occasionally include a paper data sheet in one of their down photos to add information to the observation. In all these cases, the information is blurred by Rekognition. (Figure 2d). Work is in progress on ways to address these challenges and improve the quality of the automated processing routines.
Overall, however, the AI photo screening process has automated photo validation, ensured privacy, decreased data loss, and given users rapid access to their measurements, resulting in significantly fewer help desk requests inquiring about missing photos, and anecdotally suggesting that the volunteer experience has improved. While AI has solved some of our problems, it has generated others with implications for scientific use of the data that need addressing.
Data Extraction
Accommodating both casual users and dedicated citizen scientists (Schacher et al. 2023; Kottmann et al. 2018; Luna et al. 2018) was a design goal for the GO mobile application. All the data collection tools are built with the option for users to complete the entire research protocol or opt out after uploading photos, without manually entering descriptive attributes. In practical terms, this means that the app reduces time and complexity barriers for participation without discouraging users from submitting the images that serve as voucher data for the environmental phenomena reported in place and time. However, the flexibility that promotes participation and data collection also generates image data that is not classified and not readily discoverable for scientific use.
Data reported by citizen scientists using the GO app include location, time, images, and text attributes. Of these, the image data are the most problematic, because they require computer vision classification to become discoverable. Identifying species and their environmental context is a common participatory science image task (cf. Sullivan et al. 2014; Campbell et al. 2023). Here, we present work that further researches AI algorithms to classify image data collected based around differing resolution, scale of objects, and content themes: Land Cover and Mosquito Habitat Mapper.
Application of artificial intelligence to classify citizen science mosquito images
Citizen scientists using the Mosquito Habitat Mapper tool are tasked with submission of imagery of mosquito larvae found in geolocated standing water breeding habitats (Low et al. 2022).
GLOBE Observers can decide to terminate data collection at several points in the protocol, based on their available time and interest. Approximately 65% of Mosquito Habitat Mapper observations are terminated at the first step, where the user uploads images of breeding sites they observe on the landscape. Volunteers are encouraged to submit images of larvae they observe; of those who continue to submit images of larvae, few use the built-in classification key to determine the genera of their specimen (Figure 3), and accuracy of citizen scientist classifications is quite low. Less than 35% of the larvae images submitted to GLOBE are accompanied by a completed identification.

Figure 3
Screenshots of (a) the GLOBE Observer mobile app landing screen, (b) the drop-down menu listing selectable breeding habitats, and (c) the identification key built into the app to assist citizen scientists with identification of the medically important mosquito genera.
The identification of mosquito larvae is challenging, and expert validation is required. Recognizing that manual verification could become overwhelming, an image recognition prototype that includes image collection, training of image classifiers, specimen recognition, expert validation, and analytics was in development when the Mosquito Habitat Mapper was first released to the public (Muñoz et al. 2018, 2019), but availability of larval images for use in training the AI model constrained the outcomes of this initial research.
In recent years, citizen scientists have filled the data void through a variety of mosquito surveillance projects. The Global Mosquito Observation Dashboard (GMOD) was established in 2022 as a geographic information system (GIS)–powered dashboard providing a global visualization and an open cross-platform access to interoperable citizen science mosquito data, including images from GO, Mosquito Alert, and iNaturalist (Carney et al. 2022; Uelmen et al. 2023). GMOD serves to foster collaboration between citizen science projects, and provides a single-access portal to images and data for use in training AI classification algorithms as well as mosquito vector disease research.
Researchers are progressing in the development of computer vision techniques to support community-based mosquito surveillance. The computer vision training datasets used in current research includes the larvae photos submitted by citizen scientists. The diversity in smartphone hardware used for capturing these images introduces a wide range of image qualities and characteristics that prevents overfitting of the AI model. The real-world variability in image data contributed by citizen scientists cannot be easily achieved by obtaining images of laboratory specimens in controlled conditions.
Citizen science data is now used in public health decision-making and community-based interventions (Carney et al. 2023). Mosquito surveillance using the GO app is underway, supported by AI species recognition to monitor for Anopheles Stephensi, a dangerous malaria vector now invasive in parts of Africa (Sinka et al. 2020; Ahmed et al. 2022). The computer vision team at University of South Florida gathered a dataset of 241 smartphone photos of verified specimens of An. stephensi and An. gambiae. The photos were taken via smartphone at the insectaries at USF and the Centers for Disease Control and Prevention (CDC). The primary objective was developing a localization model using Faster R-CNN (region-based convolutional neural network) to precisely identify and delineate diagnostic areas of mosquito larvae: head, thorax, abdomen, and lower region. By delineating specific diagnostic areas of the mosquito, those areas alone can then be used for training algorithms to identify the mosquito type—instead of training on images of the whole body—hence removing unnecessary noise. The Faster R-CNN model architecture follows the typical structure consisting of a region proposal network (RPN) and a Fast R-CNN detector (Ren et al. 2017). A pre-trained convolutional neural network ResNet (He et al. 2018) was employed as the backbone for feature extraction, taking the input images and generating a feature map. The RPN, a fully convolutional network, operates on this feature map to generate region proposals (bounding boxes) along with class scores and bounding box regression coordinates for each proposal, using anchor boxes of varying scales and aspect ratios. These proposals are filtered and refined using non-maximum suppression (NMS). The refined proposals are used to extract features from the feature map that are then fed into two fully connected layers for object classification and bounding box regression, predicting class probabilities and refining the bounding box coordinates, respectively. The model was trained end-to-end for 5,000 iterations using a subset of 164 photos to obtain the final object detections, classifications, and refined bounding boxes for the mosquito larvae body parts.
A test dataset comprising two specimens, totaling 42 images, fully separated during the training of the model. When tested on these test images, the effort resulted in 96.15% accuracy in terms of average precision at intersection-over-union (IoU) of at least 50%. Figure 4 shows the localization results of An. stephensi specimen and the extracted body parts of the larvae. In the future, our localization algorithms could be deployed through the app in real time to aid in the education of mosquito anatomy, as well as to improve taxonomic and sex classification—both manually by the users, as well as algorithmically by analysis of each anatomical region.

Figure 4
Localization test result of an Anopheles stephensi specimen: (a) full larva view; (b) boxes identifying segmentation of image for analysis; and (c) labeled data segments.
Transfer learning methodologies have been used to retrain the AI model, integrating a broader spectrum of mosquito species, gonotrophic stages, and backgrounds present in images contributed by citizen scientists (Minakshi et al. 2020; Azam et al. 2023; Carney et al. 2022). Such an approach broadens dataset diversity while strengthening the resilience of the resulting AI model, making it more adapted for real-world applications. This strategy enhances the scientific integrity of the AI-assisted image recognition work and highlights the indispensable role of community participation in advancing technological solutions against vector-borne diseases.
Hackathons have been employed to label mosquito images, so that they can be used for training AI species recognition algorithms. For example, The Mosquito Alert AI Challenge brought together more than 1,000 participants and more than 104 unique teams from 70 countries and marked a vibrant convergence of AI and citizen science (AI Crowd and the Mosquito Alert Team 2023). Collaborative events like the AI crowdsourced Mosquito Alert Challenge unleash the potential of citizen-driven data collection and propel the creation of AI tools for use in citizen science mosquito surveillance projects.
GeoAI Land Cover
High-resolution GO Land Cover photos support the interpretation and classification of features observed from space; and more accurate land cover classifications aided by examination of citizen science photos support informed decision-making addressing infrastructure, development, and conservation (Amos et al. 2020; Kohl et al. 2021). Volunteers are asked to capture standardized directional images to sample the patterns and processes taking place on the landscape at local scales around them. Using the camera on their mobile device, volunteers collect six multi-directional (north, south, east, west, up, and down) images at their location (Figure 5).

Figure 5
GLOBE Observer land cover tool data collection protocol. (a) Examples of four directional views at each location. (b) Images obtained from the four cardinal directions for a series of land cover types (from Huang et al. 2023, reproduced with permission of the author).
Participants have the option of using the built-in classification tool to describe the land cover at their location. However, visual estimation of cover percentage of land cover classes is both time consuming and challenging, and only 35% of citizen scientists who submit land cover photos complete this task.
Huang et al. (2023) provides empirical evidence that land cover classification benefits from the crowdsourced multi-directional images collected in GO, compared with single image observations. The study demonstrated that when faced with complex multi-view pairs, an AI model demonstrated an ability to locate relevant discriminatory features and sidestep the irrelevant ones, and provided convincing evidence that crowdsourcing programs should adopt multi-directional observation protocols. This outcome affirms the scientific value of the GO Land Cover approach, which has the unique requirement of asking for six images from each sampling point (four cardinal directions, up and down). Huang et al. (2023) concluded that the user-contributed multi-view data employed in deep classification models achieved the highest performance under the EfficientNet architecture, late fusion strategy, and maximum four cardinal directions. Up to 8.4% weighted F1 score improvement was attained over single-view approaches.
Huang et al. (2023) also developed tools for quantifying the quality of the labeled training data, supporting scientist confidence in the reuse of training data in other AI applications. The creation of training data needed for supervised machine learning requires significant time and energy, so it is expedient to reuse training data where possible. This research has implications for machine learning research at large, including contributions to new machine learning hubs that host data according to machine learning standards, as well as tools for data manipulation, and model output evaluation tools.
Machine learning–enabled classification allows for data discovery. Manzanarez, Manian, and Santos (2022) focused on uncertainty created from mixed pixels found in the edges and corners of neighboring classes to improve the prediction of land cover classes and labeling of the citizen science database of images. They developed an integrated architecture that combined Unet and DeepLabV3 for initial segmentation, followed by a weighted fusion model that combines the segmentation labels. A total of 2,916 GLOBE images were labeled with land cover classes using this integrated model, with a 90.97% label accuracy with minimal human involvement. They highlighted the variability and uncertainty produced when a variety of people worldwide, with different camera settings, lighting, and acquisition parameters contribute to a common dataset. These sources of error and the small dataset resulted in further discrimination between buildings, road, and sidewalk classes, and between grass and tree classes.
At the current stage of development, the enriched GO Land Cover data, including AI-generated land cover classifications, exist outside the GO AI ecosystem. A fully enabled land cover classification AI routine would not only exist in the cloud, but also be located in the mobile device app where it could provide detailed information and feedback during the data collection process. Full implementation of GO Land Cover classification AI would also provide flexibility for inclusion of a greater number of GeoAI datasets in the existing validation and comparison tasks. In doing so, citizen scientists could provide detailed feedback for scientists who are seeking ground-based understanding of spatially-explicit data models.
While not fully implemented in GO’s citizen science–AI ecosystem, progress on the Land Cover AI classification tools is underway and is benefiting from burgeoning research (Song et al. 2023; Taskin, Aptoula, and Ertürk 2024). As an example, the GeoWiki Project is a citizen science network dedicated to improving the overall quality of land use and land cover maps across the globe, and imagery labeled by users has been used to inform model architectures and loss functions for recursive self-improvement of land cover maps (Laso Bayas et al. 2022). Volunteers have cleaned and structured crowdsourced data for enhanced compatibility with computer vision expectations, limiting the preprocessing burden experienced in some human-AI collaborations (Aristeidou et al. 2021). Research projects interacting directly with public contributors have guided data collection towards high-value observational gaps. Fritz et al. (2019) similarly recommends engaging local communities in the co-identification of places on the landscape needing supplementary sampling by volunteers to maximize mutual benefits. The full power of AI in this context lies in the collaborative partnership of human intelligence with AI systems.
The GO Land Cover data classification challenge is an important way in which the public can contribute to participatory science through virtual land cover labeling projects through hackathons (Yang and Huan 2023) or on Zooniverse, a highly popular citizen science web-based platform (Simpson, Page, and De Roure 2014). Hosting events and working with student citizen scientist groups interested in contributing to this effort supports the inclusion of diverse perspectives (de Sherbinin et al. 2021) in scientific discovery.
Data Collector Feedback
AI-based volunteered data feedback
Significant research has addressed the challenge of attracting and retaining volunteers in a participatory science program. In a survey of 4,000 GO app users, Fisher, Cho, and Storksdieck (2021) found that contributing to science, and especially NASA research, was a strong motivator for volunteers. Feedback to volunteers signaling the scientific importance of their submitted data was identified as a positive motivating factor that reinforced participation and improved user experience (Amos et al. 2020, Fisher, Cho, and Storksdieck 2021; Geoghegan et al. 2016).
An automated user feedback providing a satellite image comparison is built into the workflow for GO’s Cloud tool. Volunteers who submit GO Cloud observations within 15 minutes of a satellite cloud observation receive an email containing that satellite image to compare with their own ground-based cloud observation (Amos et al. 2020). Communication between the program and the participants has proved so positive that similar communication is in a testing-phase for volunteers who submit Land Cover photos (Kohl 2024).
We envision future in-app capabilities to classify images for all 4 GO tools—Trees, Clouds, Land Cover, and Mosquito Habitat Mapper, where information collected by volunteers can be processed and transferred far more efficiently. Moving the AI tools from the cloud to the mobile device has the potential to improve data collection and quality, as well as the user experience. Berenguer-Agullo et al. (2024) have demonstrated the practicality of edge computing for classification in birds, reducing misclassifications stored in the project database. Velasco-Montero et al. (2024) have had success in improving their classification by integrating AI continual learning into their smart device (wildlife camera trap). Huang et al. (2018) and Minakshi et al. (2020) both describe an edge computing solution that automates surveillance of mosquito vectors in traps using computer vision techniques. Edge computing would greatly improve the GO user experience and motivate participants by providing near-real-time feedback. Automated data returns that provide confirmation or clarification of volunteered observations not only supports longitudinal skill development but also promotes self-efficacy, both of which improve data quality and participant retention in citizen science programs (Everleigh et al. 2014).
Finally, GeoAI describes geographic research that incorporates computational science with principles of geography, remote sensing, human-environment interactions, and an increasing amount of digital Earth observations (Janowicz et al. 2019). According to an editorial and review by Song et al. (2023), GeoAI applications can be grouped into four categories: (1) buildings and infrastructure, (2) land use analysis, (3) natural environment and hazards, and (4) social issues and human activities. With these categories as a guide, participatory science projects have the opportunity to identify new scientific datasets at a higher spatial and temporal resolutions than available before 2020 (Song et al. 2023). These locally and thematically relevant datasets could further enrich the collected imagery, geospatial attributes, and place-based environmental description of a participatory science project like The GLOBE Program.
Conclusion
The development of the GO AI ecosystem is still in its infancy. GO is being used in 127 countries around the world and accepts data from an estimated 3,000 devices, each with sensors that need support by the technical team. The labor required to release timely updates to the app and support citizen scientists in the field represents a significant part of an operating budget. However, there is rapid development and labor efficiency gains stemming from the field of AI, particularly related to map outputs (Ma et al. 2019). Wang et al.’s (2024) systematic review of geospatial artificial intelligence (GeoAI) research related to human geography yielded an astounding 14,537 published articles. In these articles, image-based classification was one of the top three modeling tasks represented. A dramatic surge in the numbers of GeoAI publications began in 2017, but only recently has there been a standard method for organizing and arranging datasets for AI model development. As GLOBE continues to strive for improved data quality and the usefulness of its citizen science data, adopting machine-readable standards such as Croissant to describe structured data (Shinde and Koehl 2024; ML Commons 2024) will be critical.
To date, the most mature AI component related to image classification and identification challenges have taken place in the context of GO citizen science mosquito data. The AI researchers have demonstrated an end-to-end workflow for mosquito identification for a limited number of mosquito vector species. The team is adding capabilities to identify additional mosquito species that impact human health. This work is taking place in conjunction with partner citizen science programs through GMOD, the interoperable citizen science mosquito data portal.
The status of the land cover image recognition and classification research conducted using GO Land Cover images is at an earlier stage than GO MHM, and is now addressing data challenges that are recognized within the variety of data contributions by the citizen scientists. Several different approaches to land cover classification were developed by researchers, and these need harmonization before use in a more foundational classification model. There are further research opportunities around oriented imagery classification. Finally, we recognize the transformational and transdisciplinary paradigm shift that AI and GeoAI bring to participatory sciences, and recommend further development of multi-project integration, the incorporation of spatially-explicit GeoAI models, and the adoption of open science practices in future studies based around environmental participatory projects.
Data Accessibility Statement
GMOD is available at mosquitodashboard.org. The dataset used in the Mosquito Alert AI Challenge is available after logging in to the AIcrowd platform. For more details, please visit: https://www.aicrowd.com/challenges/mosquitoalert-challenge-2023. The model and dataset used in the USF larvae localization are available at mosquitoID.org and https://github.com/FarhatBuet14/mosquitoAI/tree/main/larvaeNET/Larvae%20Localization/larvae_anatomy_localization. Amazon Rekognition algorthims used in GLOBE Observer photo ingest screening.
TextModelVersion: 3.0
LabelModelVersion: 3.0
ModerationModelVersion: 7.0
Acknowledgements
The authors thank the volunteers who participated in the GLOBE, Mosquito Alert, and iNaturalist citizen science projects that enabled this research. Thanks to Andrew Clark, Carla McAuliffe, and Cassie Soeffing (IGES) for editorial support, as well as Karlene Rivera, Pradeep Subramani, John Adams (USF), and Sarah Zohdy (CDC) for larval images used in AI training. The findings and conclusions in this manuscript are those of the author(s). Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funders. This manuscript is submitted for publication with the understanding that the United States Government is authorized to reproduce and distribute reprints for Governmental purposes.
Funding Information
The GLOBE Program is sponsored by the National Aeronautics and Space Administration, National Science Foundation, National Oceanic and Atmospheric Administration, and U.S. Department of State and managed by NASA. GLOBE Observer is supported by NASA Science Activation Award NNX16AE28A for the NASA Earth Science Education Collaborative (Theresa Schwerin, IGES, PI). Additional support for Peder Nelson is based upon work supported by the U.S. Geological Survey under Grant/Cooperative Agreement No. G23AP00683 (GY23-GY27).
The Global Mosquito Observation Dashboard was funded by the National Science Foundation under Grant No. IIS-2014547 to PI Ryan Carney (USF) and Co-PIs Sriram Chellappan (USF), and Russanne Low (IGES). Research by Xiao Huang and Di Yang was funded through NASA EPSCoR Grant #80NSSC21M0177. Initial AI classification models for GLOBE citizen science mosquito larvae was funded by NSF EAGER#1645154 to PI Rebecca Boger (Brooklyn College), Co-PIs Russanne Low (IGES) and Geoffrey Haines-Styles. Mosquito Alert is funded by (a) the European Commission, under Grants CA17108 (AIM-COST Action), 874735 (VEO), 853271 (H-MIP), and 2020/2094 (NextGenerationEU, through CSIC’s Global Health Platform, PTI Salud Global); (b) the Dutch National Research Agenda (NWA), under Grant NWA/00686468; and (c) “la Caixa” Foundation, under Grant HR19-00336.
Competing Interests
The authors have no competing interests to declare.
