AI Takes the Plunge: Monitoring Inland Water Quality with an Intelligent Agent

Author: Denis Avetisyan


A new agentic AI system, NAIAD, is poised to transform how we understand and manage the health of our lakes, rivers, and streams.

The NAIAD system leverages an agentic AI approach to inland water monitoring, where user prompts initiate a workflow involving a large language model and an autonomous agent that collaboratively retrieve data from a vector database and utilize external analytical tools to generate comprehensive, user-adapted water quality assessments.
The NAIAD system leverages an agentic AI approach to inland water monitoring, where user prompts initiate a workflow involving a large language model and an autonomous agent that collaboratively retrieve data from a vector database and utilize external analytical tools to generate comprehensive, user-adapted water quality assessments.

This paper details NAIAD, an agentic AI assistant leveraging remote sensing, large language models, and retrieval-augmented generation for adaptive inland water quality monitoring through natural language interaction.

Effective inland water monitoring requires holistic assessments, yet current approaches often address individual quality indicators in isolation. This limitation motivates the development of ‘Naiad: Novel Agentic Intelligent Autonomous System for Inland Water Monitoring’, which introduces an agentic AI assistant capable of synthesizing diverse Earth Observation data and analytical tools via natural language interaction. By leveraging Large Language Models and Retrieval-Augmented Generation, Naiad delivers tailored reports and demonstrates strong performance across varying user expertise levels. Could such an adaptable system represent a paradigm shift in accessible and proactive environmental management?


Unveiling the Complexity of Inland Water Systems

Conventional methods of tracking inland water quality demand considerable resources, primarily due to the logistical complexities of manual sampling. Historically, scientists have relied on physically collecting water samples from a limited number of locations, a process that is both time-consuming and expensive. This approach inherently provides only a snapshot of conditions at specific points and times, failing to capture the dynamic and often spatially variable nature of these ecosystems. The infrequent nature of these assessments further hinders proactive management, as changes in water quality-whether from agricultural runoff, industrial discharge, or natural events-may go undetected until they become critical issues. Consequently, a more comprehensive and efficient approach to monitoring is needed to effectively safeguard these vital freshwater resources.

Effective freshwater resource management is increasingly hampered by escalating environmental stressors. Climate change intensifies extreme weather events – droughts and floods – which concentrate pollutants and alter water temperatures, impacting aquatic ecosystems and human access to clean water. Simultaneously, shifts in land use, such as agricultural expansion and urbanization, introduce non-point source pollution – fertilizers, pesticides, and stormwater runoff – further degrading water quality. These combined pressures demand a move from reactive, crisis-driven water management to a proactive approach, yet traditional monitoring systems struggle to provide the timely, comprehensive data needed to anticipate and mitigate emerging threats, leaving water managers perpetually playing catch-up with rapidly changing conditions.

Current remote sensing techniques, while offering broad overviews of inland water bodies, frequently struggle with the nuanced complexity of water quality assessment. Limitations arise from the difficulty in discerning specific pollutants or biological factors – such as algal blooms or sediment loads – that dramatically affect water health. The spectral signatures detected by satellites or aerial sensors can be ambiguous, often reflecting a combination of variables and failing to pinpoint the precise cause of water degradation. Furthermore, these methods require substantial contextual data – including depth, flow rate, and surrounding land use – to accurately interpret the sensed signals. Without integrating this crucial information, remote sensing can produce misleading or incomplete assessments, hindering effective water resource management and proactive intervention strategies.

NAIAD effectively answers user queries about Greek lake ecosystems-including topics like chlorophyll-a, precipitation, and cyanobacteria-by retrieving, reasoning with, and summarizing scientific data using models such as Qwen-2.5 and Gemma-3.
NAIAD effectively answers user queries about Greek lake ecosystems-including topics like chlorophyll-a, precipitation, and cyanobacteria-by retrieving, reasoning with, and summarizing scientific data using models such as Qwen-2.5 and Gemma-3.

NAIAD: An Adaptive Intelligence for Water Systems

NAIAD is an agentic AI assistant developed to automate and adaptively analyze data related to inland water monitoring. This system moves beyond traditional static analysis by employing artificial intelligence to dynamically adjust analytical workflows based on incoming data and user requests. The agentic framework allows NAIAD to independently execute complex tasks – including data retrieval, processing, and interpretation – without constant human intervention. This capability is intended to significantly improve the efficiency and scalability of inland water monitoring programs, facilitating more timely and comprehensive assessments of water quality, quantity, and ecosystem health.

NAIAD utilizes Large Language Models (LLMs), specifically Qwen 2.5 and Gemma 3, as the core of its functionality. These LLMs process user-submitted queries, translating natural language into structured analytical tasks. This capability allows NAIAD to dynamically build and execute complex workflows without requiring pre-defined scripts or extensive user specification. The LLMs are responsible for identifying the relevant data sources, selecting appropriate analytical methods, and interpreting the results to deliver actionable insights. This agentic approach allows for flexible and adaptive analysis of inland water monitoring data based on varied user requests.

Retrieval-Augmented Generation (RAG) is a core component of NAIAD’s architecture, enabling the Large Language Models (LLMs) – Qwen 2.5 and Gemma 3 – to access and incorporate relevant data during query processing. This process enhances contextual understanding and improves the accuracy of generated responses. Performance evaluations demonstrate an overall correctness rate of 82.98% when utilizing RAG with both LLMs, indicating a significant improvement in information retrieval and a reduction in the generation of factually incorrect outputs. The RAG implementation allows NAIAD to move beyond the limitations of pre-trained knowledge and dynamically incorporate current and specific data related to inland water monitoring.

Evaluations of NAIAD’s query translation capabilities indicate a strong correlation between user requests and generated actionable insights. Specifically, the Qwen 2.5 Large Language Model achieved a 78.72% relevance score in translating user queries, while the Gemma 3 model demonstrated 68.09% relevance. These metrics were determined through rigorous testing designed to assess the system’s ability to accurately interpret user intent and formulate appropriate analytical responses for inland water monitoring applications. The observed variance between the two models suggests differing strengths in contextual understanding and query interpretation.

NAIAD leverages Retrieval-Augmented Generation (RAG) and a dynamically constructed Directed Acyclic Graph (DAG) to interpret user queries, orchestrate necessary tools, and synthesize a tailored report enhanced by reflection mechanisms ensuring relevant and accurate findings.
NAIAD leverages Retrieval-Augmented Generation (RAG) and a dynamically constructed Directed Acyclic Graph (DAG) to interpret user queries, orchestrate necessary tools, and synthesize a tailored report enhanced by reflection mechanisms ensuring relevant and accurate findings.

Orchestrating Insights: Workflows Defined by Directed Acyclic Graphs

NAIAD employs Directed Acyclic Graphs (DAGs) as a computational framework for defining and executing analytical workflows. A DAG consists of nodes representing individual data processing tasks – such as data ingestion, transformation, or analysis – and directed edges illustrating the dependencies between these tasks. This structure guarantees reproducibility by explicitly defining the order of operations and data lineage. Scalability is achieved through the inherent parallelism of DAGs; independent tasks within the graph can be executed concurrently across distributed computing resources. The acyclic nature of the graph prevents infinite loops and ensures deterministic execution, critical for consistent results and reliable model outputs. Each node in the DAG can be independently versioned and tested, facilitating workflow maintenance and updates without affecting the overall system integrity.

NAIAD’s workflow orchestration relies on integrating heterogeneous data sources within its Directed Acyclic Graphs (DAGs). Specifically, data from Sentinel-2 and Landsat satellites provide multi-spectral imagery for land and water surface analysis. Complementing this optical data are meteorological inputs, including temperature, precipitation, and solar radiation, sourced from various weather models and observation networks. Furthermore, in-situ sensor measurements, such as water quality parameters (e.g., turbidity, chlorophyll-a) and hydrological data (e.g., water level, flow rate) collected from field deployments and monitoring stations, are incorporated to provide ground truth and localized validation. This multi-source data integration enables comprehensive environmental assessments and model calibration within the NAIAD system.

Analytical processing within NAIAD workflows incorporates the calculation of remote sensing indices, such as the Normalized Difference Chlorophyll Index (NDCI), used to estimate chlorophyll-a concentration as a proxy for phytoplankton biomass. These calculations utilize spectral reflectance data from satellite imagery. Simultaneously, the system models nutrient loads, quantifying the influx of nitrogen and phosphorus into aquatic ecosystems, and assesses land cover change by classifying and monitoring alterations in terrestrial surfaces using multi-temporal imagery. These modeled parameters are critical inputs for predictive analyses, enabling the assessment of environmental factors influencing water quality and ecosystem health.

The CyFi platform automates multi-parameter assessment within NAIAD’s workflow orchestration system to facilitate cyanobacterial bloom forecasting. This integration involves the automated retrieval and processing of relevant environmental data-including remotely sensed imagery, meteorological records, and in-situ sensor data-and its subsequent use in CyFi’s established algorithms for bloom risk assessment. CyFi provides standardized metrics and visualizations, streamlining the analysis process and enabling consistent, reproducible forecasts of bloom occurrence, intensity, and spatial extent. The platform’s automation capabilities reduce manual intervention, allowing for timely and scalable monitoring of water quality conditions conducive to cyanobacterial blooms.

Normalized Difference Chlorophyll Index (NDCI) overlays reveal chlorophyll distribution within Lake Lysimachia, Lake Trichonida, and artificial Lake Mornos.
Normalized Difference Chlorophyll Index (NDCI) overlays reveal chlorophyll distribution within Lake Lysimachia, Lake Trichonida, and artificial Lake Mornos.

From Data to Actionable Understanding: Assessing the Health of Water Systems

NAIAD utilizes sophisticated workflows to derive key indicators of water health, most notably turbidity and the trophic state index. Turbidity, a measure of water cloudiness, reveals the presence of suspended particles impacting light penetration and potentially harboring pollutants. Simultaneously, the trophic state index assesses the level of biological productivity, ranging from nutrient-poor oligotrophic waters to excessively enriched eutrophic conditions – often signaling algal blooms and oxygen depletion. By calculating these and other related parameters, NAIAD delivers a holistic assessment, moving beyond simple measurements to provide a nuanced understanding of a water body’s overall condition and its capacity to support aquatic life. This comprehensive approach enables proactive management and informed decision-making regarding water resource protection.

NAIAD’s assessment of water health benefits significantly from a synergistic combination of data sources: remote sensing and in-situ measurements. While satellite-based remote sensing provides broad spatial coverage, crucial details about water composition and immediate conditions often require direct, on-site measurements. By intelligently integrating these two approaches, NAIAD effectively bridges data gaps inherent in relying on a single source. Remote sensing data informs the overall picture, while in-situ measurements provide critical ground truth and refine the analysis, ultimately leading to more accurate and reliable assessments of water quality parameters and a more comprehensive understanding of aquatic ecosystem health. This combined methodology ensures a robust evaluation, even in areas with limited accessibility or infrequent monitoring capabilities.

The fusion of remotely sensed data with direct, on-site measurements represents a significant advancement in proactive water resource management. This integrated approach allows for the identification of subtle changes in water quality – such as increases in turbidity or shifts in algal blooms – that might otherwise go unnoticed until problems become severe. Consequently, authorities and resource managers gain critical lead time to implement targeted interventions, ranging from adjusting water treatment processes to addressing pollution sources before widespread ecological or public health impacts occur. This capacity for early detection not only minimizes the costs associated with remediation but also supports the preservation of aquatic ecosystems and the reliable provision of clean water supplies, fostering long-term sustainability.

Traditional water quality monitoring often strains resources, particularly for programs operating with limited personnel or funding. NAIAD addresses this challenge through automated analytical workflows, significantly decreasing the demand on manual data processing and interpretation. This automation enables monitoring programs to expand their spatial and temporal coverage, moving beyond infrequent, localized sampling to achieve broader, more frequent assessments of water health. By streamlining the analytical process, NAIAD facilitates a shift from reactive, problem-focused monitoring to proactive, preventative management, ultimately enhancing the sustainability and effectiveness of water resource protection efforts.

Envisioning the Future: An Integrated Approach to Water Monitoring

The NAIAD platform distinguishes itself through an agentic artificial intelligence framework, moving beyond static analysis to a system capable of continuous learning and performance enhancement. This isn’t simply about processing more data; the AI actively refines its analytical approaches based on incoming information and observed outcomes, effectively becoming more accurate and efficient over time. By autonomously adjusting its algorithms and prioritizing relevant data streams, NAIAD adapts to evolving environmental conditions and the nuances of different water systems. This dynamic capability allows the platform to not only detect anomalies and predict future trends, but also to improve the reliability of its insights, offering increasingly valuable support for informed water resource management decisions.

The continued evolution of integrated water monitoring platforms centers on broadening analytical capabilities and data integration. Current development efforts prioritize incorporating advanced spectroscopic techniques, like hyperspectral imaging for algal bloom detection, and enhancing the system’s ability to process unconventional data streams – including acoustic sensors for leak detection and environmental DNA for biodiversity assessment. This expansion isn’t limited to data type; researchers are also working to seamlessly integrate data from diverse sources – satellite imagery, ground-based sensors, citizen science initiatives, and historical records – to create a holistic, real-time picture of water resource health. Ultimately, this multi-faceted approach aims to move beyond simply monitoring water quality to providing predictive insights that support proactive and adaptive water management strategies.

NAIAD’s design prioritizes accessibility and growth through a fully open-source architecture and modular construction. This intentional approach allows researchers, engineers, and water managers to not only inspect and adapt the platform’s core functionality, but also to contribute new analytical tools and data integrations. By fostering a collaborative environment, NAIAD circumvents the limitations of proprietary systems, accelerating innovation in water monitoring techniques and ensuring the platform remains responsive to evolving community needs. This shared development model promises a continuously improving system, driven by diverse expertise and collective problem-solving, ultimately benefiting the broader scientific community and supporting more effective water resource management strategies.

NAIAD is conceived not merely as a monitoring system, but as a comprehensive decision-support tool designed to fundamentally alter how inland water resources are managed. By integrating diverse datasets and employing advanced analytical capabilities, the platform delivers actionable insights to a broad range of stakeholders – from local water authorities and environmental agencies to agricultural businesses and citizen scientists. This empowers these groups to move beyond reactive responses to water-related challenges, and instead proactively implement sustainable strategies based on data-driven predictions and a holistic understanding of complex hydrological systems. The ultimate goal is to foster a collaborative, adaptive approach to water management, ensuring the long-term health and availability of these critical resources for both present and future generations.

The design philosophy underpinning Naiad resonates with a fundamental tenet of robust system architecture. If the system survives on duct tape, it’s probably overengineered. Tim Berners-Lee aptly stated, “The web is more a social creation than a technical one.” This highlights the importance of accessibility and user interaction, principles clearly embodied in Naiad’s natural language interface. The system doesn’t merely process data; it facilitates a dialogue, connecting diverse data sources-remote sensing, large language models, and directed acyclic graphs-into a cohesive and understandable whole. This approach mirrors the web’s original intent: to connect information in a meaningful way, fostering collaboration and understanding, rather than presenting complex data in isolation.

What Lies Beneath?

The introduction of an agentic system like NAIAD exposes a familiar truth: every new dependency is the hidden cost of freedom. While the integration of remote sensing, large language models, and directed acyclic graphs offers a compelling advance in inland water monitoring, the system’s efficacy remains inextricably linked to the quality and availability of its constituent data. The illusion of seamless access obscures the persistent challenge of data scarcity, bias, and the inherent limitations of any retrieval-augmented generation framework.

Future work must address not only the refinement of agentic capabilities, but a deeper understanding of the systemic constraints. Focus should shift from simply adding data sources to developing robust methods for assessing data lineage, quantifying uncertainty, and actively mitigating the propagation of error. The organism’s health depends on more than just increased sensory input; it requires a rigorous internal model of its own limitations.

Ultimately, the value of such a system lies not in its ability to predict water quality with ever-increasing precision, but in its capacity to reveal the fundamental gaps in knowledge. A truly intelligent system acknowledges what it doesn’t know, and designs its inquiries accordingly. The next iteration should prioritize not just answering questions, but formulating better ones.


Original article: https://arxiv.org/pdf/2601.05256.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-01-13 00:24