Turning Uncertainty into Logic: Formalizing Probabilistic System Requirements

Author: Denis Avetisyan

A new approach allows engineers to translate imprecise, natural language descriptions of system behavior into a formal, verifiable logic.

The system translates natural language requirements-initially expressed as unstructured English in a comments field-into formally defined semantics and a corresponding $PCTL^*$ formula, demonstrating a pathway from intuitive specification to rigorous, machine-processable logic.

This paper extends the FRET requirements language to support probabilistic temporal logic (PCTL*), enabling the formalization and verification of systems with uncertain behavior.

Specifying requirements for autonomous systems-particularly those operating under uncertainty-remains a significant challenge despite advances in formal methods. This paper, ‘Automated Formalization of Probabilistic Requirements from Structured Natural Language’, addresses this gap by extending a structured natural language-NASA’s FRET-to support unambiguous specification of probabilistic behaviors. We present a fully automated approach to translate these requirements into formulas of probabilistic temporal logic, coupled with validation techniques to ensure semantic correctness. Could this methodology substantially lower the barrier to formally verifying safety-critical, adaptive systems and accelerate the development of trustworthy autonomous technologies?

The Inevitable Decay of Specification: Framing the Challenge

Modern systems, increasingly reliant on complex interactions and stochastic processes, often exhibit behaviors not fully captured by traditional, deterministic requirements. These conventional specifications typically assume fixed inputs and predictable outcomes, a simplification that breaks down in the face of real-world variability. This disconnect arises because many systems now operate within environments characterized by inherent uncertainty – fluctuating network conditions, imprecise sensor data, or the unpredictable actions of users. Consequently, a system designed to meet rigid, absolute requirements may perform unexpectedly, or even fail, when confronted with the inevitable deviations from ideal conditions. The resulting unpredictable behavior isn’t necessarily a design flaw, but rather a consequence of specifying a system without acknowledging the probabilistic nature of its operating environment, highlighting the need for more nuanced and adaptable requirement methodologies.

For safety-critical systems – encompassing everything from autonomous vehicles to medical devices – simply stating what a system should do is insufficient; defining how reliably it must perform is paramount. Traditional requirements specifications often lack the necessary nuance to capture probabilistic behaviors, leading to ambiguity in crucial scenarios. Current methods for specifying these probabilities, such as relying on informal natural language or incomplete statistical models, present significant verification challenges. Determining whether a system truly meets a probabilistic requirement – for instance, a $99.999\%$ failure rate or a mean time between failures exceeding a certain threshold – demands rigorous mathematical analysis and testing. The difficulty lies not only in accurately modeling complex system interactions but also in providing conclusive evidence that the specified probabilities will be consistently maintained throughout the system’s operational lifecycle, necessitating the development of more precise and verifiable specification languages and analysis techniques.

Probabilistic FRET expands upon classic FRET by adding new components-highlighted with solid lines and colored backgrounds-to existing dashed-line structures.

Extending the Formal: Introducing Probabilistic FRET

Formal Requirement Engineering and Testing (FRET) has been extended to incorporate explicit modeling of probabilistic behavior within system specifications. Traditional formal methods often struggle to represent uncertainty or randomness inherent in real-world systems; this extension addresses this limitation by allowing specification of probabilities associated with state transitions and event occurrences. This capability is achieved through the introduction of probabilistic operators and constructs within the FRET language, enabling requirements to be defined with quantifiable likelihoods. The result is a more precise and unambiguous specification, reducing the potential for misinterpretation and facilitating rigorous verification of systems exhibiting probabilistic characteristics. Specifically, requirements can now express constraints such as “the probability of state $X$ occurring within time $t$ is greater than $p$”, allowing for the formal validation of safety and performance criteria under uncertainty.

ProbabilisticFRET is designed to be readily accessible to those familiar with Feature-Oriented Requirements Engineering (FRET). It maintains core FRET concepts such as features, constraints, and scenarios, extending them to incorporate probabilistic modeling. Specifically, ProbabilisticFRET introduces probability distributions associated with feature selections and constraint evaluations, allowing for the specification of uncertainty and variability. Existing FRET tooling and techniques, including feature model editors and verification algorithms, are largely compatible with ProbabilisticFRET, minimizing the need for new infrastructure or extensive retraining. The language utilizes a familiar syntax, building upon the established FRET notation with added constructs for probabilistic specification, thereby facilitating a smooth transition for experienced practitioners.

Traditional system requirements are often expressed in natural language, which is inherently ambiguous and prone to misinterpretation. Probabilistic FRET addresses this limitation by providing a formal language for specification, enabling the creation of models that are both human-readable and suitable for automated verification. This transition from informal descriptions to a formally defined language allows for the precise and unambiguous capture of system behavior, facilitating the development of machine-verifiable models. Consequently, ambiguities inherent in natural language are eliminated, and automated tools can be employed to rigorously analyze and validate system properties, ensuring correctness and reliability. The formal semantics of the language provide a clear and precise mapping between requirements and the intended system behavior, which is crucial for safety-critical applications and complex system design.

Deconstructing Complexity: Compositional Formalization and PCTL Verification

Compositional formalization, leveraging FRET Template Keys, enables the construction of complex probabilistic requirements from a library of simpler, pre-defined components. This methodology decomposes overarching system properties into smaller, manageable specifications, each associated with a specific FRET key. By combining these keys according to defined composition rules, complex requirements – such as probabilistic safety or liveness properties – can be expressed without specifying the entire system behavior at once. This approach promotes modularity and facilitates reuse of established specifications, reducing the complexity of formal verification and enhancing the scalability of the requirement specification process. The system supports the combination of these template keys to represent intricate probabilistic relationships within the modeled system.

The compositional formalization technique facilitates modularity and reusability in requirement specification by enabling the construction of complex probabilistic properties from predefined, reusable components – FRET Template Keys. This decomposition reduces specification complexity, minimizing the potential for errors inherent in monolithic property definitions. By leveraging these pre-verified components, engineers can assemble new requirements without requiring complete re-verification of the underlying logic, improving development efficiency and enhancing confidence in system correctness. The ability to combine and reuse these keys significantly reduces the effort required to express increasingly complex system behaviors.

The system’s capacity for expressing complex probabilistic requirements has been significantly enhanced by expanding the number of supported FRET Template Keys from 160 to 560. This increase directly correlates to a greater ability to define and verify a wider range of system behaviors. Each template key represents a distinct probabilistic pattern or condition that can be incorporated into requirement specifications. The expanded key set allows for the compositional construction of more intricate requirements, enabling a finer granularity of control and verification than previously possible. This scaling of supported keys facilitates the formalization of more complex and nuanced system properties.

Formalized requirements, expressed using FRET Template Keys, are translated into formulas compatible with Probabilistic Computation Tree Logic (PCTL). This translation enables the verification of probabilistic properties using established model checking tools such as PRISM and other solvers capable of handling PCTL specifications. Specifically, the translated PCTL formulas allow for the quantitative assessment of probabilities associated with system behaviors, confirming whether the system meets defined probabilistic guarantees. The use of model checking provides automated, rigorous verification, reducing the risk of errors in complex probabilistic systems.

The distribution of template key occurrences varies across different requirement sets.

Mapping States and Probabilities: Validation Framework and Discrete-Time Markov Chains

The validation framework utilizes Discrete-Time Markov Chains (DTMCs) to represent system behavior as a series of probabilistic state transitions. A DTMC is formally defined as a tuple $M = (S, P, I)$, where $S$ is a finite set of states, $P$ is the transition probability matrix with $P_{ij}$ representing the probability of transitioning from state $i$ to state $j$, and $I$ is the initial state distribution. This representation allows for the formalization of system properties and the quantitative assessment of their likelihood. By modeling the system as a DTMC, we establish a concrete and mathematically rigorous foundation for verification, enabling automated analysis and the identification of potential issues related to probabilistic requirements.

The validation framework incorporates automated generation and verification of Probabilistic Computation Tree Logic (PCTL) formulas. PCTL enables the formal specification of probabilistic requirements regarding system behavior, such as the probability of reaching a certain state within a given number of steps. The framework translates these requirements into PCTL formulas, which are then model-checked against a representation of the system’s behavior – in this case, a Discrete-Time Markov Chain. Successful verification confirms that the system meets the specified probabilistic guarantees; conversely, the process identifies violations and provides counterexamples detailing scenarios where the requirements are not met, allowing for targeted refinement of the system design. This automated approach reduces the potential for human error in manual verification and scales to complex systems with numerous probabilistic constraints.

Evaluation of the validation framework across four datasets, comprising a total of 391 requirements, demonstrated its ability to formally express 334, or 85%, as probabilistic requirements suitable for verification. This indicates a high degree of expressiveness for a substantial range of system properties. The remaining 57 requirements, while important for overall system functionality, were determined to be non-probabilistic in nature or fell outside the scope of supported PCTL formulas within the current framework implementation.

Systematic state space exploration, facilitated by the validation framework, involves a comprehensive traversal of all possible system states to identify instances where specified probabilistic requirements are not met. This process utilizes algorithms to analyze state transitions and determine if the system’s behavior deviates from the desired specifications, effectively pinpointing potential violations. The identified violations provide actionable insights for system refinement; designers can then modify the system to eliminate these violations and ensure adherence to the defined probabilistic guarantees. The granularity of the state space and the efficiency of the exploration algorithms directly impact the ability to detect subtle violations and optimize the system design.

The generated deterministic Markov chain (DTMC) validates system behavior by tracking time and the truth values (represented as 0 or 1) of four key propositions: mode, condition, stop, and result.

Bridging the Gap: LLMs and Structured Natural Language for Formalization

The process of formalizing system requirements often demands specialized knowledge, creating a barrier to entry for many developers. Recent advancements leverage the capabilities of Large Language Models (LLMs) to bridge this gap, automatically translating natural language descriptions of desired system behavior into formal specifications. This isn’t direct translation, however; the system employs Structured Natural Language as an intermediary, guiding the LLM to produce specifications in the FRET (Formal Requirements Engineering Tool) language. By structuring the input and output, the approach ensures clarity and reduces ambiguity, allowing complex systems – particularly those involving probabilistic reasoning – to be rigorously defined and verified with minimal manual effort. The result is a streamlined workflow that democratizes access to formal methods and accelerates the development of reliable, safety-critical applications.

The conventional development of formal specifications demands significant training in specialized languages and techniques, creating a barrier to entry for many practitioners. This work addresses that limitation by leveraging the natural language processing capabilities of Large Language Models to bridge the gap between intuitive requirements and rigorous formalisms. By automating the translation process, the need for deep expertise in formal methods is substantially reduced, opening up the benefits of formal specification – increased reliability and verifiability – to a much broader audience including domain experts and systems engineers without dedicated formal methods backgrounds. This broadened accessibility promises to accelerate the adoption of robust specification practices across a wider range of critical systems.

Employing FRETish as an intermediary step in translating natural language requirements into formal specifications demonstrably boosts accuracy. Studies reveal an 18-43% improvement in translation correctness when compared to methods that attempt a direct conversion from natural language to formal notation. This gain stems from FRETish’s role as a clarifying layer; it distills the initial requirements into a structured, unambiguous form before final formalization. This process mitigates the inherent ambiguity of natural language, reducing errors that typically plague direct translation and allowing for a more reliable specification of complex systems. The structured nature of FRETish acts as a crucial bridge, enabling Large Language Models to leverage their strengths in natural language understanding while minimizing the pitfalls associated with directly generating formal expressions.

The confluence of Large Language Models and formal methods offers a pathway to efficiently and reliably specify complex probabilistic systems, previously a significant challenge for developers. These systems, often governed by intricate relationships and uncertainties, demand precise articulation to ensure correct behavior and avoid costly errors. Leveraging the pattern recognition and generative capabilities of LLMs, researchers are able to translate high-level, natural language descriptions into formal specifications, such as those expressed in FRET. This automated translation process not only accelerates development but also introduces a level of rigor and verifiability absent in traditional approaches. The resulting formal models can then be subjected to analysis and validation techniques, guaranteeing the system’s adherence to intended properties and enhancing its robustness in real-world applications. This synergy between LLM power and formal rigor promises to democratize the specification of complex systems, moving beyond the limitations of manual, error-prone processes.

“`html

The pursuit of formalizing probabilistic requirements, as detailed in this work, mirrors a fundamental challenge in all systems: accommodating inevitable decay. Just as erosion subtly alters landscapes over time, uncertainty creeps into even the most rigorously defined systems. This research, extending the FRET language to encompass probabilistic temporal logic, attempts to quantify and manage that erosion, striving for a state of ‘temporal harmony’-a brief, stable phase within the larger cycle of change. Paul Erdős observed, “God created the integers; all else is the work of man.” Similarly, while inherent uncertainty exists, this formalization represents a human effort to impose order and predictability onto systems grappling with that very uncertainty. The extension to PCTL* aims to capture these probabilities, offering a means to analyze and, to a degree, mitigate the effects of inherent randomness.

What Lies Ahead?

The extension of formal methods to encompass probabilistic requirements, as demonstrated by this work, is not a resolution, but a relocation of fragility. Any improvement ages faster than expected; the precise articulation of uncertainty merely reveals the inevitable decay of predictive power. While formalizing probabilities offers a temporary bulwark against ambiguity, the underlying systems remain susceptible to unforeseen distributions, unmodeled correlations, and the simple passage of time. Rollback is a journey back along the arrow of time, and even the most meticulous specification cannot guarantee a return to a pristine initial state.

Future work will inevitably confront the limitations of PCTL as a vehicle for truly complex probabilistic reasoning. The challenge isn’t merely scaling to larger models, but acknowledging the fundamental incompleteness of any formalization. A more fruitful avenue may lie in embracing approximations, meta-modeling, and techniques that explicitly account for epistemic uncertainty – recognizing what is not* known, rather than striving for an illusion of complete knowledge.

Ultimately, the value of this work resides not in achieving a static, verifiable system, but in establishing a more nuanced understanding of systemic vulnerability. The formalization of probabilistic requirements is a sophisticated diagnostic, revealing the inherent transience of even the most carefully constructed designs. It is a mapping of fault lines, not a promise of invulnerability.

Original article: https://arxiv.org/pdf/2512.15788.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/