Small Models, Big Insights: Predicting Crash Injuries with Limited Resources

Author: Denis Avetisyan

A new pipeline leverages the efficiency of small language models alongside traditional tree ensembles to accurately predict injury severity from city-wide crash data.

The XGBoost model elucidates injury severity prediction through feature impact analysis, demonstrating how individual event characteristics-ranging from low (blue) to high (red) values-systematically shift the predicted outcome, effectively mapping the influence of each variable on the model’s assessment.

Researchers demonstrate a resource-efficient approach to injury severity prediction using a combined tree ensemble and small language model pipeline with explainability via SHAP values.

Despite increasing reliance on data-driven insights, city-scale injury prediction often struggles with computational cost and lack of interpretability. This paper introduces RaX-Crash, a resource-efficient pipeline for predicting injury severity from New York City motor vehicle collisions, integrating both traditional tabular feature engineering with small language models. Our results demonstrate that compact tree ensembles-Random Forest and XGBoost-significantly outperform small language models as classifiers, while simultaneously enabling explainable predictions via SHAP value analysis and targeted class weighting. Can hybrid pipelines leveraging the strengths of both tabular and textual data unlock even more nuanced and actionable insights for public health and safety initiatives?

Dissecting the Inevitable: Forecasting Injury Severity

The ability to forecast the severity of injuries sustained in motor vehicle collisions represents a critical need within emergency response and healthcare systems. Precise predictions directly influence the efficient allocation of resources – ensuring ambulances are dispatched with appropriate staffing, hospitals can prepare operating rooms, and specialized trauma teams are mobilized when necessary. Beyond logistical improvements, accurate forecasting promises to significantly enhance patient outcomes; earlier and more targeted interventions, guided by predicted injury profiles, can reduce morbidity and mortality rates. Consequently, research focused on refining predictive models isn’t merely an academic exercise, but a vital step towards optimizing the entire continuum of care for those impacted by traffic accidents, translating directly into lives saved and improved quality of life for survivors.

Predicting the severity of injuries resulting from motor vehicle collisions presents a significant challenge due to the intricate interplay of forces at work during impact and the vast differences in individual vulnerability. Traditional predictive models frequently simplify these complex dynamics, often treating collisions as isolated events without fully accounting for vehicle-specific characteristics, environmental conditions, or the precise biomechanical factors influencing injury. Furthermore, human physiology introduces considerable heterogeneity; age, pre-existing health conditions, body mass index, and even gender can dramatically alter an individual’s susceptibility to injury from the same impact force. This combination of complex physics and varied biological responses necessitates analytical techniques capable of handling high-dimensional data and capturing non-linear relationships, pushing the limits of conventional statistical methods and driving the need for more sophisticated machine learning approaches.

The increasing availability of open datasets, such as the extensive records of motor vehicle collisions maintained by cities like New York, presents a significant opportunity to refine injury severity prediction. However, these datasets are rarely ‘ready-made’ solutions; their inherent complexity and potential for bias necessitate sophisticated analytical techniques. Simply accessing the data is insufficient; robust approaches – including machine learning algorithms, statistical modeling, and careful feature engineering – are crucial to extract meaningful insights. These methods must account for the multifaceted nature of collisions – encompassing vehicle dynamics, environmental factors, and individual patient vulnerabilities – to move beyond descriptive statistics and toward genuinely predictive models capable of informing pre-hospital care and resource allocation strategies. Without these analytical tools, the wealth of information contained within these open datasets remains largely untapped, hindering advancements in trauma care and public safety.

RaX-Crash: A Pipeline for Deconstructing Impact

The RaX-Crash pipeline utilizes a Unified Feature Schema to standardize and integrate data from multiple sources, facilitating the application of tree-based machine learning models for injury severity prediction. This schema enables consistent feature representation for both Random Forest and XGBoost algorithms, improving model performance and interpretability. The combination of a standardized data format and the inherent capabilities of tree-based models – including non-linear relationship handling and feature importance assessment – contributes to a robust and accurate injury prediction system. Data processed through this pipeline is formatted to be compatible with these algorithms, allowing for efficient training and prediction of injury severity levels.

Evaluation of the RaX-Crash pipeline demonstrated that the XGBoost model achieved an accuracy of 0.7828 in predicting injury severity. For comparison, the Random Forest model attained an accuracy of 0.7794 on the same dataset. This performance difference, while marginal, suggests a slight advantage for XGBoost in this specific application. Both models were trained and tested using the same Unified Feature Schema and data preprocessing techniques, ensuring a fair comparison of algorithmic performance. These accuracy scores were calculated using standard metrics on a held-out test set of NYC open data.

The RaX-Crash pipeline incorporated techniques to address class imbalance present in the NYC open data, specifically the under-representation of severe injury cases. This imbalance can negatively impact model performance, leading to lower recall for critical injury levels. To mitigate this, weighted modeling approaches were implemented, assigning higher costs to misclassifications of severe injuries. This prioritization improved the model’s ability to correctly identify and predict fatal and critical injury scenarios, demonstrably increasing Fatal Recall as a key performance metric. The application of these techniques resulted in a more accurate and reliable injury severity prediction, particularly for the most critical cases.

The RaX-Crash pipeline is designed for efficient processing of New York City’s publicly available crash datasets, enabling rapid injury severity prediction without substantial computational demands. Utilizing optimized data structures and algorithms, the pipeline minimizes resource utilization, allowing for deployment on standard hardware. Processing time is kept low through techniques such as vectorized operations and efficient data loading, while maintaining prediction accuracy comparable to more computationally intensive methods. This lightweight design facilitates scalability and enables frequent model retraining with updated NYC open data, ensuring predictions remain current and reliable.

Beyond Correlation: Illuminating Causation with Explainable AI

SHAP (SHapley Additive exPlanations) values represent a method for explaining the output of any machine learning model. They function by calculating the contribution of each feature to the prediction for a specific instance, providing a local explanation of the model’s behavior. This is achieved by considering all possible combinations of features and weighting their impact on the prediction, adhering to principles of cooperative game theory. The resulting SHAP value for a feature indicates its influence on the model’s output – a positive value suggests the feature increased the prediction, while a negative value suggests it decreased it. The sum of the SHAP values for all features equals the difference between the model’s prediction for that instance and the average prediction across the dataset.

Small Language Models (SLMs) are employed to convert quantitative feature importance values, such as those generated by SHAP analysis, into natural language explanations. Specifically, models including LLaMA 3.2 and DeepSeek-R1 receive as input the SHAP attribution scores for individual predictions, alongside the corresponding feature values. These models then generate a textual summary detailing how each feature contributed to the model’s output for that specific instance. This translation process facilitates the interpretation of model behavior by presenting feature contributions in a human-readable format, enabling users to understand why a particular prediction was made rather than simply what the prediction was.

Quantitative evaluation of the correspondence between SHAP attribution values and the narrative explanations generated by Small Language Models yielded alignment scores of 0.610 for LLaMA 3.2 and 0.550 for DeepSeek-R1. This measurement was conducted to assess the fidelity of the translated explanations, indicating the degree to which the SLM-generated text accurately reflects the feature importance as determined by the SHAP values. Higher scores suggest a stronger correlation between the quantitative feature attributions and the qualitative, human-readable narratives.

Analysis of model explanations identifies specific factors significantly correlated with injury severity in collision events. Vehicle type, categorized by size and construction, consistently appears as a primary influence, with heavier vehicles generally associated with more severe outcomes. Collision angle, quantified in degrees relative to the impacted vehicle’s centerline, demonstrates a strong relationship, with frontal and oblique impacts frequently indicating higher injury risk. Furthermore, driver age is identified as a contributing factor, with both very young and elderly drivers exhibiting a statistically significant correlation with increased injury severity compared to middle-aged drivers.

From Prediction to Proactive Intervention: Rewriting the Narrative of Trauma Care

The RaX-Crash pipeline represents a significant advancement in predictive analytics for trauma care, offering a practical resource for those on the front lines of emergency response. By accurately forecasting the likely severity of injuries stemming from vehicle collisions, the system facilitates proactive resource allocation – ensuring that ambulances, specialized medical teams, and hospital staff are appropriately prepared before patient arrival. This isn’t merely about faster response times; it’s about delivering the right level of care, immediately, potentially mitigating long-term disability and improving patient outcomes. Beyond the immediate emergency, the pipeline’s data insights provide valuable intelligence for healthcare providers involved in long-term rehabilitation and urban planners seeking to identify and address high-risk intersections or infrastructure deficiencies, ultimately fostering safer communities.

Detailed analysis of crash data reveals specific mechanisms driving severe injury, moving beyond simple impact forces to consider factors like vehicle deformation, occupant biomechanics, and pre-existing conditions. This granular understanding allows for the development of targeted safety interventions; for instance, identifying common failure modes in vehicle structures can inform design improvements, while pinpointing vulnerable occupant positions can guide airbag deployment strategies. Furthermore, infrastructure improvements, such as optimized roadway geometry, enhanced lighting, and strategically placed protective barriers, can directly address collision characteristics associated with the most severe outcomes. By shifting from reactive emergency response to proactive risk mitigation, this approach promises a significant reduction in both the incidence and severity of traffic-related injuries, ultimately fostering safer mobility for all.

The RaX-Crash pipeline’s predictive capabilities are poised for significant advancement through the incorporation of live, real-time data feeds – including weather conditions, traffic flow, and even emergency vehicle locations. This integration promises to move beyond static risk assessment, enabling dynamic predictions of injury severity as incidents unfold and allowing for preemptive resource deployment. Simultaneously, research is expanding the model’s scope to encompass a broader spectrum of collision types – moving beyond typical car crashes to include pedestrian and cyclist incidents, as well as those involving motorcycles and commercial vehicles. This broadened scope, combined with real-time data, aims to create a comprehensive predictive system capable of informing proactive safety measures and ultimately minimizing the impact of traffic collisions across diverse urban and rural environments.

The pursuit of predictive accuracy, as demonstrated by RaX-Crash’s comparison of tree ensembles and small language models, often necessitates a willingness to dismantle established approaches. This mirrors a core tenet of knowledge acquisition: understanding limitations through rigorous testing. As John McCarthy aptly stated, “It is better to deal with reality as it is than to try to make it fit a preconceived framework.” RaX-Crash doesn’t simply accept the superiority of tree ensembles; it probes the potential of SLMs to offer explainable insights – specifically through SHAP values – even when their predictive power is comparatively lower. This active investigation, this deliberate attempt to understand how a system arrives at its conclusions, is the essence of reverse-engineering reality and pushing the boundaries of what’s known, especially when addressing complex challenges like injury severity prediction from tabular data.

Beyond the Crash Test

The pursuit of predictive accuracy, as demonstrated by the continued dominance of tree ensembles, feels…predictable. RaX-Crash doesn’t dismantle that hierarchy, but it does expose a curious asymmetry. The superior performance of established methods shouldn’t overshadow the value of interpretable failure. Small language models, while lagging in raw prediction, offer a window into the ‘why’ – a glimpse at the decision-making process itself. The alignment of these explanations with SHAP values is not a destination, but a starting point for reverse-engineering the very notion of ‘injury severity’.

Future work isn’t about squeezing marginal gains from model accuracy. It’s about dismantling the black box – not to eliminate it, but to understand how it generates its outputs. The current focus on tabular data is a convenient limitation. A truly robust system would integrate multimodal inputs – sensor data, environmental conditions, even free-text reports – and reconcile conflicting signals. Can a language model, trained on structured data, convincingly narrate a crash event, and have that narrative align with observed injuries?

The real challenge isn’t prediction; it’s constructing a useful fiction. Injury severity isn’t an objective truth, but a classification imposed upon a chaotic event. RaX-Crash hints at a path where models don’t simply predict outcomes, but articulate the reasoning behind those predictions, inviting scrutiny and, ultimately, a more nuanced understanding of risk.

Original article: https://arxiv.org/pdf/2512.07848.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

Dissecting the Inevitable: Forecasting Injury Severity

RaX-Crash: A Pipeline for Deconstructing Impact

Beyond Correlation: Illuminating Causation with Explainable AI

From Prediction to Proactive Intervention: Rewriting the Narrative of Trauma Care

Beyond the Crash Test

See also: