Author: Denis Avetisyan
A new approach leveraging artificial intelligence reconstructs the complex web of relationships between companies in the semiconductor industry, offering a dynamic view of supply chains and geopolitical influence.

This study demonstrates a scalable method for reconstructing temporal multi-relational firm networks using web-scraped data and large language models, focused on the semiconductor industry.
Despite the foundational role of the semiconductor industry in modern technology, a comprehensive understanding of its complex global network of firm relationships remains elusive. Here, we present a novel methodology-detailed in ‘Reconstructing temporal multi-relational firm networks at scale using large language models. The case of the semiconductor industry’-that leverages open web data and Large Language Models to reconstruct this network and its dynamic evolution at scale. Our approach reveals critical shifts in firm centrality during the 2022 chip shortage and geopolitical realignments, offering a validated, up-to-date map of over 1,300 linked firms. Will this framework unlock new insights into supply chain resilience and inform strategic policy across other critical sectors?
Mapping the Semiconductor Ecosystem: Unveiling Interdependencies
The semiconductor industry, foundational to modern technology, operates through a remarkably complex network of suppliers, manufacturers, and customers. This intricate web of relationships is not merely a structural characteristic, but a critical determinant of supply chain resilience. Disruptions – be they geopolitical events, natural disasters, or sudden shifts in demand – propagate through these connections, potentially cascading failures across the entire system. A robust understanding of these interdependencies allows for proactive risk assessment and mitigation, enabling stakeholders to anticipate vulnerabilities and diversify sourcing. Consequently, mapping these connections is essential for ensuring a stable and secure supply of semiconductors, safeguarding innovation and economic stability in a world increasingly reliant on microchips.
Historically, charting the relationships between companies in the semiconductor industry has proven remarkably difficult. Conventional approaches – relying on official reports, surveys, or limited proprietary data – often present a static, incomplete picture. These methods struggle to account for the rapid formation of new partnerships, shifting supplier arrangements, and the emergence of specialized firms that characterize this fast-moving sector. Critically, this limitation becomes especially pronounced during times of global disruption, such as geopolitical events or natural disasters, when established supply chains fracture and companies quickly seek alternative sources or forge unexpected collaborations. The resulting inability to accurately map these dynamic interdependencies hinders effective risk assessment and proactive supply chain management, leaving the industry vulnerable to unforeseen shocks.
A comprehensive mapping of the semiconductor industry’s complex relationships has been achieved through a novel data-driven methodology. Researchers moved beyond traditional surveys and databases, instead employing web data – including news articles, financial reports, and supply chain disclosures – to reconstruct a firm network encompassing over 1,300 linked entities. This approach automatically identifies connections between companies based on co-occurrence in publicly available text, revealing a far more detailed and dynamic picture of the ecosystem than previously possible. The resulting network isn’t merely a list of suppliers and customers; it captures nuanced relationships like joint ventures, research collaborations, and investment ties, offering critical insights into potential vulnerabilities and opportunities within this essential global industry.

Reconstructing the Network: A Data-Driven Approach
The foundation of our firm relationship reconstruction relies on data sourced from Common Crawl, a publicly accessible archive of web content. This archive provides a broad and continuously updated record of websites, allowing for the identification of textual mentions of firm relationships that are otherwise difficult to obtain. Utilizing Common Crawl circumvents the need for proprietary datasets and enables large-scale network analysis; the archive’s comprehensive coverage, while containing noise, offers a significantly larger pool of potential relationships than would be available through limited, curated sources. Data extraction is performed on snapshots of web pages archived by Common Crawl, capturing relationship data as it existed at specific points in time.
The extraction of firm relationships from web archives relies on the application of the GPT-4o-mini large language model. This model processes text from Common Crawl to identify instances of supply, customer, and ownership connections between firms. Performance is benchmarked via multi-class relationship type classification, yielding a precision score of 0.918. This metric indicates that, of all identified relationships, 91.8% are correctly categorized as supply, customer, or ownership links, demonstrating a high degree of accuracy in automated relationship extraction.
The reconstruction of the firm network begins with initial firm identification and scoping utilizing the ORBIS Database. This commercially available database provides comprehensive global company information, including corporate hierarchies, ownership details, and key financial data. Leveraging ORBIS allows for the creation of a defined universe of firms for network analysis, mitigating the challenges of incomplete or inaccurate data often present in publicly available web sources. The database facilitates the identification of parent-subsidiary relationships and associated firm identifiers, establishing a reliable foundation for subsequent relationship extraction from web archives and ensuring the resulting network is focused on verified entities.
The reconstructed network extends beyond simple supplier-customer relationships to encompass a broader range of firm connections. Data integration identifies not only direct supply chains – detailing which companies provide goods or services to others – but also ownership structures, revealing parent-subsidiary relationships and cross-ownership. Furthermore, the network maps strategic partnerships, joint ventures, and alliances that indicate collaborative arrangements between firms, providing a holistic view of inter-company dependencies and influence beyond transactional exchanges. This multi-faceted approach facilitates analysis of systemic risk, competitive dynamics, and the broader economic landscape.

Validating the Network: Ensuring Accuracy and Robustness
The reconstructed firm network underwent validation using two external datasets: S&P Capital IQ and the BACI Database. S&P Capital IQ provides detailed corporate ownership and relationship data, enabling a comparison of identified linkages. The BACI Database, maintained by the Centre for Economic Policy Research, offers comprehensive international trade data at the product and country level, facilitating validation of observed firm relationships through trade flows. Comparison against these independent sources confirms the network’s accuracy and provides a basis for assessing the robustness of the reconstruction methodology.
Statistical significance of the reconstructed firm network was assessed by comparing its edges to those present in the S&P Capital IQ dataset. A Configuration Model was employed to generate a null distribution of random networks with equivalent node degrees, against which observed edge overlap was compared. This analysis yielded a p-value of less than 10-3, indicating a statistically significant overlap between the reconstructed network and the independently sourced S&P Capital IQ data. This result supports the validity of the network’s structure and suggests that the observed linkages are not attributable to random chance.
The reconstructed firm network’s macroeconomic validity is supported by comparison with sector-level trade and production linkages detailed in the OECD Inter-Country Input-Output (ICIO) Tables. These tables provide comprehensive data on the flows of goods and services between industries and countries, enabling a cross-validation of the network’s structural properties at an aggregate level. Specifically, the observed correlations between the network’s linkages and the OECD ICIO data demonstrate the network accurately reflects established patterns of inter-industry relationships and production dependencies, averaged across the 2015-2022 period, and confirming the network’s alignment with macroeconomic indicators.
The reconstructed firm network demonstrates a substantial increase in identified linkages compared to S&P Capital IQ data, with a 78% expansion in the number of detected connections. This expanded network exhibits strong correlation with independent macroeconomic datasets; specifically, a Pearson correlation coefficient of 0.64 was observed when compared to the OECD Input-Output Tables and 0.61 with the BACI database, averaged across the 2015-2022 period. These correlation values validate the network’s structural alignment with established trade and production data sources.
Multiple validation procedures confirm the robustness and reliability of the reconstructed firm network. Independent datasets, including S&P Capital IQ, the BACI Database, and OECD ICIO Tables, were used for comparative analysis. Statistical significance was assessed via a Configuration Model, yielding a p-value below 10-3 for edge overlap with S&P Capital IQ. Quantitative results indicate a 78% increase in detected links compared to S&P data, alongside Pearson correlations of 0.64 with OECD ICIO data and 0.61 with BACI trade data, averaged across the 2015-2022 period. These findings collectively support the validity and accuracy of the data-driven methodology employed.

Network Dynamics and External Shocks: A Landscape in Flux
The structure of the semiconductor supply chain isn’t static; relationships between firms are constantly in flux, necessitating a shift from analyzing a snapshot of the network to a temporal network that maps these evolving connections. This approach reveals how dependencies aren’t fixed, but rather dynamically change over time, with firms gaining or losing importance as production patterns shift and new technologies emerge. By tracking these alterations, researchers can identify not only current vulnerabilities, but also predict potential future disruptions as relationships strengthen or weaken. Such analysis highlights how a firm seemingly peripheral today might become critically important tomorrow, and vice versa, providing a more nuanced understanding of supply chain resilience than traditional network assessments allow. This dynamic perspective is crucial for proactive risk management and strategic planning within the semiconductor industry.
Understanding the resilience of complex supply chains requires pinpointing firms with disproportionate influence, and network centrality metrics offer a powerful means of doing so. Analyses focusing on betweenness – a firm’s control over information or material flow between others – and closeness – how quickly a firm can reach all others in the network – reveal critical nodes vital for overall stability. Firms with high betweenness scores act as essential connectors; disruption to these entities can severely impede the flow of goods and information throughout the entire system. Similarly, those exhibiting high closeness centrality facilitate rapid responses to changing conditions and can mitigate the impact of localized disruptions. Identifying these key players allows for targeted risk management strategies, including diversification of sourcing, strategic stockpiling, and the development of contingency plans, ultimately bolstering the semiconductor supply chain against unforeseen challenges.
Analysis reveals the semiconductor supply chain experienced significant disruption during the COVID-19 pandemic and subsequent geopolitical instability. Quantified impacts demonstrate a marked increase in lead times for critical components, alongside a surge in pricing volatility – particularly for materials sourced from regions experiencing lockdowns or trade restrictions. The study identifies key bottlenecks concentrated around a limited number of specialized manufacturing facilities and raw material suppliers, exposing the network’s fragility. Furthermore, the research highlights how reliance on just-in-time inventory management amplified these effects, as even minor disruptions cascaded rapidly throughout the system. These findings underscore the necessity for diversified sourcing strategies and increased supply chain resilience to mitigate future shocks and ensure stable production of semiconductors, a foundational technology for numerous global industries.
The surge in artificial intelligence applications is fundamentally altering the landscape of the semiconductor ecosystem, creating new dependencies and reshaping existing relationships. Demand for specialized chips – particularly those optimized for machine learning and AI inference – has concentrated resources and investment towards a select group of firms capable of advanced fabrication. This isn’t simply an increase in overall demand; it’s a qualitative shift, fostering tighter bonds between AI developers and semiconductor manufacturers, while simultaneously creating potential vulnerabilities for companies further down the supply chain lacking direct access to these key components. The study reveals a growing asymmetry in the network, with AI-driven demand acting as a powerful force reorganizing power dynamics and highlighting the critical importance of securing access to cutting-edge AI-specific semiconductor technology for sustained competitiveness.

The reconstruction of complex networks, as demonstrated in this study of the semiconductor industry, demands not merely the aggregation of data, but a synthesis of information into a coherent and understandable form. This pursuit of clarity mirrors a fundamental principle of elegant design. Pierre Curie observed, “One never notices what has been done; one can only see what remains to be done.” This sentiment aptly describes the iterative process of network reconstruction; each successfully identified relationship illuminates the vastness of what remains hidden. The paper’s method, leveraging Large Language Models to discern multi-relational ties, represents a step towards revealing these hidden connections, acknowledging the continuous nature of discovery within complex systems. A good interface, in this context, is an accurate and insightful network, almost invisible in its seamless presentation of intricate relationships.
Beyond the Silicon: Charting Future Networks
The reconstruction of multi-relational firm networks, as demonstrated within the semiconductor industry, offers more than a mere mapping of present connections. It illuminates the inherent fragility of complex systems, systems often perceived as robust precisely because of their intricacy. The current work, while a significant step, inevitably reveals the limitations of relying solely on publicly available, text-derived data. The shadows – the unarticulated agreements, the tacit understandings, the relationships never quite committed to a press release – remain stubbornly opaque. Future iterations must grapple with these absences, perhaps through the integration of alternative data streams, or the development of models capable of inferring relationships from structural anomalies.
Consistency is empathy; a network reconstruction that fails to account for the evolving nature of trust – or its erosion – is ultimately a static portrait of a dynamic reality. The pursuit of ever-larger networks risks obscuring the critical few relationships that truly dictate industry behavior. A shift towards qualitative depth, complementing quantitative scale, seems essential. The challenge lies not merely in finding connections, but in discerning which connections matter.
Beauty does not distract, it guides attention. The elegance of a network model is not judged by its complexity, but by its ability to reveal underlying principles. As these methods expand beyond the semiconductor industry, the true test will be their capacity to anticipate, not merely reflect, the geoeconomic shifts that reshape the global landscape. The pursuit of predictive power, grounded in a deep understanding of relational dynamics, remains the most compelling, and arguably the most difficult, frontier.
Original article: https://arxiv.org/pdf/2605.15842.pdf
Contact the author: https://www.linkedin.com/in/avetisyan/
See also:
- Off Campus Season 1 Soundtrack Guide
- Euphoria Season 3’s New R-Rated Sydney Sweeney Scene Proves The Show Is Trolling Us
- Gold Rate Forecast
- DoorDash responds after customer uses AI to make food look bad and get a refund
- Brent Oil Forecast
- How to Get to the Undercoast in Esoteric Ebb
- What is Omoggle? The AI face-rating platform taking over Twitch
- 8 Funniest Billy Butcher Quotes From The Boys
- EUR CLP PREDICTION
- MNT PREDICTION. MNT cryptocurrency
2026-05-18 13:13