Beyond Package Lists: Uncovering Hidden Vulnerabilities in Python

Author: Denis Avetisyan

A new approach to software security focuses on tracing dependencies beyond readily available package metadata to identify risks lurking in native libraries.

This approach constructs cross-ecosystem call graphs for Python applications and their dependencies, then leverages these graphs to computationally determine how vulnerabilities propagate from binary code throughout the interconnected system.

This paper introduces PyXSieve, a tool for cross-ecosystem dependency analysis that accurately identifies and traces vulnerabilities in Python applications through provenance analysis and the construction of call graphs.

Python applications increasingly rely on native libraries, creating a complex dependency landscape where vulnerability analysis is hampered by false positives and negatives. This paper, ‘Cross-Ecosystem Vulnerability Analysis for Python Applications’, introduces a provenance-aware approach to accurately pinpoint vulnerable packages by resolving dependencies on native libraries to specific OS package versions or upstream releases. Through content-based hashing and dynamic analysis, we construct cross-ecosystem call graphs enabling reachability analysis of vulnerable functions, identifying 39 directly and 312 indirectly vulnerable packages with up to 97% false positive reduction. Can this cross-ecosystem approach be extended to other languages and package managers to improve software supply chain security more broadly?

The Expanding Threat Landscape of Modern Python Dependencies

The pursuit of speed and efficiency in modern Python applications has led to a significant reliance on native extensions – code written in languages like C and C++ that interfaces directly with the operating system. While these extensions dramatically improve performance for computationally intensive tasks, they introduce considerable complexity into the software supply chain. A typical Python package now rarely consists solely of Python code; instead, it often incorporates numerous native extensions, each potentially harboring its own set of dependencies and vulnerabilities. This layered structure creates a cascading effect, where a compromise in a single, seemingly minor, dependency can propagate through the entire system, posing a substantial risk to application security and stability. The increasing prevalence of these extensions therefore necessitates a shift in how developers and security professionals approach vulnerability assessment and supply chain management.

The performance benefits of Python’s reliance on native extensions introduce a complex, multi-layered software supply chain that significantly expands the potential attack surface. These extensions frequently depend on system-level shared libraries – code shared across multiple applications – creating a chain of dependencies extending far beyond the Python package itself. A compromise at any point in this chain – whether within a seemingly innocuous shared library or a rarely updated component – can potentially compromise the security of the entire Python application. This layered structure makes traditional vulnerability assessments, often focused on Python packages alone, increasingly ineffective; a thorough security analysis must now account for the integrity and provenance of these underlying, cross-ecosystem dependencies, a task proving exceptionally challenging due to the scale and opacity of many software supply chains.

Current vulnerability assessment tools often falter when faced with the intricate dependencies spanning Python packages and their underlying native code. These tools are typically designed to analyze code within a single ecosystem, proving inadequate when a vulnerability resides not within the Python code itself, but within a shared library-like a C or C++ component-that the Python package relies upon. Identifying and mitigating these cross-ecosystem vulnerabilities presents a significant challenge, as it requires tracing the dependency chain beyond the boundaries of the Python environment and into the often less-visible world of native code. This creates a blind spot for security teams, potentially leaving applications vulnerable to attacks that exploit weaknesses in components outside the scope of traditional scanning methods, and necessitates a shift toward more holistic, cross-platform analysis techniques.

PyPA's auditwheel tool bundles shared library dependencies into Python wheels by copying them and appending an 8-character SHA-256 hash to the filename, as demonstrated here with the packaging of the igraph library. — PyPA’s auditwheel tool bundles shared library dependencies into Python wheels by copying them and appending an 8-character SHA-256 hash to the filename, as demonstrated here with the packaging of the igraph library.

Establishing Trust: Provenance and Dependency Mapping as Foundational Practices

Provenance analysis is the process of determining the origin and historical record of software components, encompassing their creators, build processes, and modifications over time. This practice is fundamental to software supply chain security because it allows organizations to verify the integrity of their software and identify potential vulnerabilities introduced through compromised or malicious components. Establishing a clear provenance record enables the rapid identification of affected systems in the event of a security incident and facilitates efficient patching and remediation. Without accurate provenance, it is difficult to assess the risk associated with third-party dependencies or confidently deploy software updates, increasing exposure to supply chain attacks.

Accurate dependency mapping necessitates the construction of a cross-ecosystem call graph to track interactions not only between Python packages but also between Python code and native libraries. This is crucial because Python applications frequently interface with code written in languages like C and C++ through mechanisms such as C extensions or shared objects. A comprehensive call graph must therefore trace function calls across these language boundaries to fully represent the application’s dependencies and identify potential security vulnerabilities or performance bottlenecks introduced by native code components. Failure to account for these inter-language dependencies results in an incomplete picture of the software’s attack surface and operational behavior.

Provenance analysis is significantly enhanced through the utilization of a hash database for component identification. Our methodology successfully resolves the origin of 63.1% of vendored libraries, a critical step in bolstering software supply chain security. This resolution is achieved by comparing component hashes against the database, allowing for the accurate identification of origins even when version numbers are unavailable or unreliable. Establishing provenance in this manner directly improves vulnerability assessment by enabling organizations to determine if a specific component with a known vulnerability is present in their codebase and, crucially, where it originated, facilitating targeted remediation efforts.

Provenance resolution currently relies on two primary methods: hash-based matching and upstream version identification. Hash-based matching, accounting for 52.5% of successful resolutions, directly compares component hashes against a known database to establish origin. Complementing this, upstream version identification resolves provenance in 10.5% of cases by correlating component metadata with publicly available version information from upstream sources. These two methods are used in combination to maximize provenance coverage and accuracy, providing a more complete understanding of software component origins.

Analysis of the igraph Python package (version 0.11.9) reveals a binary dependency graph composed of native extensions, vendored libraries originating from Red Hat packages, and system dependencies, despite deployment on a Debian host.

PyXSieve: A Targeted Vulnerability Scanner for Complex Dependencies

PyXSieve is a vulnerability scanner specifically designed for Python packages, with a primary focus on identifying weaknesses introduced by native extensions. These extensions, often written in C or C++, can introduce security vulnerabilities that are not present in pure Python code. The tool performs static analysis of package dependencies to detect known vulnerabilities in these native components. Unlike traditional scanners, PyXSieve prioritizes identifying vulnerabilities within the actual attack surface of a Python application, rather than simply reporting any dependency with a known issue. This targeted approach is crucial because not all vulnerable dependencies are actually exploitable in a given context.

PyXSieve employs provenance analysis to map the origin and build process of each package and its dependencies, specifically focusing on native extensions. This process identifies the source code used in compilation and any externally sourced libraries linked during the build. Complementing this, the tool constructs cross-ecosystem call graphs which trace function calls across package boundaries, including those originating from native extensions. By combining these techniques, PyXSieve generates a comprehensive view of the attack surface, detailing how external code integrates with the scanned Python application and highlighting potential entry points for exploitation originating from vulnerable native dependencies.

Reachability analysis in PyXSieve determines if a vulnerability in a native dependency is actually exploitable from within the application’s code. This process goes beyond simply identifying vulnerable dependencies; it traces the call paths from the application through its dependencies to the vulnerable code. If no path exists – meaning the vulnerable code is never called during normal application execution – the vulnerability is considered non-reachable and is flagged as a false positive. This significantly reduces alert fatigue by focusing on vulnerabilities that present a genuine risk, as demonstrated by a 92% reduction in alerts for indirectly vulnerable packages, narrowing the scope from 3,831 packages with pinned vulnerable dependencies to 312 reachable packages after analysis.

Analysis using PyXSieve identified 39 Python packages containing directly vulnerable native dependencies. These packages collectively account for over 47 million monthly downloads, indicating a substantial potential impact if exploited. In addition to these directly affected packages, the analysis revealed 312 packages transitively impacted by vulnerabilities present in their dependencies. This distinction highlights the breadth of the vulnerability, extending beyond immediate package users to those relying on packages that themselves depend on vulnerable code.

PyXSieve employs provenance analysis to trace the origin and build process of each package dependency, specifically focusing on native extensions. This provenance data is then cross-referenced with backported patch information – identifying instances where a vulnerability has been addressed in a subsequent release of the dependency but not yet incorporated into the currently installed version. By correlating these factors, PyXSieve significantly reduces false positive vulnerability reports for directly vulnerable packages, achieving up to a 97% reduction in inaccurate alerts. This improvement stems from the ability to distinguish between genuinely exploitable vulnerabilities and those already mitigated by applied patches within the dependency’s history.

Initial scanning identified 3,831 packages containing vulnerable dependencies; however, this number included many packages where the vulnerability was not actually reachable from the application’s code. Through reachability analysis, PyXSieve reduced the number of alerted indirectly vulnerable packages by 92%, narrowing the scope to only 312 packages with confirmed exploitable transitive dependencies. This reduction in false positives significantly decreases alert fatigue and focuses security efforts on actively exploitable vulnerabilities, improving the efficiency of vulnerability remediation processes.

Our analysis demonstrates reachability of vulnerable functions within the `scpp` Python package on a Debian host, revealing that while tools like Trivy detect system dependencies like <span class="katex-eq" data-katex-display="false">libcairo</span> (green), they fail to identify and trace the provenance of vulnerabilities within vendored libraries such as <span class="katex-eq" data-katex-display="false">libxml2</span> (orange) used by `igraph`, as further detailed in Figure 2. — Our analysis demonstrates reachability of vulnerable functions within the `scpp` Python package on a Debian host, revealing that while tools like Trivy detect system dependencies like $libcairo$ (green), they fail to identify and trace the provenance of vulnerabilities within vendored libraries such as $libxml2$ (orange) used by `igraph`, as further detailed in Figure 2.

Strengthening the Chain: Reproducible Builds, Secure Packaging, and Transparency

Python projects often rely on numerous external dependencies, creating challenges for consistent and reliable deployment. Tools such as Cibuildwheel and Auditwheel address this by facilitating the creation of reproducible wheels – pre-built, self-contained packages that bundle a project’s code and its dependencies. Cibuildwheel automates the build process, ensuring identical packages are generated across different machines, while Auditwheel meticulously analyzes the build environment to identify and include all necessary shared libraries. This approach eliminates the “it works on my machine” problem, guaranteeing consistent behavior regardless of the user’s system configuration and significantly enhancing the security of the software supply chain by preventing the inclusion of unintended or malicious code.

The Manylinux standard addresses a critical challenge in software distribution: ensuring consistent execution across diverse operating system environments. Historically, packaging Python projects for Linux could result in compatibility issues due to variations in glibc versions and other system libraries. Manylinux tackles this by defining a minimal set of supported glibc versions and requiring builds to be conducted within isolated containers that emulate these environments. This process generates highly portable wheels – pre-built package formats – that function reliably on a wide range of Linux distributions, significantly reducing the burden on end-users and streamlining the deployment process. By creating a consistent baseline, Manylinux not only simplifies package installation but also enhances reproducibility and strengthens the overall software supply chain, fostering greater trust and security.

A Software Bill of Materials (SBOM) represents a formal, machine-readable inventory detailing every component within a software package – encompassing not just the core code, but also dependencies, libraries, and even the specific versions used. This detailed listing is crucial for bolstering software supply chain security, enabling organizations to quickly identify and address vulnerabilities when discovered in any component. By providing complete transparency into a package’s composition, SBOMs facilitate vulnerability management, license compliance, and facilitate faster incident response. Furthermore, the increasing adoption of SBOMs supports automated security checks and allows developers to proactively manage risks associated with third-party software, ultimately fostering greater trust and reliability in the digital ecosystem.

The dependency tree for the `thescppinpackage` reveals its recursive reliance on other packages as defined in their `.whl` metadata, mirroring the example presented in Figure 1.

The presented work embodies a principle of systemic understanding. PyXSieve’s approach to vulnerability analysis, extending beyond direct Python dependencies to encompass native libraries and their provenance, demonstrates that isolating components obscures the larger picture. This aligns with the observation that structure dictates behavior; the call graphs constructed are not merely diagrams, but representations of how a system will fail given certain inputs. The tool doesn’t simply identify vulnerable code, it maps the pathways through which that vulnerability can be exploited. As Ada Lovelace noted, “The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform.” Similarly, PyXSieve reveals vulnerabilities that exist not as inherent flaws, but as consequences of the system’s constructed dependencies and the instructions-or data-it receives.

What Lies Ahead?

The pursuit of software security often feels like chasing shadows – a vulnerability addressed merely reveals another, hidden deeper within the system’s intricate web. PyXSieve attempts to illuminate one particularly troublesome corner of that web: the opaque dependencies on native libraries within Python applications. However, accurate provenance tracking, even with the techniques presented, remains a brittle endeavor. The supply chain is dynamic, build processes are rarely canonical, and malicious actors are adept at obscuring origins. Future work must move beyond simply detecting vulnerable components to assessing the true cost of their exploitation – the actual reachability of vulnerable code within a running application.

Current approaches to dependency analysis largely treat the ecosystem as a directed graph, a neat abstraction. Yet, real-world systems are messier, exhibiting cyclical dependencies, implicit contracts, and emergent behavior. A truly robust analysis requires modeling not just what is connected, but what can be connected, and the likelihood of that connection being exploited. The construction of cross-ecosystem call graphs, while powerful, is computationally expensive and prone to false positives; a more nuanced approach, perhaps leveraging machine learning to prioritize likely attack vectors, is warranted.

Ultimately, the challenge isn’t merely technical. The increasing complexity of software ecosystems demands a fundamental shift in how security is approached. The focus must move from reactive vulnerability patching to proactive resilience – designing systems that can tolerate compromise, limit damage, and continue functioning even in the face of attack. Such systems will likely embrace principles of modularity, isolation, and verifiable trust, and will demand a level of transparency currently lacking in most software supply chains.

Original article: https://arxiv.org/pdf/2603.18693.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Expanding Threat Landscape of Modern Python Dependencies

Establishing Trust: Provenance and Dependency Mapping as Foundational Practices

PyXSieve: A Targeted Vulnerability Scanner for Complex Dependencies

Strengthening the Chain: Reproducible Builds, Secure Packaging, and Transparency

What Lies Ahead?

See also: