Unpublished 15-year-old Python bug allows code execution in 350k projects

Unpublished 15-year-old Python bug allows code execution in 350k projects

A vulnerability in the Python programming language that has been overlooked for 15 years is now in the limelight again as it affects more than 350,000 open-source repositories and can cause code execution.

Revealed in 2007 and tagged as CVE-2007-4559, the security issue never received a patch, the only mitigation provided in the form of a documentation update warning developers about the risk.

unpublished since 2007

the vulnerability is in python wirefile package, in code that uses un-sanitized tarfile.extract() The built-in defaults of the function or tarfile.extractall(). This is a path traversal bug that enables an attacker to overwrite arbitrary files.

Technical details for CVE-2007-4559 are available since initial report good in August 2007. Although there are no reports about the attack taking advantage of the bug, it represents a risk in the software supply chain.

Earlier this year, while investigating another security issue, CVE-2007-4559 was rediscovered by a researcher at Trelix, a new business that provides Extended Detection and Response (XDR) solutions, That Merger of McAfee Enterprise and FireEye.

“Failure to write any security code to clean up member files prior to the call to tarfile.extract() tarfile.extractall() results in a directory traversal vulnerability, allowing bad actors to enable access to the filesystem.” builds” – Charles McFarland, vulnerability researcher of the Trelix Advanced Threat Research Team

The flaw stems from the fact that the code squeeze function in Python’s wirefile The module explicitly relies on the information in the TarInfo object “and binds to the path that is passed to the Extract function and the name in the TarInfo object”

CVE-2007-4559 - Joining path with filename
CVE-2007-4559 – Joining path with filename
Source: Trelix

less than a week after the disclosure, a Message on Python Bug Tracker announced that the issue had been closed, updating the document with a warning that “extracting archives from untrusted sources can be dangerous.”

Estimated 350,000 Projects Affected

Analyzing the impact, the Trelix researchers found that the vulnerability was present in thousands of software projects, both open and closed source.

The researchers scraped a set of 257 repositories most likely to contain vulnerable code and manually checked 175 of them to see if they had been affected. This showed that 61% of them were vulnerable.

Running an automated check on the remaining repositories increased the number of affected projects to 65%, indicating a wider problem.

However, the small sample set only serves as a baseline to come up with an estimate of all the affected repositories available on GitHub.

“With the help of GitHub we were able to get a dataset very large to include 588,840 unique repositories, which included ‘import tarfile’ in its Python code” – Charles McFarland

Using a manually verified 61% vulnerability rate, Trelix estimates there are over 350,000 vulnerable repositories, many of them used by machine learning tools (such as GitHub Copilot) that help developers complete a project faster. help to do.

Such automated tools rely on code from hundreds of thousands of repositories to provide an “auto-complete” option. If they provide unsafe code, the problem spreads to other projects without the developer knowing.

GitHub Copilot is suggesting weak tariff extraction code
GitHub Copilot is giving weak suggestions wirefile extraction code
Source: Trelix

Looking further into the problem, Trelix found that open-source code is vulnerable to CVE-2007-4559 “a large number of industry spans.”

As expected, the most affected is the development sector, followed by web and machine learning technology.

Code sensitive to CVE-2007-4559 present in industries
Code sensitive to CVE-2007-4559 present in industries
Source: Trelix

Exploitation CVE-2007-4559

one in technical blog post Today, Trelix vulnerability researcher Casimir Schulz, who rediscovered the bug, described simple steps to exploit CVE-2007-4559 in the Windows version of Spyder IDE, an open-source cross-platform integrated development environment for scientific programming. described.

The researchers showed that the vulnerability could be exploited on Linux as well. They managed to achieve file write and code execution in a test on the Polemarch IT Infrastructure Management Service.

In addition to drawing attention to the vulnerability and the risk it poses, Trelix also created patches for more than 11,000 projects. The fixes will be available in a fork of the affected repositories. Later, they will be added to the main project via pull requests.

Due to the large number of affected repositories, the researchers expect more than 70,000 projects to be fixed in the next few weeks. Achieving 100% score is a tough challenge, however, as merge requests also need to be accepted by the maintainers.

BleepingComputer has contacted the Python Software Foundation for a comment regarding CVE-2007-4559, but has not received a reply at publication time.

Be the first to comment

Leave a Reply

Your email address will not be published.


*