Analysis of GitHub Repositories Surfaces Nearly 23M Secrets

An analysis of public GitHub repositories published today finds 22.8 million hardcoded secrets, representing a 25% increase since a similar study was done a year ago.

Conducted by GitGuardian, the analysis also discovered well over a third (35%) of all private repositories contained at least one plaintext secret that could easily be discovered.

More troubling still, 70% of the secrets that GitGuardian discovered in 2022 still remain active three years later, according to the report.

In total, 4.6% of all public repositories contained a secret, 58% of all detected secrets were generic, including hardcoded passwords embedded in source code, database connection strings, custom authentication tokens and encryption keys stored in plaintext.

Soujanya Ain, senior product marketing manager for GitGuardian, said the report makes it clear that despite increased adoption of best DevSecOps practices and the availability of services such as the GitHub Push protection plan, the number of applications secrets that cybercriminals can easily gain access to, continues to increase.

Tools such as GitHub Copilot are also contributing to the problem. Out of the 20,000 repositories that provide access to GitHub Copilot, more than 1,200 had at least one secret, representing 6.4% of the sampled repositories.

In addition, 2,584 repositories where continuous integration/continuous delivery (CI/CD) configurations indicated the availability of a secrets manager, 132 repositories had leaked at least one exposed secret, representing 5.1% of the studied repositories.

Overall, the most frequently leaked secret discovered in public repositories (19%) provided access to the MongoDB document database. The fastest growing source of leaks (188% increase) is Neo4j, an open-source graph database.

However, private repositories also appear to be providing application development teams with a false sense of security. For example, identity access management (IAM) keys for accessing Amazon Web Services (AWS) services appearing in plaintext accounted for 8% of the secrets discovered, compared to 1% in public repositories. Generic passwords appeared three times more often (24%) in private repositories than they did in a public repository.

Additionally, secrets are also being embedded in images that can be easily discovered. For example, nearly all the secrets (98%) of the secrets detected in DockerHub were in an image, with more than 7,000 AWS keys exposed. Approximately 100,000 valid secrets, including AWS keys, Google Cloud Platform (GCP) keys, and GitHub tokens belonging to Fortune 500 companies, are currently exposed. Out of 1,179,475 unique secrets detected, 101,186 were automatically verified as valid, according to the report.

The report also notes that IT tools routinely contain exposed secrets, with 2.4% of the Slack channels analyzed containing leaked secrets and 6.2% of tickets created using Jira software from Atlassian exposing secrets. These tools have become a major blind spot for many organizations, noted Ain.

The degree to which more secrets are being exposed versus discovered using machine learning algorithms is unclear, but, in an era where cybercriminals have discovered they can use stolen credentials to simply log into application environments to spread malware, the management of application secrets still needs to become a more pressing concern.

Analysis of GitHub Repositories Surfaces Nearly 23M Secrets

Hybrid Infrastructure Is a Mess—But It Doesn’t Have to Be

Checkmarx Surfaces Malicious Effort to Compromise Software Supply Chains

Most DevOps Engineer Jobs Pay Upwards of $140K, Require Less In-Ofﬁce Presence and Prefer Senior-Level Experience

Analysis of GitHub Repositories Surfaces Nearly 23M Secrets

Related Posts

Hybrid Infrastructure Is a Mess—But It Doesn’t Have to Be

Checkmarx Surfaces Malicious Effort to Compromise Software Supply Chains

Most DevOps Engineer Jobs Pay Upwards of $140K, Require Less In-Ofﬁce Presence and Prefer Senior-Level Experience