12,000 Secrets Unleashed: AI Models Training on Hidden API Keys!

Common Crawl dataset secrets revealed: 12,000 valid API keys and passwords found, including AWS and MailChimp keys. LLMs may be trained on insecure code, despite efforts to filter sensitive data. Truffle Security warns developers against hardcoding secrets, highlighting risks of data leaks and phishing.

Pro Dashboard

Hot Take:

Whoever said secrets are meant to be kept clearly didn’t inform the Common Crawl dataset. It’s like an open treasure chest of garbled passwords and keys, just waiting for pirates of the digital seas! Forget about hacking into mainframes, the real action is in the HTML forms and JavaScript snippets. Arr, matey, hardcoded treasures await!

Key Points:

  • Close to 12,000 valid secrets found in the Common Crawl dataset.
  • Truffle Security identified 219 distinct secret types, with MailChimp API keys being the most common.
  • Secrets were hardcoded into HTML and JavaScript, not using server-side environment variables.
  • 63% of secrets appeared across multiple pages, with one WalkScore API key found 57,029 times.
  • Truffle Security contacted vendors to revoke compromised keys to prevent misuse.

Membership Required

 You must be a member to access this content.

View Membership Levels
Already a member? Log in here
The Nimble Nerd
Confessional Booth of Our Digital Sins

Okay, deep breath, let's get this over with. In the grand act of digital self-sabotage, we've littered this site with cookies. Yep, we did that. Why? So your highness can have a 'premium' experience or whatever. These traitorous cookies hide in your browser, eagerly waiting to welcome you back like a guilty dog that's just chewed your favorite shoe. And, if that's not enough, they also tattle on which parts of our sad little corner of the web you obsess over. Feels dirty, doesn't it?