Detecting secrets in code committed to gitlab (in real time)

May. 16, 2020

More Related Content


Detecting secrets in code committed to gitlab (in real time)

  1. Detecting secrets in code committed to Gitlab (in real time) Chandrapal Badshah
  2. About Me ● Chandrapal Badshah ● Security Engineer ● Stoic and spends time with philosophy ● Pentest, Automation, Read books ● Manage @HackwithGithub on Twitter
  3. Context ● Product based company, fail fast learn fast ● Hires a lot of devs* ● Use Gitlab community edition for code storage and CI/CD ● We do audit the code for secrets in regular intervals, but that’s late
  4. Problem Statement Need to detect and remove sensitive API keys (secrets) from code This would reduce the impact when: ● Devs makes an internal repo public ● Devs pushes commits to their personal Github repos by mistake ● Unauthorized members accesses to code (insider threat)
  5. This would help us in situations like Source :
  6. Let’s begin our journey
  7. Git flow → git commit → git push →
  8. Git hooks ● Git hooks are scripts that git executes before or after events such as: commit, push, and receive ● Git hooks are a built-in feature - no need to download anything. ● There are many types of git hooks. Check out ● We are interested in commit and receive based hooks: ○ pre-commit ○ post-commit ○ pre-receive ○ post-receive
  9. Git hooks in the flow Source:
  10. Comparison of Git hooks Pre commit and Post commit hooks - runs the scripts on dev machines. Advantages: ● Stops even before the secrets are committed Disadvantages: ● Adding new regex & managing the script on dev machines is hard ● False positives are bad user experience ● Privacy issues ? Nothing stops them from removing the git hooks
  11. Comparison of Git hooks Pre receive hook - it can’t do much checks as the code is yet to reach the server. There is Pre push hook which executes even before the Pre receive hook is executed on the server side. But Pre push hook is still on the client side.
  12. Comparison of Git hooks Post receive hook - runs on the server side. Advantages: ● Can be configured for no delay when user does a git push. Devs don’t really see the difference. ● Easy to manage the scripts ● False positives are manageable Disadvantages: ● The secrets are already on the server
  13. Final Decision Go with the use of post receive hooks. If secret detected: ● automatically raise a confidential Gitlab issue in the repo ● get feedback - check if it’s a false positive ● if it’s a secret, ask the devs to rotate the secret Post receive hooks should be configured per repository
  14. Gitlab feature to help post receive hooks ● Gitlab has System hooks ● Gitlab system hooks does a HTTP POST request for many events like push, group create, repo create, etc ● More details at
  15. Existing secret detection tools There are lots of open source tools: ● truffleHog ● gitleaks ● git-secrets by AWS Labs ● detect-secrets by Yelp ● talisman by ThoughtWorks ● and more...
  16. TruffleHog ● Python based tool ● Customizable regex ● Easy install and CLI commands ● Good documentation ●
  17. Gitleaks ● Written in Golang ● Customizable regex ● Supports whitelisting of secrets ● Lots of options in CLI commands, lacks documentation ● Allows scan of single commit but downloads the entire repo ●
  18. Comparison of truffleHog and gitleaks truffleHog 1. Efficient for smaller commits 2. Less memory intense 3. After configuring with Gitlab system hooks, the total time taken to complete scanning was less. gitleaks 1. Same time as trufflehog for smaller commits. Comparatively fast for huge commits. 2. Very greedy for CPU memory 3. After configuring with Gitlab system hooks, the total time taken to complete scanning was less but at the cost of CPU memory.
  19. Changes made ● Took all the necessary code from truffleHog and stripped the rest. We internally call it “tattletale-rt”. ● The scan logic looks like the below: ○ Get the code changes in the commit (only the added content not the removed) ○ Get all the regexes we need to scan ○ For each line in the code change, check if the regex matches ○ If matches, report it ● Have a separate service called “Issue Manager” which manages issues.
  20. Final architecture
  21. DEMO
  22. Thanks to Fahri Shihab @fahrishb Sanjog Panda @sanjogpanda
  23. What we learnt ● Not all API keys are sensitive. Google API keys are everywhere and are intended to be public - Google Maps API key, Firebase key, etc ● Deployments are different for each projects - No “one solution” that fits all ● This detection is regex based. API keys / secrets will not be detected if: ○ API key doesn't match the regex ○ If the secrets are in a different language. пароль (parol’) is “password” in Russian. ● Entropy based detection is noisy but can detect some secrets. ● Learn on what’s the secure way to store secrets for each tech stack.
  24. Thank you Any Questions
  25. What are we working on now ? Follow on Twitter to get more updates on: ● Mobile App Security Pipeline (Android & iOS) ● SAST