| @@ -0,0 +1,108 @@ | |||||
| --- | |||||
| title: CODEpendence | |||||
| description: > | |||||
| How to surreptitiouslyinject code via submodules that use GitHub repos | |||||
| created: !!timestamp '2021-07-07' | |||||
| time: 11:16 AM | |||||
| tags: | |||||
| - security | |||||
| - GitHub | |||||
| - git | |||||
| --- | |||||
| TL;dr: If you use submodules that point to a GitHub repo, make sure | |||||
| that the commit id matches an offical branch or tag, especially if | |||||
| upgraded via a PR or submitted patch. | |||||
| This issue was disclosed to GitHub via the HackerOne Bug Bounty | |||||
| program and resolved by them in a timely manner. The | |||||
| [writeup](https://www.funkthat.com/~jmg/github.submodules.hash.txt) is | |||||
| available and is the same one that was provided to GitHub. It contains | |||||
| the complete steps in more detail than this blog post does. | |||||
| Discovery | |||||
| --------- | |||||
| Earlier this year, I was dealing with a git repo that used submodules. | |||||
| I've never been a fan of them due to the extra work involved in using | |||||
| them. But then a thought hit me, last year, when GitHub took down | |||||
| youtube-dl, someone was a bit sneaky and inserted a [copy of it into | |||||
| GitHub's DMCA repo](https://www.reddit.com/r/programming/comments/jhlhok/someone_replaced_the_github_dmca_repo_with/). | |||||
| They were able to do this because there is a feature/bug in GitHub's | |||||
| backend, that all the commits to a forked repo are accessible in the | |||||
| parent repo, it's just that the branches and tags are maintained | |||||
| separately.<label for="sn-reason" class="margin-toggle sidenote-number"> | |||||
| </label><input type="checkbox" id="sn-reason" class="margin-toggle"/> | |||||
| <span class="sidenote">This makes sense to reduce storing duplicate data | |||||
| if a repo is large or has many forks.</span> | |||||
| Verification | |||||
| ------------ | |||||
| The next question, would combining these into an attack even work? | |||||
| What would things look like? I created a few accounts to test them, | |||||
| creating a project to represent a code dependancy, | |||||
| [depproj](https://github.com/upstream123/depproj), that would be | |||||
| imported into another project by another user, | |||||
| [proj](https://github.com/comproj/proj). Then once those were created, | |||||
| have a malicious user create a fork of both the | |||||
| [deprpoj](https://github.com/maliciousrepo/depproj) and the | |||||
| [proj](https://github.com/maliciousrepo/proj). | |||||
| Once the malicious forks were created, clone them locally. With the | |||||
| clones, malicious [code can be | |||||
| inserted](https://github.com/maliciousrepo/depproj/commit/91781e4b9e1b1c944e19db740db12304755666b5) | |||||
| into the depproj repo. If you look at the repo, the previous commit | |||||
| was done as the maliciousrepo user, but while I was working on this, | |||||
| I remembered that w/ git, you can set the commit author to be anything | |||||
| (signing helps prevent that), so this commit appears to be done by the | |||||
| correct upstream123 user. | |||||
| Once the malicious code has been inserted, the malicious user can now | |||||
| update the submodule of the project to the commit id of the malicious | |||||
| code. This is done simply by doing: | |||||
| ``` | |||||
| cd depproj | |||||
| git fetch origin <commitid> | |||||
| git checkout <commitid> | |||||
| ``` | |||||
| Even though the depproj still points to the upstream123 repo, because | |||||
| fork commits appear IN the depproj repo, the above works w/o any other | |||||
| changes. This is also what makes it dangerous, because the repo is not | |||||
| changed, it can be disguised as a simple version update. | |||||
| A [PR](https://github.com/comproj/proj/pull/3) is then submitted to the | |||||
| project being attacked. I did not control the author of commits as | |||||
| well as I should have, but it still is effective. If you click into | |||||
| the proposed change, and then click on code.c, the file changed, it'll | |||||
| bring you to the [change compare | |||||
| view](https://github.com/upstream123/depproj/compare/91781e4b9e1b1c944e19db740db12304755666b5...370d35ec5df81a16bb361111faeb665ea90de026#diff-e43700a08429a0231daba9a49ff36a118566849856da2811ae074417ebb552d0). | |||||
| For this demo, it was a small change, but if the project is large, it's | |||||
| would be easy to bury a minor flaw in lots of changes. The other thing | |||||
| to note about this page is that the author displayed is NOT the author | |||||
| of the change, but it appears that it is a legitimate change by the | |||||
| author of the repo. []({{ media_url('images/codependence-comp-author.png') }}) | |||||
| Conclusion | |||||
| ---------- | |||||
| This is an interesting attack in that it leverages two features in a | |||||
| way that has surprising results. It demonstrates that software | |||||
| dependancies need to be reviewed, and vetted, and that if you're using | |||||
| GitHub, that just because a PR says it's updating a submodule to the | |||||
| new version, it doesn't mean that it is safe to simply merge in the | |||||
| change. | |||||
| Timeline | |||||
| -------- | |||||
| 2021-03-31 -- Reported to GitHub via HackerOne.<br> | |||||
| 2021-03-31 -- More info requested and provided.<br> | |||||
| 2021-04-01 -- Ack'd issue and started work on fix.<br> | |||||
| 2021-05-04 -- GitHub determined it was low risk, but did add warning when viewing commit.<br> | |||||
| 2021-05-05 -- Asked GitHub for disclosure timeline.<br> | |||||
| 2021-06-04 -- Pinged GitHub again.<br> | |||||
| 2021-07-07 -- Published blog post.<br> | |||||