The history of personal data leakage in Github
A story about one careless workshop participant from GeekBrains.

background
As usual, I tried to find information about a company I was interested in on Github.
This time, the road took me in a completely different direction.
I found a personal direct link to a document (personal data) that should not be shared with anyone. The document is located on the website of the company I’m interested in:

I thought it was another new SMS bomber, but it turned out differently.
Having opened the file, I immediately realized that it was the login and password from the site highlighted in blue. This site is not affiliated with the company I’m interested in:

Study
Having opened the history of the change, the following became clear. As it turned out, a certain user added 754 files with personal data and other confidential information with the last commit:

Example of data among files:



How did it happen?
So, the story is no longer connected to the company, which interests me. How did it happen? I also asked myself this question. I opened its Test repository and it immediately became clear from the comments why this happened:

By exploring the rest of the repositories, we confirm the connection with GeekBrains:

The student learned to fork the teacher’s repository and create repositories, as well as upload files to them, without understanding what exactly he was uploading.
Let’s take a look at this workshop:
Course program
Correspondence of groups and topics for the workshop.
What is version control system
What is a version control system for?
Installing git on your PC (depending on the system)
Installing VSCode on your PC
What is a repository and instructions for creating local repositories.
Basic work with a local repository
What are branches and what are they for when working with a version control system.
Basic branching in git.
What is a remote repository and what is it for?
Basic work with remote GitHub repositories
How it is built and why collaboration is needed in version control systems
Instructions for creating a pull request
Books and useful links on learning git.
Alternative version control systems.
Missing a section on repository auditing and repository deletion rules?
All the files that were discussed earlier in the repository from the desktop are arranged in folders:

Also studying the history of user actions, I finally understood how it all happened.
A certain Anastasia, studying at GeekBrains, performed tasks on a shared computer used by other family members, including work tasks related to the processing of personal data of drivers. As a result, as a result of training, the working folder located on the student’s desktop was mistakenly uploaded to Github along with other files in the Test repository.
conclusions
A user who has access to the personal data of other individuals, presumably legally not having access to them, tries to log into IT;
GeekBrains tried to educate him;
Nothing came of it;
The student did not even understand what he had done;
The teacher had little control over what the student was doing;
The company whose contractor is the user’s relative and the relative himself has been notified, no response yet;
During the training, study in detail everything that you will do twice, if something is not clear, do not hesitate to ask the lecturer and friends;
Lecturers do not study well enough what their students do.
PS
If you do not want to delete the repository, then you need to delete confidential data according to the instructions: Removing sensitive data from a repository – GitHub Docs
Even today, in one of my repositories, I found a Google API key that I divulged in 2019:

Be careful.