Determining the location by commits in git
Here are maps of the “possible” location of Telegram and React developers for starters.
Parameters:
scale +/- country;
miss up to a thousand km;
the probability of error is still ~20%;
Facts:
Current algorithm:
look at the time zone in the commit timestamps;
some time zones have only one major city (for example: +4:30 Kabul, +5:45 Kathmandu, +10:30 Adelaide);
some time zones have only one country (for example: +05:30 India, +12:00/+13:00 New Zealand);
Having a zone with N countries, we take into account only those where there is a high probability of IT (for example: in the Burkina Faso / UK zone, we exclude Burkina Faso)
we check the first level domain at the mailbox (for example: mil is mainly used by the US military);
check the mail server (for example: the Chinese prefer qq.com)
we check for unique characters in commit texts (for example: ł for Poland, ß for Germany, ñ for Spain);
we check popular surnames (for example: Kim and Park are ~15 million Koreans from the Korea/Japan zone, and Suzuki and Sato are ~4 million Japanese)
What else can you do:
save the TOP 100 IT companies and their addresses. Find out the company from email (for example: for ivan@luxoft.com it is most likely Luxoft). Match mail, offices and current range of countries.
If a person has committed for a long time and a lot, you can create a histogram and compare the failures in it with the state. holidays (for example: Christmas among Catholics, fiesta and siesta among the Spaniards, independence day in Papua New Guinea).
compare the location with other metrics and highlight on the map those who are working and those who have been laid off (or the main staff). Next, adjust the location of individual guys according to the position of the majority.
Cons:
there are a lot of places with “ifs”, so there will be mistakes. My task is not to guess 100% of the time, but to guess correctly “for the majority.”
The algorithm is easy to fool, but “for most” it is a pointless task.
Yes, the method is not the most accurate. But the current implementation (with bugs) is already good at guessing, and if you add the correct transitions to summer and winter time, as well as expand the metrics, it will become even better. Sources hereonline demo here.
Packages for Python, Ruby, JS, PHP, Docker
Python:
installation:pipx install assayo
create report:assayo
Ruby:
installation:gem install assayo
create report:assayo
JS:
create report:npx assayo
PHP:
installation:composer require bakhirev/assayo
create report:vendor/bin/assayo
Docker:
image: https://hub.docker.com/r/bakhirev/assayo
PS: When I wrote this, there was no news about Linux yet. And now it’s somehow strange, because this could potentially be used for mass bans. But on the other hand, the reason for the bans is not the listing tool. It would be strange to delete this.