Determining the location by commits in git

Here are maps of the “possible” location of Telegram and React developers for starters.

Telegram Desktop. A total of 205 people. Of these, 3 are main. Two (operating since 2014 and 2019) in the Samara-Caucasus region (Armenia, Georgia, Azerbaijan) and one (operating since 2018) probably in Turkey.

Telegram Desktop. A total of 205 people. Of these, 3 are main. Two (operating since 2014 and 2019) in the Samara-Caucasus region (Armenia, Georgia, Azerbaijan) and one (operating since 2018) probably in Turkey.

ReactJS. A total of 1854 people. Main composition: 14 working, 26 quit. About 50/50 sit on the east and west coasts of the US.

ReactJS. A total of 1854 people. Main composition: 14 working, 26 quit. About 50/50 sit on the east and west coasts of the US.

Parameters:

  • scale +/- country;

  • miss up to a thousand km;

  • the probability of error is still ~20%;

Facts:

Current algorithm:

  • look at the time zone in the commit timestamps;

  • some time zones have only one major city (for example: +4:30 Kabul, +5:45 Kathmandu, +10:30 Adelaide);

  • some time zones have only one country (for example: +05:30 India, +12:00/+13:00 New Zealand);

  • Having a zone with N countries, we take into account only those where there is a high probability of IT (for example: in the Burkina Faso / UK zone, we exclude Burkina Faso)

  • we check the first level domain at the mailbox (for example: mil is mainly used by the US military);

  • check the mail server (for example: the Chinese prefer qq.com)

  • we check for unique characters in commit texts (for example: ł for Poland, ß for Germany, ñ for Spain);

  • we check popular surnames (for example: Kim and Park are ~15 million Koreans from the Korea/Japan zone, and Suzuki and Sato are ~4 million Japanese)

What else can you do:

  • save the TOP 100 IT companies and their addresses. Find out the company from email (for example: for ivan@luxoft.com it is most likely Luxoft). Match mail, offices and current range of countries.

  • If a person has committed for a long time and a lot, you can create a histogram and compare the failures in it with the state. holidays (for example: Christmas among Catholics, fiesta and siesta among the Spaniards, independence day in Papua New Guinea).

  • compare the location with other metrics and highlight on the map those who are working and those who have been laid off (or the main staff). Next, adjust the location of individual guys according to the position of the majority.

Cons:

  • there are a lot of places with “ifs”, so there will be mistakes. My task is not to guess 100% of the time, but to guess correctly “for the majority.”

  • The algorithm is easy to fool, but “for most” it is a pointless task.

Yes, the method is not the most accurate. But the current implementation (with bugs) is already good at guessing, and if you add the correct transitions to summer and winter time, as well as expand the metrics, it will become even better. Sources hereonline demo here.

Packages for Python, Ruby, JS, PHP, Docker

Python:
installation:pipx install assayo
create report:assayo

Ruby:
installation:gem install assayo
create report:assayo

JS:
create report:npx assayo

PHP:
installation:composer require bakhirev/assayo
create report:vendor/bin/assayo

Docker:
image: https://hub.docker.com/r/bakhirev/assayo

PS: When I wrote this, there was no news about Linux yet. And now it’s somehow strange, because this could potentially be used for mass bans. But on the other hand, the reason for the bans is not the listing tool. It would be strange to delete this.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *