Ranking System v1 #960

CrafterKolyan · 2021-03-22T16:55:50Z

Usage of normal distribution is not justified at all and as its' support is (-inf; inf) you get some problems (e.g. #883 #455). Exponential distribution's support is [0; inf) which means if we will calculate survival function (same as 1 - cdf (https://en.wikipedia.org/wiki/Survival_function#Definition)) then for all zeros we will get a person with a score equal to a 100 which makes a lot of sense. Also in practice exponential distribution is quite accurate showing activity of a person. See example below:

This is the real activity distribution histogram (taken from https://movespring.com/blog/how-to-set-a-goal-for-your-next-activity-or-step-challenge-5f74c65ac49982000764facf):

This is the exponential distribution histogram with different parameters:

Here is one more example with real distribution (as blue) and fitted exponential distribution (as red):

Next step is to restore parameters of exponential distribution for real distribution. In my opinion Method of Moments (https://en.wikipedia.org/wiki/Method_of_moments_(statistics)) is the easiest to understand and comes from a single property we would like to have: expectation over our parameterized distribution would be equal to expectation of the real distribution. (see *_VALUE variables in code). As expectation of exponential distribution with parameter \lambda is equal to 1 / \lambda then if we have an expectation (which is equal to an average over users) of real distribution then the restored \lambda = 1 / expectation.

Now we have 7 distributions over different aspects of Github Profile: Commits, Contributions, Issues, PRs, Stars, Followers, Repositories. We can understand how "good" a Github Profile in each aspect by calculating survival function over each of these 7 distributions in the points corresponding to Github Profile stats (the lesser value the better). To get a single number from 7 numbers we can have for example an average of these numbers but that wouldn't be good as a person who is great in one aspect and bad in others (e.g. Linus Torvalds with only 2 repositories and low stats in PRs and Issues and a ton of stars and followers) will never get an S+ rank so we need to have an aggregate functions which would stimulate low values at least in one aspect. One of such functions is min(...) over 7 aspects but this doesn't encourage you to develop any aspects except the only one you are best in, so we will use harmonic average (https://en.wikipedia.org/wiki/Harmonic_mean) which fits our needs (as it almost equal to min(...) for values much less than 1 and also has a non-zero gradient over each variable).

As you can see from tests:

A decent user (400 stars and 100 followers) has an A++ rank
Newly created account has lowest possible score = 100
Linus Torvalds has highest possible score = 0 (this wasn't intended I just realized it when ran a script)

IMPORTANT NOTICE
Current values are not set to be equal to 1 / <average stat over users> as I couldn't find any official (and even unofficial) statistics referring to these. So they are just set to what I see as an average Github User.

vercel · 2021-03-22T16:55:55Z

@CrafterKolyan is attempting to deploy a commit to the github readme stats Team on Vercel.

A member of the Team first needs to authorize it.

codecov · 2021-03-22T17:02:57Z

Codecov Report

Merging #960 (6f370e3) into master (86b9ad6) will increase coverage by 0.28%.
The diff coverage is 100.00%.

❗ Current head 6f370e3 differs from pull request most recent head 034f47c. Consider uploading reports for the commit 034f47c to get more accurate results

@@            Coverage Diff             @@
##           master     #960      +/-   ##
==========================================
+ Coverage   93.98%   94.26%   +0.28%     
==========================================
  Files          22       22              
  Lines         682      663      -19     
  Branches      191      185       -6     
==========================================
- Hits          641      625      -16     
+ Misses         37       34       -3     
  Partials        4        4

Impacted Files	Coverage Δ
src/calculateRank.js	`96.42% <100.00%> (+4.93%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 86b9ad6...034f47c. Read the comment docs.

tkrotoff · 2021-04-22T17:20:43Z

I hate bots that close issues and now they also close PRs 😲

atinba · 2021-04-25T10:07:06Z

This seems much better than the current ranking system, why this hasn't been merged yet?

CrafterKolyan · 2021-04-27T10:46:45Z

I think the main problem is that @anuraghazra needs a lot of time to fully understand the solution and also he may want to do some extra testing on his side rather than rely on my research. (But maybe he simply missed this PR)

anuraghazra · 2021-07-18T12:40:17Z

I think the main problem is that @anuraghazra needs a lot of time to fully understand the solution and also he may want to do some extra testing on his side rather than rely on my research. (But maybe he simply missed this PR)

Oh hi! So just looked at it actually I'm very cautious when it comes to changing these stats calculations because people will go mad if they see their ranks are not the same and a breaking change happened.
Also another reason is that I SUCK AT MATHS.

But this PR and your description looks very promising.
@francois-rozet on #1186 seems to be doing the same thing but with few differences it would be great if you folks could help me review and pick a better stats algorithm option.

francois-rozet · 2021-07-18T13:12:39Z

Hi @anuraghazra,

The principle is very similar in this PR and #1186. Each metric (repos, commits, stars, ...) is associated to its own rank. For instance, the "stars" rank is computed as

stars_rank = exp(-stars / STARS_MEAN)

which ranges from 0 (no one is better) to 1 (every one is better).

The difference lies in how we aggregate the individual ranks.

In this PR, the author consider that if you are extremely good in one metric, your overall rank should be as well. This is done as

rank = 7 / ( 1 / stars_rank + 1 / commits_rank + 1 / followers_rank + ...)

so if followers_rank = 0.000001 (extremely good), rank will be mostly influenced by followers_rank. I don't think this is a good idea because if someone has literally 0 commits/repos/stars but a ton of followers, his rank would still be S+.

For instance this user (esin) has 2.5k followers. In this PR, he would get S+. In mine he is a A (almost A+).

In #1186, the overall rank is a weighted average of the individual ranks.

rank = (1 * stars_rank + 0.25 * commits_rank + 0.5 * followers_rank + ...) / (1 + 0.25 + 0.5 + ...)

This prevents the problem mentioned above, but it also means that, unless you are perfect (ranks = 0.) everywhere your overall rank will not be perfect. The weights are here to mitigate by reducing the impact of commits_rank with respect to stars_rank for example.

The reason why Linus Torvalds is not S+ is because he doesn't have a lot of repos compared to the average user (only 4 instead of 10) and not a lot of PRs/issues. However, it is very easily modified: You can either reduce the "weight" of repos_rank or consider the REPOS_MEAN a bit lower (e.g. 5).

I've edited the weights so that Linus Torvalds is now S+, check #1186.

CrafterKolyan · 2021-07-20T12:57:35Z

Hi @anuraghazra.

I understand your fears about algorithm changing. Of course, almost nobody would like to understand he is not that good compared to others and of course almost nobody would share on their profile such grade of work they've done. To be honest I'm not sure if having a problem in ranking algorithm is good or bad. It ranks people higher and gives them extra motivation and self-confidence, even though algorithm may lie to them.

From my point of view it seems that as your application became quite popular then people don't care much about the exact grading algorithm, they want to feel their significance to the society which is given to them here. I feel that your "encouraging" algorithm can make more to the open-source community than my "strict mathematical" approach. It is not about math and programming but about psychology.

Anyway I will remain my pull request open in case you'll want to change the calculation algorithm for something better and also as a reference for those who is curious how can you approach to such kind of ranking task.

dreamyguy · 2021-07-20T15:38:52Z

Just wanted to applaud @francois-rozet and @CrafterKolyan for their great take on this. 👏👏👏👏👏

I personally like the idea of starting from 0 and getting to the Moon, it gives me a much greater sense of accomplishment. 🚀 🌔

But I don't judge those who get a boost in confidence by starting with half a circle and an A+. Some days I feel like I need these...

Really happy with it as it is @anuraghazra, you've done an amazing work!

vercel · 2021-09-07T14:25:43Z

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/github-readme-stats-team/github-readme-stats/92MYUQNT1iBguNAJw3JbXddgb4kz
✅ Preview: https://github-readme-stats-git-fork-cr-1839b0-github-readme-stats-team.vercel.app

anuraghazra · 2021-09-07T14:32:37Z

Okay I was just testing this out, planning to sort this ranking thing this week.

Will consider both of the PRs, and release it under experimental flags ?enable_experiments=new_ranking_system to get feedback from the community and roll out the change gradually.

@CrafterKolyan but I found this, how is this username getting S rank? (username=aju100)

While using @francois-rozet's PR #1186 It is rank "A" which seems more correct.
From my experiments overall @francois-rozet's changes seem more balanced.

francois-rozet · 2021-09-07T15:03:58Z

@CrafterKolyan but I found this, how is this username getting S rank?

@anuraghazra It's because the user (aju100) has an outstanding number of repositories and, as mentioned in #960 (comment) a single good rank among the metrics leads to a good overall rank in #960, but not in #1186.

Thank you for taking the time to sort this out!

anuraghazra · 2021-09-07T15:07:44Z

Ahh i see. aju100 has only 100 repos maybe not that much but anyways it should not be S rank.

francois-rozet · 2021-09-07T15:13:27Z

It is not that much, but much more than the average user. I should mention than the number of repos is not taken into account in #1186 (otherwise Linus Torvalds would not be S+).

rickstaa · 2021-11-08T13:54:11Z

First of all, @francois-rozet and @CrafterKolyan, thanks a lot for addressing this topic. Here are my two cents.

Overall, I think @francois-rozet algorithm is better balanced. I agree with @francois-rozet that the @CrafterKolyan algorithm creates an incorrect score when somebody has a lot of followers but 0 commits/repos/stars (see #960 (comment)). If @CrafterKolyan fixed this, I would have no preference between the two implementations.

I, however, also see one shortcoming with the implementation of @francois-rozet. The current version does not take the number of contributions into account. I understand why totalRepos is unused: People can fork a lot of repositories while doing nothing with them but still their score would be increased. However, the number of contributions should be considered since a person who contributed to several critical opensource repositories has more impact than somebody who contributed to one repository. I noticed that this is one of the problems people have with the current implementation (see #1425).

francois-rozet · 2021-11-08T13:57:07Z

Hello @rickstaa, the reason why I don't consider contributions is because they are redondant with PRs, issues and commits. Since I take the latter into account, I don't need the former.

rickstaa · 2021-11-08T14:02:07Z

@francois-rozet Good point, you are right I overlooked that fact while quickly scanning your code to answer #1425. In that case, I think we should go with @francois-rozet algorithm.

CrafterKolyan · 2021-11-09T06:22:55Z

First of all, @francois-rozet and @CrafterKolyan, thanks a lot for addressing this topic. Here are my two cents.

Overall, I think @francois-rozet algorithm is better balanced. I agree with @francois-rozet that the @CrafterKolyan algorithm creates an incorrect score when somebody has a lot of followers but 0 commits/repos/stars (see #960 (comment)). If @CrafterKolyan fixed this, I would have no preference between the two implementations.

I, however, also see one shortcoming with the implementation of @francois-rozet. The current version does not take the number of contributions into account. I understand why totalRepos is unused: People can fork a lot of repositories while doing nothing with them but still their score would be increased. However, the number of contributions should be considered since a person who contributed to several critical opensource repositories has more impact than somebody who contributed to one repository. I noticed that this is one of the problems people have with the current implementation (see #1425).

To be honest I don't see the problem with many followers and 0 stars/commits/etc. The followers count is the hardest statistic to manipulate with as Github have some system to prevent multiaccount.

francois-rozet · 2021-11-09T06:45:24Z

IMHO, the rank should measure your stats as a developer not as an "influencer". Having tons of followers does not make you a good developer, GitHub is not Twitter or Instagram...

Also, the problem does not arise only with followers. Someone with a very large number of empty repos but nothing else still gets S+. Same for commits, issues, ... you get it.

Anyway, I would rather have your version of the rank than the one currently implemented, but @anuraghazra seemingly abondoned the idea...

markus-wa · 2021-12-08T12:36:27Z

👋 is this still coming @anuraghazra ? 🙂

smart2pet · 2022-12-22T22:46:15Z

Probably a lot of people stuck at A+ like me.

smart2pet

In my mind, this solve of ranking system is good enough. @anuraghazra Please look here.

smart2pet

In my mind, this solve of ranking system is good enough. @anuraghazra Please look here.

rickstaa · 2022-12-27T23:04:18Z

In my mind, this solve of ranking system is good enough. @anuraghazra Please look here.

My preference is with #1186.

chrisK824 · 2023-03-25T00:15:23Z

Hey @rickstaa @anuraghazra will this ranking system be adopted after all?

rickstaa · 2023-03-27T10:03:19Z

Hey @rickstaa @anuraghazra will this ranking system be adopted after all?

I am in favour of merging #1186 since it is more balanced (see #1186 (comment)). I, however, would like to have @anuraghazra's opinion before making such a breaking change.

rickstaa · 2023-04-08T10:10:46Z

Closing, in favour of #1186.

Change calculation algorithm to use exponential distribution

24a32f1

CrafterKolyan added 2 commits March 22, 2021 19:56

export module

46a98fb

Update calculateRank.test.js

159296a

CrafterKolyan added 6 commits March 22, 2021 20:11

Update calculateRank.test.js

e765a4e

Update calculateRank.js

9d8d98c

Update calculateRank.test.js

8a6060b

Update calculateRank.test.js

970bb2f

Update calculateRank.js

ebfdd75

Update calculateRank.test.js

48a5014

CrafterKolyan changed the title ~~Change rank algorithm so it would be possible to get B+ rank (Fix #883)~~ Change rank algorithm so it would be possible to get B+ rank (Fix #883 Fix #455) Mar 23, 2021

Refactor variable name

6f370e3

CrafterKolyan mentioned this pull request Mar 30, 2021

Improve ranking system #455

Closed

5 tasks

stale bot added the stale Issue is marked as stale. label Apr 22, 2021

stale bot removed the stale Issue is marked as stale. label Apr 22, 2021

GaryHilares approved these changes May 5, 2021

View reviewed changes

GaryHilares mentioned this pull request Jun 25, 2021

Important Github-Readme-Stats PR anuraghazra/anuraghazra#42

Closed

francois-rozet mentioned this pull request Jul 13, 2021

Ranking System v2 #1186

Merged

Repository owner deleted a comment from stale bot Jul 18, 2021

anuraghazra added enhancement New feature or request. stats-card Feature, Enhancement, Fixes related to stats the stats card. labels Jul 18, 2021

Refactor constants names for lambdas

034f47c

Dosx001 mentioned this pull request Aug 30, 2021

Changed ranking system #1273

Closed

anuraghazra changed the title ~~Change rank algorithm so it would be possible to get B+ rank (Fix #883 Fix #455)~~ Ranking System v1 Sep 7, 2021

This was referenced Oct 1, 2021

Followers #1347

Closed

Retire letter-based grades. Instead display (mode rating - rating)/sigma. #1369

Closed

rickstaa mentioned this pull request Nov 8, 2021

Ranking does not appear to reflect contributions #1425

Closed

rickstaa mentioned this pull request Aug 10, 2022

Top Issues Dashboard #1935

Open

rickstaa mentioned this pull request Oct 2, 2022

Lower ranking system from A++ -> A+ and A+ -> A #2089

Closed

rickstaa added the ranks Feature, Bug fix, improvement related to ranking system. label Oct 8, 2022

rickstaa mentioned this pull request Nov 18, 2022

Change the ranking/level system #2265

Closed

smart2pet approved these changes Dec 27, 2022

View reviewed changes

rickstaa force-pushed the master branch 2 times, most recently from 86aafe8 to 8bc69e7 Compare January 21, 2023 16:47

rickstaa closed this Apr 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ranking System v1 #960

Ranking System v1 #960

CrafterKolyan commented Mar 22, 2021 •

edited

Loading

vercel bot commented Mar 22, 2021

codecov bot commented Mar 22, 2021 •

edited

Loading

tkrotoff commented Apr 22, 2021 •

edited

Loading

atinba commented Apr 25, 2021 •

edited

Loading

CrafterKolyan commented Apr 27, 2021

anuraghazra commented Jul 18, 2021 •

edited

Loading

francois-rozet commented Jul 18, 2021 •

edited

Loading

CrafterKolyan commented Jul 20, 2021 •

edited

Loading

dreamyguy commented Jul 20, 2021

vercel bot commented Sep 7, 2021 •

edited

Loading

anuraghazra commented Sep 7, 2021

francois-rozet commented Sep 7, 2021

anuraghazra commented Sep 7, 2021

francois-rozet commented Sep 7, 2021

rickstaa commented Nov 8, 2021

francois-rozet commented Nov 8, 2021

rickstaa commented Nov 8, 2021

CrafterKolyan commented Nov 9, 2021

francois-rozet commented Nov 9, 2021 •

edited

Loading

markus-wa commented Dec 8, 2021

smart2pet commented Dec 22, 2022

smart2pet left a comment

smart2pet left a comment

rickstaa commented Dec 27, 2022 •

edited

Loading

chrisK824 commented Mar 25, 2023

rickstaa commented Mar 27, 2023

rickstaa commented Apr 8, 2023 •

edited

Loading

Ranking System v1 #960

Ranking System v1 #960

Conversation

CrafterKolyan commented Mar 22, 2021 • edited Loading

vercel bot commented Mar 22, 2021

codecov bot commented Mar 22, 2021 • edited Loading

Codecov Report

tkrotoff commented Apr 22, 2021 • edited Loading

atinba commented Apr 25, 2021 • edited Loading

CrafterKolyan commented Apr 27, 2021

anuraghazra commented Jul 18, 2021 • edited Loading

francois-rozet commented Jul 18, 2021 • edited Loading

CrafterKolyan commented Jul 20, 2021 • edited Loading

dreamyguy commented Jul 20, 2021

vercel bot commented Sep 7, 2021 • edited Loading

anuraghazra commented Sep 7, 2021

francois-rozet commented Sep 7, 2021

anuraghazra commented Sep 7, 2021

francois-rozet commented Sep 7, 2021

rickstaa commented Nov 8, 2021

francois-rozet commented Nov 8, 2021

rickstaa commented Nov 8, 2021

CrafterKolyan commented Nov 9, 2021

francois-rozet commented Nov 9, 2021 • edited Loading

markus-wa commented Dec 8, 2021

smart2pet commented Dec 22, 2022

smart2pet left a comment

Choose a reason for hiding this comment

smart2pet left a comment

Choose a reason for hiding this comment

rickstaa commented Dec 27, 2022 • edited Loading

chrisK824 commented Mar 25, 2023

rickstaa commented Mar 27, 2023

rickstaa commented Apr 8, 2023 • edited Loading

CrafterKolyan commented Mar 22, 2021 •

edited

Loading

codecov bot commented Mar 22, 2021 •

edited

Loading

tkrotoff commented Apr 22, 2021 •

edited

Loading

atinba commented Apr 25, 2021 •

edited

Loading

anuraghazra commented Jul 18, 2021 •

edited

Loading

francois-rozet commented Jul 18, 2021 •

edited

Loading

CrafterKolyan commented Jul 20, 2021 •

edited

Loading

vercel bot commented Sep 7, 2021 •

edited

Loading

francois-rozet commented Nov 9, 2021 •

edited

Loading

rickstaa commented Dec 27, 2022 •

edited

Loading

rickstaa commented Apr 8, 2023 •

edited

Loading