Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconcile Gitcoin Round IDs #2583

Closed
ccerv1 opened this issue Dec 4, 2024 · 0 comments · Fixed by #2600
Closed

Reconcile Gitcoin Round IDs #2583

ccerv1 opened this issue Dec 4, 2024 · 0 comments · Fixed by #2600
Assignees
Labels
c:data Gathering data (e.g. indexing)

Comments

@ccerv1
Copy link
Member

ccerv1 commented Dec 4, 2024

What is it?

From Slack chat with Gitcoin team:

I got deeply rabbit-holed trying to create my own roundId logic based on what’s in the all_matching and all_donations tables. And then I saw a table that I think I need access to called public.Round, which is used in this view. I’ve diffed the results I get from both sources and there are a lot of holes / discrepancies, which I think is due to the roundId logic being inconsistent.

FYI, here is the query I’m using to construct my version of rounds. Let me know if this is incorrect somehow. In this case, I’m only trying to look at GrantsStack rounds. Basically, I’m using the round_id and the chain_id as my indices:

with donations as (
  select
    round_id,
    chain_id,
    array_agg(distinct round_name) as round_names,
    sum(amount_in_usd) as donation_amount_in_usd
  from `gitcoin.all_donations`
  where
    `source` = 'GrantsStack'
    and round_name is not null
  group by 1,2
),
matching as (
  select
    round_id,
    chain_id,
    sum(match_amount_in_usd) as matching_amount_in_usd
  from `gitcoin.all_matching`
  group by 1,2
)

select
  m.round_id,
  m.chain_id,
  m.matching_amount_in_usd,
  d.donation_amount_in_usd,
  d.round_names
from matching m
join donations d
  on d.round_id = m.round_id
  and d.chain_id = m.chain_id
order by m.matching_amount_in_usd desc

Here are the issues that are puzzling me:

  • Some roundIds are hashes and others are integers
  • The roundIds in here are all hashes. Are the integers in all_donations and all_matching the nonce or something else related to the transaction that initialized the round???
  • Even in cases where there’s a clean join, the matching fund numbers are not adding up. Why could this be?
  • Is there any formal logic or heuristics for filtering out test rounds?

Response from Henry:

Some roundIds are hashes and others are integers

this one is a legacy of Allo v1 vs Allo v2. integers are v2

PAID rounds stopped around Feb 2024 so I would assume this is allo v1 only rounds

Is there any formal logic or heuristics for filtering out test rounds?

its not complete but we maintain a list of over 600 known spam rounds here

Some additional discrepencies:

These rounds appear in the matching table, but not the donations table. Any idea why this could be the case?

with donations as (
  select
    round_id,
    chain_id,
    array_agg(distinct round_name) as round_names,
    sum(amount_in_usd) as donation_amount_in_usd
  from `gitcoin.all_donations`
  where
    `source` = 'GrantsStack'
    and round_name is not null
  group by 1,2
),
matching as (
  select
    round_id,
    chain_id,
    sum(match_amount_in_usd) as matching_amount_in_usd
  from `gitcoin.all_matching`
  where round_num > 16
  group by 1,2
)

select
  m.round_id,
  m.chain_id,
  m.matching_amount_in_usd,
  d.donation_amount_in_usd
  --d.round_names
from matching m
left join donations d
  on d.round_id = m.round_id
  and d.chain_id = m.chain_id
where d.round_id is null -- Filter to show rounds in matching without donations
order by m.matching_amount_in_usd desc;

Meanwhile, Round 17 data appears in the donations table but not in the matching table:

with donations as (
  select
    round_id,
    chain_id,
    array_agg(distinct round_name) as round_names,
    sum(amount_in_usd) as donation_amount_in_usd
  from `gitcoin.all_donations`
  where round_num = 17
  group by 1,2
),
matching as (
  select
    round_id,
    chain_id,
    sum(match_amount_in_usd) as matching_amount_in_usd
  from `gitcoin.all_matching`
  group by 1,2
)

select
  d.round_id,
  d.chain_id,
  m.matching_amount_in_usd,
  d.donation_amount_in_usd,
  d.round_names
from donations d
left join matching m
  on d.round_id = m.round_id
  and d.chain_id = m.chain_id
order by d.donation_amount_in_usd desc
@github-project-automation github-project-automation bot moved this to Backlog in OSO Dec 4, 2024
@ccerv1 ccerv1 self-assigned this Dec 4, 2024
@ccerv1 ccerv1 added the c:data Gathering data (e.g. indexing) label Dec 4, 2024
@ccerv1 ccerv1 added this to the [f] Funding Event Data milestone Dec 4, 2024
@ccerv1 ccerv1 moved this from Backlog to Blocked in OSO Dec 4, 2024
@ccerv1 ccerv1 moved this from Blocked to In Progress in OSO Dec 4, 2024
@ccerv1 ccerv1 linked a pull request Dec 6, 2024 that will close this issue
@github-project-automation github-project-automation bot moved this from In Progress to Done in OSO Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c:data Gathering data (e.g. indexing)
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant