You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hmm, right now pupa doesn't delete any top level objects so we'd need to really think about how this was handled. I'm open to discussion/proposals but tend to think that we should favor some other mechanism here.
Thanks for the reply, @jamesturk, and I agree this is something we'd want to think about carefully before implementation. I've opened a new issue, which broadens our conversation: #295
Recently, LA Metro had "duplicate" events in Legistar (i.e., same name and time, but different EventId):
http://webapi.legistar.com/v1/metro/events/1265
http://webapi.legistar.com/v1/metro/events/1259 (now defunct)
The scrapers for Metro run multiple times per day, and at the time of a scrape, both events were present.
We use the EventId to create the unique instance of the Identifier class. So, the importer would not have known these were the same event.
Let's add a "clean" scrape mode to pupa, i.e., the "clean" scrape removes data that does not appear on the legistar web api.
The text was updated successfully, but these errors were encountered: