-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Remove country name from Location field #224
Comments
Hmm this seems like a good idea because it would improve data consistency
but I will wait to see if anyone else voices their opinion.
…On Wed, Jun 12, 2024, 12:08 PM Jon Banafato ***@***.***> wrote:
I'd like two remove the country name from the Location field. This field
is both redundant with the more-sepcific Country field and inconsistently
used (e.g. sometimes not used, sometimes with spelling variations like
"USA" vs. "United States of America". I'm opening this issue first to:
1. get feedback in case there are cases where the country information
cannot be expressed with the three-letter country code
2. identify any tooling that uses this repository that would not
handle this change gracefully or would need additional updates in order to
do so
—
Reply to this email directly, view it on GitHub
<#224>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACBDLCHJ2JTZ62Z2Z7BUYDZHBW6TAVCNFSM6AAAAABJGVTSHWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2DSMJVGM2TAOI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
I think my rationale in the past for being more specific about location in certain cases was to reduce ambiguity. I agree removing the country from the location might be a good idea since we already have the three digit code. I just want to make the point that the location should be specific enough to remove ambiguity, for example, if a country has two cities with the same name like Lexington, Kentucky vs Lexington, Massachusetts. |
I agree with this, and I'm not suggesting that we remove state / province / etc. kind of details, just the country information that's already stored in a dedicated field. |
Your third point is valid. Some of the three digit codes are not immediately clear. It would take away from the readability of the CSV especially if some people refer directly to the github repository and not a third party calendar or website |
As a downstream user of the CSVs it would be a minor inconvenience as I have to adjust my scripts, but since I already have to do a bunch of data cleaning anyways, it would just be adjusting my scripts. I agree with most points, but to add something of substance: On the positive side, this would also circumvent "data problems" around the self-determination of countries, such as Turkiye asking not to be called Turkey and Czechia asking to rather not be called the Czech Republic. On the negative side, PyCon DE, with the 3-letter code DEU, would be thoroughly confusing for most people who don't already know. So, I have to say, as long as the data is consistent across the data set, it'd probably be okay. But if it changes halfway through the 2024-file, I'd probably struggle slightly downstream. |
This would be another benefit of the benefits of this change. A repository covering a set of global conferences is already going to encounter language and translation issues, so this would remove one point of confusion.
This is mostly a question of how end-users are consuming this data. Automated tooling perform lookups (e.g. we use https://pypi.org/project/iso3166/ for some CI here), and I would imaging most conference participants are either familiar with their local events or fine with clicking through to the conference website. This is good feedback, though, and the reason that I'm opening this issue up for discussion.
Any change implemented here would be a global update in a single commit. As long as tools are able to deal with a new version of the data set, they shouldn't need to worry about supporting mixed formats. |
I'd like two remove the country name from the
Location
field. This field is both redundant with the more-sepcificCountry
field and inconsistently used (e.g. sometimes not used, sometimes with spelling variations like "USA" vs. "United States of America". I'm opening this issue first to:The text was updated successfully, but these errors were encountered: