Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't load/activate datasets with identical column names #166

Open
evertrol opened this issue Sep 26, 2018 · 0 comments
Open

Can't load/activate datasets with identical column names #166

evertrol opened this issue Sep 26, 2018 · 0 comments

Comments

@evertrol
Copy link

I have tried to load two datasets, with a number of identical variable names. While I can load them both, I can only switch on one at a time. If I try to switch on another dataset, I get an andless "wait circle animation".

Use case

The general use case is where someone wants to compare two or more similar data sets.
A slightly more specific use case would be where someone would like to compare their dataset with that of a published paper, to spot any differences between their result and an existing result.
It is likely in both cases that variables are identically named.

Workaround

A user workaround is to rename the relevant variables, for example by simply appending an underscore or similar to variable names. This, however, would not work in case both datasets come from a database server.

UI implementation

On the UI side, I see at least two options:

  • namespace variables with their dataset, for example <dataset1>:<variable1>, <dataset1>:<variable2>, <dataset2>:<variable1>, <dataset2>:<variable2>.
    This breaks when two identically named datasets are used. SPOT could (should?) warn the user when an identically named dataset is used, and then either:

    • abort for the new dataset
    • automatically rename the new dataset, for example by appending a -1, -2 etc. Compare file downloads and renames done by various tools.
  • keep variables separate in the UI per dataset. This requires an extra subbox per dataset. This makes it easier for the user to see which dataset a set of variables belong to, instead of longer names like the above solution. (Under the hood, variables likely need to be namespaced, but that is an implementation detail.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant