Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "Railway Station alias" to improve searches #470

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

NickStallman
Copy link
Contributor

Many railway stations just have the station name without any mention of "railway" or "station" in it.
E.g. https://www.openstreetmap.org/node/1886265358
In that case the station name is the same as a locality so it's virtually impossible to find the station in search results.

This patch will add "Railway Station" as an alias to allow for easier searching.
It also adds a small popularity boost to the station since they are common landmarks.

TODO: Likely this needs to be expanded for bus stations, light rail, ferry, etc...

Copy link
Member

@missinglink missinglink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are failing, I think you have an extra opening curly brace there?

@missinglink
Copy link
Member

missinglink commented Feb 24, 2019

I think this might need a bit more consideration for cases such as:

  • non-english speaking regions (ie. 'bahnhof')
  • names which are not just the locality name, as per the example, or names which already include 'railway station'

I think we can probably still merge this, the nice thing about aliases is that they are used for search but not for display, so we are free to put whatever we want if it improves matching.

The label generation might still be an issue here, as the resulting geojson record would still not have 'station' in the label, which might be confusing from a UX perspective.

Regarding the language, we can probably deal with this in elasticsearch with synonyms, we could, for instance add a synonym "railway station, bahnhof"

@NickStallman
Copy link
Contributor Author

Whoops, typo fixed.

Yep I guessed that aliases were good to use for this purpose.
And synonyms sound like a great way to handle it for different languages.

I'm not sure how many cases there are like this where adding additional aliases would improve searchability substantially, railway stations was a single example I've noticed myself. If there are many others it might be worth refactoring it to handle it in a more generic way. Adding a indexOf probably can't hurt either to ensure it's not already in the name.
I'll do a bunch of searches and try and find others.

As for the label, it may be better to expose the categories instead rather than changing the label so it can be annotated at the application level. E.g. show a icon to indicate a train station instead of including it in the name. That also means we don't need to change the label for other languages as well.

@NickStallman
Copy link
Contributor Author

Actually another use for synonyms would be to allow "railway station" and "train station" to both be used when searching. Are there currently any synonyms specifically for venue searching?

@missinglink
Copy link
Member

Hi @NickStallman the tests are still failing, you can run the same tests locally that are getting run on CI with npm run travis, you can also view the logs of the CI here: https://travis-ci.org/pelias/openstreetmap/jobs/497856259

I had a quick look over the failures and see that https://www.openstreetmap.org/node/25751159 is producing the alias undefined Railway Station which is not correct, same for Waterfront Station, it adds an alias undefined Railway Station.

@NickStallman
Copy link
Contributor Author

Sorry for some reason the unit tests are throwing other unrelated errors for me. When I have a sec I'll track that down and sort that out. :)

@NickStallman
Copy link
Contributor Author

I've had a chance to sit down and do this properly so I've redone it entirely.
After further thought it looks like this will actually require a separate stream with a config file as there would be quite a few areas where 'normalizing' of venue names is useful. Public transport is a key one.

I've added test cases to cover the new code and all tests are now passing perfectly.
The new code will handle these cases correctly:
"Gosford" -> "Gosford Railway Station"
"Gosford Station" -> "Gosford Railway Station"
There is an optional alt_suffixes list which allows it to remove suffixes which may already be part of the name.
I may need to add to this to handle checking two tags instead of just one. E.g. public_transport=stop_position and ferry=yes is one I've seen that can't currently be handled.

The idea is then to add synonyms to handle different languages and terminology, E.g "Train Station"
This'll allow for some very flexible and useful venue searching so a user's input syntax doesn't have to be as precise.

@missinglink let me know what you think, this is my first substantial contribution so let me know if the style or anything else isn't quite in line with what the project would like. :)

@missinglink
Copy link
Member

This is looking really good, I opened a small PR against your fork with some minor formatting changes.

Other than that, I think we should merge this.

@NickStallman
Copy link
Contributor Author

Awesome. I think I will first add support for multiple conditions as that appears to be required for ferries and some other places where this will be useful.
Should have that done in the next couple of days. Its probably easiest to use the main match like how it currently works, and add another "condition" parameter which is just an array of additional tags which must be equal.

@missinglink
Copy link
Member

All good, I will hold off merging until I hear more.

Add Ferry normalisation
Add Car Park normalisation
@NickStallman
Copy link
Contributor Author

@missinglink I've rewritten this so it can handle multiple conditions which is a requirement for things like public_transport=stop_position ferry=yes
Unfortunately this means it no longer looks similar to category_map.js but it should be reasonably clear.
Most of the public_transport variations will require multiple conditions, I'm not sure what other POI types are similar.

Let me know what you think. I'll file another PR to the API to update the synonym list so things like "train station" will work as you'd expect.

I'll also be using this code on a planet build to test out real world results.

@NickStallman
Copy link
Contributor Author

I think this is ready for merging. I've gotten rid of the expected output conflict and I've made a full planet build with the desired outcome.

Searching "railway, " where the train station's name is just "" is now working correctly.

Next steps:
1.The new synonyms need to be merged, otherwise "railway station" will match but "train station" will not. pelias/schema#358
2. The list of normalised names needs to be expanded and internationalised. Currently it's just tran stations and ferrys tested only around Sydney so there is certainly more to be done.
3. Venue scoring now becomes more important, in one case the train station's car park ranked higher than the station its self.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants