Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add alphanumeric postcodes post-processing script #158

Merged
merged 1 commit into from
May 27, 2024

Conversation

missinglink
Copy link
Member

@missinglink missinglink commented May 24, 2024

/**
 * Alphanumeric postcodes post-processing script ensures that both the expanded
 * and contracted version of alphanumeric postcodes are indexed.
 *
 * Without this script a postcode such as '1383GN' would not be matched to the
 * query '1383'.
 * 
 * The script is intended to detect these alphanumeric postcodes and index both
 * permutations, ie. '1383GN' = ['1383GN', '1383 GN'].
 * 
 * The inverse case should also be covered. ie. '1383 GN' = ['1383 GN', '1383GN'].
 * 
 * Note: the regex is currently restrictive by design, the UK for instance uses
 * alphanumeric postcodes in the format 'E81DN' which could cause error when splitting
 * with this method, they are currently ignored. Future work should consider global
 * postcode formats.
 * 
 * Note: this script is intended to run *before* the 'deduplication' post processing
 * script so that prior aliases don't generate duplicate terms.
 */

@missinglink missinglink merged commit f64cd68 into master May 27, 2024
6 checks passed
@missinglink missinglink deleted the alphanumeric-postcodes branch May 27, 2024 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant