-
-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reduce memory consumption? #509
Comments
I don't think there are any major wins to be had here. The https://github.com/pelias/whosonfirst/blob/master/src/readStream.js#L56 Looking at one of the records as its stored in memory: {
id: 85633729,
name: 'Puerto Rico',
name_aliases: [
'Puerto Rico',
'Commonwealth of Puerto Rico',
'Puerto Rican',
'Puertorico'
],
name_langs: {
am: [ 'ፕዌርቶ ሪኮ' ],
ar: [ 'بورتوريكو' ],
ay: [ 'Burinkin' ],
az: [ 'Puerto Riko' ],
ba: [ 'Пуэрто-Рико' ],
be: [ 'Пуэрта-Рыка' ],
bn: [ 'পুয়ের্তো রিকো' ],
bs: [ 'Porto Riko' ],
bg: [ 'Пуерто Рико' ],
cs: [ 'Portoriko' ],
ce: [ 'Пуэрто-Рико' ],
cv: [ 'Пуэрто-Рико' ],
dv: [ 'ޕުއެރްތޮ ރީކޯ' ],
el: [ 'Πουέρτο Ρίκο' ],
en: [ 'Commonwealth of Puerto Rico', 'Puerto Rican', 'Puertorico' ],
eo: [ 'Porto-Riko' ],
fo: [ 'Puerto Riko' ],
fa: [ 'پورتوریکو' ],
fr: [ 'Porto Rico' ],
fy: [ 'Puerto Riko' ],
ga: [ 'Pórtó Ríce' ],
gl: [ 'Porto Rico' ],
gv: [ 'Yn Phurt Verçhagh' ],
gu: [ 'પ્યુએર્ટો રિકો' ],
ht: [ 'Pòtoriko' ],
he: [ 'פוארטו ריקו' ],
hi: [ 'प्युर्तो रिको', 'पोर्टो रीको' ],
hr: [ 'Portoriko' ],
hy: [ 'Պուերտո Ռիկո' ],
io: [ 'Portuo Riko' ],
ia: [ 'Porto Rico' ],
id: [ 'Puerto Riko' ],
is: [ 'Púertó Ríkó' ],
it: [ 'Porto Rico', 'Portorico' ],
jv: [ 'Puerto Riko' ],
ja: [ 'プエルトリコ' ],
kn: [ 'ಪೋರ್ಟೊ ರಿಕೊ' ],
ka: [ 'პუერტო-რიკო' ],
kk: [ 'Пуэрто-Рико' ],
rw: [ 'Puwerito Riko' ],
ky: [ 'Пуэрто-Рико' ],
ko: [ '푸에르토리코' ],
ku: [ 'Porto Rîko' ],
la: [ 'Portus Dives' ],
lv: [ 'Puertoriko' ],
lt: [ 'Puerto Rikas' ],
ml: [ 'പോർട്ടോ റിക്കോ' ],
mr: [ 'पोर्तो रिको' ],
mk: [ 'Порторико' ],
mt: [ 'Porto Riku' ],
ne: [ 'प्युर्तो रिको' ],
os: [ 'Пуэрто-Рико' ],
pa: [ 'ਪੁਏਰਤੋ ਰੀਕੋ' ],
pl: [ 'Portoryko' ],
pt: [ 'Porto Rico' ],
qu: [ 'Burinkin' ],
ru: [ 'Пуэрто-Рико' ],
si: [ 'පුවටෝ රිකෝ' ],
sk: [ 'Portoriko' ],
sl: [ 'Portoriko' ],
so: [ 'Buerto Riko' ],
sq: [ 'Portorikoja' ],
sr: [ 'Порторико' ],
su: [ 'Puérto Riko' ],
ta: [ 'புவேர்ட்டோ ரிக்கோ' ],
tt: [ 'Пуэрто-Рико' ],
te: [ 'ఫ్యూర్టో రికో' ],
tg: [ 'Пуэрто Рико' ],
th: [ 'ปวยร์โตรีโก' ],
tk: [ 'Puerto-riko' ],
tr: [ 'Porto Riko' ],
ug: [ 'پوئېرتو رىكو' ],
uk: [ 'Пуерто-Рико' ],
ur: [ 'پورٹو ریکو' ],
uz: [ 'Puerto-Riko' ],
vo: [ 'Puertorikeäns' ],
wa: [ 'Porto Rico' ],
wo: [ 'Poortorikoo' ],
yo: [ 'Púẹ́rtò Ríkò' ],
zh: [ '波多黎各自由邦', '波多黎各' ]
},
abbreviation: 'PR',
place_type: 'dependency',
lat: 18.234668,
lon: -66.481065,
bounding_box: '-67.937815,17.922919,-65.244618,18.522773',
population: 3971020,
popularity: undefined,
hierarchies: [
{
continent_id: 102191575,
dependency_id: 85633729,
empire_id: 136253057
}
]
} There's a lot, but its mostly names. While we could prune out some unneeded languages, we still need the names to set the Overall, I don't believe memory usage of this importer is much of an issue right now, and I don't see any super compelling benefits to trying to prune out stuff we don't need. |
Agh ok, that makes sense, I didn't realize the name fields were also required but makes sense once you mention it. At some point we can probably migrate this code to use the I'm going to mark this as |
I was looking over the code today and noticed this block introduced in c500aaf which stores admin records in memory so they can later be used for computing hierarchies.
It seems as though the entire
geojson
record is being stored in RAM when we probably only need to store thewofRecord.hierarchies
path which is used by thehierarchyFinder.js
code.@orangejulius I'm not that familiar, does that sound like a reasonable change to reduce RAM usage?
The text was updated successfully, but these errors were encountered: