-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
generate csv file #1
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really great! I was especially impressed with the scraping method you wrote. I still wasn't able to figure that part out so it was great to see your solution. Left a question and one comment regarding cleaning up the iteration in the scraping method. I think reaching for a higher order array method for iterative tasks will be viewed more favorably than a loop. I think you'd likely see a similar ask from engineers on a larger project and could be some great practice with those higher order methods. As always let me know if you have questions!! Great job!
async function scrapeData(page) { | ||
// Find all the elements with a className of 'DataletSideHeading'... | ||
// ... loop through to add the key/value pair into the resultsArray | ||
const resultObj = await page.$$eval('.DataletSideHeading', (titles) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious about the double $
for the $$eval
handler. I've seen $eval
but not $$eval
. What's the difference here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my understanding, $eval selects a single html element or the first element with the same identifier. While $$eval selects multiple html elements.
index.js
Outdated
// ... loop through to add the key/value pair into the resultsArray | ||
const resultObj = await page.$$eval('.DataletSideHeading', (titles) => { | ||
let result = {}; | ||
for (let i = 0; i < titles.length; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to refactor this to a forEach
? Or perhaps reduce
since we aren returning a result
object?
const result = titles.reduce(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or perhaps:
const resultsArray = titles.reduce(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reasons why I chose a plain for loop is that the array methods that I'm familiar with return an array but I wanted an object. The other reason is that methods like map or forEach can not allow me to flag the next iteration( this is specifically for combining the column Owner Mailing / and the column Contact Address). Besides I don't have too much experience with array methods so I'll definitely look into reduce() ! Thanks for that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think reduce
is the right approach here. Let's work together on an implementation?
@@ -4,12 +4,14 @@ | |||
"description": "", | |||
"main": "index.js", | |||
"scripts": { | |||
"test": "echo \"Error: no test specified\" && exit 1" | |||
"test": "echo \"Error: no test specified\" && exit 1", | |||
"start": "nodemon index.js" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smort. 👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best code ever. Worth every character space!
Parsed all the data from 3 table from the website. Overall the format is clean but I'm struggling to get rid of the extra spaces in Owner Mailing /Contact Address column. I tried replacing the non-breaking space with a regular space first and trim the space, but it didn't work for me. I'd love to learn the way to fix that!