More KeyStats #31

Phishcook · 2020-01-15T06:47:35Z

Hi! I just cloned your project and am messing around with it. Though I am an experienced software engineer, I am new to machine learning so feel free to tell me my insights are incorrect!

After reading the code I noticed prediction modeling heavily relies on the KeyStats, however data is extremely limited. Would it not be SUPER beneficial to back fill this data with a record per quarter (the provided data is very erratic, yet most 'feature' data points are provided be the company every quarter).

In addition to this, a cron or a simple get_missing_quartly_keystats.py script that can be invoked on demand to fill in new stats to accommodate longevity and modern accuracy of this project would help this project modeling become more accurate (more data sets), but also bring it closer to becoming a practical live use tool.

Most of the historical quarterly features data points can be found directly or through calculations on https://www.macrotrends.net/. Example: https://www.macrotrends.net/stocks/charts/GNW/genworth-financial/financial-statements

There are many categories with sub categories that can most likely be scraped and parsed. For example, the full historical market cap chart served here: https://www.macrotrends.net/stocks/charts/GNW/genworth-financial/market-cap
can be parsed out as in the html is a <script> tag that defines var chartData with all the values by date.

between the balance sheets and financial records they provide you may even find other influential data points to add to the ML portion of this script.

Let me know what you think, or if my logic is simply way off. If you think it is a good Idea I can help out with refactoring!

The text was updated successfully, but these errors were encountered:

robertmartin8 · 2020-01-15T14:16:20Z

Hi,

You have struck upon the core issue when it comes to financial data science – data availability. I fully agree that this current collection of keystats data is not great. This project is meant to be a starting point for people to see a complete machine learning pipeline applied to investing.

Good find regarding macrotrends – the data looks pretty good! If you submit a PR with a scraper I'd be more than happy to merge it and credit you in the readme.

Best,
Robert

robertmartin8 added the enhancement label Jan 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More KeyStats #31

More KeyStats #31

Phishcook commented Jan 15, 2020

robertmartin8 commented Jan 15, 2020

More KeyStats #31

More KeyStats #31

Comments

Phishcook commented Jan 15, 2020

robertmartin8 commented Jan 15, 2020