You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The project is fantastic; here are a few suggestions :
It would be good if there were separate repo for redditflow data and reddit flow model APIs. Sometimes developers want to extract only data and use their model, and sometimes they want to use models but different data. Combining both things results in a bigger size of repo, and also, if I want to scrape only data, I need to install torch, sentence-transformer, sentencepiece etc. ( reference can be huggingface's dataset API and model API )
Update the doc for redditflow, including how to extract data based on a single keyword and extract all comments and posts from a single subreddit?
Organize the nfflow repo into some base functions which can utilize further for other platform APIs such as Twitter etc
Add ML Intelligence to data fetching and scrapping ( example: OpenAI's CLIP )
it can also include Elasticsearch to fetch data faster from the downloaded archive.
Here is a simple overview of integrating OpenAI's CLIP project into nfflow:
Download image data from different sources
Use Colab to load data and train OpenAI's CLIP model to convert images into vector
save the vectors into the user's gdrive
Perform evaluation ( search query ) over downloaded data
It can be automated end to end if training on colab and fetching vectors from the drive can be automated.
The text was updated successfully, but these errors were encountered:
The project is fantastic; here are a few suggestions :
It would be good if there were separate repo for redditflow data and reddit flow model APIs. Sometimes developers want to extract only data and use their model, and sometimes they want to use models but different data. Combining both things results in a bigger size of repo, and also, if I want to scrape only data, I need to install torch, sentence-transformer, sentencepiece etc. ( reference can be huggingface's dataset API and model API )
Update the doc for redditflow, including how to extract data based on a single keyword and extract all comments and posts from a single subreddit?
Organize the nfflow repo into some base functions which can utilize further for other platform APIs such as Twitter etc
Add ML Intelligence to data fetching and scrapping ( example: OpenAI's CLIP )
it can also include Elasticsearch to fetch data faster from the downloaded archive.
Here is a simple overview of integrating OpenAI's CLIP project into nfflow:
It can be automated end to end if training on colab and fetching vectors from the drive can be automated.
The text was updated successfully, but these errors were encountered: