Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPENABC2_DATASET contains only step0 files ? #16

Open
udaymallappa opened this issue Aug 19, 2024 · 8 comments
Open

OPENABC2_DATASET contains only step0 files ? #16

udaymallappa opened this issue Aug 19, 2024 · 8 comments

Comments

@udaymallappa
Copy link

After downloading and unzipping OPENABC2_DATASET.zip (Torch tensor format), I see only step0 files in the processed/ directory. Dont we have AIGs for all the 20 steps ?

@animeshbchowdhury
Copy link
Contributor

Hi @udaymallappa,

Sorry, the zenodo only allowed dataset upto 50GB at the time when I uploaded the dataset. The original dataset do have all the AIGs processed for each step. The reason for keeping this dataset short was to predict the QoR using synthesis recipe and the starting AIG only.

If you face the issue downloading and unzipping the original dataset, let me know.

@udaymallappa
Copy link
Author

Thanks for your quick response.
I am particularly interested in the ML-ready torch dataset and the original dataset has a lot of other information that I do not need. In any case, I was trying to load the *pt files corresponding to step0, but run into data loading issues as a result of torch version.
Also, does "step0" correspond to unoptimized start AIG circuits ? If so, does 1500 synthesis recipes for each step0 flavor would be identical right ?

@animeshbchowdhury
Copy link
Contributor

Yes, you're correct.

The step0 pt will have the same original unoptimized AIG, however, the labels will be different as those will capture the post synthesis results.

@udaymallappa
Copy link
Author

Okay. Is there a way to host the 50GB torch-only dataset with all the 20 steps ?

@animeshbchowdhury
Copy link
Contributor

Let me see what can be done. Can you explain what is your requirement?

The one which is already hosted on the zenodo has consumed 18GB of space. Now, for each nth step AIG, there are 1500 (20-n) length recipes across all designs. So it will be even difficult to host all the steps with all the recipe labels.

@udaymallappa
Copy link
Author

We are looking to learn representations for AIG circuits. Because all step0 circuits correspond to the same circuit, all we have is just 29 unique circuits. If you could generate 20 steps for each circuit, with 1500 synthesis recipes, that would help us obtain more aig circuits, for the training purpose.

@animeshbchowdhury
Copy link
Contributor

Hi @udaymallappa, the data you’re asking for are the processed pt format dataset of all 870k aigs. It’s part of the original dataset however the entire data is too much to be uploaded on zenodo since even the compressed version need more than 500GB.

Let me figure out a way to host original dataset in such a way that anyone do not have to download the entire data but only the processed dataset.

@animeshbchowdhury
Copy link
Contributor

@udaymallappa, give me some time by end of this week. I need to coordinate with NYU-IT dept. to manage this. The requirement from their end is to evenly chunk the entire dataset and host it. That was the main reason why the entire dataset was zipped and chunked.

I will try to find a way via which only processed pt files are grouped together and rest are zipped together and chunked. This will take some time on my side.

Also, I believe the pyg version used to dump the dataset was older version. For compatibility, with newer version, please follow the following thread:

pyg-team/pytorch_geometric#5528

I plan to migrate the entire dataset to new pytorch version but it will take some time. Thank you for your patience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants