Skip to content
This repository has been archived by the owner on Jan 13, 2023. It is now read-only.

Dataflow Batch job creates a Zero Byte TensorFlow record file #19

Open
jeffreycunn opened this issue Nov 18, 2020 · 0 comments
Open

Dataflow Batch job creates a Zero Byte TensorFlow record file #19

jeffreycunn opened this issue Nov 18, 2020 · 0 comments

Comments

@jeffreycunn
Copy link

jeffreycunn commented Nov 18, 2020

I am working through the following notebook: https://github.com/GoogleCloudPlatform/ml-design-patterns/blob/master/02_data_representation/weather_search/wx_embeddings.ipynb. I am running a GCP AI Notebook VM with JupyterLab.

When I get to the following line of code: %run -m wxsearch.hrrr_to_tfrecord -- --startdate 20190915 --enddate 20190916 --outdir gs://{BUCKET}/wxsearch/data/2019 --project {PROJECT}, my Dataflow batch job indicates that it runs fine and to completion (first image below). However, the batch job produces a zero byte TensorFlow record file (second image below). The zero elements per second seems concerning to me in create_tfr, although I don't know if this is a problem.

Any thoughts as to what may be happening? The only modifications I made were to the bucket and project variables where I wrote my own bucket and project values into the command.

Screen Shot 2020-11-17 at 10 37 30 PM

Screen Shot 2020-11-17 at 10 31 23 PM

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant