https://prefect.io logo
Title
h

Hafsa Junaid

08/04/2022, 11:08 PM
Team, My python program saves a csv file as an output in the same directory as of program, when I am trying to run this pipeline through prefect:2.0 cloud, I am getting this error. How I can get rid of it? The storage is configured by prefect itself, and when I am trying to specify storage block, deployment.yaml file is not being created. Also, Any working prefect 2.0 example which utilizes remote storage?
1
a

Anna Geller

08/05/2022, 12:37 AM
could it be that you want to avoid uploading this CSV file to your storage? if so, you can add it to .prefectignore
but the error looks like it comes from pyspark - do you get the same if you run it without Prefect?
also, do you get the same error when you run this flow locally without deployment?
h

Hafsa Junaid

08/05/2022, 2:44 AM
csv file is required since thats my output but it is saved somewhere in temps. And this is working solution without deployment @Anna Geller
o

Open AIMP

08/05/2022, 5:23 AM
@Hafsa Junaid do you still have the tmp file issue?
1
h

Hafsa Junaid

08/05/2022, 5:34 AM
@Open AIMP Yes file is saving in temp and that path is not supported
a

Anna Geller

08/05/2022, 11:12 AM
@Hafsa Junaid, generally speaking, flow and task runs are supposed to be stateless. I would encourage you to persist your data in a database, S3 or similar - this is data engineering best practice, too
h

Hafsa Junaid

08/07/2022, 9:35 PM
@Anna Geller okay I understand this, but when I am referring block name in deployment command
prefect deployment build ./als_labtest.py:als_labtest --name "alslabtest" -sb GCS/gcsblock
I am getting error, and as per your suggestion, i was not referring to any storage block and deployment was successful, but now it is saving files on temp anonymous location.
a

Anna Geller

08/07/2022, 10:43 PM
Can you share Screenshot of your block from the UI? Redact private information
h

Hafsa Junaid

08/08/2022, 3:49 AM