https://prefect.io logo
Title
k

Kyle Austin

05/19/2022, 3:34 PM
Our Team uses the task BigQueryLoadGoogleCloudStorage to load in blob uris to Bigquery. The docstring specifies that uri is an optional string https://github.com/PrefectHQ/prefect/blob/master/src/prefect/tasks/gcp/bigquery.py However, uri is used as the source_uris parameter for the load_table_from_uri method of the bigquery client. And according to the gcp bigquery client documentation, that parameter can be either a string or an iterable of strings https://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.client.Client#google_cloud_bigquery_c[…]ent_load_table_from_uri Is there any reason why the prefect Task BigQueryLoadGoogleCloudStorage has it specified that its an optional string and not Union[str, Sequence[str]] ? Looking at the code it looks like it should be fine, but I want to make sure we dont fall into an anti-pattern.
a

Andrew Huang

05/19/2022, 4:06 PM
One reason is you could
map()
the uri’s as separate tasks and monitor each one separately
BigQueryLoadGoogleCloudStorage(…).map(list_of_uris)
k

Kyle Austin

05/19/2022, 6:01 PM
I definitely thought about that -- but I always worry about having too many loading processes into the same table happening concurrently. Like, it would be nice instead to have the list of urls ran in a linear fashion to avoid having any possible concurrency limits problems. I am not sure how the load_table_from_uri handles a list of uris (linearly? concurrently? I dont want to dig too deep in their code to find out). But I feel more confident letting their library handle it as one load job vs creating several concurrent load jobs that might conflict or cause too many writes to a table concurrently.
a

Andrew Huang

05/19/2022, 6:04 PM
Since load_table_from_uri accepts sequence of strings, and Prefect task library uses that internally, then passing in sequence of strings should work! feel free to also make a PR to update the docstring
k

Kyle Austin

05/19/2022, 6:07 PM
Thanks Andrew -- Honestly I really wanted to just make sure I wasnt going against a recommended pattern (especially if it means having to make major updates in the future based on developments from Prefect's side).
a

Andrew Huang

05/19/2022, 6:08 PM
no problem!