Joshua Greenhalgh
06/12/2022, 1:48 PMDung Khuc
06/13/2022, 8:04 AMJoshua Greenhalgh
06/13/2022, 10:40 AM063bd788-bc0c-46f5-85e4-1092c2b19297
Priyank
06/13/2022, 11:14 AMIzu
06/13/2022, 11:28 AMAnna Geller
Jelle Vegter
06/13/2022, 1:36 PMOscar Krantz
06/13/2022, 2:29 PMRenuka
06/13/2022, 2:59 PMyu zeng
06/13/2022, 3:20 PMBrian Phillips
06/13/2022, 5:44 PMMary Clair Thompson
06/13/2022, 7:08 PMFlow could not be retrieved from deployment.
Traceback (most recent call last):
File "/tmp/flow-script-calculate-blob-size4cemrmn_.py", line 1, in <module>
ModuleNotFoundError: No module named 'local_module'
Slackbot
06/13/2022, 7:17 PMMitchell Bregman
06/13/2022, 8:01 PMDbtShellTask
; locally works all as expected. when i deploy to prefect cloud, I am getting a dbt: command not found
.. requirements include dbt-redshift
. My flow storage is Docker. when i try to build the Docker image locally, inside the container I also get dbt: command not found
even though it is installed in my python environment packages. Has anyone run into this before?Ken Nguyen
06/13/2022, 11:15 PMa -> b ->c
I want it so that if a
and b
fails, it sends a Slack notifications saying “Flow failed”. But if c
fails, I want it alter the Slack message by adding b’s outputWei Mei
06/14/2022, 12:32 AMMitchell Bregman
06/14/2022, 1:58 AM01:13:51 Running with dbt=1.1.0
01:13:51 Partial parse save file not found. Starting full parse.
01:13:52 Found 9 models, 75 tests, 0 snapshots, 0 analyses, 196 macros, 0 operations, 0 seed files, 1 source, 0 exposures, 0 metrics
01:13:52
01:13:52 Encountered an error:
Database Error
could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5439"?
Michelle Brochmann
06/14/2022, 3:10 AMResult
object. I am doing operations on spark DataFrames
in Tasks
and would like to pass a DataFrame
from Task
to Task
via the Result
object. If the DataFrame
is too large to fit in memory simply passing a custom serializer won’t work. So I am thinking I could create a subclass of Result
called SparkDataFrameResult
where the read
and write
methods are overridden to use the spark load
and save
methods.
1. Is there any reason this wouldn’t work or wouldn’t be recommended?
2. Are there any best practices for doing something like this?William Jamir
06/14/2022, 9:01 AMSatnam Singh
06/14/2022, 9:22 AMFlorian Guily
06/14/2022, 10:12 AMTarek
06/14/2022, 10:39 AMSang Young Noh
06/14/2022, 1:06 PMToby Rahloff
06/14/2022, 1:17 PMAyush Bairagi
06/14/2022, 1:33 PMIlhom Hayot o'g'li
06/14/2022, 1:44 PMJoseph Mathes
06/14/2022, 2:25 PMPaco Ibañez
06/14/2022, 2:34 PM@flow
def records_analysis_flow():
...
future = ingest_csv(config)
future.wait()
future = analyze_records(ra_config)
future.wait()
I noticed somewhat inconsistent performance results. If I run the code in docker without prefect it takes around 154 seconds but when I run it in prefect it takes almost 900 seconds. However, if I explicitly set the task runner to sequential, then it takes ~ 180 seconds. Is my flow doing something wrong? Why does the concurrent task runner take so long? Thanks!John-Craig Borman
06/14/2022, 2:48 PMJason Damiani
06/14/2022, 4:40 PM