Rafal
07/11/2020, 11:42 AMSebastian
07/11/2020, 6:06 PMBeibei
07/13/2020, 7:25 AMa = Parameter('a', default=1)
b = Parameter('b', default=2)
In Prefect API, I only see one Parameter for b
. Did I miss something? If I need to use multiple controls for different arguments in functions, which one do you suggest me to use, Parameter or Context?Klemen Strojan
07/13/2020, 9:17 AMitay livni
07/13/2020, 2:52 PMAmit Singh
07/13/2020, 3:21 PMwith Flow('ETL Status Cron Flow', schedule=CronSchedule("*/5 * * * *")) as cron_flow:
work_conf = importlib.import_module('settings.work_settings')
importlib.reload(work_conf)
my_settings = getattr(work_conf, 'my_settings')
if(my_settings['active'] == False):
return
else:
print('do something')
state = cron_flow.run()
Slackbot
07/13/2020, 3:48 PMAaron Y
07/13/2020, 3:53 PMfargate.py
from prefect.agent.fargate import FargateAgent
agent = FargateAgent(
#
enable_task_revisions=True,
launch_type="FARGATE",
taskRoleArn="arn:aws:iam::#####61:role/ayang-role",
executionRoleArn="arn:aws:iam::#####61:role/ayang-role",
family="#####-task",
cluster="#####-fargate-cluster",
networkConfiguration={
"awsvpcConfiguration": {
"assignPublicIp": "ENABLED",
"subnets": ["subnet-#####11"],
"securityGroups": []
}
},
cpu="1024",
memory="4096",
containerDefinitions = [
{
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/my-first-task",
"awslogs-region": "us-east-2",
"awslogs-stream-prefix": "ecs"
}
},
"image": "#####.<http://dkr.ecr.us-east-2.amazonaws.com/pga_scraper:latest|dkr.ecr.us-east-2.amazonaws.com/pga_scraper:latest>",
"name": "pga_scraper"
}
],
labels=["s3-flow-storage"]
)
agent.start()
Hugo Shi
07/13/2020, 5:32 PMJacob Blanco
07/14/2020, 1:24 AMmy_password
it'll be different a different password compared to if Flow B pulls the same secret?Jackson Maxfield Brown
07/14/2020, 4:46 AMSerializer
but has there been thought into a Verifier
or similar? (Entirely just curious at this point)Philip MacMenamin
07/14/2020, 6:33 AMprefect backend server
Started prefect agent start docker
Copied the Hello Docker flow from the web page above,
Registered that flow, it created the image without issue:
`Successfully tagged hello-docker:2020-07-14t05-57-08-042222-00-00
Flow: <http://localhost:8080/flow/e25b220f-c9e4-4051-9f00-20ca88122a99>`
and submit a run via the GUI server running on localhost:8080.
The run enters a state of Submitted and does not seem to get any further, it stays in that state.Darragh
07/14/2020, 8:27 AMFailed to load and execute Flow's environment: ClientError('An error occurred (UnrecognizedClientException) when calling the RegisterTaskDefinition operation: The security token included in the request is invalid')
Sample config for the environment:
flow.environment = FargateTaskEnvironment(
launch_type="FARGATE",
region="us-east-1",
aws_session_token="FwoGZXIvYXdzEOL//////////wEaDFSgRMmr3M07yXJ3gCKCAdZO6f/LRZc6b7DjDip0lrTvCa+FDQpFAGJEyB6Ka1tF9By3fTgKkbqSM6EnuHQgTEviQJrOn13E7wlvKKVV1++YGaa3gb1Pn9q12BxCN7I6SvQ8oBW9AE73Judo0tuYTdTc5eYC7m2PaYU/d5fkRIj29EJWp9EpO3+yq/S1saxiGvYosKGy+AUyKBOOerH4ymGN9lxo/5aprU5DXumyaC6yg2satDNNoUdPaSBQ2R8fNq0=",
cpu="256",
memory="512",
enable_task_revisions="True",
....
....
)
joao
07/14/2020, 8:48 AMprefect auth login
 step.
Prefect tries to connect locally rather than to your cloud:
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=4200): Max retries exceeded with url: /graphql (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5aa8c9f630>: Failed to establish a new connection: [Errno 111] Connection refused',))
How do I tell Prefect that it should connect to the cloud?karteekaddanki
07/14/2020, 8:55 AMBernard Greyling
07/14/2020, 9:39 AMCloudFlowRunner
flows.
Locally I have a flow setup and registered to the cloud backend.
I can trigger the flow via the web-ui, however when running locally I get the following:
>> cloud_flow = CloudFlowRunner(flow)
>> flow_state = cloud_flow.run(return_tasks=flow.tasks)
>> flow_state
[2020-07-14 09:37:58] INFO - prefect.CloudFlowRunner | Beginning Flow run for 'Map Reduce'
<Failed: "Could not retrieve state from Prefect Cloud">
Am I missing something here?Robin
07/14/2020, 10:55 AMKlemen Strojan
07/14/2020, 11:59 AMtask = ShellTask()
m = Secret('my_secret_on_the_cloud').get()
with Flow('my_flow') as flow:
name = task(
helper_script='cd ~'
env={'CREDENTIAL':m},
command="mysqldump -u my_user -p$CREDENTIAL my_db > my_db.sql"
)
while I should be using it like this:
task = ShellTask()
with Flow('my_flow') as flow:
m = Secret('my_secret_on_the_cloud')
name = task(
helper_script='cd ~'
env={'CREDENTIAL':m},
command="mysqldump -u my_user -p$CREDENTIAL my_db > my_db.sql"
)
The later fails with TypeError
. I have the secret on the cloud and in the local config.toml
for testing purposes. What am I doing wrong?Peter Peter
07/14/2020, 12:18 PM10:33:16 | Concurrency: 1 threads (target='dev')
10:33:16 |
10:33:16 | 1 of 4 START table model dbt_test.add_rid............................ [RUN]
10:33:28 | 1 of 4 OK created table model dbt_test.add_rid....................... [SELECT 161069 in 11.75s]
10:33:28 | 2 of 4 START table model dbt_test.union.............................. [RUN]
10:33:36 | 2 of 4 OK created table model dbt_test.union......................... [SELECT 159280 in 7.86s]
10:33:36 | 3 of 4 START table model dbt_test.union_join_test.................... [RUN]
* Deprecation Warning: The adapter function `adapter.get_columns_in_table` is
deprecated and will be removed in a future release of dbt. Please use
`adapter.get_columns_in_relation` instead.
Documentation for get_columns_in_relation can be found here:
   <https://docs.getdbt.com/docs/adapter>
10:33:36 | 3 of 4 ERROR creating table model dbt_test.union_join_test........... [ERROR in 0.22s]
10:33:36 | 4 of 4 START table model dbt_test.union_join......................... [RUN]
10:33:50 | 4 of 4 OK created table model dbt_test.union_join.................... [SELECT 159280 in 14.28s]
10:33:50 |
10:33:50 | Finished running 4 table models in 35.12s.
Â
When I run from DbtShellTask for the same dbt workflow I get this message:
Â
July 14th 2020 at 7:37:14am | prefect.DbtShellTask
ERRORÂ lens
Command failed with exit code 1: Done. PASS=3 WARN=0 ERROR=1 SKIP=0 TOTAL=4
Â
Trying to make it easier to track down the error. Is there anyway to include these full error messages?
Â
Here is a sample of how I am doing this. I was hoping return_all would return all messages from dbt.
Â
with Flow(name="dbt_flow") as flow:
   task = DbtShellTask(
       profile_name='default',
       environment='dev',
       dbt_kwargs={
               'type': 'postgres',
               'threads': 1,
               'host': 'IP',
               'port': 5433,
               'user': 'username',
               'pass': 'docker',
               'dbname': 'actualDbName',
               'schema': 'dbt_test'
       },
       overwrite_profiles=True,
       profiles_dir='Actual Path', return_all=True
   )(command='dbt run' )
Â
flow.register()
Â
Any help would be great.
Â
Pkarteekaddanki
07/14/2020, 3:23 PMZach Angell
07/14/2020, 4:11 PMSubmitted
state by default.Ben Fogelson
07/14/2020, 7:09 PMimport prefect as p
@p.task
def add_one(n):
return n + 1
@p.task
def add_two(n):
return n + 2
with p.Flow('f') as f:
a = p.Parameter('a')
b = add_one(a)
b.slug = 'my_slug'
c = add_two(a)
c.slug = 'my_slug'
f.get_tasks(slug='my_slug') # returns both tasks
f.validate(). # passes
Kyle Combs
07/14/2020, 9:15 PMPhilip MacMenamin
07/15/2020, 12:43 AMKaren
07/15/2020, 1:12 AMJacob Blanco
07/15/2020, 7:01 AMmy_project/postgres_user
that got converted to my_project/
and now I cannot delete it.Romain
07/15/2020, 8:46 AMDASK_DISTRIBUTED__SCHEDULER__BLOCKED_HANDLERS
?Avi A
07/15/2020, 1:21 PMdict
, prefect handles it by planting the getItem
task in the DAG:
dict_output = generate_dict()
a = task_a(dict_output['a'])
b = task_b(dict_output['b'])
How would you recommend working with dataclasses? The following piece of code wonât work, naturally (since a task doesnât have these attributes)
@dataclass
def MyClass:
a: int
b: int
...
...
dataclass_output = generate_dataclass()
a = task_a(dataclass_output.a)
b = task_b(dataclass_output.b)
c = another_task(dataclass_output)
simone
07/15/2020, 3:02 PMSven Teresniak
07/15/2020, 4:38 PMflow.register()
and its working fine. i see the work in the dask-ui and in the prefect-ui as well. now the question: is there a easy way to modularize a flow? to allow imports of local modules (e.g. boilerplate code) to re-use code in different flows? in spark i can bundle a boilerplate.jar and publish the jar together with the job for a single job run (or the same jar for every job). is there any mechanism in prefect?
do i have to deploy the boilerplate code to my (every) dask-worker in advance?