https://prefect.io logo
Title
r

Rajan Subramanian

03/16/2022, 6:36 PM
hello, i am having issues with installing prefect 2.0 in my aws. I have a ubuntu 18.0 running which comes with a older version of sqlite3. Prefect 2.0 uses version greater than 3.24. is there anyway i can not install the sqlite3 depdency when installing prefect? @Kevin Kho @Anna Geller
k

Kevin Kho

03/16/2022, 6:37 PM
You can try using
pip install --no-deps
to not install dependencies and then do them manually
r

Rajan Subramanian

03/16/2022, 6:38 PM
pip install prefect --no-deps?
k

Kevin Kho

03/16/2022, 6:42 PM
Yep
See this
r

Rajan Subramanian

03/16/2022, 6:44 PM
seems im still getting this error: Orion requires sqlite >= 3.24.0 but we found version 3.22.0
z

Zanie

03/16/2022, 6:44 PM
You won’t be able to use Prefect with the old version of SQLite since we use newer features
Unfortunately, it’s really hard to install a new version of sqlite because it’s linked into python
r

Rajan Subramanian

03/16/2022, 6:45 PM
oh i see, so i suppose i ahve to upgrade to ubuntu 2.0
20
ubuntu 20 which comes shipped with newer sqlite package?
z

Zanie

03/16/2022, 6:46 PM
Yeah we test on Ubuntu 20.04
You can also try something like https://charlesleifer.com/blog/compiling-sqlite-for-use-with-python-applications/ but really it’s a pain
If there was a good way to get a newer sqlite installed we’d have a tutorial on it but I haven’t found a reliable method yet 😕
r

Rajan Subramanian

03/16/2022, 8:08 PM
thanks for your reply. i installed ubuntu 20.0 on my machin. but now geting this error, any idea what to do?
Can't locate revision identified by '71a57ec351d1'
z

Zanie

03/16/2022, 8:10 PM
prefect orion database stamp HEAD
then
prefect orion database reset
should fix it; if not you can delete the database file manually from
~/.prefect/orion.db
r

Rajan Subramanian

03/16/2022, 8:10 PM
oh ok
@Zanie, thank you. the last one with going to the prefect directory and dleting the database fixed it
@Zanie, when i ssh into a ec2 instance, how do i view the orion dashboard? i did prefect orion start and see a url pop up. is there a way to access is within ec2? or from my local machine?
z

Zanie

03/16/2022, 9:32 PM
When you start orion, you can expose it to remote connections with
--host 0.0.0.0
Then if you know the EC2 instance’s IP you can connect to it at http😕/<ec2-ip>:4200/
However, I would only recommend doing this if your instance is on a VPC or something
r

Rajan Subramanian

03/16/2022, 9:48 PM
@Zanie, wow awesome thanks it worked
why vpc and not ec2?
sorry why vp2 and not aws?
z

Zanie

03/16/2022, 9:49 PM
You’ve now exposed your server to the internet, so anyone could send requests to it
r

Rajan Subramanian

03/16/2022, 9:50 PM
ahh ok got it
z

Zanie

03/16/2022, 9:50 PM
Putting your EC2 instance behind a private network would be safer.
r

Rajan Subramanian

03/16/2022, 9:50 PM
so when u say anyone can send requests to it, u mean if someone uses the same port and ip they can send requests to it?
only people who have the ip is me and my boss
z

Zanie

03/16/2022, 9:52 PM
Malicious actors will often test ports across the entire range of AWS IPs
r

Rajan Subramanian

03/16/2022, 9:52 PM
oh ok damn
k i will definitely let my team know that we need to migrate it to vpc
i wanted a mvp running asap. do u think that can be done in next step once i have a mvp up and running? i.e set up a vpc next
z

Zanie

03/16/2022, 9:54 PM
Yeah you can use a AWS VPC and something like the AWS VPN to limit access https://aws.amazon.com/vpn/
r

Rajan Subramanian

03/16/2022, 9:54 PM
k thanks i will look into this
security is pretty important for us
z

Zanie

03/16/2022, 9:55 PM
I would not leave it exposed, personally. Someone could execute arbitrary code by registering and running flows.
r

Rajan Subramanian

03/16/2022, 9:56 PM
oh ok hmm allright let me then set set up a vpc
@Zanie, when i start the orion server on ssh, i see the dashboard on my local webbrowser, but when i do prefet deployment create deployment_name, its not showing any deployments on the page
do i need to adjust any settings on privacy for aws? like allowing incoming port?
hmm its not really updating my deployments on the dashboard after i do prefect create deployment.
k

Kevin Kho

03/16/2022, 11:47 PM
I think this is more about the UI is still hardcoded to localhost so if you view the UI in your local machine, it will view the local database
r

Rajan Subramanian

03/17/2022, 1:30 PM
oh ok, then how can i view the UI from my local machine with the remote database option?
sorry for all these questions as its not clear to me how to deploy prefect via ssh tunnel and to view the dashboard it creates
@Kevin Kho, i am reading the docs and curious, if using the remote database in ec2 that i created via postgres, if i am firing up the ui from local machine, is one way to view remote database,
export PREFECT_ORION_DATABASE_CONNECTION_URL="<postgresql+asyncpg://ubuntu:mypassword@ec2-34>-????.<http://compute-1.amazonaws.com:5432/orion|compute-1.amazonaws.com:5432/orion>"
?
k

Kevin Kho

03/17/2022, 2:29 PM
You can’t yet for the UI from a remote machine but you can interact with an API remotely already. The UI portion just hasn’t been released but it will surely be part of the immediate roadmap
r

Rajan Subramanian

03/17/2022, 2:31 PM
oh ok, then how will i deploy this on an ec2? because what i am doing is i ssh into my ec2 instance and cd into the directory where i have my deployments, i just want to be able to run them
would i do separate ssh tunnels and do a prefect deployment execute my_deployment?
@Kevin Kho, is the new prefect version cloud compatible now?
k

Kevin Kho

03/17/2022, 3:33 PM
Yes that might be the way for now. Yes the hosted cloud might be better for you and it’s free to start with
r

Rajan Subramanian

03/17/2022, 3:33 PM
thank god
i was dying here
lol
k

Kevin Kho

03/17/2022, 3:34 PM
That experience will definitely be improved though when the localhost is no longer hardcoded and when the remote storage story gets solved
r

Rajan Subramanian

03/17/2022, 3:37 PM
yea hopefully, i guess i was running short of time and couldn't figure out how to deploy remotely on local
but the cloud solves it tremendously
k

Kevin Kho

03/17/2022, 3:41 PM
I think the Cloud solves the UI issue, but you still need to get your files on the execution environment
r

Rajan Subramanian

03/17/2022, 3:46 PM
@Kevin Kho, thats easy right ? i am assuming i set up prefect to use cloud storage on backend similarly to how it was done in prefect core.
and then i create deployments and it automatically shows up in the cloud?
k

Kevin Kho

03/17/2022, 4:09 PM
Yeah as long as the files are in the same execution environment. The default storage tutorials will be updated as Michael mentioned earlier
r

Rajan Subramanian

03/17/2022, 4:19 PM
@Kevin Kho, when i configured cloud in prefect 2.0, this step is creating an error for me: prefect cloud login --key secret_key -w my_oompany_workspace I get the notification:
Successfully logged in and set workspace to my_company_workspace in profile: default
But then i see an error at the end:
Traceback (most recent call last):
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/cli/base.py", line 58, in wrapper
    return fn(*args, **kwargs)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/utilities/asyncio.py", line 120, in wrapper
    return run_async_in_new_loop(async_fn, *args, **kwargs)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/utilities/asyncio.py", line 67, in run_async_in_new_loop
    return anyio.run(partial(__fn, *args, **kwargs))
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/anyio/_core/_eventloop.py", line 56, in run
    return asynclib.run(func, *args, **backend_options)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 233, in run
    return native_run(wrapper(), debug=debug)
  File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 228, in wrapper
    return await func(*args)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/cli/cloud.py", line 216, in login
    exit_with_success(
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/cli/base.py", line 193, in exit_with_success
    raise typer.Exit(0)
click.exceptions.Exit: 0
An exception occurred.
k

Kevin Kho

03/17/2022, 4:20 PM
This i’ll need to ask the team
z

Zanie

03/17/2022, 4:22 PM
The error can be safely ignored
That’s a bug in exit code handling, there will be a fix out today.
r

Rajan Subramanian

03/17/2022, 4:22 PM
oh ok thanks
also, prefect storage ls has a bug too:
engine$ prefect storage ls
Traceback (most recent call last):
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/cli/base.py", line 58, in wrapper
    return fn(*args, **kwargs)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/utilities/asyncio.py", line 120, in wrapper
    return run_async_in_new_loop(async_fn, *args, **kwargs)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/utilities/asyncio.py", line 67, in run_async_in_new_loop
    return anyio.run(partial(__fn, *args, **kwargs))
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/anyio/_core/_eventloop.py", line 56, in run
    return asynclib.run(func, *args, **backend_options)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 233, in run
    return native_run(wrapper(), debug=debug)
  File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 228, in wrapper
    return await func(*args)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/cli/storage.py", line 178, in ls
    json_blocks = await client.read_blocks(block_spec_type="STORAGE", as_json=True)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/client.py", line 937, in read_blocks
    response = await <http://self._client.post|self._client.post>(
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/utilities/httpx.py", line 137, in post
    return await self.request(
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/utilities/httpx.py", line 47, in request
    request = self.build_request(
TypeError: build_request() got an unexpected keyword argument 'extensions'
An exception occurred.
z

Zanie

03/17/2022, 4:24 PM
Your httpx version is 1.0.0b1
You’ll want to
pip uninstall httpx
and
pip install httpx
(I presume, since we’ve seen this with that before — you could confirm with
pip show httpx
as well)
r

Rajan Subramanian

03/17/2022, 4:29 PM
oh ok thanks, so since i have configured the cloud set up, i assume next step is to run deployment and worker-queues
and the cloud api will automatically pick them up?
k

Kevin Kho

03/17/2022, 4:29 PM
Yes if you configure your local to hit the Cloud endpoint. then the commands will be run agains the cloud
r

Rajan Subramanian

03/17/2022, 4:30 PM
awesome thanks, if i run into any issues i will bother u here
@Kevin Kho, i deployed the pipeline on the cloud. but now having an issue where i see late runs in my deployments. i did the following steps am i missing anything?
prefect deployment create binance/orderbook_deployment.py
prefect work-queue create test_queue
prefect agent start 'uiud'
from previous step work_queue. How do i get these to execute on the cloud? ```
hmm all my scheduled late deployments suddenly disappeared from the cloud
k

Kevin Kho

03/19/2022, 3:39 PM
did they run?
r

Rajan Subramanian

03/19/2022, 3:39 PM
nope none did
k

Kevin Kho

03/19/2022, 3:39 PM
will try this myself later or tom
r

Rajan Subramanian

03/19/2022, 3:39 PM
geez teh filters get automatically applied
k

Kevin Kho

03/19/2022, 3:40 PM
what filters did you have?
r

Rajan Subramanian

03/19/2022, 3:40 PM
i filtered by my pipeline name
but after 15 minutes the filter disappears and the original 1d filter comes
thats not a big deal to me, i just want these to run but yea will wait for u to check it out
also i hit the quick run button and hence it shows the flow runs but they are all late
k

Kevin Kho

03/19/2022, 3:44 PM
are you saying the filters on the dashboard get automatically applied or filters on the work queue?
r

Rajan Subramanian

03/19/2022, 3:44 PM
filters on the dashboard
hmm this is weird
when i click on the work-queue i created
i dont see any deployments listed there
k

Kevin Kho

03/19/2022, 3:47 PM
Can you try on the CLI
prefect work-queue preview --hours 12 'acffbcc8-ae65-4c83-a38a-96e2e5e5b441'
r

Rajan Subramanian

03/19/2022, 3:54 PM
give me a sec, for some reason my remote ec2 instance got frozen
@Kevin Kho
k

Kevin Kho

03/19/2022, 5:07 PM
Looks like the work queue is working right?
r

Rajan Subramanian

03/21/2022, 3:58 PM
@Kevin Kho, can i start a work-queue agent from the cloud ui rather than from the terminal where i have to ssh into linux
k

Kevin Kho

03/21/2022, 4:00 PM
No because Cloud does not host compute for jobs
r

Rajan Subramanian

03/21/2022, 4:00 PM
yea its very weird, when i ssh into a linux terminal and call prefect agent
@Kevin Kho, it freezes my terminal
and im forced to reboot
k

Kevin Kho

03/21/2022, 4:02 PM
But not on local right? Does it only freeze after a flow is deployed?
r

Rajan Subramanian

03/21/2022, 4:03 PM
yea not on local
on local its working fine
it freezes only after i call
prefect agent start 'unique_uiud'
where unique_uiud is the unique uiud thats generated when i do
prefect work-queue create live_feeds_to_postgres_agent
k

Kevin Kho

03/21/2022, 4:04 PM
Ok will try to see if I can replicate
r

Rajan Subramanian

03/21/2022, 4:05 PM
do u want me to create a sample hello world deployment and try that first
rather than the complicated deployment i have,
and see if that works on the linux side?
a

Anna Geller

03/21/2022, 4:06 PM
Always great to build and share a simple example we can reproduce on our end 👍
r

Rajan Subramanian

03/21/2022, 4:07 PM
ok i will make a fake hello world and see if i can replicate the problem and send u the same file
it is executing but my shell is frozen
then when retry to login, it breaks and doens't allow me to ssh anymore so i have to reboot the instance
give me 10 minutes, i have to reboot the instance and see if i can replicate the problem
a

Anna Geller

03/21/2022, 4:12 PM
We actually prefer asynchronous communication 😄 so please take your time and post one larger message once you have some reproducible example and once you calmly investigated the issue in detail
r

Rajan Subramanian

03/21/2022, 7:38 PM
hey, ok, so i ran some tests, I created a sample helloworld deployment and its deploying on my prefect cloud with no issues both locally and on the ssh terminal. So it seems on the linux side its working. So i ran this deployment i created 'redis_to_postgres_deployment'. its successfully deploying on the linux side and scheduled. Now, the next step, deploying 10 different processes, for ftx exchage. I ran,
prefect deployment create ftx/order_book_deployments.py and i see 10 deployments created on the UI.  
When i go to the UI and click run, suddenly the ec2 instance freezes up.  
I then rebooted and repeated the steps above but instead of going to the UI to hit run, i did following:
prefect deployment execute live_feeds_to_redis_pipeline/ftx_L1_avaxusd and then i see the following message: Its completely stuck, i cant hit cancel or anything, and it shows up late on the UI. Same sequence of steps works completely fine on my local machine. The pipeline makes a shell call to a python file. When i go to the ssh terminal to the local of the python file and do a python3 file.py it streams perfectly on the shell, But the deployment is creating a problem.
k

Kevin Kho

03/21/2022, 7:42 PM
I think this might be because the DaskTaskRunner is already occupying all of the cores of the machine. Did you configure it? I don’t know if this is the main issue, but you can’t deploy these concurrent flows that all use the DaskTaskRunner because they will then compete for resources.
r

Rajan Subramanian

03/21/2022, 7:44 PM
oh
k

Kevin Kho

03/21/2022, 7:45 PM
LocalDask is like a multiprocessing pool, and then you’ll be spinning up multiple pools that all try to occupy all the threads of the machine
r

Rajan Subramanian

03/21/2022, 7:46 PM
oh ok
@Kevin Kho, hmm so when i checked my mac, i have 10 cores. so i think for daskcluster, i allocated 4 workers and 2 threads per worker, so 8 cores in total used. but on linux side i have only 1 core.
perhaps thats why
k

Kevin Kho

03/21/2022, 8:54 PM
That makes sense. There is no other core to continue deploying flows or it’s running into some deadlock
r

Rajan Subramanian

03/21/2022, 8:54 PM
yea lol
jesus
k thanks a lot hopefully that fixes things for me
will circle back if anything pops up
hey, i am still getting the same error, i changed the n_workers to 1 and n_threads to 1. In the ec2 instance, it still freezes as soon as i run this
k

Kevin Kho

03/21/2022, 10:36 PM
But I think you still need more than 1 core because the agent is using that single core that you have?
r

Rajan Subramanian

03/21/2022, 10:37 PM
so if in ec2, i see cpu=1,
what am i allocating?
k

Kevin Kho

03/21/2022, 10:37 PM
More cpus like 4 just to test
r

Rajan Subramanian

03/21/2022, 10:37 PM
and one thread?
initially i was running a deployment in ec2 and its running fine
its only when i run the 2nd deployment it creates issues for me
k

Kevin Kho

03/21/2022, 10:39 PM
I don’t think you define threads on the VM right? It’s just CPU count?
r

Rajan Subramanian

03/21/2022, 10:39 PM
yea pretty much
so what do i pass for parameters?
for threads_per_worker?
yea i just tested, im not able to run more than one flow at a time
in ec2
k

Kevin Kho

03/21/2022, 11:14 PM
no you have to make the EC2 instance more powerful to support the additional flow runs
r

Rajan Subramanian

03/22/2022, 1:56 PM
@Kevin Kho, im looking at aws ec2 instance types, is there anything recommended for live trading? how many vCPUs would be good?
k

Kevin Kho

03/22/2022, 2:25 PM
I don’t know how many flows you eventually intend to run concurrecntly but same specs as your local development machine would be a good start?
r

Rajan Subramanian

03/22/2022, 2:26 PM
ideally it wil be about 100s
k

Kevin Kho

03/22/2022, 2:30 PM
Man you might need a really powerful EC2 instance then
r

Rajan Subramanian

03/22/2022, 2:31 PM
how are people doing it in practice? running 1000s of flows?
k

Kevin Kho

03/22/2022, 2:33 PM
Kubernetes and autoscaling so that you can spin up a job
But I think your scrapers are always going to be on?
r

Rajan Subramanian

03/22/2022, 2:34 PM
yep
im scraping 100 currencies from 10 different exchanges
so 400 live streams of data
sorry 1000 live streams of data
k

Kevin Kho

03/22/2022, 2:41 PM
I think if that is the goal, each stream effectively needs it’s own cpu/core
r

Rajan Subramanian

03/22/2022, 3:09 PM
yep, btw thanks a lot for your help @Kevin Kho, its running on ec2 no issues for 30 coins
k

Kevin Kho

03/22/2022, 3:10 PM
wow ok and how many cores is that EC2?
r

Rajan Subramanian

03/22/2022, 3:24 PM
i chose 16 at the moment
and its running with no issues for 40 coins
i wonder why that is the case? my understanding of this cores vs threads is getting blurry. If i have 16vcpu and i allocate 1 core to my pipeline, wouldn't that imply i can only deploy maximum 16 streams? since each deployment takes up one core?
or does ec2 or prefect automatically use the unused thread to run 16 other streams?
k

Kevin Kho

03/22/2022, 3:28 PM
I think this is because the default task submission is actually
async
so it can change task in between while waiting for other stuff. But I wouldn’t know what the ratio of cpu and concurrent tasks needs to be for your use case
r

Rajan Subramanian

03/22/2022, 3:29 PM
uhh...i wnat the deployments to be completely independent of one another but running concurrently
so if one deployment fails, i should ideally get an email saying it failed and that its attempting to restart. that way i can log back in and trouble shoot if necessary to get the system back up
k

Kevin Kho

03/22/2022, 3:30 PM
I think it is independent now right?
r

Rajan Subramanian

03/22/2022, 3:33 PM
yep independent