hello, i am having issues with installing prefect ...
# ask-community
r
hello, i am having issues with installing prefect 2.0 in my aws. I have a ubuntu 18.0 running which comes with a older version of sqlite3. Prefect 2.0 uses version greater than 3.24. is there anyway i can not install the sqlite3 depdency when installing prefect? @Kevin Kho @Anna Geller
k
You can try using
pip install --no-deps
to not install dependencies and then do them manually
r
pip install prefect --no-deps?
k
Yep
See this
r
seems im still getting this error: Orion requires sqlite >= 3.24.0 but we found version 3.22.0
z
You won’t be able to use Prefect with the old version of SQLite since we use newer features
Unfortunately, it’s really hard to install a new version of sqlite because it’s linked into python
r
oh i see, so i suppose i ahve to upgrade to ubuntu 2.0
20
ubuntu 20 which comes shipped with newer sqlite package?
z
Yeah we test on Ubuntu 20.04
You can also try something like https://charlesleifer.com/blog/compiling-sqlite-for-use-with-python-applications/ but really it’s a pain
If there was a good way to get a newer sqlite installed we’d have a tutorial on it but I haven’t found a reliable method yet 😕
r
thanks for your reply. i installed ubuntu 20.0 on my machin. but now geting this error, any idea what to do?
Copy code
Can't locate revision identified by '71a57ec351d1'
z
prefect orion database stamp HEAD
then
prefect orion database reset
should fix it; if not you can delete the database file manually from
~/.prefect/orion.db
r
oh ok
@Zanie, thank you. the last one with going to the prefect directory and dleting the database fixed it
@Zanie, when i ssh into a ec2 instance, how do i view the orion dashboard? i did prefect orion start and see a url pop up. is there a way to access is within ec2? or from my local machine?
z
When you start orion, you can expose it to remote connections with
--host 0.0.0.0
Then if you know the EC2 instance’s IP you can connect to it at http//<ec2 ip>4200/
However, I would only recommend doing this if your instance is on a VPC or something
r
@Zanie, wow awesome thanks it worked
why vpc and not ec2?
sorry why vp2 and not aws?
z
You’ve now exposed your server to the internet, so anyone could send requests to it
r
ahh ok got it
z
Putting your EC2 instance behind a private network would be safer.
r
so when u say anyone can send requests to it, u mean if someone uses the same port and ip they can send requests to it?
only people who have the ip is me and my boss
z
Malicious actors will often test ports across the entire range of AWS IPs
r
oh ok damn
k i will definitely let my team know that we need to migrate it to vpc
i wanted a mvp running asap. do u think that can be done in next step once i have a mvp up and running? i.e set up a vpc next
z
Yeah you can use a AWS VPC and something like the AWS VPN to limit access https://aws.amazon.com/vpn/
r
k thanks i will look into this
security is pretty important for us
z
I would not leave it exposed, personally. Someone could execute arbitrary code by registering and running flows.
r
oh ok hmm allright let me then set set up a vpc
@Zanie, when i start the orion server on ssh, i see the dashboard on my local webbrowser, but when i do prefet deployment create deployment_name, its not showing any deployments on the page
do i need to adjust any settings on privacy for aws? like allowing incoming port?
hmm its not really updating my deployments on the dashboard after i do prefect create deployment.
k
I think this is more about the UI is still hardcoded to localhost so if you view the UI in your local machine, it will view the local database
r
oh ok, then how can i view the UI from my local machine with the remote database option?
sorry for all these questions as its not clear to me how to deploy prefect via ssh tunnel and to view the dashboard it creates
@Kevin Kho, i am reading the docs and curious, if using the remote database in ec2 that i created via postgres, if i am firing up the ui from local machine, is one way to view remote database,
Copy code
export PREFECT_ORION_DATABASE_CONNECTION_URL="<postgresql+asyncpg://ubuntu:mypassword@ec2-34>-????.<http://compute-1.amazonaws.com:5432/orion|compute-1.amazonaws.com:5432/orion>"
?
k
You can’t yet for the UI from a remote machine but you can interact with an API remotely already. The UI portion just hasn’t been released but it will surely be part of the immediate roadmap
r
oh ok, then how will i deploy this on an ec2? because what i am doing is i ssh into my ec2 instance and cd into the directory where i have my deployments, i just want to be able to run them
would i do separate ssh tunnels and do a prefect deployment execute my_deployment?
@Kevin Kho, is the new prefect version cloud compatible now?
k
Yes that might be the way for now. Yes the hosted cloud might be better for you and it’s free to start with
r
thank god
i was dying here
lol
k
That experience will definitely be improved though when the localhost is no longer hardcoded and when the remote storage story gets solved
r
yea hopefully, i guess i was running short of time and couldn't figure out how to deploy remotely on local
but the cloud solves it tremendously
k
I think the Cloud solves the UI issue, but you still need to get your files on the execution environment
r
@Kevin Kho, thats easy right ? i am assuming i set up prefect to use cloud storage on backend similarly to how it was done in prefect core.
and then i create deployments and it automatically shows up in the cloud?
k
Yeah as long as the files are in the same execution environment. The default storage tutorials will be updated as Michael mentioned earlier
r
@Kevin Kho, when i configured cloud in prefect 2.0, this step is creating an error for me: prefect cloud login --key secret_key -w my_oompany_workspace I get the notification:
Copy code
Successfully logged in and set workspace to my_company_workspace in profile: default
But then i see an error at the end:
Copy code
Traceback (most recent call last):
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/cli/base.py", line 58, in wrapper
    return fn(*args, **kwargs)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/utilities/asyncio.py", line 120, in wrapper
    return run_async_in_new_loop(async_fn, *args, **kwargs)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/utilities/asyncio.py", line 67, in run_async_in_new_loop
    return anyio.run(partial(__fn, *args, **kwargs))
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/anyio/_core/_eventloop.py", line 56, in run
    return asynclib.run(func, *args, **backend_options)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 233, in run
    return native_run(wrapper(), debug=debug)
  File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 228, in wrapper
    return await func(*args)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/cli/cloud.py", line 216, in login
    exit_with_success(
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/cli/base.py", line 193, in exit_with_success
    raise typer.Exit(0)
click.exceptions.Exit: 0
An exception occurred.
k
This i’ll need to ask the team
z
The error can be safely ignored
That’s a bug in exit code handling, there will be a fix out today.
r
oh ok thanks
also, prefect storage ls has a bug too:
Copy code
engine$ prefect storage ls
Traceback (most recent call last):
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/cli/base.py", line 58, in wrapper
    return fn(*args, **kwargs)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/utilities/asyncio.py", line 120, in wrapper
    return run_async_in_new_loop(async_fn, *args, **kwargs)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/utilities/asyncio.py", line 67, in run_async_in_new_loop
    return anyio.run(partial(__fn, *args, **kwargs))
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/anyio/_core/_eventloop.py", line 56, in run
    return asynclib.run(func, *args, **backend_options)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 233, in run
    return native_run(wrapper(), debug=debug)
  File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 228, in wrapper
    return await func(*args)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/cli/storage.py", line 178, in ls
    json_blocks = await client.read_blocks(block_spec_type="STORAGE", as_json=True)
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/client.py", line 937, in read_blocks
    response = await <http://self._client.post|self._client.post>(
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/utilities/httpx.py", line 137, in post
    return await self.request(
  File "/home/ubuntu/phobos/env/lib/python3.9/site-packages/prefect/utilities/httpx.py", line 47, in request
    request = self.build_request(
TypeError: build_request() got an unexpected keyword argument 'extensions'
An exception occurred.
z
Your httpx version is 1.0.0b1
You’ll want to
pip uninstall httpx
and
pip install httpx
(I presume, since we’ve seen this with that before — you could confirm with
pip show httpx
as well)
r
oh ok thanks, so since i have configured the cloud set up, i assume next step is to run deployment and worker-queues
and the cloud api will automatically pick them up?
k
Yes if you configure your local to hit the Cloud endpoint. then the commands will be run agains the cloud
r
awesome thanks, if i run into any issues i will bother u here
@Kevin Kho, i deployed the pipeline on the cloud. but now having an issue where i see late runs in my deployments. i did the following steps am i missing anything?
Copy code
prefect deployment create binance/orderbook_deployment.py
prefect work-queue create test_queue
prefect agent start 'uiud'
from previous step work_queue. How do i get these to execute on the cloud? ```
hmm all my scheduled late deployments suddenly disappeared from the cloud
k
did they run?
r
nope none did
k
will try this myself later or tom
r
geez teh filters get automatically applied
k
what filters did you have?
r
i filtered by my pipeline name
but after 15 minutes the filter disappears and the original 1d filter comes
thats not a big deal to me, i just want these to run but yea will wait for u to check it out
also i hit the quick run button and hence it shows the flow runs but they are all late
k
are you saying the filters on the dashboard get automatically applied or filters on the work queue?
r
filters on the dashboard
hmm this is weird
when i click on the work-queue i created
i dont see any deployments listed there
k
Can you try on the CLI
Copy code
prefect work-queue preview --hours 12 'acffbcc8-ae65-4c83-a38a-96e2e5e5b441'
r
give me a sec, for some reason my remote ec2 instance got frozen
@Kevin Kho
k
Looks like the work queue is working right?
r
@Kevin Kho, can i start a work-queue agent from the cloud ui rather than from the terminal where i have to ssh into linux
k
No because Cloud does not host compute for jobs
r
yea its very weird, when i ssh into a linux terminal and call prefect agent
@Kevin Kho, it freezes my terminal
and im forced to reboot
k
But not on local right? Does it only freeze after a flow is deployed?
r
yea not on local
on local its working fine
it freezes only after i call
Copy code
prefect agent start 'unique_uiud'
where unique_uiud is the unique uiud thats generated when i do
Copy code
prefect work-queue create live_feeds_to_postgres_agent
k
Ok will try to see if I can replicate
r
do u want me to create a sample hello world deployment and try that first
rather than the complicated deployment i have,
and see if that works on the linux side?
a
Always great to build and share a simple example we can reproduce on our end 👍
r
ok i will make a fake hello world and see if i can replicate the problem and send u the same file
it is executing but my shell is frozen
then when retry to login, it breaks and doens't allow me to ssh anymore so i have to reboot the instance
give me 10 minutes, i have to reboot the instance and see if i can replicate the problem
a
We actually prefer asynchronous communication 😄 so please take your time and post one larger message once you have some reproducible example and once you calmly investigated the issue in detail
r
hey, ok, so i ran some tests, I created a sample helloworld deployment and its deploying on my prefect cloud with no issues both locally and on the ssh terminal. So it seems on the linux side its working. So i ran this deployment i created 'redis_to_postgres_deployment'. its successfully deploying on the linux side and scheduled. Now, the next step, deploying 10 different processes, for ftx exchage. I ran,
Copy code
prefect deployment create ftx/order_book_deployments.py and i see 10 deployments created on the UI.  
When i go to the UI and click run, suddenly the ec2 instance freezes up.  
I then rebooted and repeated the steps above but instead of going to the UI to hit run, i did following:
prefect deployment execute live_feeds_to_redis_pipeline/ftx_L1_avaxusd and then i see the following message: Its completely stuck, i cant hit cancel or anything, and it shows up late on the UI. Same sequence of steps works completely fine on my local machine. The pipeline makes a shell call to a python file. When i go to the ssh terminal to the local of the python file and do a python3 file.py it streams perfectly on the shell, But the deployment is creating a problem.
k
I think this might be because the DaskTaskRunner is already occupying all of the cores of the machine. Did you configure it? I don’t know if this is the main issue, but you can’t deploy these concurrent flows that all use the DaskTaskRunner because they will then compete for resources.
r
oh
k
LocalDask is like a multiprocessing pool, and then you’ll be spinning up multiple pools that all try to occupy all the threads of the machine
r
oh ok
@Kevin Kho, hmm so when i checked my mac, i have 10 cores. so i think for daskcluster, i allocated 4 workers and 2 threads per worker, so 8 cores in total used. but on linux side i have only 1 core.
perhaps thats why
k
That makes sense. There is no other core to continue deploying flows or it’s running into some deadlock
r
yea lol
jesus
k thanks a lot hopefully that fixes things for me
will circle back if anything pops up
hey, i am still getting the same error, i changed the n_workers to 1 and n_threads to 1. In the ec2 instance, it still freezes as soon as i run this
k
But I think you still need more than 1 core because the agent is using that single core that you have?
r
so if in ec2, i see cpu=1,
what am i allocating?
k
More cpus like 4 just to test
r
and one thread?
initially i was running a deployment in ec2 and its running fine
its only when i run the 2nd deployment it creates issues for me
k
I don’t think you define threads on the VM right? It’s just CPU count?
r
yea pretty much
so what do i pass for parameters?
for threads_per_worker?
yea i just tested, im not able to run more than one flow at a time
in ec2
k
no you have to make the EC2 instance more powerful to support the additional flow runs
r
@Kevin Kho, im looking at aws ec2 instance types, is there anything recommended for live trading? how many vCPUs would be good?
k
I don’t know how many flows you eventually intend to run concurrecntly but same specs as your local development machine would be a good start?
r
ideally it wil be about 100s
k
Man you might need a really powerful EC2 instance then
r
how are people doing it in practice? running 1000s of flows?
k
Kubernetes and autoscaling so that you can spin up a job
But I think your scrapers are always going to be on?
r
yep
im scraping 100 currencies from 10 different exchanges
so 400 live streams of data
sorry 1000 live streams of data
k
I think if that is the goal, each stream effectively needs it’s own cpu/core
r
yep, btw thanks a lot for your help @Kevin Kho, its running on ec2 no issues for 30 coins
k
wow ok and how many cores is that EC2?
r
i chose 16 at the moment
and its running with no issues for 40 coins
i wonder why that is the case? my understanding of this cores vs threads is getting blurry. If i have 16vcpu and i allocate 1 core to my pipeline, wouldn't that imply i can only deploy maximum 16 streams? since each deployment takes up one core?
or does ec2 or prefect automatically use the unused thread to run 16 other streams?
k
I think this is because the default task submission is actually
async
so it can change task in between while waiting for other stuff. But I wouldn’t know what the ratio of cpu and concurrent tasks needs to be for your use case
r
uhh...i wnat the deployments to be completely independent of one another but running concurrently
so if one deployment fails, i should ideally get an email saying it failed and that its attempting to restart. that way i can log back in and trouble shoot if necessary to get the system back up
k
I think it is independent now right?
r
yep independent