Hi all we are currently facing a problem where a flow run in Prefect Community #ask-community

Hi all, we are currently facing a problem where a ...

g.suijker

04/08/2021, 11:23 AM

Hi all, we are currently facing a problem where a flow run in Prefect Cloud takes significantly more time (factor 20/25) than the same flow run in Prefect Server. The flow retrieves data from a mysql database and inserts the data into a sql server database. When running the flow in Prefect Server, with Docker storage and a Docker agent I get around 1000 rows/sec insertions into the sql server db. While in Prefect Cloud, with Docker storage and a Kubernetes agent I get around 40 rows/sec insertions. Any ideas on the cause of this performance issue?

Marko Herkaliuk

04/08/2021, 1:05 PM

Hi. Please tell which one execution layer in both setups, and how you count rows per second? Total time of flow or exactly SQL part?

Kevin Kho

04/08/2021, 1:26 PM

Hi @g.suijker! This sounds odd. Are Cloud and Server running the same Agent? Where does the SQL Server live? Are you using mapping for insertions?

g.suijker

04/08/2021, 1:42 PM

@Marko Herkaliuk, timing is on the SQL part that inserts the rows. @Kevin Kho, for Cloud we use a Kubernetes Agent and for Server a Docker Agent. Think the SQL server is running on a remote server. What do you mean by mapping for insertions? I do not use a mapping task if that's what you mean.

Marko Herkaliuk

04/08/2021, 1:45 PM

maybe it's network delays? where you have the database deployed and where exactly the execution layer in both setup.

g.suijker

04/08/2021, 1:53 PM

the weird thing is that according to my devops colleagues the Kubernetes agent is running closer to the remote server than the Docker agent running on my laptop. So the network delay should be less.

Kevin Kho

04/08/2021, 1:58 PM

Yes that is what I meant by map. I was thinking it was something to do with concurrent connections. I think it’s really hard to pinpoint the issue here unless you run the Docker agent with Cloud in the same place you ran your Kubernetes agent. That will allow us to compare.

Marko Herkaliuk

04/08/2021, 2:10 PM

Can you look at the monitoring of your database when during the test queries? Maybe at that time, there was a different load on the database, or queries to the table in which the data was inserted.

👍 1

g.suijker

04/08/2021, 2:11 PM

What also struck me was that after we switched to the new payment plan with unlimited concurrency, the performance on Cloud for this specific run deteriorated (running times increased by 2x). I do not understand how that could happen, since it is just one flow without mapping or dependent tasks.

g.suijker

04/08/2021, 2:14 PM

@Marko Herkaliuk, will ask the dba to monitor while running the queries.

Kevin Kho

04/08/2021, 2:16 PM

That is very puzzling. The cloud subscription shouldn’t affect your flow runtime (since Prefect doesn’t run the code). Was your flow completing at a consistent time before?

Dylan

04/08/2021, 2:20 PM

Copy code

When running the flow in Prefect Server, with Docker storage and a Docker agent I get around 1000 rows/sec insertions into the sql server db.
While in Prefect Cloud, with Docker storage and a Kubernetes agent I get around 40 rows/sec insertions.

Dylan

04/08/2021, 2:20 PM

Hi @g.suijker!

Dylan

04/08/2021, 2:21 PM

It sounds like your kubernetes cluster and docker container may have different resource constraints for execution

Dylan

04/08/2021, 2:21 PM

Would you mind sharing your Flow’s config for both cases? (the Run Configs, Results, Executor, etc)

Dylan

04/08/2021, 2:23 PM

Copy code

What also struck me was that after we switched to the new payment plan with unlimited concurrency, the performance on Cloud for this specific run deteriorated (running times increased by 2x). I do not understand how that could happen, since it is just one flow without mapping or dependent tasks.

Depending on where/when your Flow Runs are being executed in K8s, the increased concurrency may have put additional resource strain on your cluster. Your Flow’s config for the docker vs. k8s setups should help us debug 😄

g.suijker

04/08/2021, 2:31 PM

I'm not sure what you mean exactly, but I do not set anything other than the label on within the run config.

Copy code

for Prefect Server:
storage = Docker(
        python_dependencies=["pyodbc", "pandas", "PyMySQL"],
        dockerfile = 'dockerfile',
        )

run_config = DockerRun(labels=['kube-office'])

with Flow(f"{flow_name}", storage=storage, run_config=run_config) as flow:

------------------
Prefect Cloud:
docker_storage = Docker(
        registry_url="xxx",
        image_name=f"prefect/{flow_name}",
        image_tag="latest",
        python_dependencies=["pyodbc", "pandas", "PyMySQL"],
        dockerfile = 'dockerfile',
        )
run_config = KubernetesRun(labels=['kube-office'])

with Flow(f"{flow_name}", storage=storage, run_config=run_config) as flow:

Using the same dockerfile for both

Dylan

04/08/2021, 2:56 PM

This makes sense, then! The default resource allocations (memory, CPU, etc) can be very different for Docker and Kubernetes. In your Kubernetes Run Config, try specifying

cpu_limit

cpu_request

mem_limit

, and

mem_request

to make sure the Kubernetes job that the Flow Run executes in has the proper resources available

Dylan

04/08/2021, 2:57 PM

Here’s some background k8s documentation https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/

Dylan

04/08/2021, 2:58 PM

And our Kubernetes Run basics: https://docs.prefect.io/orchestration/flow_config/run_configs.html#examples-4

Dylan

04/08/2021, 2:58 PM

https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits

Dylan

04/08/2021, 2:58 PM

If you find that specifying resources doesn’t solve your issue, please let us know

Dylan

04/08/2021, 2:59 PM

and feel free to follow up with any questions you have 😄

g.suijker

04/08/2021, 2:59 PM

Thanks for the help all, will give it a shot.

Dylan

04/08/2021, 3:00 PM

👍

g.suijker

04/12/2021, 2:51 PM

Hi guys, we tried a couple of different resource allocations. But none of them resulted in a significant performance increase. We ran some local tests on the same flow: 1. By running the flow with flow.run(), inserting one chunk of data into mssql took on average between 300 - 400ms. 2. By running the flow on Prefect Server, inserting one chunk of data into into mssql took on average between 2800 - 3000ms. Where is this difference coming from?

Dylan

04/12/2021, 5:09 PM

Hi @g.suijker have you configured any checkpointing or results for your tasks?

g.suijker

04/13/2021, 9:06 AM

Hi @Dylan, I have set checkpointing to False on all tasks and did not specifiy any Result subclass

Zanie

04/13/2021, 2:46 PM

Are you measuring actual insertion time or are you measuring time per task?

Zanie

04/13/2021, 2:54 PM

and are you running an insertion per task or many insertions in a single task?

g.suijker

04/13/2021, 2:58 PM

I'm measuring the actual insertion time and doing many insertions within a single task (chunking the full dataset and inserting each chunk)

Zanie

04/13/2021, 3:28 PM

I see. In your above message you noted that the insertion time was per-chunk. We'd expect a couple seconds of spin-up/down time for tasks that are reporting their states to the API. Are you still running one test on Kubernetes and one on Docker? It's going to be very hard to pin down where the difference is if you're running on entirely different architecture.

g.suijker

04/14/2021, 7:09 AM

I understand, that's what making it difficult to track down the issue. Last tests we did with the insertion time per-chunk was on my local machine, where the test with flow.run() and with LocalRun & Local Storage via prefect server gave the same performance while the test with DockerRun & Docker storage was around 10x slower. I tried changing Docker resources but that didn't seem to make a difference.

g.suijker

04/23/2021, 7:28 AM

We finally found the issue. The bad performance was due to closing the database connection after each insert query (this flow is doing a lot of insert query's within a loop). On my Windows laptop closing the connection did not result in a lower insert speed, but within a Docker container it did. So when running the flow with flow.run() there was no performance issue, but running the flow via Cloud or Server with Docker storage and a Docker/Kubernetes Agent there was. The query execution function first looked like:

Copy code

def sqlServerExecute(query, params=None):
    """Execute query with values provided to sqlServer"""

    dwh_connection_string = connection_string

    connection = pyodbc.connect(dwh_connection_string)

    try:
        cursor = connection.cursor()
        cursor.fast_executemany = True
        if params:
            cursor.executemany(query, params)
        else:
            cursor.execute(query)
        connection.commit()

    except (Exception, pyodbc.DatabaseError):
        connection.rollback()
        raise

    finally:
        cursor.close()
        connection.close()

Which we changed to, where the connection was made before the insert loop and closed after :

Copy code

def sqlServerExecute(query, connection, params=None):
    """Execute query with values provided to sqlServer"""
    try:
        cursor = connection.cursor()
        cursor.fast_executemany = True
        if params:
            cursor.executemany(query, params)
        else:
            cursor.execute(query)
        connection.commit()
    except (Exception, pyodbc.DatabaseError):
        connection.rollback()
        raise
    finally:
        cursor.close()

This resulted in equal insert performance locally (flow.run()) and Cloud with Docker storage and Kubernetes Agent.

g.suijker

04/23/2021, 7:40 AM

When I look at the source code for the SqlServerExecute tasks https://github.com/PrefectHQ/prefect/blob/5de58efaba956b431335d99acab07eaf6a362e1b/src/prefect/tasks/sql_server/sql_server.py#L103, the connection is getting closed after each query as well. I guess more people might encounter the same issue then.

👍 1

g.suijker

04/23/2021, 9:28 AM

To be more precise, the issue appeared due to Connection Pooling available on my Windows machine, but not within the Docker Container. Any ideas on how I can enable Connection Pooling within the Docker Container?

Zanie

04/23/2021, 1:07 PM

Very interesting! Thanks for reporting back @g.suijker -- I'm not sure how to enable connection pooling. Perhaps https://github.com/mkleehammer/pyodbc/issues/774 will be relevant?

20 Views

Open in Slack

Previous Next