Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Prefect Community

Hey all, I was wondering if anyone has run into the following error while trying to run a prefect flow (version 1) using the pandas package `Segmentation fault (core dumped)` .  I have run this using different version of pandas (1.5.1 and 1.3.5) and still get the same error.  I am able to import pandas and run a command such as `pd.show_versions()`  without issues but whenever I try to run anything that would create a dataframe I get the segfault error.

Including my currently installed versions and the flow I am trying to run below.  Thanks in advance!

```INSTALLED VERSIONS
------------------
commit : 66e3805b8cabe977f40c05259cc3fcf7ead5687d
python : 3.8.10.final.0
python-bits : 64
OS : Linux
OS-release : 4.14.294-220.533.amzn2.x86_64
Version : #1 SMP Thu Sep 29 01:01:23 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : None
LOCALE : en_US.UTF-8
pandas : 1.3.5
numpy : 1.23.4
pytz : 2022.6
dateutil : 2.8.2
pip : 20.0.2
setuptools : 45.2.0
Cython : None
pytest : 7.2.0
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.9.1
html5lib : None
pymysql : None
psycopg2 : 2.9.5 (dt dec pq3 ext lo64)
jinja2 : 3.1.2
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : 2022.11.0
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 10.0.0
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : 0.9.0
xarray : None
xlrd : None
xlwt : None
numba : None```

```import os
import prefect
import sys
import pandas as pd
from prefect import task, Flow
from prefect.storage import S3
from prefect.run_configs import ECSRun
from prefect.run_configs import LocalRun

# create logger
logger = prefect.utilities.logging.get_logger()

@task
def say_hello():
    print(pd.show_versions())
    data = {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']}
    df = pd.DataFrame.from_dict(data)
    <http://logger.info|logger.info>(df.head())
    <http://logger.info|logger.info>("Hello, Cloud!")



flow = Flow("jrose_flow", tasks=[say_hello])

kwargs = {}
kwargs["cluster"] = f"arn:aws:ecs:us-west-2:XXXXXXXXXXX:cluster/prefect-agent-dev"

flow.run_config = ECSRun(task_role_arn="arn:aws:iam::XXXXXXXXXXX:role/prefect-dev-rpt-services-role",
                         execution_role_arn="arn:aws:iam::XXXXXXXXXXX:role/prefect-dev-rpt-services-role",
                         task_definition_arn="arn:aws:ecs:us-west-2:XXXXXXXXXXX:task-definition/rn-rpt-services-dev",
                         run_task_kwargs = kwargs)

flow.storage = S3(bucket="bucket_name", key="prefecttest/rpt/jrose_flow.py",
                  stored_as_script=False)```

Hi Jeff! I had to do a bit of poking around the internet to see where this error could come from. It didn't strike me as a prefect-related error at first. So far what i've been able to gather is the `Segmentation fault` error arises when your system is trying to access memory that it does not have access to, or memory that doesn't exist.

I was reading through this <https://www.javatpoint.com/segmentation-fault-core-dumped-ubuntu|webpage>, maybe there is something here that could help?

Thanks!  I think this might be due to the pandas version I have been using

It appears to be working when I switch to a different version

Sweet! Thanks for sharing that here. If someone else runs into the same issue they'll certainly appreciate this thread.

Just an FYI, I switched from pandas 1.5.1 -&gt; 1.3.5 and that fixed it (while using Python 3.8.10)