Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Prefect Community

Hi, I'm a Prefect noob, so please bear with me. I seem to be having a strange problem with decoding from byte string. I am using Prefect to perform a query from my database, which has text in RTF format (this is an old database!) in which the text is stored as latin-1 encoded byte strings. The query is performed using `pd.read_sql()` and so the text is then stored in a column in the pandas dataframe. I then decode the strings and use `striprtf` to convert to plain text. I have never had a problem performing this step in Jupyter notebooks or on multiple machines, but when I run this in Prefect, for a portion of the text I get a `UnicodeDecodeError` despite using `text.decode(encoding='latin-1', errors='replace')` . I've tried using `chardet` but have had no luck. Thanks in advance for the help.

So no one wastes time - this is a problem with the conversion from rtf to text (`striprtf`) and not with Prefect!