https://prefect.io logo
Title
f

Frances Sun

12/09/2020, 2:52 PM
Hi! Curious to see if anyone has been able to use Singer and Prefect together?
w

Will Milner

12/14/2020, 8:25 PM
I just started a project that is using singer and prefect
@Frances Sun I have some working examples of using Singer and Prefect if you're interested in chatting about it
f

Frances Sun

12/16/2020, 3:28 PM
Thanks Will! Love to chat more about your usage in a few weeks. We're trying to catch some deadlines this month due to year end, so holding off on any new integrations until 2021. Until then, what taps & targets have you used? I've only tested a klaviyo tap with a jsonl & csv target
w

Will Milner

12/16/2020, 3:32 PM
currently we're using the Salesforce tap and a custom tap we developed, both going to a snowflake target
i've also used a redshift target in the past as well
f

Frances Sun

12/16/2020, 3:35 PM
Curious, I'm almost sure the taps return json data (guessing), what mappings were necessary to push that data into redshift/snowflake, if any?
w

Will Milner

12/16/2020, 4:46 PM
nothing, the singer spec is able to handle the json output. The targets handle the json output and convert it as necessary
s

Sawyer Waugh

01/16/2021, 10:31 PM
@Will Milner Hi Will. Would you be so kind as to pass along the aforementioned examples of Singer+Prefect? I have written several over past 2 years, and our pipeline has grown well past the point where CRON suffices for orchestration
Did you have any luck with this combination, @Frances Sun? Is the thought to make a generic
tap task
? Or something that is specific to a given tap…or to just use either the python or bash tasks?
w

Will Milner

01/19/2021, 6:21 PM
sorry for the late reply. Since all singer taps/targets can be invoked via bash, our approach is to install them into the flow as a docker image and invoke them via a Shell task
only odd part is we set up a virtual environment within the docker image which is a bit unorthodox in terms of using docker but is best practice with using singer. An example task invocation looks a bit like this
task_shell = ShellTask()    
salesforce = task_shell(command="/salesforce_tap_env/bin/tap-salesforce --config /config/salesforce_config.json --properties /config/salesforce_props.json --state /config/salesforce_state.json |"
                                    " /snowflake_target_env/bin/target-snowflake --config /config/snowflake_salesforce_config.json",
                    task_args={"name": "salesforce"})
you could also do them as python tasks are you suggested, but I find it easier to call them via shell. All of our taps and targets are their own stand alone package that we deploy to our own internal pypi server that can then be installed into our docker images
I have ran into some dependency conflicts between prefect and some singer taps, so sometimes I need to install a specific dependency package, here is a snippet from our docker file for our flows, this is installing https://github.com/singer-io/tap-salesforce
FROM python:3.7


COPY flows/config /config



RUN pip install requests===2.23.0. # prefect tries to install a different version of requests that is not compatible with tap-salesforce

# install singer tap-salesforce
RUN python -m venv salesforce_tap_env
RUN /bin/bash -c "source salesforce_tap_env/bin/activate && pip install tap-salesforce && deactivate"