Hello I m Elliot technical founder of a pre seed startup loo Prefect Community #introductions

Hello! I'm Elliot - technical founder of a pre-se...

Elliot Winard

10/11/2024, 8:24 PM

Hello! I'm Elliot - technical founder of a pre-seed startup looking for product/market fit, experienced software engineer and engineering manager, teller of dad jokes. I worked at Spotify for a bunch of years and my team there used our home-grown feature flagging and analytics systems. At my last company we used LaunchDarkly. I'm excited about getting to know Prefect now.

👋 9

🚀 6

🙌 7

Alexander Azzam

10/11/2024, 8:28 PM

What's up Elliot! Have been in your boat before, welcome to the community!

Ravi Tharisayi

10/11/2024, 8:31 PM

Hey there Eliot - Welcome! I recently worked at LaunchDarkly so can relate to joining the Prefect train. Thanks for trying us out!

Elliot Winard

10/11/2024, 8:38 PM

Oh sheesh. I'm adding feature flagging and workflow orchestration and it's Friday end-of-day and wires got crossed in my head.

Elliot Winard

10/11/2024, 8:39 PM

We went from Luigi to Flyte and Airflow on different teams.

👍 1

Elliot Winard

10/11/2024, 8:41 PM

I'm looking at flipt / OpenFeature for feature flagging.

Jeff Hale

10/11/2024, 8:42 PM

Welcome to the group, Elliot! Happy to have you here no matter what your tech needs.🎉

Ravi Tharisayi

10/11/2024, 8:43 PM

haha that's great.. I was literally saying to someone "Prefect for feature flagging?" 😅. Welcome.

Elliot Winard

10/11/2024, 8:43 PM

Yeah, right?

Elliot Winard

10/11/2024, 8:52 PM

I have a pipeline that went from a "try it in a notebook and with individual python scripts" to "one big script to do all the things" and now I'm breaking it into smaller pieces and looking to use Prefect to manage the pipeline through subflows. The different stages are used to take data, refine it with LLMs and content from web searches, insert into databases. Am I thinking about Prefect in the right way?

👍 3

yess 3

Alexander Azzam

10/11/2024, 8:53 PM

Yep, that's exactly the right way

Elliot Winard

10/11/2024, 8:54 PM

Sweet. Thx thx.

Elliot Winard

10/11/2024, 8:57 PM

Any suggestions for "free" web searches that I can use as an alternative to SerpAPI?

Alexander Azzam

10/11/2024, 8:58 PM

free search engine lookups or are you trying to scrape a particular type of the world

Alexander Azzam

10/11/2024, 8:58 PM

are you trying to go "English -> search results -> snag those pages -> extract/summarize" or something?

Elliot Winard

10/11/2024, 9:00 PM

more like "<a thing some business did> press release filetype:pdf". I have a lot of those things and am looking for "real data" to back that up.

Elliot Winard

10/11/2024, 9:04 PM

The

work pool

feature also seems friendlier than "write some celery thing and a bunch of cronjobs"

Alexander Azzam

10/11/2024, 9:05 PM

yeah dude it's sick. I ran a pretty large scale scraping operation at my last startup and getting from celery hell into Prefect was 🤌 (also why I later joined, haha)

Alexander Azzam

10/11/2024, 9:06 PM

for "real data" to back that up.

so Bing has an RSS feed for its news that's "free" (in the sense that it's against their TOS but they don't care until you're big, and then you can just pay folks for it legitly)

Elliot Winard

10/11/2024, 9:06 PM

A question.... because we're early stage startup, I am the data team and the web team and the guy who changes the batteries on my co-founder's boyfriends TV remote control. Can you help me articulate why it makes sense to use a hosted (or digitalocean-hosted...) Prefect instance?

Elliot Winard

10/11/2024, 9:09 PM

Hah about the Bing thing. Hopefully Microsoft acquires us before we get big enough for them to complain about that. Are you talking about a RSS news feed?

Alexander Azzam

10/11/2024, 9:09 PM

use a hosted Prefect instance

like vs free tier on prefect cloud

Alexander Azzam

10/11/2024, 9:10 PM

about a RSS news feed?

yeah, but tl;dr you can just hit bing.com/somethingIforget/?topic=what+you+need and it gives you like the top 10 things that relate to your query

🙌 1

Elliot Winard

10/11/2024, 9:10 PM

yeah, i mean "prefect cloud" or "a prefect server instance I am running on digital ocean"

Alexander Azzam

10/11/2024, 9:15 PM

hosting your own means you can build indexes on queries you need to tune, which makes sense when you're really sweating the thing. If you have security / compliance stuff where all the data needs to live in your VPC, etc. prefect cloud is also multi-tenant, which means you're in some sense sharing resources with other folks. isn't really an issue, but I think that's the usual reason why folks roll their own. I ran Prefect OSS on RDS in AWS for a bit and it was fine but I ran out of AWS credits and then realized I could just run my startup on prefect cloud free-tier and it was fine.

Alexander Azzam

10/11/2024, 9:16 PM

right now OSS doesn't have "push work pools" which is a fancy phrase for "serverless / autoscaling workloads". So if you hate celery and having to host workers all the time, that's maybe an argument for the cloud life.

👌 1

Elliot Winard

10/11/2024, 9:33 PM

We're hand-waving about compliance right now. It will come, but not yet.

Elliot Winard

10/11/2024, 10:19 PM

Is there any way to adjust the frequency of running of scheduled deployments in the web interface?

Alexander Azzam

10/11/2024, 10:20 PM

image.png

Elliot Winard

10/11/2024, 10:23 PM

thank you

Alexander Azzam

10/11/2024, 10:23 PM

btw as you're going through this if anything feels rough around the edges lmk

Alexander Azzam

10/11/2024, 10:23 PM

thinking a lot about the UI recently

Elliot Winard

10/11/2024, 10:29 PM

I got a little worried that this thing is running running running because i didn't realize that the 2 viz on the left look are just based on the runs, not on time. and the one on the right too, i guess.

Elliot Winard

10/11/2024, 10:31 PM

and this kinda made me think that this was running because of the "every minute every day". I tried to turn it off there, but can't. I think it's not running because it's "not ready".

Alexander Azzam

10/11/2024, 10:31 PM

woof, yeah I can see how that's confusing

Elliot Winard

10/11/2024, 10:37 PM

Can you point me at any example "best practices" on github for structuring LLM+search+data pipelines?

Elliot Winard

10/11/2024, 10:38 PM

Is "AskMarvin" you guys? https://www.askmarvin.ai/

Alexander Azzam

10/11/2024, 10:38 PM

Sure is!

Alexander Azzam

10/11/2024, 10:39 PM

on my phone so a little curt, but tl;dr wrap every HTTP call in a task so you get retries / caching / idempotency for each atomic piece of failure. Also get .map or TaskRunners which let you do massively parallelized operations.

Alexander Azzam

10/11/2024, 10:39 PM

so openai calls, give them their own task scraping a single page, give them their own task

Alexander Azzam

10/11/2024, 10:40 PM

my usual suggestion is think through where you expect the most failure, and that's usually where I start encapsulating them in a task

Alexander Azzam

10/11/2024, 10:40 PM

sorry if this is a ilttle generic 😅

Elliot Winard

10/11/2024, 10:41 PM

haha. all good.

Elliot Winard

10/11/2024, 10:44 PM

I was just thinking about using Prefect for the pipelines. It feels like it might be useful in "retrieval" part of our app too. But maybe not.

Elliot Winard

10/11/2024, 10:45 PM

I'm using the retries in Instructor now. Not making any "direct" calls to openai. https://python.useinstructor.com/

Alexander Azzam

10/11/2024, 10:46 PM

yeah those retries are local or "client"-side.

Alexander Azzam

10/11/2024, 10:47 PM

which means like "if your machine dies while you're on the 3rd retry you're f'd"

Alexander Azzam

10/11/2024, 10:47 PM

might be a corner case a small scale but as you start beefing this up you want the "state" of your retry stored somewhere else

Alexander Azzam

10/11/2024, 10:47 PM

corner case at* small scale

Elliot Winard

10/11/2024, 10:49 PM

you're talking about data pipelines, right?

Elliot Winard

10/11/2024, 10:53 PM

hrm. thinking... we've got web code calling an API which does the agentic and RAG and LLM-calling stuff. So the "client" I am thinking about here is the API in my app.

Elliot Winard

10/11/2024, 10:54 PM

thx for talking through this stuff. I need to go feed my kids.

Alexander Azzam

10/11/2024, 10:55 PM

🙇

Yaron Levi

10/15/2024, 8:13 AM

@Elliot Winard Are you you need Sub Flows? With Prefect 3.0 you should be able to define Tasks within Tasks. Creating a new sub flow might be too much, and you loose context.

Elliot Winard

10/15/2024, 11:59 AM

Thx. Not sure I need Sub Flows. I'm just getting my head around the tool.

👍 1

6 Views

Open in Slack

Previous Next