< Marvin> My flow is failing with Value error Flow run param Prefect Community #ask-marvin

<@ULVA73B9P> My flow is failing with 'Value error,...

Janet Carson

07/18/2025, 5:31 PM

@Marvin My flow is failing with 'Value error, Flow run parameters must be less than 512KB when serialized.', I have lots of questions - 1. Is that limit configurable at all? 2. How do I get the actual size of my serialized parameters for a given flow run? 3. On prefect UI, it just says "flow run failed" for my subflow and I had to instrument my worker to get the above error. Are the detailed errors supposed to be logged somewhere? If so, where do I look?

Marvin

07/18/2025, 5:31 PM

✅ thought for 86.5 seconds

Marvin

07/18/2025, 5:33 PM

You can use

prefect flow-run logs <flow-run-id>

to see detailed logs for any flow run, including subflows. b) In Your Code: To ensure errors are properly logged, use Prefect's built-in logging:

Copy code

python
from prefect import get_run_logger

@flow
def your_flow():
    logger = get_run_logger()
    try:
        # your flow code
    except Exception as e:
        logger.error(f"Flow failed with error: {str(e)}")
        raise

c) Worker Logs: Since you mentioned instrumenting your worker, you're on the right track. Worker logs can be valuable for debugging deployment-related issues. Make sure your worker's log level is set appropriately (INFO or DEBUG for more detail). Recommendations for Your Case: 1. Consider breaking up large parameters into smaller chunks or using alternative methods like: - Storing large data in a block and passing the block ID instead - Using intermediate storage (like S3, local filesystem, etc.) and passing references - Using Prefect's artifact system for large data transfers between flows 2. For better error visibility, make sure you're using proper logging in your flows and subflows using

get_run_logger()

. 3. If you need to debug specific flow runs, use the CLI command

prefect flow-run logs <flow-run-id>

to get detailed logs. Would you like me to elaborate on any of these points or provide more specific examples for your use case?

Janet Carson

07/18/2025, 5:35 PM

@Marvin The logger also failed: ValueError: Log of size 3497160 is greater than the max size of 1000000

Marvin

07/18/2025, 5:35 PM

✅ thought for 33.3 seconds

Kevin Grismore

07/18/2025, 5:35 PM

Worker logs are reachable in the UI too! go to your work pool, then workers, then click on the worker name to see its logs

Kevin Grismore

07/18/2025, 5:35 PM

on cloud anyway

Marvin

07/18/2025, 5:36 PM

Based on the research, I can provide a comprehensive answer about handling both your parameter and log size limits: 1. Log Size Limit (1MB) You can configure this limit using environment variables:

Copy code

bash
# Increase the maximum log size to 5MB (5000000 bytes)
export PREFECT_LOGGING_MAX_SIZE=5000000

2. Alternative Logging Strategies: When dealing with large logs, consider these approaches: - Break down your logging into smaller chunks - Use log rotation - Log only critical information to Prefect and use external logging systems for verbose logs - Use Prefect's

get_run_logger()

with appropriate log levels:

Copy code

python
from prefect import get_run_logger

@flow
def your_flow():
    logger = get_run_logger()
    # Use appropriate log levels
    logger.debug("Detailed debug info")  # For detailed debugging
    <http://logger.info|logger.info>("Important but not huge info")  # For general information
    logger.error("Critical errors only")  # For errors

3. Handling Large Parameters (512KB limit) Since you're hitting both parameter and log size limits, here are some recommended approaches: a) Use Blocks for Large Data:

Copy code

python
from prefect.blocks.system import JSON

@flow
def your_flow():
    # Store large data in a block
    json_block = JSON(value=your_large_data)
    json_block.save("large-data-block")
    
    # Pass the block name instead of the data
    process_data(block_name="large-data-block")

@flow
def process_data(block_name: str):
    # Retrieve data from block
    data = JSON.load(block_name).value
    # Process data

b) Use File Storage:

Copy code

python
from pathlib import Path

@flow
def your_flow():
    # Save large data to file
    temp_path = Path("/tmp/large_data.json")
    with open(temp_path, "w") as f:
        json.dump(your_large_data, f)
    
    # Pass file path instead of data
    process_data(file_path=str(temp_path))

c) Break Down into Smaller Chunks:

Copy code

python
@flow
def main_flow(large_data):
    # Break data into chunks
    chunks = [large_data[i:i+1000] for i in range(0, len(large_data), 1000)]
    
    # Process each chunk separately
    for chunk in chunks:
        process_chunk(chunk)

@flow
def process_chunk(chunk):
    # Process smaller amount of data
    pass

4. Monitoring and Debugging To better track these issues: - Use the Prefect UI's logs view for successful runs - Check worker logs for deployment issues - Set up external logging for detailed debugging - Use

prefect flow-run logs <flow-run-id>

for specific flow runs Would you like me to elaborate on any of these solutions or provide more specific examples for your use case? Also, it would be helpful to know more about the nature of your large parameters - are they data files, configuration settings, or something else?

Janet Carson

07/18/2025, 5:36 PM

@Kevin Grismore I'm on cloud, but the log was too long to be sent back to prefect, see second note above

Janet Carson

07/18/2025, 5:36 PM

(I'm on cloud, but local infrastructure)

Kevin Grismore

07/18/2025, 5:36 PM

that log being the error from the worker? was it a huge stack trace

Janet Carson

07/18/2025, 5:37 PM

Not a terribly huge stack trace, but a terribly huge parameter that was trying to be printed

Janet Carson

07/18/2025, 5:37 PM

and logged

Janet Carson

07/18/2025, 5:37 PM

and serialized

Janet Carson

07/18/2025, 5:38 PM

Maybe "quote" it?

Kevin Grismore

07/18/2025, 5:39 PM

ahhh I see

Kevin Grismore

07/18/2025, 5:40 PM

quoting just skips in-memory recursive walking of the param to see if it was returned from another task

upvote 1

Kevin Grismore

07/18/2025, 5:41 PM

if it's large data, you could write it out to blob storage, pass the path to it as a param, and read it in in your subflow

upvote 1

Janet Carson

07/18/2025, 5:42 PM

So, if I move pulling the parameter from the database from the parent flow, to the start of the subflow, and then "quote" it for each task in the subflow, that might work?

Janet Carson

07/18/2025, 7:21 PM

@Kevin Grismore FYI - Logged #18545 and #18546 on github related to this problem.

Nate

07/18/2025, 7:36 PM

quoting the parameters will not make a difference in terms of what is sent to the api, like kevin said it will just skip recursive exploration of the object that is the parameter (used to automatically resolve futures etc). also worth mentioning that task run parameters are never sent to the API, whereas flow parameters are

Nate

07/18/2025, 7:36 PM

taking a look at the issues, thanks!

5 Views

Open in Slack

Previous Next