We have a task that waits for a Spark Step to finish and if Prefect Community #ask-community

We have a task that waits for a Spark Step to fini...

Jacob Blanco

01/23/2025, 1:07 AM

We have a task that waits for a Spark Step to finish and if the step fails, it gathers the Spark error and emits it as an exception so that it gets logged in Prefect Cloud. We've noticed that the Spark Scala stacktrace is shown twice in Cloud as you can see in the example below. Any ideas how to stop that from happening? Is it something specific to our setup or is that expected behaviour??? Spark traces are massive and having them twice, makes logs hard to read and debug.

Copy code

Encountered exception during execution: Exception('Step s-08129852Y2AW3G3A7H9F failed with status FAILED. Reason: Unknown Error. Message: \nYmir error stacktrace: java.util.concurrent.ExecutionException: org.apache.spark.SparkException: Job 0 cancelled because SparkContext was shut down\n\tat com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)\n\tat com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293)\n\tat com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)\n\tat com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135)\n\tat com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410)\n\tat com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2380)\n\tat

/// Truncated for clarity

)
Traceback (most recent call last):
  File "/opt/venv/lib/python3.11/site-packages/prefect/flow_engine.py", line 710, in run_context
    yield self
  File "/opt/venv/lib/python3.11/site-packages/prefect/flow_engine.py", line 754, in run_flow_sync
    engine.call_flow_fn()

  /// Truncated Stack trace for clarity

  File "/opt/venv/lib/python3.11/site-packages/Tasks/ymir/run_tasks.py", line 178, in run_command
    _emr.wait_for_step_finished(cluster_id, step_id, job_timeout_minutes)
  File "/opt/venv/lib/python3.11/site-packages/Tasks/ymir/_emr.py", line 321, in wait_for_step_finished
    raise Exception(
Exception: Step s-08129852Y2AW3G3A7H9F failed with status FAILED. Reason: Unknown Error. Message: 
Ymir error stacktrace: java.util.concurrent.ExecutionException: org.apache.spark.SparkException: Job 0 cancelled because SparkContext was shut down
	at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
	at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293)
	at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
	at com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135)
	at com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410)
	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2380)
	
/// Truncated for clarity

Janet Carson

01/23/2025, 5:27 PM

The message contents aren't the key to diagnosing and fixing this - You have to identify which logger(s) are doing the logging. You may need to set the "propagate" setting to false or up the log level on a logger somewhere...

4 Views

Open in Slack

Previous Next