Paul Gierz
12/22/2021, 4:34 PM@task
def get_n_newest_files_for_pattern(pattern: str, path: str, n: int) -> list:
"""
Task to get the n newest files for a given pattern.
"""
logger = prefect.context.get("logger")
<http://logger.info|logger.info>(f"Getting {n} newest files in {path} for pattern {pattern}")
path_files = os.listdir(path)
files_with_path = [os.path.join(path, f) for f in path_files]
files = [pathlib.Path(f) for f in files_with_path if re.search(pattern, f)]
<http://logger.info|logger.info>(f"Found {len(files)} files for pattern {pattern}")
logger.debug(f"Files: {files}")
<http://logger.info|logger.info>("Sorting files by modification time")
files.sort(key=lambda x: x.stat().st_mtime, reverse=True)
<http://logger.info|logger.info>(f"Returning the {n} newest files")
return files[:n]
I am getting file not found errors with:
FileNotFoundError: [Errno 2] No such file or directory: '<Parameter: Path to the top level of the experiment tree>/outdata/fesom'
I thought that once it was loaded, any Parameter
would behave as whatever type it is supposed to be?
It is defined like this:
path = Parameter(name="Path to the top level of the experiment tree")
I one time before do an f-string conversion:
outdata_path = f"{path}/outdata/fesom"
Not having f-strings would be possible, but a bit annoyingZanie
Paul Gierz
12/22/2021, 4:53 PM18 with Flow(
17 "Regridded Timmean of Newest N Files for a FESOM 2D Variable (ESM Tools Layout)"
16 ) as flow:
15 # Get the experiment ID
14 expid = Parameter(name="Experiment ID")
13 # Get the main path of the output directory from the top of the experiment tree from the user:
12 path = Parameter(name="Path to the top level of the experiment tree")
11 # Get the 2D variable as a user parameter
10 varname = Parameter(name="FESOM Variable Name")
9 # Get the number of files to average as a user parameter
8 nfiles = Parameter(name="nfiles", default=30)
7 # Get the regrid size from the user:
6 lat_size = Parameter("Latitude Size (e.g 1 for a 1x1 degree grid)", default=1.0)
5 lon_size = Parameter("Longitude Size (e.g 1 for a 1x1 degree grid)", default=1.0)
4
3 lons = np_arange(-180, 180, lon_size)
2 lats = np_arange(-90, 90, lat_size)
1 output_dir = path + "/outdata/fesom"
68 pattern = finalize_pattern(varname)
1 # Get all files in the output directory
2 files = get_n_newest_files_for_pattern(pattern, output_dir, nfiles)
Paul Gierz
12/22/2021, 4:54 PMoutput_dir = f"{path}/outdata/fesom"
Paul Gierz
12/22/2021, 4:55 PMZanie
Flow
block. Any mutation needs to happen in a task e.g. output_dir = path + "/outdata/fesom"
will throw an error because path
is still a Parameter
type until it is sent to a task at runtime.Paul Gierz
12/22/2021, 5:04 PMZanie
Zanie
Paul Gierz
12/22/2021, 5:11 PMZanie
Paul Gierz
12/22/2021, 5:27 PMZanie