Log In | Users | Register

Compressing COSMO output

Using the tool nczip written by Urs Beyerle (IAC), it is possible to compress the netCDF (.nc) COSMO output files and save a lot of disk space. nczip is a bash script which is based on the NCO command ncpdq (https://linux.die.net/man/1/ncpdq). The amount of disk space that can be saved using nczip strongly depends on the properties of the nc file that is compressed. For instance, for a field like precipitation, where most of the domain has zero values, the resulting size can reach down to 10% of the original file. On the other hand, for a strongly variable file like e.g. specific humidity, the saving is much smaller and the resulting file size around 40-60% of the original file.

To facilitate the compression of the typical COSMO output, we created a Python3 wrapper around nczip that allows to run the compression on the lffd.nc COSMO output files in parallel. The script was implemented and tested on Daint. It can be run with 12 tasks in parallel.

The code is currently stored on the HYMET group Gitlab page: https://gitlab.ethz.ch/hymet/nc_compression but we will likely move it somewhere else soon to make it readily accessible to everyone. The README file shows instructions how to run the code. The code is currently only in a first version and hopefully, it can be improved by the users with time. If you encounter problems running the code please contact christoph.heim@env.ethz.ch.

Running COSMO nested into ERA5

This section shares some first experience from running COSMO nested into ERA5. ERA5 has 30km horizontal resolution, 1-hourly time resolution and comes with 137 vertical levels. The setup presented here was used for a rather unconventional application of COSMO over the subtropical South East Atlantic. The lateral forcing over the subtropics is quite different from the the mid-latitudes. Thus, some of the conclusions drawn here may not be the same for a more typical application of COSMO.
  • ERA5 reanalysis data converted to int2lm-ready caf and cas files is available for the period 1979/01 - 2019/05 at DKRZ (https://www.dkrz.de). Information about the storage places of the data can be found under https://redc.clm-community.eu/projects/cclmdkrz/wiki/INT2LM_input_data_sets (login with your CLM-Community member account) (from Mail sent around the CLM community by Burkhardt Rockel on 13.12.2019).
  • Retrieving the files from the DKRZ tape system and copying them to daint takes long because the files are very large. The next steps are much shorter.
  • int2lm runs very slow (in our case slower than COSMO!) when using the obtained cas files directly. It seems that it scales very badly for large input files. To avoid this problem - and to save a lot of disk space - we cut out a subdomain from the cas files using NCO command ncks (e.g. ncks -O -F -d level1,30,137 -d level,30,136 -d lon,-33.5,23.5 -d lat,-30.5,20.5 cas_in.nc cas_out.nc). The uppermost vertical levels are usually not necessary to run COSMO. After cutting out the subdomain, int2lm runtimes were again comparable to those using global ERA-Interim files. The resulting files are also small enough (depending on the simulation period) that it may be feasible to store them locally.
  • ERA5 reanalysis data is given at 1-hourly time resolution. We did some tests and for our subtropical domain, it does not make a substantial difference if we use only 3-hourly forcing. This saved a lot of time because we only needed to copy 1/3 of the files from DKRZ. The lateral forcing on our subtropical domain is, however, relatively constant in time (Trade winds). On a typical mid-latitude domain, 1-hourly resolution may well give substantial benefits compared to 3-hourly resolution.
  • Given the high spatial resolution of ERA5 (30km), we realized that we can directly nest a 4.4km COSMO simulation into ERA5, avoiding an intermediate nesting step at e.g. 12km. Again, this may be different for another domain.
This site is managed by the Center for Climate Systems Modeling (C2SM).
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors. Ideas, requests, problems? Send feedback!