WRF on the Cloud

From LADCO Wiki
Revision as of 21:21, 28 November 2018 by Zac (talk | contribs)
Jump to: navigation, search

LADCO is seeking to understand the best practices for submitting and managing multiprocessor computing jobs on a cloud computing platform. In particular, LADCO would like to develop a WRF production environment that utilizes cloud-based computing. The goal of this project is to prototype a WRF production environment on a public, on-demand high performance computing service in the cloud to create a WRF platform-as-a-service (PaaS) solution. The WRF PaaS must meet the following objectives:

  • Configurable computing and storage to scale, as needed, to meet that needs of different WRF applications
  • Configurable WRF options to enable changing grids, simulation periods, physics options, and input data
  • Flexible cloud deployment from a command line interface to initiate computing clusters and spawn WRF jobs in the cloud

Call Notes

  • WRF Benchmarking (emulating WRF 2016 12/4/1.3 grids) costing for CPUs, RAM and Storage
  • CPU: 8 Cores: 5.5 day run = 4 days; 24 Cores: 3 days
  • RAM: ~22 Gb RAM/run (2.5 Gb/core)
  • Storage: test netCDF4 and netCDF no compression; with compression saves a lot of space (1/3 of the output) relative to uncompressed NCF (~70% compression); need to link in the HDF and NC4 libraries with compression to downstream programs; estimate about 5.8 Tb for the year, goes to 16.9 without compression

Costing analysis

  • Cluster management would launch a head node and compute nodes
  • 77 chunks, 20 computers for 16 days
  • Head node running constantly
  • Compute nodes running over the length of project
  • Can probably use 80 computers 4 days insteady of 20 in 16 days
  • Memory optimized machines performed better than compute optimized for CAMx
  • Storage