Difference between revisions of "WRF on the Cloud"
(38 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | = | + | = WRF on AWS: LADCO User's Guide = |
− | + | Last Update: 16May2019 | |
+ | How to configure/optimize AWS for running WRF. | ||
+ | == Summary == | ||
+ | * AWS pcluster v2.1.0 instance using ALinux | ||
+ | * WRFv3.9.1 compiled with netCDF4 (compression) | ||
+ | * PGI compiler 2018 with OpenMPI v3.1.3 | ||
+ | * NetCDF C 4.6.2, Fortran 4.2 | ||
+ | * Spot instances with sc1 cold storage volumes | ||
− | + | The AWS Parallel Cluster package is a way to build a computing cluster such that there is constantly running master node that is used to launch jobs on compute nodes. The compute nodes are only started when a job is initiated. The compute nodes will shut down after a default 10 minutes of idle use (you can change the idle time through the pcluster config file). This system lets you choose different instance types for the master and compute nodes. We chose an inexpensive master instance (c4.large @ 2 CPU 3.75 Gb RAM) and compute optimized compute instances (c4.4xlarge @ 16 CPU 30 Gb RAM). We attached a 10 Tb EBS volume (sc1 Cold HDD) for storage. | |
− | |||
− | |||
− | + | For the workflow, we run an operational script system that downloads MADIS obs, runs WPS, REAL, and WRF. It also runs a script to replace the NOAA SST with GLSEA SST. The simulation is run as multiple 32 CPU 5.5 day simulations at the same time. | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | After WRF completes, we run MCIP and WRFCAMx, and ingest the results into AMET. We archive the wrfout, MCIP, and WRFCAMx data to S3 Glacier. | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | == | + | == Notes == |
− | * | + | * No success with MPICH (v3.2.1) on AWS ALinux. Tried with GCC (7.2.1), PGI (2018), and Intel (xe 2019) compilers; also tried WRFV4.0 and nothing worked with MPICH. WRF compiles, but it was unstable and crashed consistently with segfaults after a seemingly random number of output timesteps |
+ | * Stable WRF is created with OpenMPI; ultimately settled on PGI 2018 and OpenMPI v3.1.3 | ||
+ | * Prototyping and testing done with EC2 ondemand m5a.* instance (see ondemand pcluster config below) and gp2 EBS volumes | ||
+ | * Production done with EC2 spot c4* instances (see spot pcluster config below) and sc1 volumes | ||
− | == | + | == AWS Parallel Cluster == |
− | * AWS | + | * [https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html Install AWS-CLI] |
− | ** Don't | + | * [https://aws.amazon.com/blogs/opensource/aws-parallelcluster/ Install Pcluster] |
− | + | * Don't set the spot price. When you do not set a spot price, AWS will give you the [https://aws.amazon.com/blogs/compute/new-amazon-ec2-spot-pricing/ spot market price capped at the on-demand price]. Setting the spot price makes the instance more prone to being reclaimed and having your job terminated. As there is currently no functionality automatically enabled on EC2 instances for checkpoint/restart, losing an instance is a show-stopper for WRF production runs. For WRF MPI applications it's not worth playing the spot market if the tradeoff is instance reliability. | |
− | + | * Use the base alinux AMI for your version of pcluster; e.g., for v2.4.0 : https://github.com/aws/aws-parallelcluster/blob/v2.4.0/amis.txt | |
− | * | + | Configure the cluster with a config file: |
− | |||
− | |||
− | === | + | === spot pcluster config === |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | === | + | [aws] |
− | * | + | aws_region_name = us-east-2 |
− | + | ||
− | * | + | [cluster ladcospot] |
− | * | + | vpc_settings = public |
+ | ebs_settings = ladcosc1 | ||
+ | scheduler = sge | ||
+ | master_instance_type = c4.large | ||
+ | compute_instance_type = c4.4xlarge | ||
+ | placement = computer | ||
+ | placement_group = DYNAMIC | ||
+ | master_root_volume_size = 40 | ||
+ | cluster_type = spot | ||
+ | #spot_price = 0.2 | ||
+ | base_os = alinux | ||
+ | key_name = ***** | ||
+ | # Base AMI for pcluster v2.1.0 | ||
+ | custom_ami = ami-0381cb7486cdc973f | ||
+ | |||
+ | # Create a cold storage I/O directory | ||
+ | [ebs ladcosc1] | ||
+ | shared_dir = data | ||
+ | volume_type = sc1 | ||
+ | volume_size = 10000 | ||
+ | volume_iops = 1500 | ||
+ | encrypted = false | ||
+ | |||
+ | [vpc public] | ||
+ | master_subnet_id = subnet-****** | ||
+ | vpc_id = vpc-****** | ||
+ | |||
+ | [global] | ||
+ | update_check = true | ||
+ | sanity_check = true | ||
+ | cluster_template = ladcospot | ||
+ | |||
+ | [aliases] | ||
+ | ssh = ssh -Y {CFN_USER}@{MASTER_IP} {ARGS} | ||
− | === | + | === on demand pcluster config === |
− | |||
− | |||
− | |||
− | |||
− | = | + | [aws] |
− | == WRF == | + | aws_region_name = us-east-2 |
− | * | + | |
− | * | + | [cluster ladcowrf] |
+ | vpc_settings = public | ||
+ | ebs_settings = ladcowrf | ||
+ | scheduler = sge | ||
+ | master_instance_type = m4.large | ||
+ | compute_instance_type = m5a.4xlarge | ||
+ | placement = cluster | ||
+ | placement_group = DYNAMIC | ||
+ | master_root_volume_size = 40 | ||
+ | cluster_type = ondemand | ||
+ | base_os = alinux | ||
+ | key_name = ***** | ||
+ | min_vcpus = 0 | ||
+ | max_vcpus = 64 | ||
+ | desired_vcpus = 0 | ||
+ | # Base AMI for pcluster v2.1.0 | ||
+ | custom_ami = ami-0381cb7486cdc973f | ||
+ | |||
+ | [ebs ladcowrf] | ||
+ | shared_dir = data | ||
+ | volume_type = gp2 | ||
+ | volume_size = 10000 | ||
+ | volume_iops = 1500 | ||
+ | encrypted = false | ||
+ | |||
+ | [vpc public] | ||
+ | master_subnet_id = subnet-****** | ||
+ | vpc_id = vpc-****** | ||
+ | |||
+ | [global] | ||
+ | update_check = true | ||
+ | sanity_check = true | ||
+ | cluster_template = ladcowrf | ||
+ | |||
+ | [aliases] | ||
+ | ssh = ssh -Y {CFN_USER}@{MASTER_IP} {ARGS} | ||
+ | |||
+ | === Cluster Access === | ||
+ | |||
+ | Start the cluster | ||
+ | pcluster create -c config.spot ladcospot | ||
+ | |||
+ | Log in to the cluster | ||
+ | pcluster ssh ladcospot -i {name of your AWS Key} | ||
+ | |||
+ | === Fault Tolerance === | ||
+ | |||
+ | Script to monitor pcluster system logs for termination notice, and restart WRF. | ||
+ | |||
+ | #!/bin/bash | ||
+ | CASE=LADCO_2016_WRFv39_YNT_NAM | ||
+ | JSTART=2016095 | ||
+ | wrk_dir=/data/apps/WRFV3.9.1/sims/LADCO_2016_WRFv39_YNT_NAM | ||
+ | while true | ||
+ | do | ||
+ | if [ -z $(curl -s http://169.254.169.254/latest/meta-data/spot/termination-time | grep -q .*T.*Z ) ]; then | ||
+ | echo "terminated" | ||
+ | break | ||
+ | else | ||
+ | echo "Still running fine" | ||
+ | sleep 3 | ||
+ | fi | ||
+ | done | ||
+ | echo "Restarting WRF job" | ||
+ | #qsub -N WRF_rest $wrk_dir/wrapper_restart_wrf.csh $JSTART 6 | ||
+ | ~ | ||
+ | |||
+ | == AWS Configuration == | ||
+ | Packages/software installed: | ||
+ | * PGI 2018 | ||
+ | * GCC and Gfortran 7.2.1 | ||
+ | * NetCDF C 4.6.2, Fortran 4.2 | ||
+ | * HDF5 1.10.1 | ||
+ | * JASPER 1.900.2 | ||
+ | * ZLIB 1.2.11 | ||
+ | * R 3.4.1 | ||
+ | * OpenMPI 3.1.3 | ||
+ | * yum -y install screen dstat htop strace perf pdsh ImageMagick |
Latest revision as of 17:43, 26 March 2021
Contents
WRF on AWS: LADCO User's Guide
Last Update: 16May2019 How to configure/optimize AWS for running WRF.
Summary
- AWS pcluster v2.1.0 instance using ALinux
- WRFv3.9.1 compiled with netCDF4 (compression)
- PGI compiler 2018 with OpenMPI v3.1.3
- NetCDF C 4.6.2, Fortran 4.2
- Spot instances with sc1 cold storage volumes
The AWS Parallel Cluster package is a way to build a computing cluster such that there is constantly running master node that is used to launch jobs on compute nodes. The compute nodes are only started when a job is initiated. The compute nodes will shut down after a default 10 minutes of idle use (you can change the idle time through the pcluster config file). This system lets you choose different instance types for the master and compute nodes. We chose an inexpensive master instance (c4.large @ 2 CPU 3.75 Gb RAM) and compute optimized compute instances (c4.4xlarge @ 16 CPU 30 Gb RAM). We attached a 10 Tb EBS volume (sc1 Cold HDD) for storage.
For the workflow, we run an operational script system that downloads MADIS obs, runs WPS, REAL, and WRF. It also runs a script to replace the NOAA SST with GLSEA SST. The simulation is run as multiple 32 CPU 5.5 day simulations at the same time.
After WRF completes, we run MCIP and WRFCAMx, and ingest the results into AMET. We archive the wrfout, MCIP, and WRFCAMx data to S3 Glacier.
Notes
- No success with MPICH (v3.2.1) on AWS ALinux. Tried with GCC (7.2.1), PGI (2018), and Intel (xe 2019) compilers; also tried WRFV4.0 and nothing worked with MPICH. WRF compiles, but it was unstable and crashed consistently with segfaults after a seemingly random number of output timesteps
- Stable WRF is created with OpenMPI; ultimately settled on PGI 2018 and OpenMPI v3.1.3
- Prototyping and testing done with EC2 ondemand m5a.* instance (see ondemand pcluster config below) and gp2 EBS volumes
- Production done with EC2 spot c4* instances (see spot pcluster config below) and sc1 volumes
AWS Parallel Cluster
- Install AWS-CLI
- Install Pcluster
- Don't set the spot price. When you do not set a spot price, AWS will give you the spot market price capped at the on-demand price. Setting the spot price makes the instance more prone to being reclaimed and having your job terminated. As there is currently no functionality automatically enabled on EC2 instances for checkpoint/restart, losing an instance is a show-stopper for WRF production runs. For WRF MPI applications it's not worth playing the spot market if the tradeoff is instance reliability.
- Use the base alinux AMI for your version of pcluster; e.g., for v2.4.0 : https://github.com/aws/aws-parallelcluster/blob/v2.4.0/amis.txt
Configure the cluster with a config file:
spot pcluster config
[aws] aws_region_name = us-east-2 [cluster ladcospot] vpc_settings = public ebs_settings = ladcosc1 scheduler = sge master_instance_type = c4.large compute_instance_type = c4.4xlarge placement = computer placement_group = DYNAMIC master_root_volume_size = 40 cluster_type = spot #spot_price = 0.2 base_os = alinux key_name = ***** # Base AMI for pcluster v2.1.0 custom_ami = ami-0381cb7486cdc973f # Create a cold storage I/O directory [ebs ladcosc1] shared_dir = data volume_type = sc1 volume_size = 10000 volume_iops = 1500 encrypted = false [vpc public] master_subnet_id = subnet-****** vpc_id = vpc-****** [global] update_check = true sanity_check = true cluster_template = ladcospot [aliases] ssh = ssh -Y {CFN_USER}@{MASTER_IP} {ARGS}
on demand pcluster config
[aws] aws_region_name = us-east-2 [cluster ladcowrf] vpc_settings = public ebs_settings = ladcowrf scheduler = sge master_instance_type = m4.large compute_instance_type = m5a.4xlarge placement = cluster placement_group = DYNAMIC master_root_volume_size = 40 cluster_type = ondemand base_os = alinux key_name = ***** min_vcpus = 0 max_vcpus = 64 desired_vcpus = 0 # Base AMI for pcluster v2.1.0 custom_ami = ami-0381cb7486cdc973f [ebs ladcowrf] shared_dir = data volume_type = gp2 volume_size = 10000 volume_iops = 1500 encrypted = false [vpc public] master_subnet_id = subnet-****** vpc_id = vpc-****** [global] update_check = true sanity_check = true cluster_template = ladcowrf [aliases] ssh = ssh -Y {CFN_USER}@{MASTER_IP} {ARGS}
Cluster Access
Start the cluster
pcluster create -c config.spot ladcospot
Log in to the cluster
pcluster ssh ladcospot -i {name of your AWS Key}
Fault Tolerance
Script to monitor pcluster system logs for termination notice, and restart WRF.
#!/bin/bash CASE=LADCO_2016_WRFv39_YNT_NAM JSTART=2016095 wrk_dir=/data/apps/WRFV3.9.1/sims/LADCO_2016_WRFv39_YNT_NAM while true do if [ -z $(curl -s http://169.254.169.254/latest/meta-data/spot/termination-time | grep -q .*T.*Z ) ]; then echo "terminated" break else echo "Still running fine" sleep 3 fi done echo "Restarting WRF job" #qsub -N WRF_rest $wrk_dir/wrapper_restart_wrf.csh $JSTART 6
~
AWS Configuration
Packages/software installed:
- PGI 2018
- GCC and Gfortran 7.2.1
- NetCDF C 4.6.2, Fortran 4.2
- HDF5 1.10.1
- JASPER 1.900.2
- ZLIB 1.2.11
- R 3.4.1
- OpenMPI 3.1.3
- yum -y install screen dstat htop strace perf pdsh ImageMagick