Skip to content
GitLab
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Sign in
Toggle navigation
Menu
Open sidebar
pegasus-isi
ACME-Workflow
Commits
f9ab44d9
Commit
f9ab44d9
authored
Mar 12, 2015
by
Juve, Gideon
Browse files
Add setup script
parent
059baf0e
Changes
7
Hide whitespace changes
Inline
Side-by-side
README.md
View file @
f9ab44d9
...
...
@@ -6,77 +6,45 @@ Pegasus workflow for ACME climate models.
Consult the
[
CESM User's Guide
](
http://www.cesm.ucar.edu/models/cesm1.2/cesm/doc/usersguide/book1.html
)
for more information about the climate code used by this workflow.
On Hopper
---------
You need to create the case while logged into Hopper.
In the following, assume that these variables are set to the values below:
ACMEPATH = /global/project/projectdirs/m2187/acme_code/ACME
CASENAME = F1850.g37.case2
SCRATCHDIR = /scratch/scratchdirs/juve/
CASEDIR = $SCRATCHDIR/$CASENAME
Note that the "shared-scratch" directory in your site catalog must match the
value for SCRATCHDIR above. This is because we use CASENAME as the --relative-dir
option for the Pegasus planner so that CASEDIR will become Pegasus' scratch dir.
1.
Create the case directory. From $ACMEPATH/scripts run:
$ ./create_newcase -case $CASEDIR -mach hopper -compset F1850 -res T31_g37 -project m2187
In this case "m2187" is our Hopper project ID, and "mach" means "machine"
and should be set to "hopper". The "compset" defines the component set, or the
set of model components and configuration you want to use, and "res" specifies
the resolution, or grid setting. Possible compsets include: F1850, B1850.
Possible grids include: ne30_g16, T31_g37.
2.
Make any manual changes to the case that are required for your simulation.
3.
Setup the case. From $CASEDIR run:
$ ./cesm_setup
4.
Compile the code. This should take about 20 minutes. From $CASEDIR run:
$ ./$CASENAME.build
On Submit Host
--------------
Steps to Run the Workflow
-------------------------
1.
Create/edit the configuration file (e.g. test.cfg)
a. Set "casename" to match the name of your case.
b. Set "mppwidth" to
match the $CASENAME.run script in your case dir
b. Set "mppwidth" to
the number of cores that your run requires.
c. Set "stop_n" and "walltime" to create the number of stages you want
the workflow to have.
2.
Generate the DAX
2.
Create/edit the setup script (e.g. setup.sh)
a. Set the create_newcase parameters
3.
Generate the DAX
$ python daxgen.py test.cfg
$CASE
NAME
$ python daxgen.py test.cfg
setup.sh DIR
NAME
3
.
Edit the site catalog, sites.xml:
4
.
Edit the site catalog, sites.xml:
a. Update the "shared-scratch" directory entry to have your username
b. Update the "shared-storage" directory entry
4
.
Plan the DAX
5
.
Plan the DAX
$ ./plan.sh
$CASE
NAME
$ ./plan.sh
DIR
NAME
5
.
Get NERSC grid proxy using:
6
.
Get NERSC grid proxy using:
$ myproxy-logon -s nerscca.nersc.gov:7512 -t 24 -T -l YOUR_NERSC_USERNAME
6
.
Follow output of plan.sh to submit workflow
7
.
Follow output of plan.sh to submit workflow
$ pegasus-run
$CASENAME/submit/$CASENAME
$ pegasus-run
DIRNAME/path/to/submit/dir
7
.
Monitor the workflow:
8
.
Monitor the workflow:
$ pegasus-status -l
$CASENAME/submit/$CASENAME
$ pegasus-status -l
DIRNAME/path/to/submit/dir
bin/acme-setup.sh
0 → 100644
View file @
f9ab44d9
#!/bin/bash
# This script creates a new case, runs cesm_setup, and builds the code
set
-e
if
[
-z
"
$ACMEROOT
"
]
;
then
echo
"ERROR: Set the ACMEROOT environment variable in sites.xml"
exit
1
fi
if
!
[
-d
"
$ACMEROOT
"
]
;
then
echo
"ERROR: ACMEROOT does not point to a valid path"
exit
1
fi
function
usage
()
{
echo
"Usage:
$0
-case CASENAME -setup SETUP_SCRIPT"
}
if
[
$#
-eq
0
]
;
then
usage
exit
1
fi
CONTINUE_RUN
=
FALSE
while
[
"$#"
-ne
0
]
;
do
case
"
$1
"
in
-case
)
shift
CASENAME
=
$1
;;
-setup
)
shift
SETUP_SCRIPT
=
$1
;;
*
)
usage
exit
1
;;
esac
shift
done
if
[
-z
"
$CASENAME
"
]
;
then
echo
"ERROR: Specify -case"
exit
1
fi
if
[
-z
"
$SETUP_SCRIPT
"
]
;
then
echo
"ERROR: Sepcify -setup"
exit
1
fi
SCRATCHDIR
=
$PWD
export
CASEROOT
=
$SCRATCHDIR
/
$CASENAME
echo
"CASENAME is
$CASENAME
"
echo
"Setting CASEROOT to
$CASEROOT
"
echo
"Setup script is
$SETUP_SCRIPT
"
# Clean up after any previous failed run
if
[
-d
"
$CASEROOT
"
]
;
then
echo
"WARNING: CASEROOT already exists:
$CASEROOT
"
echo
"Removing existing CASEROOT"
rm
-rf
$CASEROOT
fi
# Mark the setup script as executable
chmod
755
$SETUP_SCRIPT
# Run the setup script
./
$SETUP_SCRIPT
daxgen.py
View file @
f9ab44d9
...
...
@@ -2,6 +2,7 @@
import
os
import
sys
import
re
import
shutil
import
string
from
ConfigParser
import
ConfigParser
from
Pegasus.DAX3
import
*
...
...
@@ -27,9 +28,10 @@ def format_template(name, outfile, **kwargs):
f
.
close
()
class
ACMEWorkflow
(
object
):
def
__init__
(
self
,
outdir
,
config
):
def
__init__
(
self
,
outdir
,
setup
,
config
):
"'outdir' is the directory where the workflow is written, and 'config' is a ConfigParser object"
self
.
outdir
=
outdir
self
.
setup
=
setup
self
.
config
=
config
self
.
daxfile
=
os
.
path
.
join
(
self
.
outdir
,
"dax.xml"
)
self
.
replicas
=
{}
...
...
@@ -71,6 +73,18 @@ class ACMEWorkflow(object):
f
=
open
(
path
,
"w"
)
try
:
f
.
write
(
"""
tr acme-setup {
site local {
pfn "file://%s/bin/acme-setup.sh"
arch "x86_64"
os "linux"
type "STAGEABLE"
profile globus "count" "1"
profile globus "jobtype" "single"
profile hints "globusScheduler" "auxiliary"
}
}
tr acme-run {
site local {
pfn "file://%s/bin/acme-run.sh"
...
...
@@ -107,7 +121,7 @@ tr acme-amwg {
profile pegasus "exitcode.failuremsg" "Segmentation fault"
}
}
"""
%
(
DAXGEN_DIR
,
self
.
mppwidth
,
DAXGEN_DIR
,
DAXGEN_DIR
))
"""
%
(
DAXGEN_DIR
,
DAXGEN_DIR
,
self
.
mppwidth
,
DAXGEN_DIR
,
DAXGEN_DIR
))
finally
:
f
.
close
()
...
...
@@ -129,23 +143,32 @@ tr acme-amwg {
"Generate a workflow (DAX, config files, and replica catalog)"
dax
=
ADAG
(
self
.
casename
)
last
=
None
if
self
.
stop_option
in
[
"nyear"
,
"nyears"
]:
amwg
=
True
else
:
print
"WARNING: Diagnostics not added to workflow unles stop option is 'nyears'. Current setting is '%s'"
%
self
.
stop_option
amwg
=
False
# Add the setup stage
setupscript
=
File
(
self
.
setup
)
setup
=
Job
(
name
=
"acme-setup"
)
setup
.
addArguments
(
"-case"
,
self
.
casename
,
"-setup"
,
setupscript
)
setup
.
uses
(
setupscript
,
link
=
Link
.
INPUT
,
register
=
False
,
transfer
=
True
)
dax
.
addJob
(
setup
)
self
.
add_replica
(
self
.
setup
,
os
.
path
.
join
(
self
.
outdir
,
self
.
setup
))
last
=
None
tot_years
=
0
i
=
1
for
stop_n
,
walltime
in
zip
(
self
.
stop_n
,
self
.
walltime
):
stage
=
Job
(
name
=
"acme-run"
)
if
i
>
1
:
stage
.
addArguments
(
"-continue"
)
stage
.
addArguments
(
"-stage %s -stop %s -n %s"
%
(
i
,
self
.
stop_option
,
stop_n
))
stage
.
addProfile
(
Profile
(
namespace
=
"globus"
,
key
=
"maxwalltime"
,
value
=
walltime
))
dax
.
addJob
(
stage
)
if
i
==
1
:
dax
.
depends
(
stage
,
setup
)
else
:
stage
.
addArguments
(
"-continue"
)
if
last
is
not
None
:
dax
.
depends
(
stage
,
last
)
...
...
@@ -198,34 +221,42 @@ tr acme-amwg {
dax
.
writeXMLFile
(
self
.
daxfile
)
def
generate_workflow
(
self
):
if
os
.
path
.
isdir
(
self
.
outdir
):
raise
Exception
(
"Directory exists: %s"
%
self
.
outdir
)
# Create the output directory
self
.
outdir
=
os
.
path
.
abspath
(
self
.
outdir
)
os
.
makedirs
(
self
.
outdir
)
self
.
generate_dax
()
self
.
generate_replica_catalog
()
self
.
generate_transformation_catalog
()
self
.
generate_env
()
def
main
():
if
len
(
sys
.
argv
)
!=
3
:
raise
Exception
(
"Usage: %s CONFIGFILE OUTDIR"
%
sys
.
argv
[
0
])
if
len
(
sys
.
argv
)
!=
4
:
raise
Exception
(
"Usage: %s CONFIGFILE
SETUP
OUTDIR"
%
sys
.
argv
[
0
])
configfile
=
sys
.
argv
[
1
]
outdir
=
sys
.
argv
[
2
]
setup
=
sys
.
argv
[
2
]
outdir
=
sys
.
argv
[
3
]
if
not
os
.
path
.
isfile
(
configfile
):
raise
Exception
(
"No such file: %s"
%
configfile
)
raise
Exception
(
"Invalid CONFIGFILE: No such file: %s"
%
configfile
)
if
not
os
.
path
.
isfile
(
setup
):
raise
Exception
(
"Invalid SETUP script: No such file: %s"
%
setup
)
outdir
=
os
.
path
.
abspath
(
outdir
)
if
os
.
path
.
isdir
(
outdir
):
raise
Exception
(
"Directory exists: %s"
%
outdir
)
# Read the config file
config
=
ConfigParser
()
config
.
read
(
configfile
)
# Create the output directory
os
.
makedirs
(
outdir
)
# Save a copy of the config file and setup script
shutil
.
copy
(
configfile
,
outdir
)
shutil
.
copy
(
setup
,
outdir
)
# Generate the workflow in outdir based on the config file
workflow
=
ACMEWorkflow
(
outdir
,
config
)
workflow
=
ACMEWorkflow
(
outdir
,
os
.
path
.
basename
(
setup
),
config
)
workflow
.
generate_workflow
()
...
...
plan.sh
View file @
f9ab44d9
...
...
@@ -36,7 +36,6 @@ pegasus-plan \
--conf
$PP
\
--dax
$DAX
\
--dir
$SUBMIT_DIR
\
--relative-dir
$CASENAME
\
--sites
$SITE
\
--output-site
$OUTPUT_SITE
\
--cleanup
none
\
...
...
setup-hopper-F1850-T31_g37.sh
0 → 100644
View file @
f9ab44d9
#!/bin/bash
# This is an example ACME setup script. The workflow runs this script with
# several environment variables:
#
# ACMEROOT: This is the path to the ACME/CESM source code
# CASENAME: The name of the case
# CASEROOT: This is the path to the case directory
#
# This script should run create_newcase, cesm_setup and $CASENAME.build.
#
set
-e
$ACMEROOT
/scripts/create_newcase
-case
$CASEROOT
-mach
hopper
-compset
F1850
-res
T31_g37
cd
$CASEROOT
./cesm_setup
./
$CASENAME
.build
sites.xml
View file @
f9ab44d9
...
...
@@ -10,14 +10,16 @@
<site
handle=
"hopper"
arch=
"x86_64"
os=
"LINUX"
>
<grid
type=
"gt5"
contact=
"hoppergrid.nersc.gov/jobmanager"
scheduler=
"Fork"
jobtype=
"auxillary"
/>
<grid
type=
"gt5"
contact=
"hoppergrid.nersc.gov/jobmanager-pbs"
scheduler=
"PBS"
jobtype=
"compute"
/>
<!--
Your
case
dir
must be in shared-scratch
. Put your username here instead of "juve". -->
<!--
This is where the
casedir
goes
. Put your username here instead of "juve". -->
<directory
type=
"shared-scratch"
path=
"/scratch/scratchdirs/juve"
>
<file-server
operation=
"all"
url=
"gsiftp://hoppergrid.nersc.gov/scratch/scratchdirs/juve"
/>
</directory>
<!-- This is where output files go. -->
<directory
type=
"shared-storage"
path=
"/project/projectdirs/m2187/pegasus"
>
<file-server
operation=
"all"
url=
"gsiftp://hoppergrid.nersc.gov/project/projectdirs/m2187"
/>
</directory>
<profile
namespace=
"env"
key=
"PEGASUS_HOME"
>
/project/projectdirs/m2187/pegasus/pegasus-4.4.0
</profile>
<profile
namespace=
"env"
key=
"ACMEROOT"
>
/global/project/projectdirs/m2187/acme_code/ACME
</profile>
<profile
namespace=
"env"
key=
"DIAG_HOME"
>
/project/projectdirs/m2187/amwg/amwg_diagnostics
</profile>
<profile
namespace=
"env"
key=
"MAGICK_HOME"
>
/project/projectdirs/m2187/ImageMagick-6.9.0.4
</profile>
<profile
namespace=
"globus"
key=
"project"
>
m2187
</profile>
...
...
test.cfg
View file @
f9ab44d9
[acme]
# This is the name of the case. This should be the name of the directory
# relative to "shared-scratch" from the site catalog. For example, if
# your shared-scratch is /scratch/juve, and casename is mycase, then the
# case should be set up in /scratch/juve/mycase
# This is the name of the case.
casename = F1850.g37.case2
# This is the number of cores to use for each stage
# This is the number of cores to use for each stage. For now, this needs to
# be set manually. Eventually we might be able to derive it automatically
# from the compset and grid.
mppwidth = 24
# This is the unit of simulation time
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment