Merge branch 'docs/examples' into 'main' (ea3b243e) · Commits · Research Enablement / Xylem

examples/atmospheric_correction-pansharpening-orthorectification/README.md

0 → 100644

+103 −0

Original line number	Diff line number	Diff line
		# Configuration for: Atmospheric Correction, Pansharpening, and Orthorectification

		## Breaking Down the Configuration
		- Specify the input directory or series of directories:
		- Use the `input` value to parse through imagery in parallel.
		- In the associated `config.json`, there is a single directory listed. Xylem will parse through this main directory and concurrently process all available subdirectories.
		```json
		"input":
		"/data/test-cases"
		```
		- Multiple directories may also be specified. Note that when listing multiple directories, they are listed within `[ ]`, signaling to Xylem these specific directories should be concurrently processed.
		```json
		"input": [
		"/data/test-cases/test-case-GE01-medium-low-mid",
		"/data/test-cases/test-case-WV02-medium-mid-high",
		"/data/test-cases/test-case-wv03-large-high-mid",
		"/data/test-cases/test-case-wv03-large-mid-low"
		]
		```
		- Define each module:
		- Within the module variable, specify the `name` and `uri` to the module. The `uri` specifically informs Xylem from where to install the module environment. For example, if a user has Xylem installed and has cloned each of the modules in their `dev` directory within a Docker container, this path may look something like:
		```json
		"uri": "file:///root/dev/module-a"
		```
		- Define each module's variables:
		- Variables are unique to each module (detailed below).
		- Simply follow the `template`:
		- This section is built from the module's `Makefile`. Here, users can specify arguments and associated variables. Below is an example template for Module A:
		```json
		"template": {
		"command": "python",
		"environment": {
		"name": "module-a",
		"manager": "conda"
		},
		"arguments": [
		"-m",
		"lib",
		"--input_directory",
		"{{ INPUT }}",
		"--output_directory",
		"{{ OUTPUT_DIRECTORY }}",
		"--method",
		"{{ METHOD }}",
		"--profile",
		"{{ PROFILE }}"
		]
		}
		```

		## Module Variables
		For more thorough discussions on variables and overall structure, along with a link to technical documentation, please browse the README:
		- [Module A](https://code-int.ornl.gov/gshs/common/imagery-processing/module-a/-/blob/main/README.md)
		- [Module P](https://code-int.ornl.gov/gshs/common/imagery-processing/module-p/-/blob/main/README.md)
		- [Module O](https://code-int.ornl.gov/gshs/common/imagery-processing/module-o/-/blob/main/README.md)

		### Module A: *A*tmopsheric Correction
		- `input_directory`: the directory containing the raw, Level 1B Maxar imagery. Module A currently expects a Maxar data directory structure. This argument will always be `INPUT` in the template.
		- `output_directory`: the directory in which to store the output of Module A processing.
		- `method`: specification for users to define if they want the output to contain top-of-atmosphere reflectance (`toa_reflectance`) or bottom-of-atmosphere reflectance (`boa_reflectance`). For the best representation of true surface reflectance (e.g., the removal of the blue effects of the atmosphere), users should select the latter.
		- `profile`: specification for users to define the aersol profile selected in the Py6S model. Module A is currently optimized for `urban` applications, so consider using `maritime` sparingly.
		- Future work for Module A will incorporate the maritime-optimized atmospheric correction efforts led by Matt McCarthy.

		### Module P: *P*ansharpening
		- `source_directory`: the directory containing the output of Module A. This argument should match the `output_directory` of Module A.
		- `output_directory`: the directory in which to store the output of Module P processing. This argument should match both the `source_directory` for Module P and the `output_directory` of Module A.
		- `method`: specification for users to define a pansharpening method. Currently, only `nn_diffuse` is supported.
		- `module_list`: specification for users to identify all modules used in the given configuration. The presence, or absence, of the string `MODA` in this argument determines how the input directories are processed. In this example `config.json`, users would specify Modules A, P, and O. Options for this argument are: 'MODP', 'MODP, MODO', 'MODA, MODP', 'MODA, MODP, MODO'.

		### Module O: *O*rthorectification
		- 'source_directory': the directory containing the output of Module P. This argument should match the `output_directory` of Module P.
		- 'output_directory': the directory in which to store the output of Module O processing. This argument should match both the `source_directory` for Module O and the `output_directory` of Module P.

		## Running the Module
		- To run the standalone module independently of any workflow:
		```bash
		make run
		```
		- To run the module with Xylem, at maximum verbosity, using the `config.json`:
		```bash
		xylem run -vvvv
		```

		## General Notes for a Multi-Module Configuration
		- The general structure of the configuration remains the same:
		- Specify the input directory or series of directories:
		- There will still be the initial `input` value to include outside of the module definitions. This is generally going to be the directory (or specific subdirectories) of raw, level 1B imagery.
		- Define the modules:
		- Just as with a standalone configuration, users will need to identify all of the modules being used for processing, in the correct order and correct paths.
		- Define each module's variables:
		- All variables are defined as above. However, the only caveat with the multi-module implementation is aligning the input/source and output directories between modules.
		- The output of a given module will always be the same path as the input to the following module.
		- All non-primary modules will have the same path for their source and output directories.
		- Follow the template:
		- Just as with a standalone configuration, users can specify module arguments within each module's template.
		- The multi-module configuration will provide outputs at each stage of processing, which are used as input to subsequent stages. This design is meant to support easier troubleshooting between modules, if needed.
		- As imagery is processed, output files will be appended with the name of the processing module. Therefore, imagery processed with Modules A, P, and O will have filenames ending in `MODA_MODP_MODO`.
		- Module P also updates the middle of the filenames from `P1BS` for panchromatic imagery and `M1BS` for multispectral imagery to `S3XS` which is a Maxar naming standard and a naming convention upheld throughout the use of Legacy PIPE.
		- For multi-module processing, combinations can only be made in these two orders:
		> Module A -> Module P -> Module O

		> Module P -> Module O
		- Atmospheric correction is always the first processing step if it is included, and imagery must be pansharpened before it is orthorectified.
		No newline at end of file

examples/atmospheric_correction-pansharpening-orthorectification/config.json

0 → 100644

+99 −0

Original line number	Diff line number	Diff line
		{
		"version": "0.0.9",
		"description": "Test workflow configuration",
		"requirements": [
		"python",
		"conda"
		],
		"keywords": [],
		"input":
		"/data/test-cases",
		"modules": [
		{
		"name": "Module A (local)",
		"type": "script",
		"programmingLanguage": "python",
		"uri": "file:///path/to/module-a",
		"variables": {
		"OUTPUT_DIRECTORY": "/data/output",
		"METHOD": "boa_reflectance",
		"PROFILE": "urban"
		},
		"template": {
		"command": "python",
		"environment": {
		"name": "module-a",
		"manager": "conda"
		},
		"arguments": [
		"-m",
		"lib",
		"--input_directory",
		"{{ INPUT }}",
		"--output_directory",
		"{{ OUTPUT_DIRECTORY }}",
		"--method",
		"{{ METHOD }}",
		"--profile",
		"{{ PROFILE }}"
		]
		}
		},
		{
		"name": "Module P (local)",
		"type": "script",
		"programmingLanguage": "python",
		"uri": "file:///path/to/module-p",
		"variables": {
		"SOURCE_DIRECTORY": "/data/output",
		"OUTPUT_DIRECTORY": "/data/output",
		"METHOD": "nn_diffuse",
		"MODULE_LIST": "MODA, MODP, MODO"
		},
		"template": {
		"command": "python",
		"environment": {
		"name": "module-p",
		"manager": "conda"
		},
		"arguments": [
		"-m",
		"lib",
		"--source_directory",
		"{{ SOURCE_DIRECTORY }}",
		"--output_directory",
		"{{ OUTPUT_DIRECTORY }}",
		"--method",
		"{{ METHOD }}",
		"--module_list",
		"{{ MODULE_LIST }}"
		]
		}
		},
		{
		"name": "Module O (local)",
		"type": "script",
		"programmingLanguage": "python",
		"uri": "file:///path/to/module-o",
		"variables": {
		"SOURCE_DIRECTORY": "/data/output",
		"OUTPUT_DIRECTORY": "/data/output"
		},
		"template": {
		"command": "python",
		"environment": {
		"name": "module-o",
		"manager": "conda"
		},
		"arguments": [
		"-m",
		"lib",
		"--source_directory",
		"{{ SOURCE_DIRECTORY }}",
		"--output_directory",
		"{{ OUTPUT_DIRECTORY }}"
		]
		}
		}
		]
		}
		No newline at end of file

examples/atmospheric_correction/README.md

0 → 100644

+70 −0

Original line number	Diff line number	Diff line
		# Configuration for: Atmospheric Correction

		## Breaking Down the Configuration
		- Specify the input directory or series of directories:
		- Use the `input` value to parse through imagery in parallel.
		- In the associated `config.json`, there is a single directory listed. Xylem will parse through this main directory and concurrently process all available subdirectories.
		```json
		"input":
		"/data/test-cases"
		```
		- Multiple directories may also be specified. Note that when listing multiple directories, they are listed within `[ ]`, signaling to Xylem these specific directories should be concurrently processed.
		```json
		"input": [
		"/data/test-cases/test-case-GE01-medium-low-mid",
		"/data/test-cases/test-case-WV02-medium-mid-high",
		"/data/test-cases/test-case-wv03-large-high-mid",
		"/data/test-cases/test-case-wv03-large-mid-low"
		]
		```
		- Define the module:
		- Within the module variable, specify the `name` and `uri` to the module. The `uri` specifically informs Xylem from where to install the module environment. For example, if a user has Xylem installed and has cloned each of the modules in their `dev` directory within a Docker container, this path may look something like:
		```json
		"uri": "file:///root/dev/module-a"
		```
		- Define the module variables:
		- Variables are unique to each module (detailed below).
		- Simply follow the `template`:
		- This section is built from the module's `Makefile`. Here, users can specify arguments and associated variables. Below is an example template for Module A:
		```json
		"template": {
		"command": "python",
		"environment": {
		"name": "module-a",
		"manager": "conda"
		},
		"arguments": [
		"-m",
		"lib",
		"--input_directory",
		"{{ INPUT }}",
		"--output_directory",
		"{{ OUTPUT_DIRECTORY }}",
		"--method",
		"{{ METHOD }}",
		"--profile",
		"{{ PROFILE }}"
		]
		}
		```

		## Module Variables
		For more thorough discussions on variables and overall structure, along with a link to technical documentation, please browse the README:
		- [Module A](https://code-int.ornl.gov/gshs/common/imagery-processing/module-a/-/blob/main/README.md)

		### Module A: *A*tmopsheric Correction
		- `input_directory`: the directory containing the raw, Level 1B Maxar imagery. Module A currently expects a Maxar data directory structure. Because Module A is either run as a standalone module or always the first in a multi-module sequence, this argument will always be `INPUT` in the template.
		- `output_directory`: the directory in which to store the output of Module A processing.
		- `method`: specification for users to define if they want the output to contain top-of-atmosphere reflectance (`toa_reflectance`) or bottom-of-atmosphere reflectance (`boa_reflectance`). For the best representation of true surface reflectance (e.g., the removal of the blue effects of the atmosphere), users should select the latter.
		- `profile`: specification for users to define the aersol profile selected in the Py6S model. Module A is currently optimized for `urban` applications, so consider using `maritime` sparingly.
		- Future work for Module A will incorporate the maritime-optimized atmospheric correction efforts led by Matt McCarthy.

		## Running the Module
		- To run the standalone module independently of any workflow:
		```bash
		make run
		```
		- To run the module with Xylem, at maximum verbosity, using the `config.json`:
		```bash
		xylem run -vvvv
		```
		No newline at end of file

examples/atmospheric_correction/config.json

0 → 100644

+43 −0

Original line number	Diff line number	Diff line
		{
		"version": "0.0.9",
		"description": "Test workflow configuration",
		"requirements": [
		"python",
		"conda"
		],
		"keywords": [],
		"input":
		"/data/test-cases",
		"modules": [
		{
		"name": "Module A (local)",
		"type": "script",
		"programmingLanguage": "python",
		"uri": "file:///path/to/module-a",
		"variables": {
		"OUTPUT_DIRECTORY": "/data/output",
		"METHOD": "boa_reflectance",
		"PROFILE": "urban"
		},
		"template": {
		"command": "python",
		"environment": {
		"name": "module-a",
		"manager": "conda"
		},
		"arguments": [
		"-m",
		"lib",
		"--input_directory",
		"{{ INPUT }}",
		"--output_directory",
		"{{ OUTPUT_DIRECTORY }}",
		"--method",
		"{{ METHOD }}",
		"--profile",
		"{{ PROFILE }}"
		]
		}
		}
		]
		}
		No newline at end of file

examples/orthorectification/README.md

0 → 100644

+63 −0

Original line number	Diff line number	Diff line
		# Configuration for: Orthorectification

		## Breaking Down the Configuration
		- Specify the input directory or series of directories:
		- Use the `input` value to parse through imagery in parallel.
		- In the associated `config.json`, multiple directories are listed. Note that when listing multiple directories, they are listed within `[ ]`, signaling to Xylem these specific directories should be concurrently processed.
		```json
		"input": [
		"/data/test-cases/test-case-GE01-medium-low-mid",
		"/data/test-cases/test-case-WV02-medium-mid-high",
		"/data/test-cases/test-case-wv03-large-high-mid",
		"/data/test-cases/test-case-wv03-large-mid-low"
		]
		```
		- A single directory may also be specified. Xylem will parse through this main directory and concurrently process all available subdirectories.
		```json
		"input":
		"/data/test-cases"
		```
		- Define the module:
		- Within the module variable, specify the `name` and `uri` to the module. The `uri` specifically informs Xylem from where to install the module environment. For example, if a user has Xylem installed and has cloned each of the modules in their `dev` directory within a Docker container, this path may look something like:
		```json
		"uri": "file:///root/dev/module-o"
		```
		- Define the module variables:
		- Variables are unique to each module (detailed below).
		- Simply follow the `template`:
		- This section is built from the module's `Makefile`. Here, users can specify arguments and associated variables. Below is an example template for Module O:
		```json
		"template": {
		"command": "python",
		"environment": {
		"name": "module-o",
		"manager": "conda"
		},
		"arguments": [
		"-m",
		"lib",
		"--source_directory",
		"{{ INPUT }}",
		"--output_directory",
		"{{ OUTPUT_DIRECTORY }}"
		]
		}
		```

		## Module Variables
		For more thorough discussions on variables and overall structure, along with a link to technical documentation, please browse the README:
		- [Module O](https://code-int.ornl.gov/gshs/common/imagery-processing/module-o/-/blob/main/README.md)

		### Module O: *O*rthorectification
		- 'source_directory': the directory containing the output of Module P. Module O can be run as a unique module following the example `config.json`. HOWEVER, the input MUST have already been processed through Module P.
		- 'output_directory': the directory in which to store the output of Module O processing. This argument should match the `source_directory` for Module O.

		## Running the Module
		- To run the standalone module independently of any workflow:
		```bash
		make run
		```
		- To run the module with Xylem, at maximum verbosity, using the `config.json`:
		```bash
		xylem run -vvvv
		```
		No newline at end of file