Loading docs/deploy-ansible.md +118 −12 Original line number Diff line number Diff line Loading @@ -3,8 +3,8 @@ This project provides an [Ansible playbook](https://docs.ansible.com/ansible/latest/index.html) that will place [Docker containers](https://www.docker.com/) onto specific hosts. The playbook and associated roles and variables are in [deploy/ansible/](../deploy/ansible/). Please look at these files to understand exactly what the playbook is doing. The playbook and associated roles and variables are in [deploy/ansible/](../deploy/ansible/). Please look at these files to understand exactly what the playbook is doing. For a complete list of all variables that are available, please look at the defaults for each of the [playbook roles](../deploy/ansible/roles/). The defaults have extensive comments that Loading @@ -16,10 +16,13 @@ configurations. - [Running the playbook](#running-the-playbook) - [Local test installation with Vagrant](#local-test-installation-with-vagrant) - [Configuring the installation](#configuring-the-installation) - [Setting the version](#setting-the-version) - [Setting the image version](#setting-the-image-version) - [Setting the web address](#setting-the-web-address) - [Enabling and disabling components](#enabling-and-disabling-components) - [Configuring the available datasets](#configuring-the-available-datasets) - [Using existing THREDDS catalogs](#using-existing-thredds-catalogs) - [Configuring Solr replicas](#configuring-solr-replicas) - [Using external Solr instances](#using-external-solr-instances) <!-- /TOC --> Loading @@ -31,7 +34,9 @@ Before attempting to run the playbook, make sure that you have Next, make a configuration directory - this can be anywhere on your machine that is **not** under `esgf-docker`. You can also place this directory under version control if you wish - this can be very useful for tracking changes to the configuration, or even triggering deployments automatically when configuration changes. configuration changes. If you do, make sure not to commit any plain-text secrets to your version control repository (e.g. by using an encryption tool such as [Ansible Vault](https://docs.ansible.com/ansible/latest/user_guide/vault.html)). In your configuration directory, make an [inventory file](https://docs.ansible.com/ansible/latest/user_guide/intro_inventory.html) Loading @@ -42,14 +47,19 @@ defining the hosts that you want to deploy to: [data] esgf.example.org [index] esgf.example.org ``` Currently, ESGF deployments only respect the `data` group. Hosts in this group will be deployed as data nodes. Currently, ESGF deployments respect the `data` and `index` groups. Hosts in these groups will be deployed as data and/or index nodes respectively. A host can be in both groups, and will be deployed as a combined data and index node. Variables can be overridden on a per-group or per-host basis by placing YAML files at `/my/esgf/config/group_vars/[group name].yaml` or `/my/esgf/config/host_vars/[host name].yml`. See below for some common examples, and consult the [role defaults](../deploy/ansible/roles/) for a complete list of available variables. Variables can be overridden on a per-group or per-host basis by placing YAML files in your configuration directory at `/my/esgf/config/group_vars/[group name].yaml` or `/my/esgf/config/host_vars/[host name].yml`. See below for some common examples, and consult the [role defaults](../deploy/ansible/roles/) for a complete list of available variables. Once you have configured your inventory and host/group variables, you can run the playbook: Loading Loading @@ -77,7 +87,7 @@ After waiting for the containers to start, the THREDDS interface will be availab This section describes the most commonly modified configuration options. For a full list of available variables, please consult the playbook [role defaults](../deploy/ansible/roles/). ### Setting the version ### Setting the image version By default, the Ansible playbook will use the `latest` tag when specifying Docker images. For production installations, it is recommended to use an immutable tag (see [Image tags](../README.md#image-tags)). Loading @@ -93,6 +103,21 @@ image_tag: a031a2ca image_pull: false ``` To use images from a custom registry, e.g. if you need to perform additional security checks: ```yaml # Set the prefix for the images image_prefix: registry.example.com/esgf ``` Properties can also be overridden on a per-image basis, e.g.: ```yaml # Use a different branch for the THREDDS image thredds_image_tag: my-branch thredds_image_pull: true ``` ### Setting the web address By default, the web address is the FQDN of the host (i.e. the output of `hostname --fqdn`). This can Loading Loading @@ -121,8 +146,23 @@ esgf-data02.example.org hostname=esgf-data.example.org ``` The Ansible playbook does **not** configure the DNS load-balancing automatically - you will need to separately configure [Round-robin DNS](https://en.wikipedia.org/wiki/Round-robin_DNS) or a more sophisticated service like [AWS Route 53](https://aws.amazon.com/route53/) to do this. separately configure [Round-robin DNS](https://en.wikipedia.org/wiki/Round-robin_DNS) or use a more sophisticated service like [AWS Route 53](https://aws.amazon.com/route53/) to do this. ### Enabling and disabling components As well as defining each node as a data and/or index node using groups, the Ansible playbook allows individual components to be enabled or disabled using variables. By default, all components for the node type (as determined by the groups) will be deployed. The following variables control which components are deployed: ```yaml thredds_enabled: true/false fileserver_enabled: true/false solr_enabled: true/false search_enabled: true/false ``` ### Configuring the available datasets Loading Loading @@ -185,3 +225,69 @@ When the catalogs change, run the Ansible playbook in order to restart the conta load the new catalogs. THREDDS is configured to use a persistent volume for cache files, meaning that although the first start may be slow for large catalogs, subsequent restarts should be much faster (depending how many files have changed). ### Configuring Solr replicas By default, the Ansible playbook configures local master and slave Solr instances for locally pulished data and configures the `esg-search` application to talk to them. However, `esg-search` can also include results from indexes at other sites, which are replicated locally. Each replica gets it's own Solr instance and the `esg-search` application is configured to use these replicas. To configure the available replicas use the variable `solr_replicas`. The value should be a list in which the following keys are required for each item: * `name`: Used in the names of Kubernetes resources for the replica * `master_url`: The URL to replicate, including scheme, port and path, e.g. `https://esgf-index1.ceda.ac.uk/solr` For example, the following configures two replicas, and will result in four Solr containers running: * `master` * `slave` * `ceda-index-3` * `llnl` ```yaml solr_replicas: - name: ceda-index-3 master_url: https://esgf-index3.ceda.ac.uk/solr - name: llnl master_url: https://esgf-node.llnl.gov/solr ``` Additional variables are available to customise behaviour, e.g. poll intervals - please see the [role defaults for the index role](../deploy/ansible/roles/index/defaults/main.yml). ### Using external Solr instances If you have existing Solr instances that you do not wish to migrate, or need to run Solr outside of Docker for persistence or performance reasons, the Ansible playbook can configure the `esg-search` application to use external Solr instances. To do this, just disable Solr and set the external URLs to use. For any replicas that are specified, `esg-search` will be configured to use the `master_url` directly. > **WARNING** > > If you want to use a Solr instance configured using `esgf-ansible` as an external Solr instance, > you will need to configure the firewall on that host to expose the port `8984` where the > master listens. Example configuration using external Solr instances: ```yaml # Disable local Solr instances solr_enabled: false # Set the external URLs for Solr solr_master_external_url: http://external.solr:8984/solr solr_slave_external_url: http://external.solr:8983/solr # Configure the replicas # No local containers will be deployed - esg-search will use the master_url directly solr_replicas: - name: ceda-index-3 master_url: https://esgf-index3.ceda.ac.uk/solr - name: llnl master_url: https://esgf-node.llnl.gov/solr ``` Loading
docs/deploy-ansible.md +118 −12 Original line number Diff line number Diff line Loading @@ -3,8 +3,8 @@ This project provides an [Ansible playbook](https://docs.ansible.com/ansible/latest/index.html) that will place [Docker containers](https://www.docker.com/) onto specific hosts. The playbook and associated roles and variables are in [deploy/ansible/](../deploy/ansible/). Please look at these files to understand exactly what the playbook is doing. The playbook and associated roles and variables are in [deploy/ansible/](../deploy/ansible/). Please look at these files to understand exactly what the playbook is doing. For a complete list of all variables that are available, please look at the defaults for each of the [playbook roles](../deploy/ansible/roles/). The defaults have extensive comments that Loading @@ -16,10 +16,13 @@ configurations. - [Running the playbook](#running-the-playbook) - [Local test installation with Vagrant](#local-test-installation-with-vagrant) - [Configuring the installation](#configuring-the-installation) - [Setting the version](#setting-the-version) - [Setting the image version](#setting-the-image-version) - [Setting the web address](#setting-the-web-address) - [Enabling and disabling components](#enabling-and-disabling-components) - [Configuring the available datasets](#configuring-the-available-datasets) - [Using existing THREDDS catalogs](#using-existing-thredds-catalogs) - [Configuring Solr replicas](#configuring-solr-replicas) - [Using external Solr instances](#using-external-solr-instances) <!-- /TOC --> Loading @@ -31,7 +34,9 @@ Before attempting to run the playbook, make sure that you have Next, make a configuration directory - this can be anywhere on your machine that is **not** under `esgf-docker`. You can also place this directory under version control if you wish - this can be very useful for tracking changes to the configuration, or even triggering deployments automatically when configuration changes. configuration changes. If you do, make sure not to commit any plain-text secrets to your version control repository (e.g. by using an encryption tool such as [Ansible Vault](https://docs.ansible.com/ansible/latest/user_guide/vault.html)). In your configuration directory, make an [inventory file](https://docs.ansible.com/ansible/latest/user_guide/intro_inventory.html) Loading @@ -42,14 +47,19 @@ defining the hosts that you want to deploy to: [data] esgf.example.org [index] esgf.example.org ``` Currently, ESGF deployments only respect the `data` group. Hosts in this group will be deployed as data nodes. Currently, ESGF deployments respect the `data` and `index` groups. Hosts in these groups will be deployed as data and/or index nodes respectively. A host can be in both groups, and will be deployed as a combined data and index node. Variables can be overridden on a per-group or per-host basis by placing YAML files at `/my/esgf/config/group_vars/[group name].yaml` or `/my/esgf/config/host_vars/[host name].yml`. See below for some common examples, and consult the [role defaults](../deploy/ansible/roles/) for a complete list of available variables. Variables can be overridden on a per-group or per-host basis by placing YAML files in your configuration directory at `/my/esgf/config/group_vars/[group name].yaml` or `/my/esgf/config/host_vars/[host name].yml`. See below for some common examples, and consult the [role defaults](../deploy/ansible/roles/) for a complete list of available variables. Once you have configured your inventory and host/group variables, you can run the playbook: Loading Loading @@ -77,7 +87,7 @@ After waiting for the containers to start, the THREDDS interface will be availab This section describes the most commonly modified configuration options. For a full list of available variables, please consult the playbook [role defaults](../deploy/ansible/roles/). ### Setting the version ### Setting the image version By default, the Ansible playbook will use the `latest` tag when specifying Docker images. For production installations, it is recommended to use an immutable tag (see [Image tags](../README.md#image-tags)). Loading @@ -93,6 +103,21 @@ image_tag: a031a2ca image_pull: false ``` To use images from a custom registry, e.g. if you need to perform additional security checks: ```yaml # Set the prefix for the images image_prefix: registry.example.com/esgf ``` Properties can also be overridden on a per-image basis, e.g.: ```yaml # Use a different branch for the THREDDS image thredds_image_tag: my-branch thredds_image_pull: true ``` ### Setting the web address By default, the web address is the FQDN of the host (i.e. the output of `hostname --fqdn`). This can Loading Loading @@ -121,8 +146,23 @@ esgf-data02.example.org hostname=esgf-data.example.org ``` The Ansible playbook does **not** configure the DNS load-balancing automatically - you will need to separately configure [Round-robin DNS](https://en.wikipedia.org/wiki/Round-robin_DNS) or a more sophisticated service like [AWS Route 53](https://aws.amazon.com/route53/) to do this. separately configure [Round-robin DNS](https://en.wikipedia.org/wiki/Round-robin_DNS) or use a more sophisticated service like [AWS Route 53](https://aws.amazon.com/route53/) to do this. ### Enabling and disabling components As well as defining each node as a data and/or index node using groups, the Ansible playbook allows individual components to be enabled or disabled using variables. By default, all components for the node type (as determined by the groups) will be deployed. The following variables control which components are deployed: ```yaml thredds_enabled: true/false fileserver_enabled: true/false solr_enabled: true/false search_enabled: true/false ``` ### Configuring the available datasets Loading Loading @@ -185,3 +225,69 @@ When the catalogs change, run the Ansible playbook in order to restart the conta load the new catalogs. THREDDS is configured to use a persistent volume for cache files, meaning that although the first start may be slow for large catalogs, subsequent restarts should be much faster (depending how many files have changed). ### Configuring Solr replicas By default, the Ansible playbook configures local master and slave Solr instances for locally pulished data and configures the `esg-search` application to talk to them. However, `esg-search` can also include results from indexes at other sites, which are replicated locally. Each replica gets it's own Solr instance and the `esg-search` application is configured to use these replicas. To configure the available replicas use the variable `solr_replicas`. The value should be a list in which the following keys are required for each item: * `name`: Used in the names of Kubernetes resources for the replica * `master_url`: The URL to replicate, including scheme, port and path, e.g. `https://esgf-index1.ceda.ac.uk/solr` For example, the following configures two replicas, and will result in four Solr containers running: * `master` * `slave` * `ceda-index-3` * `llnl` ```yaml solr_replicas: - name: ceda-index-3 master_url: https://esgf-index3.ceda.ac.uk/solr - name: llnl master_url: https://esgf-node.llnl.gov/solr ``` Additional variables are available to customise behaviour, e.g. poll intervals - please see the [role defaults for the index role](../deploy/ansible/roles/index/defaults/main.yml). ### Using external Solr instances If you have existing Solr instances that you do not wish to migrate, or need to run Solr outside of Docker for persistence or performance reasons, the Ansible playbook can configure the `esg-search` application to use external Solr instances. To do this, just disable Solr and set the external URLs to use. For any replicas that are specified, `esg-search` will be configured to use the `master_url` directly. > **WARNING** > > If you want to use a Solr instance configured using `esgf-ansible` as an external Solr instance, > you will need to configure the firewall on that host to expose the port `8984` where the > master listens. Example configuration using external Solr instances: ```yaml # Disable local Solr instances solr_enabled: false # Set the external URLs for Solr solr_master_external_url: http://external.solr:8984/solr solr_slave_external_url: http://external.solr:8983/solr # Configure the replicas # No local containers will be deployed - esg-search will use the master_url directly solr_replicas: - name: ceda-index-3 master_url: https://esgf-index3.ceda.ac.uk/solr - name: llnl master_url: https://esgf-node.llnl.gov/solr ```