Commit 7d4a6f8b authored by Hines, Jesse's avatar Hines, Jesse
Browse files

Merge branch 'develop' into 'main'

Update raps in simulation server

See merge request !1
parents 1aefa218 44a83520
Loading
Loading
Loading
Loading
+5 −2
Original line number Diff line number Diff line
[submodule "simulation_server/simulation/raps"]
	path = simulation_server/simulation/raps
[submodule "raps"]
	path = raps
	url = https://github.com/ExaDigiT/RAPS.git
	branch = main
[submodule "simulation_dashboard"]
	path = simulation_dashboard
	url = https://github.com/ExaDigiT/SimulationDashboard.git

Dockerfile

0 → 100644
+31 −0
Original line number Diff line number Diff line
FROM python:3.12.11

RUN apt-get update \
  && apt-get install git libsnappy-dev \
  && rm -rf /var/lib/apt/lists/*

RUN pip install --no-cache-dir --upgrade pip
RUN pip install --no-cache-dir uv
ENV UV_NO_CACHE=true

WORKDIR /app

# Install RAPS dependencies as first layer for caching
COPY raps/pyproject.toml /app/raps/
RUN uv pip install --system -r /app/raps/pyproject.toml

# Install server dependencies (including raps) for caching
COPY raps/ /app/raps/
COPY pyproject.toml /app/
# pip install expects README to exist
RUN touch /app/README.md
RUN uv pip install --system -r /app/pyproject.toml

# Install simulation server
COPY druid_ingests/ /app/druid_ingests/
COPY simulation_server/ /app/simulation_server/
RUN uv pip install --system -e .
# Re-install RAPS as editable (TODO: RAPS currently doesn't work in non-editable mode)
RUN uv pip install --system -e ./raps

CMD ["python", "-m", "simulation_server.server.main"]

Dockerfile.server

deleted100644 → 0
+0 −25
Original line number Diff line number Diff line
FROM python:3.9

RUN apt-get update \
  && apt-get install -y libsnappy-dev \
  && rm -rf /var/lib/apt/lists/*

RUN pip install --upgrade pip
RUN pip install hatch

WORKDIR /app

COPY pyproject.toml /app

RUN hatch dep show requirements > /app/requirements.txt
# RUN hatch dep show requirements --feature=server >> /app/requirements.txt
RUN python3 -m pip install -r /app/requirements.txt
ENV RAPS_CONFIG=/app/simulation_server/simulation/raps/config

COPY ["druid_ingests", "/app/druid_ingests/"]
COPY ["models", "/app/models"]
COPY ["simulation_server", "/app/simulation_server/"]
COPY ["README.md", "/app"]
RUN python3 -m pip install -e .

CMD ["python3", "-m", "simulation_server.server.main"]

Dockerfile.simulation

deleted100644 → 0
+0 −24
Original line number Diff line number Diff line
FROM ubuntu:22.04

RUN apt-get update \
  && apt-get install -y python3 python3-pip git libsnappy-dev \
  && rm -rf /var/lib/apt/lists/*

RUN pip install --upgrade pip
RUN pip install hatch

WORKDIR /app

COPY pyproject.toml /app

RUN hatch dep show requirements > /app/requirements.txt
# RUN hatch dep show requirements --feature=simulation >> /app/requirements.txt
RUN python3 -m pip install -r /app/requirements.txt
ENV RAPS_CONFIG=/app/simulation_server/simulation/raps/config

COPY ["simulation_server", "/app/simulation_server/"]
COPY ["models", "/app/models"]
COPY ["README.md", "/app"]
RUN python3 -m pip install -e .

CMD ["python3", "-m", "simulation_server.simulation.main"]
+36 −15
Original line number Diff line number Diff line
@@ -2,41 +2,62 @@

REST API that allows running and querying the results from the ExaDigit simulation and RAPS.

## Loading RAPS submodule
## Loading RAPS and Dashboard submodules
This uses [RAPS](https://github.com/ExaDigiT/RAPS) to run the simulation, which is loaded as a 
submodule. Make sure to run
submodule. The [Simulation Dashboard](https://github.com/ExaDigiT/SimulationDashboard) is also in a
separate repo and loaded as a submodule. Make to load the submodules by running:
```
git submodule update --init --recursive
```
to load the submodule.

## Downloading FMU models
The Frontier FMU models aren't currently publicly available. To run Frontier simulations with cooling enabled, use this
command to download them (if you have access to the fmu-models repo).
```
cd ./raps
make fetch-fmu-models
```

You can run the job and power simulation without downloading any FMU models. But to use the cooling
simulation you'll need to download FMU models into the `models` directory. You can download
`Simulator_olcf5_base.fmu` from https://code.ornl.gov/exadigit/fmu-models if you have access. (The
FMU models aren't currently publicly available.)

## Running locally
To run a local version of the server run
```bash
docker compose up --wait
```
The API server will be hosted on http://localhost:8081. The dashboard will be hosted on http://localhost:8080.

You'll need at least 32 GiB of RAM for druid and RAPS to run smoothly.

## Deploying
To deploy the server, run
If you want to run replay data locally, you'll need to download the datasets and then ingest them in
Druid. You can fetch the datasets with `./scripts/fetch_data.sh`, and use the `./scripts/submit_data_ingests.py`
script to ingest them into druid.

View the server logs with:
```bash
./scripts/deploy.sh prod
docker compose logs -f --no-log-prefix simulation-server
```

This will build both the server and simulation docker images, and push them to Slate.
To shut down the server run:
```bash
docker compose down
```

## Running locally
To run a local version of the server run
Use this if you want to wipe all the database data as well:
```bash
./scripts/launch_local.sh
docker compose down --volumes
```
The server will be hosted on http://localhost:8080

You'll need at least 16 GiB of RAM, preferably 32 GiB for druid to run smoothly.
## Deploying
To deploy the server, run
```bash
./scripts/deploy.sh prod
```

If you want to run replay data locally, you'll need to download the datasets (see ./scripts/fetch.sh)
and then ingest them in Druid. After launching, you can access the Druid UI at http://localhost:8888
and submit druid ingests for the system you want.
This will build both the server and simulation docker images, and push them to Slate.

## API Docs
You can view the API docs and the `openapi.json` with the API specification at
Loading