container-builder issueshttps://code.ornl.gov/olcf/container-builder/-/issues2017-12-29T13:01:39Zhttps://code.ornl.gov/olcf/container-builder/-/issues/25Use client to test if we're in a tty and inform the builder2017-12-29T13:01:39ZSimpson, Adam BUse client to test if we're in a tty and inform the builder*Created by: AdamSimpson*
Don't run through `unbuffer` if the client isn't running a tty*Created by: AdamSimpson*
Don't run through `unbuffer` if the client isn't running a ttyhttps://code.ornl.gov/olcf/container-builder/-/issues/8Typo2017-12-20T17:25:18ZFrench, RobertTypohttps://github.com/AdamSimpson/ContainerBuilder/blob/1aab208710b5d5b095dfb8429ca2c3f79907edcb/Common/src/Messenger.cpp#L84
You have "failbit" and "badbit" but its actually "Fitbit (TM)"https://github.com/AdamSimpson/ContainerBuilder/blob/1aab208710b5d5b095dfb8429ca2c3f79907edcb/Common/src/Messenger.cpp#L84
You have "failbit" and "badbit" but its actually "Fitbit (TM)"https://code.ornl.gov/olcf/container-builder/-/issues/9Trick wget into printing nice progress bars without tty2017-12-20T17:25:18ZSimpson, Adam BTrick wget into printing nice progress bars without tty*Created by: AdamSimpson*
wget uses ioctl to check the screen size:
`ioctl(0, TIOCGWINSZ, &screen_size);`
When it returns 0, as is the case for our subprocess, a newline is printed after each progress bar image is printed to the scree...*Created by: AdamSimpson*
wget uses ioctl to check the screen size:
`ioctl(0, TIOCGWINSZ, &screen_size);`
When it returns 0, as is the case for our subprocess, a newline is printed after each progress bar image is printed to the screen which isn't desirable. Try LD_PRELOAD ioctl which provides a fake screen to see if it can trick wget.https://code.ornl.gov/olcf/container-builder/-/issues/5Shutdown builder service gracefully before stopping VM and creating image2017-12-20T17:25:18ZSimpson, Adam BShutdown builder service gracefully before stopping VM and creating image*Created by: AdamSimpson*
*Created by: AdamSimpson*
https://code.ornl.gov/olcf/container-builder/-/issues/32Severity level doesn't show up in logs2017-12-29T13:01:18ZSimpson, Adam BSeverity level doesn't show up in logs2017-12-29 05:11:32.679480 []: Failed to fetch server list2017-12-29 05:11:32.679480 []: Failed to fetch server listhttps://code.ornl.gov/olcf/container-builder/-/issues/19QUEUE_HOST / QUEUE_PORT give cryptic errors when undefined2017-12-20T17:25:18ZFrench, RobertQUEUE_HOST / QUEUE_PORT give cryptic errors when undefinedTrying to connect be like
```
~/P/C/build ❯❯❯ ./ContainerBuilderClient ../ContainerBuilderTitan.def poop.img
Attempting to connect to BuilderQueue: Failed to build container: QUEUE_HOST: Operation not supported
```
same for QUE...Trying to connect be like
```
~/P/C/build ❯❯❯ ./ContainerBuilderClient ../ContainerBuilderTitan.def poop.img
Attempting to connect to BuilderQueue: Failed to build container: QUEUE_HOST: Operation not supported
```
same for QUEUE_PORT. The client should complain in a more straight-forward way if they are missing (though I reckon this will be deployed via environment modules that would set these variables anyhow?)https://code.ornl.gov/olcf/container-builder/-/issues/20Make file chunk size default to the asio buffer sizes2017-12-20T17:25:18ZSimpson, Adam BMake file chunk size default to the asio buffer sizes*Created by: AdamSimpson*
*Created by: AdamSimpson*
https://code.ornl.gov/olcf/container-builder/-/issues/33Make better use of exceptions2018-01-06T21:19:42ZSimpson, Adam BMake better use of exceptionshttps://code.ornl.gov/olcf/container-builder/-/issues/22install yum on builder2017-12-20T17:25:18ZSimpson, Adam Binstall yum on builder*Created by: AdamSimpson*
*Created by: AdamSimpson*
https://code.ornl.gov/olcf/container-builder/-/issues/10INSTALL_PREFIX?2017-12-20T17:25:18ZFrench, RobertINSTALL_PREFIX?https://github.com/AdamSimpson/ContainerBuilder/blob/488ced8aa87844868812e285033e7098f80ddea9/Scripts/ProvisionBuilder#L49https://github.com/AdamSimpson/ContainerBuilder/blob/488ced8aa87844868812e285033e7098f80ddea9/Scripts/ProvisionBuilder#L49https://code.ornl.gov/olcf/container-builder/-/issues/2Handle a builder going down2017-12-20T17:25:18ZSimpson, Adam BHandle a builder going down*Created by: AdamSimpson*
If a builder goes down attempt to notify the queue*Created by: AdamSimpson*
If a builder goes down attempt to notify the queuehttps://code.ornl.gov/olcf/container-builder/-/issues/4Get a grip on connection exception handling2017-12-20T17:25:18ZSimpson, Adam BGet a grip on connection exception handling*Created by: AdamSimpson*
A failure in OpenStack create is uncaught by the connection and will bring down the server...this is not desirable.*Created by: AdamSimpson*
A failure in OpenStack create is uncaught by the connection and will bring down the server...this is not desirable.https://code.ornl.gov/olcf/container-builder/-/issues/30Fix whatever caused the queue to go down...2017-12-13T00:44:09ZSimpson, Adam BFix whatever caused the queue to go down...```
cades@builderqueue:~$ tail -f /home/queue/ContainerBuilder.log
2017-12-11 17:22:31.440361 (656) [128.219.164.233:43774] : Established connection
2017-12-11 17:22:31.542644 (657) [128.219.164.233:43774] : Connection initial request...```
cades@builderqueue:~$ tail -f /home/queue/ContainerBuilder.log
2017-12-11 17:22:31.440361 (656) [128.219.164.233:43774] : Established connection
2017-12-11 17:22:31.542644 (657) [128.219.164.233:43774] : Connection initial request error received bad message type: Bad message
2017-12-11 17:22:31.542674 (658) [128.219.164.233:43774] : Ending connection
2017-12-11 17:22:31.644552 (659) [128.219.164.233:43850] : Established connection
2017-12-11 17:22:31.745598 (660) [128.219.164.233:43850] : Connection initial request error received bad message type: Bad message
2017-12-11 17:22:31.746069 (661) [128.219.164.233:43850] : Ending connection
2017-12-11 17:22:31.847465 (662) [128.219.164.233:43934] : Established connection
2017-12-11 17:22:32.954434 (663) Running command: /home/queue/GetBuilders
2017-12-11 17:22:35.700940 (664) Running command: /home/queue/GetBuilders
2017-12-11 17:22:38.649908 (665) Running command: /home/queue/GetBuilders
```
```
cades@builderqueue:~$ sudo systemctl status BuilderQueue
● BuilderQueue.service - BuilderQueue daemon
Loaded: loaded (/etc/systemd/system/BuilderQueue.service; enabled; vendor preset: enabled)
Active: failed (Result: core-dump) since Mon 2017-12-11 17:22:39 UTC; 20h ago
Main PID: 964 (code=dumped, signal=ABRT)
Dec 11 16:53:51 builderqueue systemd[1]: Started BuilderQueue daemon.
Dec 11 17:22:39 builderqueue BuilderQueue[964]: terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> >'
Dec 11 17:22:39 builderqueue BuilderQueue[964]: what(): remote_endpoint: Transport endpoint is not connected
Dec 11 17:22:39 builderqueue systemd[1]: BuilderQueue.service: Main process exited, code=dumped, status=6/ABRT
Dec 11 17:22:39 builderqueue systemd[1]: BuilderQueue.service: Unit entered failed state.
Dec 11 17:22:39 builderqueue systemd[1]: BuilderQueue.service: Failed with result 'core-dump'.
```https://code.ornl.gov/olcf/container-builder/-/issues/13Fix verification from builder bringup script2018-01-06T21:20:24ZSimpson, Adam BFix verification from builder bringup script*Created by: AdamSimpson*
This should not happen:
```Received build host: Could not find a suitable TLS CA certificate bundle, invalid path: /home/queue/OpenStack.cer:Could not find a suitable TLS CA certificate bundle, invalid path: /...*Created by: AdamSimpson*
This should not happen:
```Received build host: Could not find a suitable TLS CA certificate bundle, invalid path: /home/queue/OpenStack.cer:Could not find a suitable TLS CA certificate bundle, invalid path: /home/queue/OpenStack.cer```https://code.ornl.gov/olcf/container-builder/-/issues/34Don't use waiting animation when debug output enabled2018-01-29T22:21:05ZSimpson, Adam BDon't use waiting animation when debug output enabledDebug output may print during animation making a mess of stderr/outDebug output may print during animation making a mess of stderr/outhttps://code.ornl.gov/olcf/container-builder/-/issues/6Don't allow image output and definition input to be the same name2018-03-01T12:48:21ZSimpson, Adam BDon't allow image output and definition input to be the same name*Created by: AdamSimpson*
*Created by: AdamSimpson*
https://code.ornl.gov/olcf/container-builder/-/issues/31Client should exit immediately on failure2017-12-29T13:34:09ZSimpson, Adam BClient should exit immediately on failure```
[atj@titan-ext7]$ container_builder --arch=ppc64le test.img test.def
2017-Dec-17 22:42:39 [SUCCESS] Connecting to BuilderQueue:
2017-Dec-17 22:42:40 [SUCCESS] Requesting Builder:
2017-Dec-17 22:42:40 [SUCCESS] Connecting to Builder...```
[atj@titan-ext7]$ container_builder --arch=ppc64le test.img test.def
2017-Dec-17 22:42:39 [SUCCESS] Connecting to BuilderQueue:
2017-Dec-17 22:42:40 [SUCCESS] Requesting Builder:
2017-Dec-17 22:42:40 [SUCCESS] Connecting to Builder:
2017-Dec-17 22:42:40 [INFO] Sending definition: test.def
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
2017-Dec-17 22:42:40 [INFO] Start of Singularity builder output:
ERROR: Unknown container build Singularity recipe format: ./container.def
ABORT: Aborting with RETVAL=255
Cleaning up...
2017-Dec-17 22:42:41 [INFO] Sending finished container: test.img
2017-Dec-17 22:42:41 [INFO] Error receiving headerEnd of file
2017-Dec-17 22:42:41 [INFO] Recieved message hader not of file type
2017-Dec-17 22:42:41 [SUCCESS] Container received: test.img
```https://code.ornl.gov/olcf/container-builder/-/issues/1Cleanup zombie build processes2017-12-20T17:25:18ZSimpson, Adam BCleanup zombie build processes*Created by: AdamSimpson*
Detaching a boost process leaves behind zombie processes
Singularity builder.img:/root> ps auxf
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 34 0.0 0.0 18480 348...*Created by: AdamSimpson*
Detaching a boost process leaves behind zombie processes
Singularity builder.img:/root> ps auxf
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 34 0.0 0.0 18480 3480 pts/1 S 02:33 0:00 /bin/bash --norc
root 36 0.0 0.0 38976 3080 pts/1 R+ 02:33 0:00 \_ ps auxf
root 1 0.0 0.0 13968 1800 ? Ss 02:24 0:00 singularity-instance: root [builder]
root 3 0.0 0.1 40192 6748 ? S 02:24 0:00 /usr/local/bin/ContainerBuilderServer
root 4 0.0 0.0 0 0 ? Z 02:26 0:00 \_ [action-suid] <defunct>
root 9 0.0 0.0 0 0 ? Z 02:26 0:00 \_ [action-suid] <defunct>
root 14 0.0 0.0 0 0 ? Z 02:26 0:00 \_ [action-suid] <defunct>
root 19 0.0 0.0 0 0 ? Z 02:26 0:00 \_ [action-suid] <defunct>
root 24 0.0 0.0 0 0 ? Z 02:29 0:00 \_ [action-suid] <defunct>
root 29 0.0 0.0 0 0 ? Z 02:31 0:00 \_ [action-suid] <defunct>
https://code.ornl.gov/olcf/container-builder/-/issues/21checksum files on transfer2017-12-20T17:25:18ZSimpson, Adam Bchecksum files on transfer*Created by: AdamSimpson*
Add checksum to header*Created by: AdamSimpson*
Add checksum to headerhttps://code.ornl.gov/olcf/container-builder/-/issues/26Build ppc64le containers with qemu2018-01-06T21:19:53ZSimpson, Adam BBuild ppc64le containers with qemu*Created by: AdamSimpson*
*Created by: AdamSimpson*