Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New PR for GHA #289

Merged
merged 8 commits into from
Aug 24, 2023
Merged

New PR for GHA #289

merged 8 commits into from
Aug 24, 2023

Conversation

swelborn
Copy link
Collaborator

@swelborn swelborn commented Jun 2, 2023

After discussion with @cjh1, this is a cleaner PR for GHA.

I tried to strike a middle ground:

  1. Added scripts to build locally.
  2. "latest" and the latest commit tag are aliased.
  3. Remove conda in favor of python docker images and pip.

"py3x" is still in the tag name, but it is also "latest" without any python version name. Had to do this to ensure base build maps to stempy build. Let me know what you think about these changes.

- adds dockerfiles
- adds pip requirements, modifies existing requirements (add cmake)
- adds composite github action
- adds dockerignore file
build scripts for stempy, stempy-mpi.

can change variables inside to fit local building.
@swelborn swelborn requested review from cjh1 and psavery June 2, 2023 22:58
Copy link
Collaborator

@psavery psavery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments, but overall, I think it looks much simpler than before, and it looks great!

Comment on lines +58 to +103
run: |
echo "BASE_TARGET=$(if [[ "${{ matrix.mpi }}" == 'ON' ]]; then
echo 'mpi'
else
echo 'base'
fi)" >> $GITHUB_ENV

MPI_TAG=$(if [[ "${{ matrix.mpi }}" == 'ON' ]]; then
echo '-mpi'
else
echo ''
fi)
echo "MPI_TAG=${MPI_TAG}" >> $GITHUB_ENV

DEV_TAG=$(if [[ "${{ matrix.dev }}" == 'dev' ]]; then
echo '-dev'
else
echo ''
fi)
echo "DEV_TAG=${DEV_TAG}" >> $GITHUB_ENV

echo "RELEASE_OR_DEBUG=$(if [[ "${{ matrix.dev }}" == 'dev' ]]; then
echo 'Debug'
else
echo 'Release'
fi)" >> $GITHUB_ENV

COMMIT=$(git rev-parse --short HEAD)
echo "COMMIT=${COMMIT}" >> $GITHUB_ENV

IPYKERNEL_TAG=$(if [[ "${{ matrix.ipykernel }}" == 'ipykernel' ]]; then
echo '-ipykernel'
else
echo ''
fi)

echo "IPYKERNEL_TAG=${IPYKERNEL_TAG}" >> $GITHUB_ENV

BASE_TAG=${{ env.DOCKERHUB_ORG }}/stempy${MPI_TAG}:py${{ env.PYTHON_VERSION_NODOT }}-base${DEV_TAG}
echo "BASE_TAG=${BASE_TAG}" >> $GITHUB_ENV

TAG=${{ env.DOCKERHUB_ORG }}/stempy${MPI_TAG}${IPYKERNEL_TAG}:py${{ env.PYTHON_VERSION_NODOT }}-${COMMIT}${DEV_TAG}
echo "TAG=${TAG}" >> $GITHUB_ENV

LATEST_TAG=${{ env.DOCKERHUB_ORG }}/stempy${MPI_TAG}${IPYKERNEL_TAG}:latest${DEV_TAG}
echo "LATEST_TAG=${LATEST_TAG}" >> $GITHUB_ENV
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have more than a few lines in a run command, you may want to consider placing the code in a script instead, and calling it. But you do not have to.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it is mostly just two actions combined into one, right? What are the benefits for creating a new action for this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In previous iterations I was using this composite action several times throughout the workflow. I can remove it. I do think it makes it a little easier to read, and you could reuse it later. Let me know your opinion!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to leave it for now, unless @cjh1 suggests otherwise.

Comment on lines +41 to +48
- name: Determine if Dockerfile.base changed
id: changed-dockerfile-base
uses: tj-actions/changed-files@v35
with:
files: |
./docker/Dockerfile.base
./docker/apt-packages-common.txt
./docker/apt-packages-dev.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the Dockerfile.base take a long time to build? If not, it could have been simpler to just build it every time, rather than caching it. We may actually want to rebuild it occasionally just because upstream dependencies change over time which include things like security updates.

But we can leave this!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dockerfile.base does take a long time to build (~40 minutes). I understand that it is CI, but say you are developing on a fork and want to make a quick change and test at NERSC.

All you have to do is edit this:

on:
  push:
    branches:
      - 'master'
  pull_request:
    branches:
      - 'master'

And commit your change, and CI will take care of building/pushing to your own Dockerhub repo (so long as you have set up these vars and secrets):

DOCKERHUB_USERNAME: ${{ secrets.DOCKERHUB_USERNAME }}
DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
DOCKERHUB_ORG: ${{ vars.DOCKERHUB_ORG }}

I can set up a cron job to build/push base, I don't think that would be too much work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info! It's probably okay for now unless @cjh1 thinks we should add a cron job.

- forgot mpi4py
- tested and works now:
```
****Statistics for calculating electron thresholds****
number of samples:1200
min sample:0
max sample:1588
mean:0.106727
variance:4.7572
std dev:2.1811
number of bins:832
x-ray threshold n sigma:175
background threshold n sigma:4
optimized mean:-0.302141
optimized std dev:0.223178
background threshold:0.590571
xray threshold:381.8
electron_count_cori.py
Using files in /pscratch/sd/s/swelborn/ncem/138
scan name = data_scan0000000441_*.data
start counting #441
electron_count_cori.py
Using files in /pscratch/sd/s/swelborn/ncem/138
scan name = data_scan0000000441_*.data
start counting #441
electron_count_cori.py
Using files in /pscratch/sd/s/swelborn/ncem/138
scan name = data_scan0000000441_*.data
start counting #441
electron_count_cori.py
Using files in /pscratch/sd/s/swelborn/ncem/138
scan name = data_scan0000000441_*.data
start counting #441
total time = 4.425157070159912
/pscratch/sd/s/swelborn/ncem/138/FOURD_230601_1658_00907_00441.h5
```
@swelborn
Copy link
Collaborator Author

@cjh1 I tested samwelborn:stempy-mpi:latest and it works. All of the distiller notebooks also work with samwelborn:stempy-ipykernel:latest Should be good to go, if you are ok with it in its current state.

@swelborn swelborn merged commit d6b5e1d into OpenChemistry:master Aug 24, 2023
5 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants