This repository has been archived by the owner on Aug 13, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Initial containerisation of Luigi (#106)
* Luigi running in a container with local files copied in. * Create docker-compose for arxlive. * Add basic .dockerignore file. * Add an extra level of iteration so find_filepath_from_pathstub doesn't exit early when 1 level from HOME. * Change relative paths to absolute for arxiv pipeline so calling from outside folder works. * Add documentation for containerised luigi. * Tidy up Dockerfile and implement arg for python version.
- Loading branch information
russwinch
committed
Jun 18, 2019
1 parent
cb9f6ed
commit 3e49fa6
Showing
14 changed files
with
162 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
docker-compose* | ||
Dockerfile | ||
.git | ||
docs/* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
######## | ||
# Python dependencies builder | ||
# | ||
ARG PYTHON_VERSION=3.6 | ||
FROM python:$PYTHON_VERSION-slim AS builder | ||
|
||
WORKDIR /app | ||
# Sets utf-8 encoding for Python et al | ||
ENV LANG=C.UTF-8 | ||
# Turns off writing .pyc files; superfluous on an ephemeral container. | ||
ENV PYTHONDONTWRITEBYTECODE=1 | ||
# Seems to speed things up | ||
ENV PYTHONUNBUFFERED=1 | ||
|
||
# Ensures that the python and pip executables used | ||
# in the image will be those from our virtualenv. | ||
ENV PATH="/venv/bin:$PATH" | ||
|
||
# Install OS package dependencies. | ||
RUN apt-get update && apt-get install -y git | ||
|
||
# Setup the virtualenv | ||
RUN python -m venv /venv | ||
|
||
# Install Python dependencies | ||
# TODO: Replace with clone from git as below | ||
COPY requirements.txt ./ | ||
RUN pip install --no-cache-dir -r requirements.txt | ||
|
||
# TODO: implement clone from github with arg for branch or tag | ||
# RUN git clone https://github.com/nestauk/nesta.git --branch master --depth 1 --single-branch | ||
# RUN pip install --no-cache-dir -r nesta/requirements.txt | ||
|
||
# Install packages not in requirements. | ||
# TODO: mysql-connector-repackaged doesn't work, investigate switching over. | ||
RUN pip install awscli mysql-connector-python | ||
|
||
|
||
######## | ||
# app container | ||
# | ||
FROM python:$PYTHON_VERSION-slim AS app | ||
|
||
ENV PYTHONDONTWRITEBYTECODE=1 | ||
ENV PYTHONUNBUFFERED=1 | ||
ENV LANG=C.UTF-8 | ||
ENV PIP_DISABLE_PIP_VERSION_CHECK=1 | ||
ENV PATH="/venv/bin:$PATH" | ||
ENV PYTHONPATH /app | ||
ENV MYSQLDB /app/nesta/production/config/mysqldb.config | ||
ENV LUIGI_CONFIG_DIR /app/nesta/production/config | ||
ENV LUIGI_CONFIG_PATH /app/nesta/production/config/luigi.cfg | ||
|
||
WORKDIR /app | ||
|
||
# Copy in Python environment | ||
COPY --from=builder /venv /venv | ||
|
||
# Copy in the rest of the app from local to pick up configs | ||
# TODO: replace when secrets are implemented | ||
COPY ./ ./ | ||
|
||
RUN mkdir -p /var/log/luigi && \ | ||
mv docker/run.sh /usr/bin/run.sh && \ | ||
chmod +x /usr/bin/run.sh | ||
|
||
ENTRYPOINT ["run.sh"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
Containerised Luigi | ||
=================== | ||
|
||
Build | ||
----- | ||
|
||
The build uses a multi-stage dockerfile to speed up rebuilds after code changes: | ||
1. requirements are pip installed into a virtual environment | ||
2. the environment is copied into the second image along with the codebase | ||
|
||
From the root of the repository: | ||
:code:`docker build -f docker/Dockerfile -t name:tag .` | ||
|
||
where :code:`name` is the name of the created image and :code:`tag` is the chosen tag. | ||
eg :code:`arxlive:dev`. This just makes the run step easier rather than using the generated id | ||
|
||
Rebuilds due to code changes should just build from the second image but if a full rebuild is required then include: | ||
:code:`--no-cache` | ||
|
||
Python version defaults to 3.6 but can be set during build by including the flag: | ||
:code: `--build-arg python-version=3.7` | ||
|
||
Run | ||
--- | ||
|
||
As only one pipeline runs in the container the :code:`luigid` scheduler is not used. | ||
|
||
There is a :code:`docker-compose` file which mounts your local ~.aws folder for aws credentials as this outside docker's context | ||
This could be adapted for each pipeline. | ||
|
||
:code:`docker-compose -f docker/docker-compose.yml run luigi --module module_path params` | ||
|
||
where: | ||
|
||
- :code:`docker-compose.yml` is the docker-compose file containing the image: :code:`image_name:tag` from the build | ||
- :code:`module_path` is the full python path to the module | ||
- :code:`params` are any other params to supply as per normal, ie :code:`--date` :code:`--production` etc | ||
|
||
eg :code:`docker-compose -f docker/docker-compose-arxlive-dev.yml run luigi --module nesta.production.routines.arxiv.arxiv_iterative_root_task RootTask --date 2019-04-16` | ||
|
||
Important points | ||
---------------- | ||
|
||
- keep any built images secure, they contain credentials | ||
- you only need to rebuild if code has changed | ||
- as there is no central scheduler there is nothing stopping you from running the task more than once at the same time | ||
- the graphical interface is not enabled without the scheduler | ||
|
||
Debugging | ||
--------- | ||
|
||
If necessary, it's possible to debug inside the container, but the :code:`endpoint` needs to be overridden with :code:`bash`: | ||
|
||
:code:`docker run --entrypoint /bin/bash -itv ~/.aws:/root/.aws:ro image_name:tag` | ||
|
||
where :code:`image_name:tag` is the image from the build step | ||
This includes the mounting of the .aws folder | ||
|
||
Almost nothing is installed (not even vi!!) other than Python so | ||
|
||
:code:`apt-get update` and then :code:`apt-get install` whatever you need |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
version: '3' | ||
services: | ||
luigi: | ||
image: arxlive:dev | ||
volumes: | ||
- ~/.aws:/root/.aws:ro |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
#!/usr/bin/env bash | ||
# uncomment to enable the central scheduler (maybe useful for the graphical interface) | ||
# luigid & | ||
|
||
# pass any arguments straight on to luigi | ||
luigi --local-scheduler "$@" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
.. include:: ../../docker/README.rst |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters