###################### Continuous Integration ###################### .. epigraph:: Fragalysis Stack development procedures for Kubernetes deployment and the role of GitHub `Actions`_ in the Fragalysis Stack *CI* process. We rely on an external build process (**GitHub Actions**) to build, test and deploy the Fragalysis Stack container images. **GitHub Actions** are a built-in feature of GitHub repositories. We have added facilities to *chain* builds (for one GitHub repository to trigger another) using our custom `trigger-ci-action`_ GitHub Action. We deploy container images to the cluster using pre-configured AWX **Job Templates** using our custom `trigger-awx-action`_ GitHub Action. ***************************** Fragalysis Stack Repositories ***************************** There are four GitHub repositories involved in the build of the stack image:: fragalysis fragalysis-backend fragalysis-frontend fragalysis-stack The by-product of each repository is: - fragalysis The output of the ``fragalysis`` repository is a small package of Python code, written to `PyPI`_ when the repository is tagged. The package is part of the ``fragalysis-backend`` image's Python *requirements* [#f1]_. fragalysis-backend The output of the ``fragalysis-backend`` is a container image, written to `Docker Hub`_. This image is used as a ``FROM`` image in the *Stack* `multi-stage build`_. The backend is based on a **Python** "slim-bullseye" image. fragalysis-frontend The output of the ``fragalysis-frontend`` is a container image, written to `Docker Hub`_. This image is used as a ``FROM`` image in the *Stack* `multi-stage build`_. The frontend is based on a **Node** "bullseye" image. fragalysis-stack The output of the ``fragalysis-stack`` is a container image, written to `Docker Hub`_, and is based on the content of both the frontend and backend images. When deployed into a Kubernetes **Namespace** the the Fragalysis Stack manifests itself as a stack **Pod** (running the Django application) along with a database and a number of other objects, summarised in the following diagram: - .. image:: ../images/frag-actions/frag-actions.020.png **************************** Build example (stack master) **************************** Let's see how **GitHub Actions** work for the Fragalysis Stack by exploring a simple example, where a user-change to a repository's *staging* branch results in the stack being re-built, illustrated by the following diagram. .. image:: ../images/frag-actions/frag-actions.001.png The diagram illustrates a *user* making a change (**A**) to the ``staging`` branch of ``fragalysis-backend`` repository. The following steps occur, in approximate order: - 1. **GitHub Actions** detect the change and a build takes place that results in a backend image build pushed to `Docker Hub`_. The image pushed is ``xchem/fragalysis-backend:latest``. 2. At the end of the backend build the **Action** *triggers* a build in the remote repository ``fragalysis-stack``. It uses our `trigger-ci-action`_ Action to do this. 3. The ``fragalysis-stack`` **Actions** (triggered by the *backend* changes above) runs a build and its image is pushed to Docker Hub. The backend image is based on the contents of both the backend and frontend container images. The image pushed is ``xchem/fragalysis-stack:latest``. Importantly, there is only one branch in the stack repository, ``master``. ******************************** More scenarios (here be Dragons) ******************************** That's a simplistic illustration of a *build chain* from one ``staging`` branch rippling through the dependent repositories. But software development's more complicated than just changes to the ``staging`` branch and, in these cases, **GitHub Actions** will need some help. How does the Action know which repos to trigger? ================================================ This is the responsibility of the repository owner. Our `trigger-ci-action`_ Action is used to simplify the calls the the **GitHub** API but the owner of each repository needs to know which repositories to trigger and simply uses the `trigger-ci-action`_ at a suitable point in their own **workflow** file. The mechanism is essentially a *push-driven* trigger from *upstream* repository to *downstream*. A *downstream* repository cannot monitor *upstream* repositories, the author has to know which repositories depend on their code. How does a repo know what container tag to use? =============================================== By convention, in a CI/CD sense, automated builds on ``staging`` produce container images tagged ``latest``. The **Action** build can be easily configured to produce any tag but we tend to use ``latest`` or the tag used when a repository **Release** or **Tag** is created. How do I instruct the downstream to use my image? ================================================= In our example we've assumed the branch being manipulated is ``staging`` and in this *very simple* workflow we want all the dependent ``staging`` branches to build, resulting in their own ``latest`` images. But what if you're working on a defect on the *backend*, on a branch called ``issue-1178.1``? Do you want to trigger a rebuild of the *Stack*'s ``latest`` image from ``fragalysis-backend:latest``? No, you want the stack to use ``fragalysis-backend:issue-1178.1`` as its ``FROM``. So this is where the `trigger-ci-action`_ Acton, calling the **GitHub** REST API and your **workflow** file in both your *upstream* and *downstream* repositories become a little more complex... The *downstream* (Stack) repository's **workflow** file is configured to expect a ``workflow_dispatch`` event, the variables of which are populated pu the upstream Action's use of its `trigger-ci-action`_. There are default values, namely: - * ``BE_NAMESPACE`` AND ``BE_IMAGE_TAG`` (defaulting to ``xchem`` and ``latest``) * ``FE_NAMESPACE`` AND ``FE_IMAGE_TAG`` (defaulting to ``xchem`` and ``latest``) All the *upstream* repository's **workflow** file has to do is ensure that it *injects* appropriate values for these variables using the `trigger-ci-action`_. For this example we'd set the variables:: with: ci-inputs: >- be_namespace=issue-1178.1 be_image_tag=xchem With this setup the triggered build will produce a Stack image based on our ``issue-1178.1`` backend image. Brilliant! But hold on - the stack will be based on ``issue-1178.1`` while producing its own ``latest`` image. The stack's *downstream* repository's ``workflow_dispatch`` handler also accommodates the variables ``stack_namespace`` and ``stack_version``. If you set these in your trigger action you can build a stack image ``alanbchristie/fragalysis-stack:issue-1178.1`` by setting the variables ``stack_namespace`` and ``stack_version`` to ``alanbchristie`` and ``issue-1178.1`` respectively. Simple ... ish But what if you forget to set the variable? After all, when you create your *backend* branch you need to adjust your own GitHub secrets to provide a value for the variable. If you forget (and you will) you'll end up causing a new build of ``latest`` in the downstream projects that contains your (probably untested) patch. Not what others might expect from ``latest``. What if I want to trigger a non-master downstream branch? ========================================================= .. epigraph:: That's a very good question. If I have a ``issue-1178.1`` branch in the *upstream* build and I want to trigger the ``issue-1178.1`` branch in the *downstream* project? It's solved by the `trigger-ci-action`_ Action, which allows you to pass in a ``ci-ref`` definition so that **GitHub** builds the branch you name rather than the default ``master``. Brilliant! If you're clever enough you could even pass this value on to *downstreams* of the *downstream*, but that doesn't apply in our case and starts to get complex very quickly. But what if you forget to set the variable? Mmmm ... OK ... I see a pattern emerging here. Basically this is where it all gets rather messy, complex and complicated and unless you are very, very disciplined in your project organisation and development you should be treading extremely carefully. I have a fork of the frontend, how do I... ========================================== Here we'd like changes in a branch of a fork of one repository to trigger the build of a branch in the fork of another repository... **STOP!** It's just getting mind-bendingly complex. Mmmmm We're starting to sink deeper into a very complicated world. ************************** Development Recommendation ************************** For the main production images for STAGING (latest) and PRODUCTION (tagged) we... 1. ...utilise **trigger-ci-action** actions in the main ``xchem`` repositories. The build triggers are used *exclusively* for the automatic production of ``latest`` images on the ``master`` branch of the stack. 2. Similarly, GitHub builds tagged images on the main ``xchem`` repositories based on the presence of a release (or tag) in the repository. ``fragalysis-backend:2023.11.1`` is automatically produced when the owner applies the tag ``2023.11.1`` to the ``fragalysis-backend`` repository. The main stack deployment is therefore automatic, continuous, fast but, above all, simple. Individual developers... 3. ...work on branches of the main repositories or (ideally) on branches of *forks* of the main repos. 4. No images are automatically produced from changes to branches or forks. 5. Developers are responsible for building their own container images and for pushing them to Docker Hub. **Tina** working on branch ``issue-1178.1`` in a *fork* of the ``fragalysis-frontend`` repository is responsible for producing the corresponding ``stack`` image by (ideally) also forking and manipulating the ``fragalysis-stack`` repository so that it clones her frontend code rather than the code from ``xchem/fragalysis-frontend``. 6. In order to deploy their project to Kubernetes (the subject of another guide), users may push their container image to any Docker Hub namespace, project or tag. **Tina** can push her image as ``xyz/stack-tina:issue-1178.1`` if she chooses. This works because she will have deployed her project to Kubernetes (now a developer responsibility) configured tso her cloud deployment's stack should run using the image ``xyz/stack-tina:issue-1178.1`` (rather than the default ``xchem/fragalysis-stack:latest``). **Tina** can also select the version of the database she wants to use and the URL of the graph database. When she's done she destroys the Kubernetes project. The above places significant responsibility on the developer - they have to create the images, they have to push them, they have to create the Kubernetes deployments (subject of another guide) and they have to understand the build process. But, this is a significantly simpler and a relatively pain-free route to supporting unlimited multi-developer deployments than could be achieved by any automatic system in the timescale available. After all, if you're expect to have 20 or 30 developers all on different forks and branches, all developing different aspects of the code, an automatic build system would be enormously complex, fragile and costly to maintain. ******************** Development Examples ******************** To further illustrate the knock-on effect of the above recommendation for individual developers, i.e. that developers are responsible for their own container images using repository forks and branches, a few examples follow. .. epigraph:: The following relies on the use of standard Docker build arguments and the ability to use build-time args in the FROM statement, i.e. Docker v17.05 or later. .. _fe-example: Developing Front-end (F/E) Code Example ======================================= Here you're developing front-end code, relying on a published backend image and the existing stack implementation. .. image:: ../images/frag-actions/frag-actions.002.png 1. The developer *forks* ``xchem/fragslysis-frontend``, into, say ``alan/fragslysis-frontend`` (**A**) 2. The developer creates a *branch* and clones it, e.g. ``1-fix``, in order to make changes (**B**) 3. The developer *clones* ``xchem/fragslysis-stack`` (**C**) 4. When a stack image is to be tested the developer builds the stack (locally) using Docker. This could be achieved through the use of a build script [#f3]_) where the developer provides a suitable set of *build-args*, as shown (**D**). 5. Upon conclusion of development a *pull-request* on the frontend repository propagates the changes back to the XChem repo. The produced *stack*, built from a tagged backend and the code in the developer's 1-fix branch of their front-end repo fork, can then be pushed to Docker-hub and the Kubernetes cluster triggered to pull and run the updated code. The diagram also illustrates how the XChem ``STAGING/latest`` Fragalysis Stack is built and deployed (automatically using GitHub). This *official* stack uses a tagged b/e image (the same version in this example) but its *build args* (**E**) are such that is uses the ``master`` branch of the ``xchem`` project as the source of the front-end code [#f4]_. .. _be-example: Developing Back-end (B/E) Code Example ====================================== Here you're developing back-end code, relying on existing front-end and stack implementation. .. image:: ../images/frag-actions/frag-actions.003.png Here, in a less cluttered diagram: - 1. The developer *forks* ``xchem/fragslysis-backend``, into, say ``alan/fragslysis-backend`` (**A**) 2. The developer creates a *branch* and clones it, e.g. ``1-fix``, in order to make changes (**B**) 3. The developer *clones* ``xchem/fragslysis-stack`` (**C**) 4. When a stack image is to be tested the developer needs to build their own b/e image (**D**) (which they can optionally push to Docker hub) and then build the stack (locally), providing suitable *build-args*, as shown (**E**). 5. Upon conclusion of development a *pull-request* on the b/e repository propagates the changes back to the XChem repo. .. _stack-example: Developing Stack Code Example ============================= Here you're developing stack code, relying on a published back-end image and front-end implementation. .. image:: ../images/frag-actions/frag-actions.004.png 1. The developer *forks* the fragalysis stack repository (say to ``alan``) (**A**) 2. The developer creates a *branch* and clones it, e.g. ``1-fix``, in order to make changes (**B**) 3. When a stack image needs to be tested the developer needs to build their own stack image, which is pushed to Docker hub (**C**) providing suitable *build-args*, as shown (**D**). 4. Upon conclusion of development a *pull-request* on the stack repository propagates the changes back to the XChem repo. .. _everything-example: Developing Everything Example ============================= Here you're developing front-end, back-end and stack code. .. image:: ../images/frag-actions/frag-actions.005.png This is essentially a combination of the three prior scenarios. 1. The developer *forks* each repository (say to ``alan``) (**A**) 2. The developer creates a feature *branch* in each *fork* and then clones that to make changes (**B**). In the diagram we have branches ``1-fix``, ``2-fix`` and ``4-feature`` for the f/e, b/e and stack respectively. 3. When a stack is to be tested the developer first builds their own b/e (**C**) using minimal build arguments [#f5]_. The user then builds their own stack, from a clone of their code branch. Here you can see the stack is configured to use the ``alan/fragalysis-backend:2-fix`` image and a clone of the f/e ``1-fix`` branch. 4. The pushed stack can then be deployed to the Kubernetes cluster. 5. Upon conclusion of development *pull-requests* for b/e, f/e and stack repositories are made in order to propagate the changes back to the XChem repos. .. rubric:: Footnotes .. [#f1] Publishing to PyPi does not currently result in a trigger of the backend. It is something we can contemplate in the new development. .. [#f3] The build script will help by forcing a pull of the dependent backend container image for example. .. [#f4] ideally this would actually be a tag rather than ``master`` .. [#f5] Automation fo the image project from the project fork should be possible so the user may not have to specify anything in this case. .. _actions: https://github.com/features/actions .. _current: https://github.com/pavol-brunclik-m2ms/fragalysis-frontend/tree/develop .. _docker hub: https://hub.docker.com/search?q=xchem&type=image .. _multi-stage build: https://docs.docker.com/build/building/multi-stage .. _pypi: https://pypi.org/project/fragalysis .. _trigger-ci-action: https://github.com/InformaticsMatters/trigger-ci-action .. _trigger-awx-action: https://github.com/InformaticsMatters/trigger-awx-action