Rebase on main

materialsproject · Aug 22, 2023 · c4ac3f2 · c4ac3f2
2 parents c71fb86 + b71cf2a
commit c4ac3f2
Show file tree

Hide file tree

Showing 85 changed files with 7,543 additions and 467 deletions.
diff --git a/.github/workflows/docs-manual.yml b/.github/workflows/docs-manual.yml
diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
@@ -0,0 +1,55 @@
+name: build-docs
+
+on:
+  workflow_dispatch:
+  push:
+    branches: [main]
+
+# set GITHUB_TOKEN permissions to allow deployment to GitHub Pages
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+jobs:
+  build-docs:
+    if: github.repository_owner == 'materialsproject' && github.ref == 'refs/heads/main'
+    runs-on: ubuntu-latest
+
+    steps:
+      - uses: actions/checkout@v3
+        with:
+          ref: ${{ github.event.workflow_run.head_branch }}
+
+      - name: Install pandoc
+        run: sudo apt-get install pandoc
+
+      - uses: actions/setup-python@v4
+        with:
+          python-version: "3.10"
+          cache: pip
+          cache-dependency-path: pyproject.toml
+
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install .[strict,docs]
+
+      - name: Build
+        run: sphinx-build docs docs_build
+
+      - name: Upload build artifact
+        uses: actions/upload-pages-artifact@v2
+        with:
+          path: ./docs_build
+
+  deploy:
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+    needs: build-docs
+    runs-on: ubuntu-latest
+    steps:
+      - name: Deploy to GitHub Pages
+        id: deployment
+        uses: actions/deploy-pages@v2
diff --git a/.github/workflows/testing.yml b/.github/workflows/testing.yml
@@ -2,7 +2,7 @@ name: testing
 
 on:
   push:
-
+    branches: [main]
   pull_request:
     branches: [main]
 
@@ -14,7 +14,7 @@ jobs:
 
       - uses: actions/setup-python@v4
         with:
-          python-version: '3.8'
+          python-version: "3.8"
           cache: pip
           cache-dependency-path: pyproject.toml
 
@@ -30,7 +30,7 @@ jobs:
     runs-on: ubuntu-latest
     strategy:
       matrix:
-        python-version: ['3.8', '3.9', '3.10']
+        python-version: ["3.8", "3.9", "3.10"]
 
     steps:
       - uses: actions/checkout@v3
@@ -65,7 +65,7 @@ jobs:
 
       - uses: actions/setup-python@v4
         with:
-          python-version: '3.10'
+          python-version: "3.10"
           cache: pip
           cache-dependency-path: pyproject.toml
 

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -3,7 +3,7 @@ default_language_version:
 exclude: '^.github/'
 repos:
 - repo: https://github.com/charliermarsh/ruff-pre-commit
-  rev: v0.0.275
+  rev: v0.0.284
   hooks:
   - id: ruff
     args: [--fix]
@@ -16,11 +16,11 @@ repos:
   - id: end-of-file-fixer
   - id: trailing-whitespace
 - repo: https://github.com/psf/black
-  rev: 23.3.0
+  rev: 23.7.0
   hooks:
   - id: black
 - repo: https://github.com/asottile/blacken-docs
-  rev: 1.14.0
+  rev: 1.15.0
   hooks:
   - id: blacken-docs
     additional_dependencies: [black]
@@ -46,7 +46,7 @@ repos:
   - id: rst-directive-colons
   - id: rst-inline-touching-normal
 - repo: https://github.com/pre-commit/mirrors-mypy
-  rev: v1.3.0
+  rev: v1.5.0
   hooks:
   - id: mypy
     files: ^src/
@@ -55,7 +55,7 @@ repos:
     - types-pkg_resources==0.1.2
     - types-paramiko
 - repo: https://github.com/codespell-project/codespell
-  rev: v2.2.4
+  rev: v2.2.5
   hooks:
   - id: codespell
     stages: [commit, commit-msg]

diff --git a/docs/conf.py b/docs/conf.py
@@ -11,8 +11,6 @@
 import os
 import sys
 
-# import typing
-# typing.TYPE_CHECKING = True
 from atomate2 import __version__
 
 sys.path.insert(0, os.path.abspath("../../"))
@@ -94,7 +92,7 @@
 """
 }
 language = "en"
-html_extra_path = ["images/ badge.svg"]
+html_extra_path = ["images/badge.svg"]
 html_static_path = ["_static"]
 html_css_files = ["custom.css", "github.css"]
 suppress_warnings = "etoc.toctree"

diff --git a/docs/dev/workflow_tutorial.md b/docs/dev/workflow_tutorial.md
@@ -0,0 +1,63 @@
+# How to Develop a new workflow for `atomate2`
+
+## Anatomy of an `atomate2` computational workflow (i.e., what do I need to write?)
+
+Every `atomate2` workflow is an instance of jobflow's `Flow ` class, which is a collection of Job and/or other `Flow` objects. So your end goal is to produce a `Flow `.
+
+In the context of computational materials science, `Flow ` objects are most easily created by a `Maker`, which contains a factory method make() that produces a `Flow `, given certain inputs. Typically, the input to `Maker`.make() includes atomic coordinate information in the form of a `pymatgen` `Structure` or `Molecule` object. So the basic signature looks like this:
+
+```python
+class ExampleMaker(Maker):
+    def make(self, coordinates: Structure) -> Flow:
+        # take the input coordinates and return a `Flow `
+        return Flow(...)
+```
+
+The `Maker` class usually contains most of the calculation parameters and other settings that are required to set up the calculation in the correct way. Much of this logic can be written like normal python functions and then turned into a `Job` via the `@job` decorator.
+
+One common task encountered in almost any materials science calculation is writing calculation input files to disk so they can be executed by the underlying software (e.g., VASP, Q-Chem, CP2K, etc.). This is preferably done via a `pymatgen` `InputSet` class. `InputSet` is essentially a dict-like container that specifies the files that need to be written, and their contents. Similarly to the way that `Maker` classes generate `Flow`s, `InputSet`s are most easily created by `InputGenerator` classes. `InputGenerator`
+have a method `get_input_set()` that typically takes atomic coordinates (e.g., a `Structure` or `Molecule` object) and produce an `InputSet`, e.g.,
+
+```python
+class ExampleInputGenerator(InputGenerator):
+    def get_input_set(self, coordinates: Structure) -> InputSet:
+        # take the input coordinates, determine appropriate
+        # input file contents, and return an `InputSet`
+        return InputSet(...)
+```
+
+`pymatgen` already contains `InputSet` for many common codes, so when developing a workflow `Maker` it is convenient to use the `InputGenerator` / `InputSet` to prepare your files. This is done in `atomate2` by making the `InputGenerator` a class parameter, e.g.,
+
+**TODO - the code block below needs refinement. Not exactly sure how write_inputs() fits into a`Job`**
+
+```python
+class ExampleMaker(Maker):
+    input_set_generator: ExampleInputGenerator = field(
+        default_factory=ExampleInputGenerator
+    )
+
+    def make(self, coordinates: Structure) -> Flow:
+        # create an`InputSet`
+        input_set = self.input_set_generator.get_input_set(coordinates)
+        # write the input files
+        input_set.write_inputs()
+        return Flow(...)
+```
+
+Finally, most `atomate2` workflows return structured output in the form of "Task Documents". Task documents are instances of `emmet`'s `BaseTaskDocument` class (similarly to a `python` `@dataclass`) that define schemas for storing calculation outputs. `emmet` already contains calculation schemas for codes utilized by the Materials Project (e.g., VASP, Q-Chem, FEFF) as well as a number of schemas for code-agnostic structural and molecular information (for example, the `MaterialsDoc` is a schema for solid material calculation data). `atomate2` can also interpret output generated by [`cclib`](https://cclib.github.io/), which is able to parse the output of many additional codes.
+
+**TODO - extend code block above to illustrate TaskDoc usage**
+
+In summary, a new `atomate2` workflow consists of the following components:
+ - A `Maker` that actually generates the workflow
+ - One or more `Job` and/or `Flow ` classes that define the discrete steps in the workflow
+ - (optionally) an `InputGenerator` that produces a `pymatgen` `InputSet` for writing calculation input files
+ - (optionally) a `TaskDocument` that defines a schema for storing the output data
+
+## Where do I put my code?
+
+Because of the distributed design of the MP Software Ecosystem, writing a complete new workflow may involve making contributions to more than one GitHub repository. The following guidelines should help you understand where to put your contribution.
+
+ - All workflow code (`Job`, `Flow `, `Maker`) belongs in `atomate2`
+ - `InputSet` and `InputGenerator` code belongs in `pymatgen`. However, if you need to create these classes from scratch (i.e., you are working with a code that is not already supported in`pymatgen`), then it is recommended to include them in `atomate2` at first to facilitate rapid iteration. Once mature, they can be moved to `pymatgen` or to a `pymatgen` [addon package](https://pymatgen.org/addons).
+ - `TaskDocument` schemas should generally be developed in `atomate2` alongside the workflow code. We recommend that you first check emmet to see if there is an existing schema that matches what you need. If so, you can import it. If not, check [`cclib`](https://cclib.github.io/). `cclib` output can be imported via [`atomate2.common.schemas.TaskDocument`](https://github.com/materialsproject/atomate2/blob/main/src/atomate2/common/schemas/cclib.py). If neither code has what you need, then new schemas should be developed within `atomate2` (or `cclib`).
diff --git a/docs/user/codes/vasp.md b/docs/user/codes/vasp.md
@@ -241,7 +241,7 @@ adjust them if necessary. The default might not be strict enough
 for your specific case.
 ```
 
-## Lobster
+### Lobster
 
 Perform bonding analysis with [LOBSTER](http://cohp.de/) and [LobsterPy](https://github.com/jageo/lobsterpy)
 
@@ -258,8 +258,67 @@ VASP_CMD: <<VASP_CMD>>
 LOBSTER_CMD: <<LOBSTER_CMD>>
 ```
 
+The corresponding flow could, for example, be started with the following code:
+
+```Python
+from jobflow import SETTINGS
+from jobflow import run_locally
+from pymatgen.core.structure import Structure
+
+from atomate2.vasp.flows.lobster import VaspLobsterMaker
+from atomate2.vasp.powerups import update_user_incar_settings
+
+structure = Structure(
+    lattice=[[0, 2.13, 2.13], [2.13, 0, 2.13], [2.13, 2.13, 0]],
+    species=["Mg", "O"],
+    coords=[[0, 0, 0], [0.5, 0.5, 0.5]],
+)
+
+lobster = VaspLobsterMaker().make(structure)
+
+# update the incar
+lobster = update_user_incar_settings(lobster, {"NPAR": 4})
+# run the job
+run_locally(lobster, create_folders=True, store=SETTINGS.JOB_STORE)
+```
+
+It is, however,  computationally very beneficial to define two different types of job scripts for the VASP and Lobster runs, as VASP and Lobster runs are parallelized differently (MPI vs. OpenMP).
 [FireWorks](https://github.com/materialsproject/fireworks) allows to run the VASP and Lobster jobs with different job scripts. Please check out the [jobflow documentation on FireWorks](https://materialsproject.github.io/jobflow/tutorials/8-fireworks.html#setting-the-manager-configs) for more information.
 
+Specifically, you might want to change the `_fworker` for the LOBSTER runs and define a separate `lobster` worker within FireWorks:
+
+```python
+from fireworks import LaunchPad
+from jobflow.managers.fireworks import flow_to_workflow
+from pymatgen.core.structure import Structure
+
+from atomate2.vasp.flows.lobster import VaspLobsterMaker
+from atomate2.vasp.powerups import update_user_incar_settings
+
+structure = Structure(
+    lattice=[[0, 2.13, 2.13], [2.13, 0, 2.13], [2.13, 2.13, 0]],
+    species=["Mg", "O"],
+    coords=[[0, 0, 0], [0.5, 0.5, 0.5]],
+)
+
+lobster = VaspLobsterMaker().make(structure)
+lobster = update_user_incar_settings(lobster, {"NPAR": 4})
+
+# update the fireworker of the Lobster jobs
+for job, _ in lobster.iterflow():
+    config = {"manager_config": {"_fworker": "worker"}}
+    if "get_lobster" in job.name:
+        config["response_manager_config"] = {"_fworker": "lobster"}
+    job.update_config(config)
+
+# convert the flow to a fireworks WorkFlow object
+wf = flow_to_workflow(lobster)
+
+# submit the workflow to the FireWorks launchpad
+lpad = LaunchPad.auto_load()
+lpad.add_wf(wf)
+```
+
 Outputs from the automatic analysis with LobsterPy can easily be extracted from the database and also plotted:
 
 ```python