DOCS: Add example for using individual_axes

Also finalized remaining bits in the docs.
derb12 · Feb 18, 2024 · 21f2262 · 21f2262
1 parent 1a668d1
commit 21f2262
Show file tree

Hide file tree

Showing 8 changed files with 155 additions and 99 deletions.
diff --git a/README.rst b/README.rst
@@ -101,7 +101,7 @@ pybaselines requires `Python <https://python.org>`_ version 3.8 or later
 and the following libraries:
 
 * `NumPy <https://numpy.org>`_
-* `SciPy <https://www.scipy.org>`_
+* `SciPy <https://scipy.org>`_
 
 
 All of the required libraries should be automatically installed when
@@ -124,8 +124,8 @@ To use the various functions in pybaselines, simply input the measured
 data and any required parameters. All baseline correction functions in pybaselines
 will output two items: a numpy array of the calculated baseline and a
 dictionary of potentially useful parameters. The main interface for all baseline correction
-algorithms in pybaselines is through the `pybaselines.Baseline` object for one dimensional
-data and `pybaselines.Baseline2D` for two dimensional data.
+algorithms in pybaselines is through the ``Baseline`` object for one dimensional
+data and ``Baseline2D`` for two dimensional data.
 
 For more details on each baseline algorithm, refer to the `algorithms section`_ of
 pybaselines's documentation. For examples of their usage, refer to the `examples section`_.

diff --git a/docs/algorithms_2d/index.rst b/docs/algorithms_2d/index.rst
@@ -2,14 +2,14 @@
 2D Algorithms
 =============
 
-pybaselines extends a subset of the 1D baseline correction algorithms to work with
-2D data. Note that this is only intended for data in which there is some global baseline;
-otherwise, it is more appropriate and usually significantly faster to simply use the 1D
-algorithms on each individual row and/or column in the data, which can be done using
-:meth:`~.Baseline2D.individual_axes`.
+pybaselines extends a subset of the one dimensional (1D) baseline correction algorithms to work
+with two dimensional (2D) data. Note that this is only intended for data in which there is some
+global baseline; otherwise, it is more appropriate and usually significantly faster to simply
+use the 1D algorithms on each individual row and/or column in the data, which can be done using
+:meth:`.Baseline2D.individual_axes` or using :class:`.Baseline` with for-loops.
 
 This section of the documentation is to help provide some context for how the algorithms
-were extended to work with two dimensional data. It will not be as comprehensive as the
+were extended to work with 2D data. It will not be as comprehensive as the
 :doc:`1D Algorithms section <../algorithms/index>`, so to help understand any algorithm,
 it is suggested to start there. Refer to the :doc:`API section <../api/index>` of the
 documentation for the full parameter and reference listing for any algorithm.

diff --git a/docs/algorithms_2d/optimizers_2d.rst b/docs/algorithms_2d/optimizers_2d.rst
@@ -81,6 +81,12 @@ baseline algorithm along each row and/or column of the measured data. This is us
 if the axes of the data are not correlated such that no information is lost by
 fitting each axis separately, or when baselines only exist along one axis.
 
+Note that one limitation of :meth:`~.Baseline2D.individual_axes` is that it does not
+handle array-like `method_kwargs`, such as when different input weights are desired
+for each dataset along the rows and/or columns. However, this is an extremely niche
+situation, and could be handled by simply using a for-loop to do one dimensional
+baseline correction instead.
+
 .. plot::
    :align: center
    :context: close-figs

diff --git a/docs/installation.rst b/docs/installation.rst
@@ -11,7 +11,7 @@ Dependencies
 pybaselines requires `Python <https://python.org>`_ version 3.8 or later and the following libraries:
 
 * `NumPy <https://numpy.org>`_ (>= 1.20)
-* `SciPy <https://www.scipy.org>`_ (>= 1.5)
+* `SciPy <https://scipy.org>`_ (>= 1.5)
 
 
 All of the required libraries should be automatically installed when
@@ -22,7 +22,7 @@ Optional Dependencies
 
 pybaselines has the following optional dependencies:
 
-* `numba <https://github.com/numba/numba>`_ (>= 0.49):
+* `Numba <https://github.com/numba/numba>`_ (>= 0.49):
   speeds up calculations used by the following functions:
 
     * :meth:`~Baseline.loess`

diff --git a/examples/two_d/plot_along_axes_1d_baseline.py b/examples/two_d/plot_along_axes_1d_baseline.py
@@ -0,0 +1,94 @@
+# -*- coding: utf-8 -*-
+"""
+Using `individual_axes` for 1D Baseline Correction
+--------------------------------------------------
+
+This example will show how to apply one dimensional baseline correction to two
+dimensional data using :meth:`.Baseline2D.individual_axes`. Note that this is valid
+only if each baseline along the axis uses the same inputs; otherwise, the more appropriate
+approach is to use a for-loop with the corresponding :class:`.Baseline` method.
+
+"""
+# sphinx_gallery_thumbnail_number = 4
+
+import matplotlib.pyplot as plt
+import numpy as np
+
+from pybaselines import Baseline2D
+from pybaselines.utils import gaussian
+
+
+def plot_contour_with_projection(X, Z, data):
+    """Plots the countour plot and 3d projection."""
+    fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
+    ax_1 = fig.add_subplot(1, 2, 1)
+    ax_1.contourf(X, Z, data, cmap='coolwarm')
+    ax_2 = fig.add_subplot(1, 2, 2, projection='3d')
+    ax_2.plot_surface(X, Z, data, cmap='coolwarm')
+
+    ax_1.set_xlabel('Raman Shift (cm$^{-1}$)')
+    ax_1.set_ylabel('Temperature ($^o$C)')
+    ax_2.set_xlabel('Raman Shift (cm$^{-1}$)')
+    ax_2.set_ylabel('Temperature ($^o$C)')
+    ax_2.set_zticks([])
+
+
+def plot_1d(x, data):
+    """Plots the data in only one dimension."""
+    plt.figure()
+    # reverse so that data for lowest temperatures is plotted first
+    plt.plot(x, data[::-1].T)
+    plt.xlabel('Raman Shift (cm$^{-1}$)')
+    plt.ylabel('Intensity (Counts)')
+
+
+# %%
+# The data for this example will simulate Raman spectroscopy measurements that
+# were taken while heating a sample. Within the sample, peaks for one specimen
+# disappear as the temperature is raised, which could occur due to a chemical
+# reaction, phase change, decomposition, etc. Further, as the temperature increases,
+# the measured baseline slightly increases.
+len_temperature = 25
+wavenumber = np.linspace(50, 300, 1000)
+temperature = np.linspace(25, 100, len_temperature)
+X, T = np.meshgrid(wavenumber, temperature, indexing='ij')
+noise_generator = np.random.default_rng(0)
+data = []
+for i, t_value in enumerate(temperature):
+    signal = (
+        gaussian(wavenumber, 11 * (1 - i / len_temperature), 90, 3)
+        + gaussian(wavenumber, 12 * (1 - i / len_temperature), 110, 6)
+        + gaussian(wavenumber, 13, 210, 8)
+    )
+    real_baseline = 100 + 0.005 * wavenumber + 0.0001 * (wavenumber - 120)**2 + 0.08 * t_value
+    data.append(signal + real_baseline + noise_generator.normal(scale=0.1, size=wavenumber.size))
+y = np.array(data)
+
+plot_contour_with_projection(X, T, y.T)
+
+# %%
+# When considering the baseline of this data, it is more helpful to plot all measurements
+# only considering the wavenumber dependence.
+plot_1d(wavenumber, y)
+
+# %%
+# While the measured data is two dimensional, each baseline can be considered as
+# only dependent on the wavenumbers and independent of every other measurement along the
+# temperature axis. Thus, individual_axes can be called on just the axis corresponding
+# to the wavenumbers (ie. axis 1, the columns).
+baseline_fitter = Baseline2D(temperature, wavenumber)
+baseline, params = baseline_fitter.individual_axes(
+    y, axes=1, method='pspline_arpls', method_kwargs={'lam': 1e4}
+)
+
+# %%
+# Looking at the one dimensional representation, each spectrum was correctly baseline
+# corrected.
+plot_1d(wavenumber, y - baseline)
+
+# %%
+# Finally, looking at the two dimensional representation of the data again, the dependance
+# of the intensity for each peak with temperature is more easily seen.
+plot_contour_with_projection(X, T, (y - baseline).T)
+
+plt.show()
diff --git a/examples/two_d/plot_whittaker_2d_dof.py b/examples/two_d/plot_whittaker_2d_dof.py
@@ -32,6 +32,16 @@ def mean_squared_error(fit_baseline, real_baseline):
     return ((fit_baseline - real_baseline)**2).mean()
 
 
+def plot_contour_with_projection(X, Z, data, title=''):
+    """Plots the countour plot and 3d projection."""
+    fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
+    fig.suptitle(title)
+    ax_1 = fig.add_subplot(1, 2, 1, projection='3d')
+    ax_1.plot_surface(X, Z, data, cmap='coolwarm')
+    ax_2 = fig.add_subplot(1, 2, 2)
+    ax_2.contourf(X, Z, data, cmap='coolwarm')
+
+
 x = np.linspace(-20, 20, 100)
 z = np.linspace(-20, 30, 100)
 X, Z = np.meshgrid(x, z, indexing='ij')
@@ -52,19 +62,8 @@ def mean_squared_error(fit_baseline, real_baseline):
 # Only the baselines will be plotted in this example since the actual data is irrelevant
 # for this discussion.
 
-fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
-fig.suptitle('Actual Polynomial Baseline')
-ax = fig.add_subplot(1, 2, 2)
-ax.contourf(X, Z, polynomial_baseline, cmap='coolwarm')
-ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
-ax_2.plot_surface(X, Z, polynomial_baseline, cmap='coolwarm')
-
-fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
-fig.suptitle('Actual Sinusoidal Baseline')
-ax = fig.add_subplot(1, 2, 2)
-ax.contourf(X, Z, sine_baseline, cmap='coolwarm')
-ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
-ax_2.plot_surface(X, Z, sine_baseline, cmap='coolwarm')
+plot_contour_with_projection(X, Z, polynomial_baseline, title='Actual Polynomial Baseline')
+plot_contour_with_projection(X, Z, sine_baseline, title='Actual Sinusoidal Baseline')
 
 # %%
 # The ``lam`` values for fitting the baseline can be kept constant whether using
@@ -83,19 +82,12 @@ def mean_squared_error(fit_baseline, real_baseline):
 print(f'Mean-squared-error, polynomial: {mse_analytical_poly:.5f}')
 print(f'Mean-squared-error, sinusoidal: {mse_analytical_sine:.5f}\n')
 
-fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
-fig.suptitle('Analytical Polynomial Baseline')
-ax = fig.add_subplot(1, 2, 2)
-ax.contourf(X, Z, analytical_poly_baseline, cmap='coolwarm')
-ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
-ax_2.plot_surface(X, Z, analytical_poly_baseline, cmap='coolwarm')
-
-fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
-fig.suptitle('Analytical Sinusoidal Baseline')
-ax = fig.add_subplot(1, 2, 2)
-ax.contourf(X, Z, analytical_sine_baseline, cmap='coolwarm')
-ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
-ax_2.plot_surface(X, Z, analytical_sine_baseline, cmap='coolwarm')
+plot_contour_with_projection(
+    X, Z, analytical_poly_baseline, title='Analytical Polynomial Baseline'
+)
+plot_contour_with_projection(
+    X, Z, analytical_sine_baseline, title='Analytical Sinusoidal Baseline'
+)
 
 # %%
 # Now, try using eigendecomposition to calculate the same baselines. To start
@@ -119,50 +111,26 @@ def mean_squared_error(fit_baseline, real_baseline):
 print(f'Mean-squared-error, polynomial: {mse_analytical_poly:.5f}')
 print(f'Mean-squared-error, sinusoidal: {mse_analytical_sine:.5f}\n')
 
-fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
-fig.suptitle('40x40 Eigenvalues Polynomial Baseline')
-ax = fig.add_subplot(1, 2, 2)
-ax.contourf(X, Z, eigenvalue_poly_baseline_1, cmap='coolwarm')
-ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
-ax_2.plot_surface(X, Z, eigenvalue_poly_baseline_1, cmap='coolwarm')
-
-fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
-fig.suptitle('40x40 Eigenvalues Sinusoidal Baseline')
-ax = fig.add_subplot(1, 2, 2)
-ax.contourf(X, Z, eigenvalue_sine_baseline_1, cmap='coolwarm')
-ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
-ax_2.plot_surface(X, Z, eigenvalue_sine_baseline_1, cmap='coolwarm')
+plot_contour_with_projection(
+    X, Z, eigenvalue_poly_baseline_1, title='40x40 Eigenvalues Polynomial Baseline'
+)
+plot_contour_with_projection(
+    X, Z, eigenvalue_sine_baseline_1, title='40x40 Eigenvalues Sinusoidal Baseline'
+)
 
 # %%
 # By using 40 eigenvalues along the rows and 40 along the columns, the error of the fit
 # remains the same as the analytical solution while slightly reducing the computation time.
 # However, the number of eigenvalues being used is more than is actually required to represent
 # the two baselines, which means that the calculation time can be further reduced. Plot the
 # effective degrees of freedom to see which contribute most to the calculation.
-fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
-fig.suptitle('Effective Degrees of Freedom for\nPolynomial Baseline')
-ax = fig.add_subplot(1, 2, 2)
-ax.contourf(
+plot_contour_with_projection(
     *np.meshgrid(np.arange(num_eigens[0]), np.arange(num_eigens[1]), indexing='ij'),
-    params_3['dof'], cmap='coolwarm'
+    params_3['dof'], title='Effective Degrees of Freedom for Polynomial Baseline'
 )
-ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
-ax_2.plot_surface(
+plot_contour_with_projection(
     *np.meshgrid(np.arange(num_eigens[0]), np.arange(num_eigens[1]), indexing='ij'),
-    params_3['dof'], cmap='coolwarm'
-)
-
-fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
-fig.suptitle('Effective Degrees of Freedom for\nSinusoidal Baseline')
-ax = fig.add_subplot(1, 2, 2)
-ax.contourf(
-    *np.meshgrid(np.arange(num_eigens[0]), np.arange(num_eigens[1]), indexing='ij'),
-    params_4['dof'], cmap='coolwarm'
-)
-ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
-ax_2.plot_surface(
-    *np.meshgrid(np.arange(num_eigens[0]), np.arange(num_eigens[1]), indexing='ij'),
-    params_4['dof'], cmap='coolwarm'
+    params_4['dof'], title='Effective Degrees of Freedom for Sinusoidal Baseline'
 )
 
 # %%
@@ -191,19 +159,12 @@ def mean_squared_error(fit_baseline, real_baseline):
 print(f'Mean-squared-error, polynomial: {mse_analytical_poly:.5f}')
 print(f'Mean-squared-error, sinusoidal: {mse_analytical_sine:.5f}\n')
 
-fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
-fig.suptitle('10x4 Eigenvalues Polynomial Baseline')
-ax = fig.add_subplot(1, 2, 2)
-ax.contourf(X, Z, eigenvalue_poly_baseline_2, cmap='coolwarm')
-ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
-ax_2.plot_surface(X, Z, eigenvalue_poly_baseline_2, cmap='coolwarm')
-
-fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
-fig.suptitle('8x35 Eigenvalues Sinusoidal Baseline')
-ax = fig.add_subplot(1, 2, 2)
-ax.contourf(X, Z, eigenvalue_sine_baseline_2, cmap='coolwarm')
-ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
-ax_2.plot_surface(X, Z, eigenvalue_sine_baseline_2, cmap='coolwarm')
+plot_contour_with_projection(
+    X, Z, eigenvalue_poly_baseline_2, title='10x4 Eigenvalues Polynomial Baseline'
+)
+plot_contour_with_projection(
+    X, Z, eigenvalue_sine_baseline_2, title='8x35 Eigenvalues Sinusoidal Baseline'
+)
 
 # %%
 # By reducing the number of eigenvalues to represent the baseline, the calculation
@@ -226,23 +187,16 @@ def mean_squared_error(fit_baseline, real_baseline):
 t1 = perf_counter()
 mse_analytical_poly = mean_squared_error(eigenvalue_poly_baseline_3, polynomial_baseline)
 mse_analytical_sine = mean_squared_error(eigenvalue_sine_baseline_3, sine_baseline)
-print(f'3x3 Eigenvalues for polynomial, 5x10 for sinusoidal:\nTime: {t1 - t0:.3f} seconds')
+print(f'3x3 Eigenvalues for polynomial, 5x12 for sinusoidal:\nTime: {t1 - t0:.3f} seconds')
 print(f'Mean-squared-error, polynomial: {mse_analytical_poly:.5f}')
 print(f'Mean-squared-error, sinusoidal: {mse_analytical_sine:.5f}')
 
-fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
-fig.suptitle('3x3 Eigenvalues Polynomial Baseline')
-ax = fig.add_subplot(1, 2, 2)
-ax.contourf(X, Z, eigenvalue_poly_baseline_3, cmap='coolwarm')
-ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
-ax_2.plot_surface(X, Z, eigenvalue_poly_baseline_3, cmap='coolwarm')
-
-fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
-fig.suptitle('5x12 Eigenvalues Sinusoidal Baseline')
-ax = fig.add_subplot(1, 2, 2)
-ax.contourf(X, Z, eigenvalue_sine_baseline_3, cmap='coolwarm')
-ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
-ax_2.plot_surface(X, Z, eigenvalue_sine_baseline_3, cmap='coolwarm')
+plot_contour_with_projection(
+    X, Z, eigenvalue_poly_baseline_3, title='3x3 Eigenvalues Polynomial Baseline'
+)
+plot_contour_with_projection(
+    X, Z, eigenvalue_sine_baseline_3, title='5x12 Eigenvalues Sinusoidal Baseline'
+)
 
 plt.show()
 

diff --git a/pybaselines/two_d/__init__.py b/pybaselines/two_d/__init__.py
@@ -51,6 +51,7 @@
 
     * collab_pls (Collaborative Penalized Least Squares)
     * adaptive_minmax (Adaptive MinMax)
+    * individual_axes (1D Baseline Correction Along Individual Axes)
 
 
 @author: Donald Erb

diff --git a/pybaselines/two_d/optimizers.py b/pybaselines/two_d/optimizers.py
@@ -311,7 +311,8 @@ def individual_axes(self, data, axes=(0, 1), method='asls', method_kwargs=None):
         Raises
         ------
         ValueError
-            Raised if `method_kwargs` is a sequence with length greater than `axes`.
+            Raised if `method_kwargs` is a sequence with length greater than `axes` or if
+            the values in `axes` are duplicates.
 
         Notes
         -----