Skip to content

Commit

Permalink
DOCS: Add example for using individual_axes
Browse files Browse the repository at this point in the history
Also finalized remaining bits in the docs.
  • Loading branch information
derb12 committed Feb 18, 2024
1 parent 1a668d1 commit 21f2262
Show file tree
Hide file tree
Showing 8 changed files with 155 additions and 99 deletions.
6 changes: 3 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ pybaselines requires `Python <https://python.org>`_ version 3.8 or later
and the following libraries:

* `NumPy <https://numpy.org>`_
* `SciPy <https://www.scipy.org>`_
* `SciPy <https://scipy.org>`_


All of the required libraries should be automatically installed when
Expand All @@ -124,8 +124,8 @@ To use the various functions in pybaselines, simply input the measured
data and any required parameters. All baseline correction functions in pybaselines
will output two items: a numpy array of the calculated baseline and a
dictionary of potentially useful parameters. The main interface for all baseline correction
algorithms in pybaselines is through the `pybaselines.Baseline` object for one dimensional
data and `pybaselines.Baseline2D` for two dimensional data.
algorithms in pybaselines is through the ``Baseline`` object for one dimensional
data and ``Baseline2D`` for two dimensional data.

For more details on each baseline algorithm, refer to the `algorithms section`_ of
pybaselines's documentation. For examples of their usage, refer to the `examples section`_.
Expand Down
12 changes: 6 additions & 6 deletions docs/algorithms_2d/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@
2D Algorithms
=============

pybaselines extends a subset of the 1D baseline correction algorithms to work with
2D data. Note that this is only intended for data in which there is some global baseline;
otherwise, it is more appropriate and usually significantly faster to simply use the 1D
algorithms on each individual row and/or column in the data, which can be done using
:meth:`~.Baseline2D.individual_axes`.
pybaselines extends a subset of the one dimensional (1D) baseline correction algorithms to work
with two dimensional (2D) data. Note that this is only intended for data in which there is some
global baseline; otherwise, it is more appropriate and usually significantly faster to simply
use the 1D algorithms on each individual row and/or column in the data, which can be done using
:meth:`.Baseline2D.individual_axes` or using :class:`.Baseline` with for-loops.

This section of the documentation is to help provide some context for how the algorithms
were extended to work with two dimensional data. It will not be as comprehensive as the
were extended to work with 2D data. It will not be as comprehensive as the
:doc:`1D Algorithms section <../algorithms/index>`, so to help understand any algorithm,
it is suggested to start there. Refer to the :doc:`API section <../api/index>` of the
documentation for the full parameter and reference listing for any algorithm.
Expand Down
6 changes: 6 additions & 0 deletions docs/algorithms_2d/optimizers_2d.rst
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,12 @@ baseline algorithm along each row and/or column of the measured data. This is us
if the axes of the data are not correlated such that no information is lost by
fitting each axis separately, or when baselines only exist along one axis.

Note that one limitation of :meth:`~.Baseline2D.individual_axes` is that it does not
handle array-like `method_kwargs`, such as when different input weights are desired
for each dataset along the rows and/or columns. However, this is an extremely niche
situation, and could be handled by simply using a for-loop to do one dimensional
baseline correction instead.

.. plot::
:align: center
:context: close-figs
Expand Down
4 changes: 2 additions & 2 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Dependencies
pybaselines requires `Python <https://python.org>`_ version 3.8 or later and the following libraries:

* `NumPy <https://numpy.org>`_ (>= 1.20)
* `SciPy <https://www.scipy.org>`_ (>= 1.5)
* `SciPy <https://scipy.org>`_ (>= 1.5)


All of the required libraries should be automatically installed when
Expand All @@ -22,7 +22,7 @@ Optional Dependencies

pybaselines has the following optional dependencies:

* `numba <https://github.com/numba/numba>`_ (>= 0.49):
* `Numba <https://github.com/numba/numba>`_ (>= 0.49):
speeds up calculations used by the following functions:

* :meth:`~Baseline.loess`
Expand Down
94 changes: 94 additions & 0 deletions examples/two_d/plot_along_axes_1d_baseline.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# -*- coding: utf-8 -*-
"""
Using `individual_axes` for 1D Baseline Correction
--------------------------------------------------
This example will show how to apply one dimensional baseline correction to two
dimensional data using :meth:`.Baseline2D.individual_axes`. Note that this is valid
only if each baseline along the axis uses the same inputs; otherwise, the more appropriate
approach is to use a for-loop with the corresponding :class:`.Baseline` method.
"""
# sphinx_gallery_thumbnail_number = 4

import matplotlib.pyplot as plt
import numpy as np

from pybaselines import Baseline2D
from pybaselines.utils import gaussian


def plot_contour_with_projection(X, Z, data):
"""Plots the countour plot and 3d projection."""
fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
ax_1 = fig.add_subplot(1, 2, 1)
ax_1.contourf(X, Z, data, cmap='coolwarm')
ax_2 = fig.add_subplot(1, 2, 2, projection='3d')
ax_2.plot_surface(X, Z, data, cmap='coolwarm')

ax_1.set_xlabel('Raman Shift (cm$^{-1}$)')
ax_1.set_ylabel('Temperature ($^o$C)')
ax_2.set_xlabel('Raman Shift (cm$^{-1}$)')
ax_2.set_ylabel('Temperature ($^o$C)')
ax_2.set_zticks([])


def plot_1d(x, data):
"""Plots the data in only one dimension."""
plt.figure()
# reverse so that data for lowest temperatures is plotted first
plt.plot(x, data[::-1].T)
plt.xlabel('Raman Shift (cm$^{-1}$)')
plt.ylabel('Intensity (Counts)')


# %%
# The data for this example will simulate Raman spectroscopy measurements that
# were taken while heating a sample. Within the sample, peaks for one specimen
# disappear as the temperature is raised, which could occur due to a chemical
# reaction, phase change, decomposition, etc. Further, as the temperature increases,
# the measured baseline slightly increases.
len_temperature = 25
wavenumber = np.linspace(50, 300, 1000)
temperature = np.linspace(25, 100, len_temperature)
X, T = np.meshgrid(wavenumber, temperature, indexing='ij')
noise_generator = np.random.default_rng(0)
data = []
for i, t_value in enumerate(temperature):
signal = (
gaussian(wavenumber, 11 * (1 - i / len_temperature), 90, 3)
+ gaussian(wavenumber, 12 * (1 - i / len_temperature), 110, 6)
+ gaussian(wavenumber, 13, 210, 8)
)
real_baseline = 100 + 0.005 * wavenumber + 0.0001 * (wavenumber - 120)**2 + 0.08 * t_value
data.append(signal + real_baseline + noise_generator.normal(scale=0.1, size=wavenumber.size))
y = np.array(data)

plot_contour_with_projection(X, T, y.T)

# %%
# When considering the baseline of this data, it is more helpful to plot all measurements
# only considering the wavenumber dependence.
plot_1d(wavenumber, y)

# %%
# While the measured data is two dimensional, each baseline can be considered as
# only dependent on the wavenumbers and independent of every other measurement along the
# temperature axis. Thus, individual_axes can be called on just the axis corresponding
# to the wavenumbers (ie. axis 1, the columns).
baseline_fitter = Baseline2D(temperature, wavenumber)
baseline, params = baseline_fitter.individual_axes(
y, axes=1, method='pspline_arpls', method_kwargs={'lam': 1e4}
)

# %%
# Looking at the one dimensional representation, each spectrum was correctly baseline
# corrected.
plot_1d(wavenumber, y - baseline)

# %%
# Finally, looking at the two dimensional representation of the data again, the dependance
# of the intensity for each peak with temperature is more easily seen.
plot_contour_with_projection(X, T, (y - baseline).T)

plt.show()
128 changes: 41 additions & 87 deletions examples/two_d/plot_whittaker_2d_dof.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,16 @@ def mean_squared_error(fit_baseline, real_baseline):
return ((fit_baseline - real_baseline)**2).mean()


def plot_contour_with_projection(X, Z, data, title=''):
"""Plots the countour plot and 3d projection."""
fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
fig.suptitle(title)
ax_1 = fig.add_subplot(1, 2, 1, projection='3d')
ax_1.plot_surface(X, Z, data, cmap='coolwarm')
ax_2 = fig.add_subplot(1, 2, 2)
ax_2.contourf(X, Z, data, cmap='coolwarm')


x = np.linspace(-20, 20, 100)
z = np.linspace(-20, 30, 100)
X, Z = np.meshgrid(x, z, indexing='ij')
Expand All @@ -52,19 +62,8 @@ def mean_squared_error(fit_baseline, real_baseline):
# Only the baselines will be plotted in this example since the actual data is irrelevant
# for this discussion.

fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
fig.suptitle('Actual Polynomial Baseline')
ax = fig.add_subplot(1, 2, 2)
ax.contourf(X, Z, polynomial_baseline, cmap='coolwarm')
ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
ax_2.plot_surface(X, Z, polynomial_baseline, cmap='coolwarm')

fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
fig.suptitle('Actual Sinusoidal Baseline')
ax = fig.add_subplot(1, 2, 2)
ax.contourf(X, Z, sine_baseline, cmap='coolwarm')
ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
ax_2.plot_surface(X, Z, sine_baseline, cmap='coolwarm')
plot_contour_with_projection(X, Z, polynomial_baseline, title='Actual Polynomial Baseline')
plot_contour_with_projection(X, Z, sine_baseline, title='Actual Sinusoidal Baseline')

# %%
# The ``lam`` values for fitting the baseline can be kept constant whether using
Expand All @@ -83,19 +82,12 @@ def mean_squared_error(fit_baseline, real_baseline):
print(f'Mean-squared-error, polynomial: {mse_analytical_poly:.5f}')
print(f'Mean-squared-error, sinusoidal: {mse_analytical_sine:.5f}\n')

fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
fig.suptitle('Analytical Polynomial Baseline')
ax = fig.add_subplot(1, 2, 2)
ax.contourf(X, Z, analytical_poly_baseline, cmap='coolwarm')
ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
ax_2.plot_surface(X, Z, analytical_poly_baseline, cmap='coolwarm')

fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
fig.suptitle('Analytical Sinusoidal Baseline')
ax = fig.add_subplot(1, 2, 2)
ax.contourf(X, Z, analytical_sine_baseline, cmap='coolwarm')
ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
ax_2.plot_surface(X, Z, analytical_sine_baseline, cmap='coolwarm')
plot_contour_with_projection(
X, Z, analytical_poly_baseline, title='Analytical Polynomial Baseline'
)
plot_contour_with_projection(
X, Z, analytical_sine_baseline, title='Analytical Sinusoidal Baseline'
)

# %%
# Now, try using eigendecomposition to calculate the same baselines. To start
Expand All @@ -119,50 +111,26 @@ def mean_squared_error(fit_baseline, real_baseline):
print(f'Mean-squared-error, polynomial: {mse_analytical_poly:.5f}')
print(f'Mean-squared-error, sinusoidal: {mse_analytical_sine:.5f}\n')

fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
fig.suptitle('40x40 Eigenvalues Polynomial Baseline')
ax = fig.add_subplot(1, 2, 2)
ax.contourf(X, Z, eigenvalue_poly_baseline_1, cmap='coolwarm')
ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
ax_2.plot_surface(X, Z, eigenvalue_poly_baseline_1, cmap='coolwarm')

fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
fig.suptitle('40x40 Eigenvalues Sinusoidal Baseline')
ax = fig.add_subplot(1, 2, 2)
ax.contourf(X, Z, eigenvalue_sine_baseline_1, cmap='coolwarm')
ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
ax_2.plot_surface(X, Z, eigenvalue_sine_baseline_1, cmap='coolwarm')
plot_contour_with_projection(
X, Z, eigenvalue_poly_baseline_1, title='40x40 Eigenvalues Polynomial Baseline'
)
plot_contour_with_projection(
X, Z, eigenvalue_sine_baseline_1, title='40x40 Eigenvalues Sinusoidal Baseline'
)

# %%
# By using 40 eigenvalues along the rows and 40 along the columns, the error of the fit
# remains the same as the analytical solution while slightly reducing the computation time.
# However, the number of eigenvalues being used is more than is actually required to represent
# the two baselines, which means that the calculation time can be further reduced. Plot the
# effective degrees of freedom to see which contribute most to the calculation.
fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
fig.suptitle('Effective Degrees of Freedom for\nPolynomial Baseline')
ax = fig.add_subplot(1, 2, 2)
ax.contourf(
plot_contour_with_projection(
*np.meshgrid(np.arange(num_eigens[0]), np.arange(num_eigens[1]), indexing='ij'),
params_3['dof'], cmap='coolwarm'
params_3['dof'], title='Effective Degrees of Freedom for Polynomial Baseline'
)
ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
ax_2.plot_surface(
plot_contour_with_projection(
*np.meshgrid(np.arange(num_eigens[0]), np.arange(num_eigens[1]), indexing='ij'),
params_3['dof'], cmap='coolwarm'
)

fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
fig.suptitle('Effective Degrees of Freedom for\nSinusoidal Baseline')
ax = fig.add_subplot(1, 2, 2)
ax.contourf(
*np.meshgrid(np.arange(num_eigens[0]), np.arange(num_eigens[1]), indexing='ij'),
params_4['dof'], cmap='coolwarm'
)
ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
ax_2.plot_surface(
*np.meshgrid(np.arange(num_eigens[0]), np.arange(num_eigens[1]), indexing='ij'),
params_4['dof'], cmap='coolwarm'
params_4['dof'], title='Effective Degrees of Freedom for Sinusoidal Baseline'
)

# %%
Expand Down Expand Up @@ -191,19 +159,12 @@ def mean_squared_error(fit_baseline, real_baseline):
print(f'Mean-squared-error, polynomial: {mse_analytical_poly:.5f}')
print(f'Mean-squared-error, sinusoidal: {mse_analytical_sine:.5f}\n')

fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
fig.suptitle('10x4 Eigenvalues Polynomial Baseline')
ax = fig.add_subplot(1, 2, 2)
ax.contourf(X, Z, eigenvalue_poly_baseline_2, cmap='coolwarm')
ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
ax_2.plot_surface(X, Z, eigenvalue_poly_baseline_2, cmap='coolwarm')

fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
fig.suptitle('8x35 Eigenvalues Sinusoidal Baseline')
ax = fig.add_subplot(1, 2, 2)
ax.contourf(X, Z, eigenvalue_sine_baseline_2, cmap='coolwarm')
ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
ax_2.plot_surface(X, Z, eigenvalue_sine_baseline_2, cmap='coolwarm')
plot_contour_with_projection(
X, Z, eigenvalue_poly_baseline_2, title='10x4 Eigenvalues Polynomial Baseline'
)
plot_contour_with_projection(
X, Z, eigenvalue_sine_baseline_2, title='8x35 Eigenvalues Sinusoidal Baseline'
)

# %%
# By reducing the number of eigenvalues to represent the baseline, the calculation
Expand All @@ -226,23 +187,16 @@ def mean_squared_error(fit_baseline, real_baseline):
t1 = perf_counter()
mse_analytical_poly = mean_squared_error(eigenvalue_poly_baseline_3, polynomial_baseline)
mse_analytical_sine = mean_squared_error(eigenvalue_sine_baseline_3, sine_baseline)
print(f'3x3 Eigenvalues for polynomial, 5x10 for sinusoidal:\nTime: {t1 - t0:.3f} seconds')
print(f'3x3 Eigenvalues for polynomial, 5x12 for sinusoidal:\nTime: {t1 - t0:.3f} seconds')
print(f'Mean-squared-error, polynomial: {mse_analytical_poly:.5f}')
print(f'Mean-squared-error, sinusoidal: {mse_analytical_sine:.5f}')

fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
fig.suptitle('3x3 Eigenvalues Polynomial Baseline')
ax = fig.add_subplot(1, 2, 2)
ax.contourf(X, Z, eigenvalue_poly_baseline_3, cmap='coolwarm')
ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
ax_2.plot_surface(X, Z, eigenvalue_poly_baseline_3, cmap='coolwarm')

fig = plt.figure(layout='constrained', figsize=plt.figaspect(0.5))
fig.suptitle('5x12 Eigenvalues Sinusoidal Baseline')
ax = fig.add_subplot(1, 2, 2)
ax.contourf(X, Z, eigenvalue_sine_baseline_3, cmap='coolwarm')
ax_2 = fig.add_subplot(1, 2, 1, projection='3d')
ax_2.plot_surface(X, Z, eigenvalue_sine_baseline_3, cmap='coolwarm')
plot_contour_with_projection(
X, Z, eigenvalue_poly_baseline_3, title='3x3 Eigenvalues Polynomial Baseline'
)
plot_contour_with_projection(
X, Z, eigenvalue_sine_baseline_3, title='5x12 Eigenvalues Sinusoidal Baseline'
)

plt.show()

Expand Down
1 change: 1 addition & 0 deletions pybaselines/two_d/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
* collab_pls (Collaborative Penalized Least Squares)
* adaptive_minmax (Adaptive MinMax)
* individual_axes (1D Baseline Correction Along Individual Axes)
@author: Donald Erb
Expand Down
3 changes: 2 additions & 1 deletion pybaselines/two_d/optimizers.py
Original file line number Diff line number Diff line change
Expand Up @@ -311,7 +311,8 @@ def individual_axes(self, data, axes=(0, 1), method='asls', method_kwargs=None):
Raises
------
ValueError
Raised if `method_kwargs` is a sequence with length greater than `axes`.
Raised if `method_kwargs` is a sequence with length greater than `axes` or if
the values in `axes` are duplicates.
Notes
-----
Expand Down

0 comments on commit 21f2262

Please sign in to comment.