Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pyatoa v0.3.0 #39

Merged
merged 59 commits into from
Aug 29, 2023
Merged

Pyatoa v0.3.0 #39

merged 59 commits into from
Aug 29, 2023

Conversation

bch0w
Copy link
Member

@bch0w bch0w commented Aug 29, 2023

Pyatoa v0.2.0 -> v0.3.0
This PR implements backwards incompatible changes to Pyatoa
The overall intention of this PR is to streamline Pyatoa by removing unnecessary abstractions, features and dependencies.

See the CHANGELOG for detailed descriptions, the abridged version is here:

  • Removed all data gathering capabilities and related features/utilities/tests from Pyatoa. See PySEP for a data gathering package
  • Removed internal abstractions for Config and Manager. Public API remains the same but internal routines have simplified

bch0w and others added 30 commits August 2, 2022 11:56
…ing some issue with proj versions on systems that did not already have proj
… if config.save_to_ds is set to False. This is unintended because write() is called explicitely and should force write, whereas save_to_ds is used to stop passive writing during the processing phase.

Also threw in a check for ASDFDataSets opened in read-only during write() which would throw an error
…references. Pyaflowa functionality has been completely shifted to SeisFlows
…nd to match seisflows structure

changed docs conf to point to the correct location
updated changelog with most recent changes
…window type' because they changed name 'Hanning' -> 'Hann' causing errors in ObsPy
…dependency conflicts during requirements installation
bugfixing time offset and waveform misfit naming
…pped from expected behavior. i.e., setting only min period used to set a highpass which is NOT what we want. This has been fixed
* stripped away all moment tensor related functionality from the gatherer and plugins

* further removed moment tensor related functionality and moved into pysep

* stripped out I/O read and write functions related to SPECFEM which were redundant with respect to PySEP. replaced internal import statements with imports to pysep

* stripped out all references to FDSN client in Pyatoa and all gathering that queried FDSN. This functionality will be shifted into PySEP

* Removed now defunct docs related to gathering, updated changelog to reflect major changes made in this branch, removed failing tests that relied on removed functionality

* update readme to remove reference of data fetchign

* fixed missing comma in setup file
* removed setup.py and .cfg, implemented new pyproject.toml file replacing setup.py. pointed readthedocs to only requirements.txt file in docs directory. removed changelog in preference to set changelog in docs page

* removed requirements.txt file

* fixed missing suffixes in git dependencies of pyproject.toml file

* docs added readme, moved notebooks into own directory, updated convert script. updating docs wording

* updated overview page (slimmed down considerably) and bumped version number in conf

* setting default filter periods to None and removing float requirement on input

* pyflex requires a min period so setting to 1-100 rather than previous 10-30

* moved unused plot scripts out of repo (into simutil), removed unnecessary mgmt plot and moved into manager.plot() function

* updated docs environment and main environment

* update readme to match seisflows readme

* removed manager and config notebooks and converted to a direct rst document
added first glance to replace getting started

* throws in a '*' on synthetic data gathering to deal with specfem3d_globe synthetics which suffix with 'ascii'

* added data discovery rst doc page which is meant to replace the gatherer notebook and page

* removed gatherer notebook and rst page

* renames storage notebook

* manually edited inspector rst doc file to be shorter and remove the notebook dependence. figures will be moved into the gallery

* changed core_func name to misfit, for misfit quantifictaion

* removed inspector notebook in favor of hand edited inspector doc page

* updated naming standards page

* cleaned out the scripts directory which had a lot of scripts that were unfinished or not used. ones that were important were moved into docs

* bugfix inspector raypath plot checking incorrect logic

* fixed bugfix

* removed old image files from notebooks that are no longer required
added an inspector gallery notebook and added rst to gallery page
further cleaned up script repository
added a load example inspector script

* removed logging docs page and notebook, moved a short section into the misfit docs page

* manually edited the storage rst file to be more concise and to move away from the notebook configuration.

* removed storage notebook in favor of new .rst storage file

* added outputs to code cells in inspector doc where relevant, changed ASDFDataSet example reading function to read only to not affect test data

* added inspector figure text into gallery and removed text from inspector rst file

* removing warnings from insp_gallery

* renamed insp gallery notebook to avoid conversion deleting the manually edited version

* condensed changelog to remove 0.3.0 changelog since we haven't even bumped to 0.2.0
removed make figures scripts directory

* added github relevant files including different issue templates
added contributing page modfieid from Pygmt

* editing contributing document
added cross-referencing into the misfit docs page

* fixing typos misfit doc page

* added cross-referencing to the storage docs page

* finished adding cross-referencing into all docs pages of relevance

* removed unnused 'read_seisflows_yaml' from Config class

* major config and preprocessing reworking to allow for data-data misfit: removed unnused parameters 'start_pad', 'end_pad' from config, as well as filter corners
removed seisflows_yaml and _par reading functionality from config, not necessary
preprocess function now written more generally, does not take manager as input but rather takes stream and a few other arguments. manager preprocessing function changed to match

* finished updating preprocessing function
cleaned up logging statements, shortened and exchanged some 'info' for 'debug' statments

* working data data example with some TA array data

* renamed some files, cleaned up some docs text

* last minute doc fixes

* update changelog

* fix tests which changed due to a change in default config parameters, removed some ununused test data, upated baseline images
* Pyadjoint v0.2.0 updates some API, fixing within Pyatoa tests, Config and Manager

* renaming some internal Config parameters to match with Pyadjoint naming schema. Fixed texts for new Config system in PYadjoint

* added new test dataset and script to make it. updated tests to reflect new dataset
…ands as input and returns an averaged adjoint source to address #24 (#26)
* added an MPI data processing script and related data

* added docs page with mpi code snippet and explanation

* Revert "added docs page with mpi code snippet and explanation"

This reverts commit 8a86339.

* Revert "Revert "added docs page with mpi code snippet and explanation""

This reverts commit 91dfdb1.

* removing autoapi

* cleaning up mpi example doc page

* divy -> divvy as per NG

* update changelog
added shebang to mpi example script
Updates MPI example script with information about where results are stored, and fix in the code about number of events/stations
* added new test to catch Issue \#34, which describes introduction of sub-sample time shifts when resampling data that already has the correct sampling rate

* fixed test filtering above nyquist, low freq example still works
added a flow control statement to skip resampling if sampling rates are already the same to prevent unncessary resampling of data which can cause sub-sample time shifts

* update changelog and add ridvan to contributors
* added feature to allow amplitude normalization during the standardization procedure

* GATHERER OVERHAUL:
core.gatherer -> utils.gather, demoted from core class to utility package
utils.gather functions are being stripped down to bare essentials, no more reliance on internal path attributes etc.
manager.gather -> manager.gather_from_dataset, used to get internal data from a dataset
the underlying motivation here is to make data gathering much more explicit because it is currently very stupidly implicit and difficult to track/manipulate

* fully stripped out all unncessary gathering routines which were just redundant fluff and weird abstractions that did not require their own class. the leftovers are two main functions which are used to read events that are not acceptable in ObsPy, and to gather waveform data directly from SEED structured directories

* moved gather routines into the already existing 'read' utility function

* remove gatherer import from package init

* removed manager.gather_from_dataset function because I realized that load already does this

* fully removed any reference to 'paths' from Config class as this feature of the package was pretty abstract and not really useful other than in a very specific working case

* removed config parameters 'save_to_ds' because gatherer has been cut out
removed 'pyflex_preset' config parameter because this was not a useful setup
reworked config pyflex and pyadjoint config setting to be much simpler as it was sort of abstracted behind functions before, now its just directly calling the underlying config objects
changed default values for config min and max period to 1-100

* removed lingering references to 'pyflex_preset' Config attribute

* remove lingering calls to 'pyflex_preset'
REMOVED all saving to dataset that occurs during processing, this is saved completely for the 'write' function which is renamed 'write_to_dataset' for clarity
removed 'save' argument in window and measure to reflect point 1'

* added choices parameter to Manager write_to_dataset function to allow selectively saving. also by default this function writes config object now

* RESTRUCTURE preprocessing to move default preprocessing directly into the Manager.preprocess function, rather than obscuring it behind a utility function. parameters are set directly in the preprocessing function to be more exlicit.
response removal is turned OFF by default
changed some function names and
allowed both st_obs and st_syn to run through response removal and STF convolution depending on their data type

* fully migrated read functions OUT of Pyatoa and into PySEP (or SeisFlows). the intention here is that Pyatoa is simply a misfit quantification package, anything to do with reading files should be left to PySEP

* map maker was fetching lat/lon values from inventory at the channel level but Inventories read from SPECFEM STATIONS files will not have the channel level. reduced this to station level which will also have lat and lon values which should anyways be the same as the channel level

* map maker was trying to access magnitude information which is not always present in an Event object. now allows a logic loop to ignore magnitude information

* removed unncessary comment mapmaker

* converted pyflex_presets script into a docs page and removed gatherer docs page from index TOC because gatherer has been nixed

* starting to fix tests but many still broken from refactoring

* bump version number 0.3.0 because this PR will have backwards incompatible changes
removed lingering deps. on PySEP by adding back one formatting function that had been copied over to PySEP

* update CHANGELOG with v0.3.0 changes

* updating windows documentation for better formatting

* remove lingering references to gatherer class

* fixed ASDF util tests

* fixed wave maker tests

* fixed Config tests

* changed baseline images for wave maker plot

* fixing manager tests

* removed 'force' argument from flow and flow_multiband Manager functions, and moved kwargs into args to make things more explicit

* overhauled 'flow_multiband' in Manager to mimic behavior of 'flow', which is to return internal attributes 'windows' and 'adjsrcs' which can be used for later misfit assessment
previously this function returned dictionaries of dictionaries which needed to be manipulated, now the function averages all adjoint sources from all period bands, and also collects all windows, and puts them in the same format as the expected 'windows' and 'adjsrcs' attributes so that the results of flow multiband can be accessed the same way as the results of 'flow'

* all tests passing

* small update docs based on recent changes
…d notebooks remotely, created new environment_local.yml file for those that need a local conda environment to build docs
…at string only columns were not dropped automatically, needed to set a flag to get this to resume previous behavior, caught by test
@bch0w bch0w merged commit 6eae6c9 into master Aug 29, 2023
@bch0w bch0w deleted the devel branch August 29, 2023 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant