Releases: cchdo/hydro
ALT merging and CDOM
This release improves support for merging and manipulating CDOM parameters in the datasets, it's early support so bugs might be in there. A bug was found where merging alternate parameters would instead update the "non alternate" parameter.
This next release will adopt SPEC0 and so is the last version to support (be tested against) python 3.10.
v1.0.2.12 (2024-10-29)
- Add support for adding CDOM params/wavelengths
- Add support for merging CDOM in merge_fq
- (Bug) Fix merge_fq putting alternate paramter data in the wrong place
- (Bug) Fix a crash in the COARDS writer on some architectures (x64) when CTDNOBS has fill values
- Fix exception caused by string dtype parameters with all fill values
Full Changelog: v1.0.2.11...v1.0.2.12
Release Friday
Naturally releasing on a Friday leads to forgetting something... like debug print statements in hot paths. This quick followup release removes those debug print statements.
Changes:
- Removed 2 rouge debug print statemnets from digging into that WOCE bug
String Gotchas
The changes for numpy 2.0 failed to calculate a string length correctly and important things like station ids were being truncated! This fixes that problem and is important enough that I decided to do a release on a Friday before a long weekend, a totally safe and accepted programming norm.
Changes:
- (Major Bug) Fix not calculating the correct string length for string fields that have an inconsistent length (e.g. station)
- (Bug) Fix the legacy woce writer not writing the data block if no quality flags are in the file
Speedy Merges
This release focused on the merge code paths and making them faster (we had a CTD file that needed fixing). It also fixed a bug in the legacy COARDS writer.
Changes:
- (New) Added
hydro.core.add_param
andhydro.core.remove_param
functions - (Bug) Fix crash in the COARDS writer when the comments are just an empty string
- Add cchdo.auth to cli optional requiremnets
- Vectorize the merge_fq accessor for greater speed
- use absolute imports throughout the library
- Speedups in string processing (precision extraction) from numpy 2
netCDF4 required for selftest
A technical issue prevented the 1.0.2.7 from being published successfully
- netCDF4 is now requried as part of the selftest option when installing
Legacy Accessor Fixes
This release fixes some bugs that would prevent an exchange to COARDS netCDF from generating successfully from a CF/netCDF file. It also includes a CLI tool for testing this functionality on all public CF files at cchdo.
Changes:
- (Bug) fix to_exchange accessor failing for variables with seconds and the unit
- (Bug) fix to_coards accessor failing for variables with seconds and the unit
- Add status-cf-derived command that tests all all public CF files at CCHDO going from netCDF to every other supported format
Duplicate Params
This release adds initial support for duplicate parameter names using the "ALT" syntax proposed in cchdo/params#25
- Support for duplicate parameters
- (Bug) fix to_exchange accessor failing with a Dataset containing CDOM variables
- (Bug) fix for the flag column getting lost when alternate units for the same parameter were present in one file
If, for example, a file had CTDTMP [ITS-90] and CTDTMP [IPTS-68] and both had CTDTMP_FLAG_W columns, only one of the parameters would get a flag column - Added "coards" and "woce" file name generation support to
gen_fname()
accessor to_woce()
now always returns zipfile bytes for ctd data- Omit the "STAMP" text from generated WOCE files
- (changed) Bump min
cchdo.params
version to 2024.3
Full Changelog: v1.0.2.5...v1.0.2.6
Some CLI TLC
This release has some CLI improvements a highlight being the ability to convert a generic CSV format, which is still pretty specific in what is expected, and not quite documented yet. You can also override the comments with an external file when converting exchange (or csv).
The COARDS legacy output has been rewritten in xarray (from netCDF4-python) in an attempt to speed it up, even if it is not sped up, the code sustainability and readability improvements make it worth it. See the changelog for some more details (and the CLI --help
for details there)
- Rewrite the COARDS netCDF output to create xarray objects rather than netCDF datasets directly.
In some quick testing, this results in about a 3x speed up, this depends more on variable count vs data length, so most of the performance increase is actually in the bottle output- Fixed a bug in COARDS where the fill value was not being set in the bottom depth variable
- Add
fill_values
andprecision_source
arguments toread_csv
- Add string literal types for the
ftype
parameter ofread_csv
- CLI improvements:
- made "precision_source" and option rather than positional argument
- added a
--comments
option to allow the override of comments from either a string or file path prefixed with @. - Add a convert_csv subcommand which takes an additional ftype option to specify (C)TD or (B)ottle
- Removed the
matlab
optional install extra, this previously had a single dependency of "scipy" in it.
Scipy is used by xarray for netCDF3 output so this dependency has been moved to thenetcdf
optional install extra.
Bump In Params
Trying to be more regular, there was a change in the way that cchdo.params did versioning that this release adjusts for.
- (improved) the read_csv method now handles ctd data better, specifically you do not need to include a SAMPNO column if the FileType is CTD.
- Switched linting in pre-commit and CI to use ruff
- (changed) Bump min
cchdo.params
version to 2023.9
It's Progress
Looks like it has been almost a year since the last point release, since I want to feel good about this, it is because the code base is stable and robust and definitely not for any other reason like the lead developer spending 6 months at sea collecting reference quality hydrographic data. The changes being worked on here related to automation efforts at CCHDO, a highlight being the porting of the coards netcdf and woce file generators. That's right, we are committed to providing hydrographic data in your favorite formats and are working on automating the process of making these data available for all our cruises.
-
Add
read_csv
method -
(bug) Remove the
C_format
andC_format_source
attributes for non floating point variables. Integer and string values are exact so do not need any sort of format hint. Including a format string for non floating point values is undefined behavior in the netCDF-C Library and can result in crashing. -
(new) Add
to_coards()
andto_woce()
accessors to maintain legacy formats at CCHDO. -
(new) All the
to_*
accessors now support a path argument that will accept a writeable binary mode file like object or a filesystem path to write to. -
(new) Add a
compact_profile()
accessor that drops the trailing fill values from a profile -
(new) Add the a
file_seperator
andkeep_seperator
tocchdo.hydro.exchange.read_exchange()
.
Thekeep_seperator
argument defaults to True.
This is specifically to allow the reading of CTD exchange files that have been concatenated together (rather than zipped).
Assuming there is nothing after "END_DATA" and you cat a bunch of _ct1.csv files together, they should be readable if "END_DATA" is passed into thefile_seperator
argument. -
(new) Add
--dump-data-counts
option to the exchange status generator which will dump a json document containing a object with nc_var name strings to count integers of how many
variables with this name actually contain any data (i.e. are not just entirely fill value). -
Add a
--version
option to the cli interface -
(changed) Export
read_exchange
from the top levelcchdo.hydro
namespace. -
(changed) Bump min
cchdo.params
version to 0.1.21 -
(changed) Dropped netCDF4 as required for installation, if netCDF4 isn't installed already you can install with the
cchdo.hydro[netcdf4]
optional.- While this might seem like an odd choice for a library that started as one to convert WHP Exchange files to netCDf, netCDF
itself is not called until the very end of the conversion process. Internally, everything is anxarray.Dataset
. This means you can
install this library to read exchange files in tricky environments like pyodide or jupyterlite which already tend to have pandas and numpy in them.
- While this might seem like an odd choice for a library that started as one to convert WHP Exchange files to netCDf, netCDF
-
(bug) fix
pressure
variable not having a_FillValue
attribute