Add a boundary layer module to estimate boundary height #3572

ThomasRieutord · 2024-07-19T11:44:07Z

Add a boundary layer module to estimate boundary height with various classical ways from vertical profiles

Description Of Changes

Boundary layer height estimation is a parameter commonly derived from vertical profiles (from radiosondings or model profiles) with key importance in application domains such as air quality. The issue #1515 was expressing the need for a way to estimate boundary layer height with MetPy. This PR suggests a module in calc to do so.

The newly added module boundarylayer.py includes four different ways to estimate the boundary layer height, each of them is linked to references:

from bulk Richardson number
from the parcel method
from humidity gradient
from temperature inversion

Help needed

The imports in the module do not follow the MetPy standards. They are just as if MetPy was an external module. Can someone help me correcting them?

The decorators that are present on the functions of other modules (such as exporter) are missing. As I'm not familiar with them, help is welcome here too.

The testing program added together with the module only runs the functions in the new module on sample data, but the unitary tests are not defined yet.

Thanks for your help.

Checklist

Closes calculate PBL height from soundings #1515
Tests added: not to the Pytest format yet
Fully documented

…layer height

CLAassistant · 2024-07-19T11:44:12Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

tests/test_boundarylayer.py

DWesl

Overall a good start, though the documentation could use a description of the assumptions or limitations of each method (i.e., this method gives odd answers at night, that one gives odd answers during the day, the concept of a boundary layer distinct from the free troposphere breaks down inside a thunderstorm).

This is a solid collection of functions to implement existing methods for soundings isolated in time.

DWesl · 2024-07-19T18:33:14Z

src/metpy/calc/boundarylayer.py

+from metpy.units import units
+
+
+def smooth(val, span):


XArray calls this a rolling mean. So does pandas.
Bottleneck calls this a moving-window mean.
SciPy appears to call the same thing a uniform filter.

These would likely work better in the See Also section than as a change in the name.

Thanks for the references! I knew some equivalent functions were already existing but they are not quite exactly the same (Xarray works on xarray.Dataset, Scipy has a slightly different strategy at the edges) and, given that the function is simple enough, it was less work to write it than to look for the existing one.

Bottleneck's function seems to do exactly what I want but it is not listed in the Metpy's dependencies. Do you think it's worth adding it so I can use their moving-mean function?

You'd have to ask one of the maintainers, but, in the meantime, would this be faster?

cumulative_sums = np.nancumsum(val) rolling_sums = cumulative_sums[span:] - cumulative_sums[:-span] valid_index = np.isfinite(val) cumulative_count = np.cumsum(valid_index) rolling_count = cumulative_count[span:] - cumulative_count[:-span] rolling_means = rolling_sums / rolling_count

You'd need to pre-allocate rolling_means and handle the edges still but it should work. (Alternately, use np.lib.stride_tricks.sliding_window_view with np.nanmean, but the note about it being slow is warranted)

Alternately, use SciPy for the bulk of the calculation, then re-do the edges the way you want.

It would probably be a good idea to check whether this takes enough time that it's worth optimizing before going too far, though (as you may have noticed, I am not good at that).

With the testing data I have it's almost instantaneous and, as I moved other topics, I don't really have something bigger to quickly try it on. I suggest we leave it that way for now and other users might open another issue if when need to speed it up. Would that be alright?

You picked the same name as elsewhere in MetPy, though the edge handling is again different from what you do here (they do not smooth close to the edge) and from SciPy, and they do not use nanmean.

The bigger test data would likely be someone trying to find boundary layer height from model data somewhere, and it looks like your code is designed for a profile at a time, not arrays of profiles (either the N x Z that description implies or the Z x Y x X conventional in model output). It might be a simple matter to add an axis keyword argument to most functions and borrow the logic from the derivative functions to create the indexers (and possibly the vertical axis auto-detection for DataArrays), but that should probably be a follow-up PR in case it isn't.

DWesl · 2024-07-19T18:35:11Z

src/metpy/calc/boundarylayer.py

+    u[0] = 0 * units.meter_per_second
+    v[0] = 0 * units.meter_per_second


Is there a particular reason you force the winds at the first layer to zero without documenting the fact, rather than asserting an additional layer at the ground (below the profiles passed) where the winds are zero to satisfy the boundary conditions?

Not at all, I just didn't have the idea of adding an extra point. Thanks for the suggestion!
However, it raises the question of what value put to the potential temperature at this extra point. I think duplicating the lowest point it the most convenient solution.

The two lines would be replaced by:

# Add a ground point with null wind to satisfy theoretical boundary conditions u = np.insert(u, 0, 0 * units.meter_per_second) v = np.insert(v, 0, 0 * units.meter_per_second) height = np.insert(height, 0, 0 * units.metre) potential_temperature = np.insert(potential_temperature, 0, potential_temperature[0])

Additionally, as the insertion of the ground changes the length of the profile, the returned profile should exclude it to keep the same size as the input ones:

return bRi[1:]

Does it look OK with these changes?

Would you need to copy the point into the array? Could you just add a couple of variables for these? That would avoid the complication with the changing profile size, which would also impact idxfoot.
Something like:

if idxfoot == 0: u0 = 0 v0 = 0 else: u0 = u[idxfoot] v0 = v[idxfoot] theta0 = theta[idxfoot] height0 = height[idxfoot]

or even:

if idxfoot == 0: Du = u Dv = v else: Du = u - u[idxfoot] Dv = v - v[idxfoot]

I'm not sure if height should be treated the same as velocity. I think I remember a displacement height that might complicate things?

I think the main difference between this method and yours would be if the boundary layer was below the lowest passed level.

Your second suggestion:

if idxfoot == 0: Du = u Dv = v else: Du = u - u[idxfoot] Dv = v - v[idxfoot]

looks the best to me. I will go for that one. Thanks!

Regarding the problem that could arise with BL below the first level, I think the bulk Richardson method is out of scope anyway in this kind of situation, so I don't think we need to address it (rather refer to a more appropriate method).

DWesl · 2024-07-19T18:39:53Z

src/metpy/calc/boundarylayer.py

+        theta0 = potential_temperature[0]
+
+    if any(potential_temperature > theta0):
+        iblh = np.where(potential_temperature > theta0)[0][0]


So this looks like a potential temperature threshold method. I would prefer "exceeds" over "reaches" in the documentation, given the usual description of the convective boundary layer.

Fine by me. I will also add that it's only suited for unstable boundary layer, as suggested in the main comment. The name of the method might vary with the authors, "parcel method" is the one I have seen the most, but I can include other names (e.g. "potential temperature threshold method") in the doc.

There might be a difference in expectations if the boundary layer is saturated (i.e. fog or fair-weather cumulus), but describing alternate names should avoid that.

DWesl · 2024-07-19T18:42:08Z

src/metpy/calc/boundarylayer.py

+    """Calculate atmospheric boundary layer height from the relative
+    humidity gradient
+
+    It is the height where the relative humidity or specific humidity gradient reaches a minimum.


As a side note, it should be possible to (ab)use this function with TKE or aerosol backscatter. I take it you considered wavelet methods out-of-scope?

Definitely possible to use it with TKE or aerosol backscatter. I don't mention it because the references I give are related to radiosounding data, while aerosol would come from lidar or ceilometer and TKE usually from model output, but they are very good way of deriving the BLH (especially the TKE, when turbulence is right in the model).

The wavelet method is out of scope for this first PR, yes. It would be great to have it in the future but I will not have the time to commit for that. In the meantime, this gradient method + smoothing should be equivalent to the Haar wavelet.

TKE usually from model output, but they are very good way of deriving the BLH (especially the TKE, when turbulence is right in the model).

I think I've seen TKE from Doppler lidar. It's probably possible to derive something similar from Doppler radar, but the radar-based boundary-layer methods I've seen have been based on scattering/reflectivity.

The wavelet method is out of scope for this first PR, yes. It would be great to have it in the future but I will not have the time to commit for that.

Yeah, there's a lot of interesting things to do with boundary layers, and not all of them should be in a first attempt.

In the meantime, this gradient method + smoothing should be equivalent to the Haar wavelet.

Fairly close, yes.

Sure, Doppler lidars can provide TKE but, as far as I remember, only the most powerful (and expensive) Doppler lidars will be able to have any signal from above the boundary layer, which make it difficult to run such an estimation because the top of the BL might be too close to the end of the profile.

Anyway, the function is there, no harm trying it on TKE profiles! Do you think I should change the name (and the doc) of the function to make it more general? Something like blh_from_concentration_gradient?

Sure, Doppler lidars can provide TKE but, as far as I remember, only the most powerful (and expensive) Doppler lidars will be able to have any signal from above the boundary layer, which make it difficult to run such an estimation because the top of the BL might be too close to the end of the profile.

I think I've seen estimates based on SNR/signal quality in that situation, but likely less frequent of a use-case.

Anyway, the function is there, no harm trying it on TKE profiles! Do you think I should change the name (and the doc) of the function to make it more general? Something like blh_from_concentration_gradient?

I like blh_from_concentration_gradient. Definitely mention humidity in the docs, since that's probably what people will be looking for.

DWesl · 2024-07-19T18:43:52Z

src/metpy/calc/boundarylayer.py

+Contains a collection of boundary layer height estimations.
+
+
+References:


I should probably ask the maintainers about one overall References section here with citations everywhere else vs. repeated References sections in each function that needs them.

DWesl · 2024-07-19T18:47:59Z

src/metpy/calc/boundarylayer.py

+    """Calculate atmospheric boundary layer height from the inversion of
+    absolute temperature gradient
+
+    It is the height where the temperature gradient (absolute or potential) changes of sign.


Out of curiosity, how well does this work for the convective boundary layer with potential temperature? Or, for that matter, with the nocturnal stable boundary layer with either?

I was expecting to see a threshold method on $\frac{d\theta}{dz}$.

From what I remember (my experience with this is now a bit old), this method is more suited for nocturnal stable boundary layers as it will track the end of the stable layer at the surface.

For convective boundary layer, the parcel method should be preferred to this method, as this method would gives underestimated height with a large variability.

The threshold of $\frac{d\theta}{dz}$ would be interesting to try too. I think I have seen it used in some works, just not in the ones I mention in this code.

DWesl · 2024-07-19T18:54:38Z

tests/test_boundarylayer.py

+blh = boundarylayer.blh_from_richardson_bulk(SAMPLE_HEIGHT, potential_temperature, SAMPLE_U, SAMPLE_V)
+print(f"Estimated boundary layer height with blh_from_richardson_bulk:\t {blh}")
+
+blh = boundarylayer.blh_from_parcel(SAMPLE_HEIGHT, potential_temperature)
+print(f"Estimated boundary layer height with blh_from_parcel:\t {blh}")
+
+blh = boundarylayer.blh_from_humidity_gradient(SAMPLE_HEIGHT, SAMPLE_RELATIVE_HUMIDITY)
+print(f"Estimated boundary layer height with blh_from_humidity_gradient (relative humidity):\t {blh}")
+
+blh = boundarylayer.blh_from_humidity_gradient(SAMPLE_HEIGHT, specific_humidity)
+print(f"Estimated boundary layer height with blh_from_humidity_gradient (specific humidity):\t {blh}")
+
+blh = boundarylayer.blh_from_temperature_inversion(SAMPLE_HEIGHT, SAMPLE_TEMPERATURE)
+print(f"Estimated boundary layer height with blh_from_temperature_inversion (absolute temperature):\t {blh}")
+
+blh = boundarylayer.blh_from_temperature_inversion(SAMPLE_HEIGHT, potential_temperature)
+print(f"Estimated boundary layer height with blh_from_temperature_inversion (potential temperature):\t {blh}")


I think the usual method is to put the calculations in a function, with some asserts to indicate what you expect to be true. With this setup, checking that the different methods give answers within a few hundred meters of each other, or within a few hundred meters of the value you picked out from a Skew-T or tephigram, would likely get your intent across.

I'm pretty sure MetPy has an example sounding around somewhere, for the documentation if not for the tests, so you could create similar tests on that one as well.

Sure, I can write the corresponding pytest unitary tests based on these computations. However, I better use a standard Metpy profile rather than my own testing data. Do you know where I could find this sounding data?

Thank you very much for the review, @DWesl !

https://github.com/Unidata/MetPy/tree/main/staticdata has a few files matching *_sounding.txt that look promising. Those are used here:

MetPy/examples/Advanced_Sounding.py

Lines 30 to 31 in eaf608d

df = pd.read_fwf(get_test_data('may4_sounding.txt', as_file_obj=False),

skiprows=5, usecols=[0, 1, 2, 3, 6, 7], names=col_names)

The SkewT tests look like they use p = np.geomspace(1000, 100, 10); T = np.linspace(20, -20, 10) and similar profiles. You could build something similar for a first test (constant potential temperature and water vapor mixing ratio up to some height, then a sharp increase in the former and a sharp decrease in the latter)

…ta#2 in PR Unidata#3572)

… PR Unidata#3572)

tests/test_boundarylayer.py

+import numpy as np
+import pandas as pd
+
+import metpy.calc as mpcalc


tests/test_boundarylayer.py

+# -*-coding:utf-8 -*-
+"""Testing program for the MetPy boundary layer module"""
+
+import numpy as np


tests/test_boundarylayer.py

+col_names = ["pressure", "height", "temperature", "dewpoint", "direction", "speed"]
+
+df = pd.read_fwf(
+    get_test_data("may4_sounding.txt", as_file_obj=False),


DWesl · 2024-08-01T18:27:13Z

src/metpy/calc/boundarylayer.py

+    N = len(val)
+    smoothed_val = deepcopy(val)
+    for i in range(N):
+        smoothed_val[i] = np.nanmean(val[i - min(span, i) : i + min(span, N - i)])


Suggested change

smoothed_val[i] = np.nanmean(val[i - min(span, i) : i + min(span, N - i)])

smoothed_val[i] = np.nanmean(val[max(i - span, 0) : min(i + span, N)])

Would this be more clear as to the intent?

DWesl · 2024-08-29T13:50:21Z

src/metpy/calc/boundarylayer.py

+    return smoothed_val
+
+
+def bulk_richardson_number(


Might close #628

Add a boundary layer module with classical ways to estimate boundary …

a99c908

…layer height

ThomasRieutord requested a review from a team as a code owner July 19, 2024 11:44

ThomasRieutord requested review from dopplershift and removed request for a team July 19, 2024 11:44

github-advanced-security bot found potential problems Jul 19, 2024

View reviewed changes

tests/test_boundarylayer.py Fixed Show fixed Hide fixed

ThomasRieutord temporarily deployed to CodeCov July 19, 2024 11:55 — with GitHub Actions Inactive

ThomasRieutord temporarily deployed to CodeCov July 19, 2024 11:58 — with GitHub Actions Inactive

DWesl reviewed Jul 19, 2024

View reviewed changes

ThomasRieutord added 4 commits July 31, 2024 13:03

Update bulk_richardson_number: fix null wind at bottom (comment Unida…

6bdf81b

…ta#2 in PR Unidata#3572)

Rename blh_from_humidity_gradient -> blh_from_concentration_gradient

106ac48

Improve doc + linting (comment Unidata#3 in PR Unidata#3572)

0f16c2d

Rewrite tests with standard data and for pytest (comment Unidata#7 in…

c60a30b

… PR Unidata#3572)

github-advanced-security bot found potential problems Jul 31, 2024

View reviewed changes

ThomasRieutord temporarily deployed to CodeCov July 31, 2024 13:24 — with GitHub Actions Inactive

ThomasRieutord temporarily deployed to CodeCov July 31, 2024 13:27 — with GitHub Actions Inactive

DWesl reviewed Aug 1, 2024

View reviewed changes

DWesl reviewed Aug 29, 2024

View reviewed changes

src/metpy/calc/boundarylayer.py

return smoothed_val

def bulk_richardson_number(

Copy link

Contributor

DWesl Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might close #628

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a boundary layer module to estimate boundary height #3572

Add a boundary layer module to estimate boundary height #3572

ThomasRieutord commented Jul 19, 2024

CLAassistant commented Jul 19, 2024

DWesl left a comment

DWesl Jul 19, 2024 •

edited

Loading

ThomasRieutord Jul 22, 2024

DWesl Jul 24, 2024

ThomasRieutord Jul 31, 2024

DWesl Aug 1, 2024 •

edited

Loading

DWesl Jul 19, 2024

ThomasRieutord Jul 22, 2024

DWesl Jul 24, 2024

ThomasRieutord Jul 31, 2024

DWesl Jul 19, 2024

ThomasRieutord Jul 22, 2024

DWesl Jul 24, 2024

DWesl Jul 19, 2024

ThomasRieutord Jul 22, 2024

DWesl Jul 24, 2024

ThomasRieutord Jul 31, 2024

DWesl Jul 31, 2024

DWesl Jul 19, 2024

DWesl Jul 19, 2024

ThomasRieutord Jul 22, 2024

DWesl Jul 19, 2024

ThomasRieutord Jul 22, 2024

ThomasRieutord Jul 22, 2024

DWesl Jul 24, 2024

DWesl Aug 1, 2024

DWesl Aug 29, 2024

		u[0] = 0 * units.meter_per_second
		v[0] = 0 * units.meter_per_second

		Contains a collection of boundary layer height estimations.


		References:

	df = pd.read_fwf(get_test_data('may4_sounding.txt', as_file_obj=False),
	skiprows=5, usecols=[0, 1, 2, 3, 6, 7], names=col_names)

	smoothed_val[i] = np.nanmean(val[i - min(span, i) : i + min(span, N - i)])
	smoothed_val[i] = np.nanmean(val[max(i - span, 0) : min(i + span, N)])

Add a boundary layer module to estimate boundary height #3572

Are you sure you want to change the base?

Add a boundary layer module to estimate boundary height #3572

Conversation

ThomasRieutord commented Jul 19, 2024

Description Of Changes

Help needed

Checklist

CLAassistant commented Jul 19, 2024

DWesl left a comment

Choose a reason for hiding this comment

DWesl Jul 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DWesl Aug 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DWesl Jul 19, 2024 •

edited

Loading

DWesl Aug 1, 2024 •

edited

Loading