What is the difference between Confidence Intervals obtained in StatAnalysis from MPR and in METviewer aggregation from SL1L2? #2482
-
Hello! The range of the upper and lower values of the confidence interval suggests that StatAnalysis uses each Best regards, |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
Alexander, I see you have a question about different types of bootstrap confidence intervals computed within METplus. And that is a great question! In general, the MET tools (including Point-Stat, Grid-Stat, and Stat-Analysis) can compute non-parametric confidence intervals by bootstrapping the aggregation of individual matched pairs. Since Point-Stat and Grid-Stat are run once per output time, bootstrapping the MPR's is resampling the spatial aggregation at a single time. For example, if you have 1,000 matched pairs for a particular valid time over a particular masking region, Point-Stat computes thousands of replicates of those 1,000 MPR's, each using random subsampling with replacement, and computes the stats for each replicate. And the bootstrap CI's are derived from the stats from those replicates. METviewer, on the other hand, can compute non-parametric confidence intervals by bootstrapping the aggregation of contingency table or partial sums. Since METviewer bootstraps the aggregation of tables or sums, each representing a spatial summary for a single output time, bootstrapping those is resampling the temporal aggregation. For example, if you have 30 days of SL1L2 partial sums in a MET database, METviewer computes thousands of replicates of those 30 SL1L2 partial sums, each using random subsampling with replacement, and aggregates the resulting SL1L2 lines and derives statistics for each replicate. While both are called bootstrapping, the difference in input data make them two very different things! I would say in general the latter is more widely used and useful. In fact, if we could go back in time and redo it all, I would recommend NOT including bootstrap CI's in the output from Point-Stat, Grid-Stat, and Stat-Analysis. In practice, bootstrapping the spatial aggregation is extremely time consuming, and generally less useful. And that's why we've changed the default number of bootstrap replications in the PointStatConfig_default file to 0 (i.e. Since you mention Stat-Analysis, I will note that the details can get murkier there. Let's say, for example, that you've run Point-Stat to generate matched pair output for each output time. Then you run Stat-Analysis to read those MPR's and compute stats separately for each station location (e.g. Hope that helps clarify the issue. I'd like to tag @ericgilleland, our resident METplus statistician, on this discussion and ask him to review these details and clarify any points he'd like. |
Beta Was this translation helpful? Give feedback.
-
I think that John summarized it well. I'm not sure how the resampling is done for the Point- and Grid- Stat tools, but I'm guessing it just uses an iid resampling scheme that does not account for spatial correlation. Therefore, the resulting CIs will be too narrow, so you should keep that in mind when making conclusions from them. |
Beta Was this translation helpful? Give feedback.
-
Thank you for the replies! As I understand, Stat-Analysis and METviewer use different approaches to bootstrapping and can not replicate each other (Stat-Analysis can use only MPR line type as input, METviewer only CNT/SL1L2/other statistics-containing data). |
Beta Was this translation helpful? Give feedback.
Alexander,
I see you have a question about different types of bootstrap confidence intervals computed within METplus. And that is a great question!
In general, the MET tools (including Point-Stat, Grid-Stat, and Stat-Analysis) can compute non-parametric confidence intervals by bootstrapping the aggregation of individual matched pairs. Since Point-Stat and Grid-Stat are run once per output time, bootstrapping the MPR's is resampling the spatial aggregation at a single time. For example, if you have 1,000 matched pairs for a particular valid time over a particular masking region, Point-Stat computes thousands of replicates of those 1,000 MPR's, each using random subsampling with replacement…