Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Origin/reweighting #185

Open
wants to merge 92 commits into
base: master
Choose a base branch
from
Open

Origin/reweighting #185

wants to merge 92 commits into from

Conversation

pittlerf
Copy link
Contributor

Hi,

I started to add functionality to able to handle reweighting for correlation functions.
I use 'cfrw_boot' label for correlation functions that have been reweighted.
These correlation function should not be resampled, I removed the 'cf_orig' label from them.
Next thing is to make this consistent with the other functions in hadron.

R/readutils.R Outdated Show resolved Hide resolved
R/readutils.R Outdated Show resolved Hide resolved
R/rw.R Outdated Show resolved Hide resolved
man/addStat.cf.Rd Outdated Show resolved Hide resolved
R/rw.R Show resolved Hide resolved
R/cf.R Outdated Show resolved Hide resolved
R/cf.R Outdated Show resolved Hide resolved
R/cf.R Outdated Show resolved Hide resolved
R/cf.R Outdated Show resolved Hide resolved
R/cf.R Outdated Show resolved Hide resolved
R/cf.R Outdated Show resolved Hide resolved
@urbach
Copy link
Member

urbach commented Apr 9, 2021

It's not totally clear to me how the reweighting is supposed to work here? I had thought that a function like bootstrap_and_rw.cf was sufficient, with a cf and reweighting factors as input. How does it work here?

Why is it not allowed to resample a reweighted cf?

@urbach
Copy link
Member

urbach commented Apr 9, 2021

devtools::check output:

❯ checking examples ... ERROR
  Running examples in ‘hadron-Ex.R’ failed
  The error most likely occurred in:
  
  > base::assign(".ptime", proc.time(), pos = "CheckExEnv")
  > ### Name: is_empty.rw
  > ### Title: Checks whether the cf object contains no data
  > ### Aliases: is_empty.rw
  > 
  > ### ** Examples
  > 
  > # The empty rw object must be empty:
  > is_empty.rw(rw())
  Error in is_empty.rw(rw()) : could not find function "is_empty.rw"
  Execution halted

❯ checking examples with --run-donttest ... ERROR
  Running examples in ‘hadron-Ex.R’ failed
  The error most likely occurred in:
  
  > base::assign(".ptime", proc.time(), pos = "CheckExEnv")
  > ### Name: is_empty.rw
  > ### Title: Checks whether the cf object contains no data
  > ### Aliases: is_empty.rw
  > 
  > ### ** Examples
  > 
  > # The empty rw object must be empty:
  > is_empty.rw(rw())
  Error in is_empty.rw(rw()) : could not find function "is_empty.rw"
  Execution halted

❯ checking for missing documentation entries ... WARNING
  Undocumented code objects:
    ‘rw_unit’ ‘samplerw’ ‘samplerw_inverse’
  Undocumented data sets:
    ‘samplerw’ ‘samplerw_inverse’
  All user-level objects in a package should have documentation entries.
  See chapter ‘Writing R documentation files’ in the ‘Writing R
  Extensions’ manual.

❯ checking Rd \usage sections ... WARNING
  Undocumented arguments in documentation object 'read.rw'
    ‘monomial_id’
  
  Undocumented arguments in documentation object 'rw_orig'
    ‘rw’

  Functions with \usage entries need to have the appropriate \alias
  entries, and all their arguments documented.
  The \usage entries must correspond to syntactically valid R code.
  See chapter ‘Writing R documentation files’ in the ‘Writing R
  Extensions’ manual.

❯ checking package dependencies ... NOTE
  Package suggested but not available for checking: ‘rhdf5’

❯ checking DESCRIPTION meta-information ... NOTE
  Package listed in more than one of Depends, Imports, Suggests, Enhances:
    ‘dplyr’
  A package should be listed in only one of these fields.

❯ checking R code for possible problems ... NOTE
  *.rw: no visible binding for global variable ‘cf1’
  *.rw: no visible binding for global variable ‘cf2’
  read.rw: no visible binding for global variable ‘monomialid’
  Undefined global functions or variables:
    cf1 cf2 monomialid

2 errors ✖ | 2 warnings ✖ | 3 notes ✖

@urbach
Copy link
Member

urbach commented Apr 9, 2021

fixed most of the check problems.

where can I find an example for this? I'm still not convinced all of this is needed...!?

@urbach
Copy link
Member

urbach commented Apr 9, 2021

this is left:

   read.rw: no visible binding for global variable ‘monomialid’
   Undefined global functions or variables:
     monomialid

which I don't understand yet.

@urbach
Copy link
Member

urbach commented Apr 9, 2021

also, the data object will mean we can no longer install for R < 3.5.0

     NB: this package now depends on R (>= 3.5.0)
     WARNING: Added dependency on R >= 3.5.0 because serialized objects in  serialize/load version 3 cannot be read in older versions of R.  File(s) containing such objects: ‘hadron/data/samplerw.RData’  ‘hadron/data/samplerw_inverse.RData’

@pittlerf
Copy link
Contributor Author

pittlerf commented Apr 9, 2021

fixed most of the check problems.

where can I find an example for this? I'm still not convinced all of this is needed...!?

Yes, the reading is actually quite format dependent. In the beta12 project I analysed just the output of tmLQCD for the reweighting factors: (that looked like the following):

00 00000 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9715302949e+01
00 00001 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9523762274e+01
00 00002 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9776317102e+01
00 00003 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9501797443e+01
00 00004 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9453382954e+01

In the PLNG project I got the reweighting factor from Marco, entirely different format.

@urbach
Copy link
Member

urbach commented Apr 9, 2021 via email

@kostrzewa
Copy link
Member

It's not totally clear to me how the reweighting is supposed to work here? I had thought that a function like bootstrap_and_rw.cf was sufficient, with a cf and reweighting factors as input. How does it work here?

Why is it not allowed to resample a reweighted cf?

I understood this to originate from the fact that the normalisation needs to be recomputed (the average of the weights). In other words, the data and the reweighting factors both need to be resampled consistently and separately, such that for each bootstrap resample, the normalisation and the corresponding reweighted data can be generated.

There are of course ways to handle this: reweighted data could be stored unnormalised:

d^{rw}_i = d_i * w_i 

which can be resampled any way one wants. However, when the reweighted data (and resampling thereof) is used, the corresponding normalisations need to be available and correctly applied to the central value and bootstrap samples. In other words, w_i need to be resampled too, giving boot.R values for the normalisation factor. The normalisation factor for the central value is of course just sum_i w_i.

Does the above sound reasonable and describe correctly, why one can't "blindly" resample the reweighted data?

@kostrzewa
Copy link
Member

Does the above sound reasonable and describe correctly, why one can't "blindly" resample the reweighted data?

Let me add another qualifying remark: we also don't deal with just a single reweighting factor, but sequences of factors which move us along in parameter space. For this, some sort of solution was required (such as supporting the multiplication of two sets of reweighting factors to form a third).

@urbach
Copy link
Member

urbach commented Apr 9, 2021 via email

@urbach
Copy link
Member

urbach commented Apr 9, 2021 via email

@urbach
Copy link
Member

urbach commented Apr 11, 2021

fixed most of the check problems. > > where can I find an example for this? I'm still not convinced all of this is needed...!? Yes, the reading is actually quite format dependent. In the beta12 project I analysed just the output of tmLQCD for the reweighting factors: (that looked like the following): ## 00 00000 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9715302949e+01 00 00001 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9523762274e+01 00 00002 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9776317102e+01 00 00003 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9501797443e+01 00 00004 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9453382954e+01 ## In the PLNG project I got the reweighting factor from Marco, entirely different format.
I had in mind an example of the whole thing working?

@pittlerf In other words, is there a rmarkdown file which explains how to use this? Are there some tests?

@pittlerf
Copy link
Contributor Author

fixed most of the check problems. > > where can I find an example for this? I'm still not convinced all of this is needed...!? Yes, the reading is actually quite format dependent. In the beta12 project I analysed just the output of tmLQCD for the reweighting factors: (that looked like the following): ## 00 00000 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9715302949e+01 00 00001 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9523762274e+01 00 00002 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9776317102e+01 00 00003 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9501797443e+01 00 00004 0.163251250000 0.163265000000 0.000000000000 0.000000000000 6.9453382954e+01 ## In the PLNG project I got the reweighting factor from Marco, entirely different format.
I had in mind an example of the whole thing working?

@pittlerf In other words, is there a rmarkdown file which explains how to use this? Are there some tests?

Hi @urbach, I uploaded a how-to use cheat sheet in rmarkdown.

@urbach
Copy link
Member

urbach commented Apr 13, 2021

thanks.
There are still changes requested...

@urbach
Copy link
Member

urbach commented Apr 27, 2021

check(cran=TRUE) gives

❯ checking package dependencies ... NOTE
  Package suggested but not available for checking: ‘rhdf5’

❯ checking R code for possible problems ... NOTE
  read.rw: no visible binding for global variable ‘monomialid’
  Undefined global functions or variables:
    monomialid

❯ checking Rd line widths ... NOTE
  Rd file 'rw_orig.Rd':
    \examples lines wider than 100 characters:
       rw_factor <- rw_orig( rw=rw_data, conf.index=seq(1,20), max_value= max(rw_data),stochastic_error=rep(0,20))
  
  These lines will be truncated in the PDF manual.

thanks!

@urbach
Copy link
Member

urbach commented Apr 27, 2021

The comment on rhdf5 is on my side...

@urbach
Copy link
Member

urbach commented May 26, 2021

hmm?

@pittlerf
Copy link
Contributor Author

hmm?

ah, sorry I will do it now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants