Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comments on initial version #35

Open
5 tasks
behinger opened this issue Jun 5, 2023 · 10 comments
Open
5 tasks

comments on initial version #35

behinger opened this issue Jun 5, 2023 · 10 comments

Comments

@behinger
Copy link
Member

behinger commented Jun 5, 2023

  • get_datahas a units argument, no need to do 10e^6 - fixed in pull lower-cased a lot of things #34
  • in julia it is common to have capital variables only for Types, thus Subject, Path etc should be samll - fixed in pull, same for RunUnfold and CollectEvents and some others-> fixed
  • we could think to provide a flexible interface to the get_data function, by allowing to specify a NamedTuple which can be used not only to define what channels to use, but also what time etc. #40
  • same for fit!(UnfoldModel) - specifying contrasts is not possible, but by making this flexible, we can easily do it
  • it might be nice to have an optional "save" toggle (with Unfold 0.5 saving is quite efficient space wise), after the fit, so instead of returning one DataFrame, we could offer to simply save it, and then have a collect-function similar to DrWatson. This might also make it easier to run the jobs in parallel on a cluster; do you know about DrWatsonSim.jl? could be interesting, but requires more rewriting
  • we could refactor the subject-loop with the actual per-subject job. This would make it easier to offer custom parallelization
  • imho MNE has an auto-detector for files, could be a good fallback?
@behinger
Copy link
Member Author

behinger commented Jun 5, 2023

  • I guess in case we ever use BIDS.jl, it would be nice to expose columsn like "task" in the layout DF
  • It would be nice to enable/disable the logging, or use ProgressMeter.jl
  • it is quite counterintuitive to me, why you need to specify a folder where all subjects csv/tsv files are. Couldnt you include the events-link in the layout=
  • channels::Union{Nothing, String, Integer}=nothing,) the plural implies I could specifz a subset, but right now I can specify a single channel, or all (which is nothing). Maybe rewrite this to:
    channels::Vector{<:Union{String, Integer}}=[] with an isempty replacing the isnothing below?
    Then empty vector would be all channels, or I can specify a Vector of ints / strings.

Disclaimer: not sure about the <:Union, I never know when to add that operator or not :D

@behinger
Copy link
Member Author

behinger commented Jun 5, 2023

  • in load.jl you have some code-doubling, the loading code is twice in there with the only difference an "if", I think you could reconcile that within one code block

@behinger
Copy link
Member Author

behinger commented Jun 5, 2023

  • summarise_events doesnt exist :(
  • new feature: lazy loading of data, that would be helpful in case of huge datasets, maybe MNE already does that though?

@behinger
Copy link
Member Author

behinger commented Jun 5, 2023

  • events.latency is not required by BIDS, but is required for Unfold. we should either document this well, or automatically convert according to the raw.info.["srate"] sampling rate. I used events.latency = events.onset .*500

@behinger
Copy link
Member Author

behinger commented Jun 5, 2023

ok it ran through my test-dataset (8-bit) - nice!

  • ok cool, I didnt see that the function returns the coeftable output. This is ok - but, often we want to use Effects as well, for that we could offer to save the data (but we have to be very clear that we have to explicitly save the modelfit). What we also could do is remove the Xdc designmatrix which results in a low-memory profile (25mb instead of 1.5gb). What do you think? simply return a collection of uf_results witihout the Xdc designmatrices? Then we could define a coeftable(UnfoldCollection) and effects(UnfoldCollection) and we can use all functions as we used to?

@behinger
Copy link
Member Author

behinger commented Jun 5, 2023

  • being able to specify your own preprocessing function could be nice as well? e.g.
function runUnfold(args...;kwargs..., preprocessfunction =preprocess_default)
...
for raw = data.raw
   tmpData = preprocess_default(raw[1];channels=channels, preprocessOptions...) # with preprocessOptions the NamedTuple I proposed earlier? or maybe we simply tell users that if they need advanced features, they should just implement their own preprocessingFunction? Maybe a healthy mix between flexibility (some pre-specified options, e.g.  setting reference, selecting channels) and flexibility?

end
...
end

function preprocess_default(raw;channels=[])
return pyconvert(Array,raw.get_data(picks=channels,units="uV"))
end
  • bfDict is a bit of missnomer, as you can do mass-univariate analyses here as well, but they dont have basisfunctions. how about design as we use it in Unfold.jl? (unfortunately UnfoldSim.jl uses design as well.. :S

@ReneSkukies
Copy link
Member

it is quite counterintuitive to me, why you need to specify a folder where all subjects csv/tsv files are. Couldnt you include the events-link in the layout=

I did this cause currently some of our datasets have a folder where the events files are ordered in this way. Can split it in a fucntion "loading_from_csv" and "events_loading" or something

@ReneSkukies
Copy link
Member

ReneSkukies commented Jun 30, 2023

summarise_events doesnt exist :(

Can you remind me again what the functionality is?
Basically just a DF with unique events and count and such? (what would be useful here?)

@behinger
Copy link
Member Author

it is quite counterintuitive to me, why you need to specify a folder where all subjects csv/tsv files are. Couldnt you include the events-link in the layout=

I did this cause currently some of our datasets have a folder where the events files are ordered in this way. Can split it in a fucntion "loading_from_csv" and "events_loading" or something

I think it would help already, if layout automatically extracts the events as a new column, and you can either provide the events_loading a new file, or the file from the layout, right?

@behinger
Copy link
Member Author

behinger commented Jun 30, 2023

summarise_events doesnt exist :(

Can you remind me again what the functionality is? Basically just a DF with unique events and count and such? (what would be useful here?)

I think I simply got cofused in the Readme.md as it is still included there :-D

I think I envisioned some kind of text-summary of events (e.g. frequency, which events how many times) - potentially also a matrix, which event is followed by which event (and how much overlap? dunno how to do that). Also maybe a range of each variable (or a in-line histogram)
~~ cue some procrastiation music ~~
oookay, and here I have a prototype for this

	using UnicodePlots
	using Term
	using Printf
	using Statistics

h(name,d) = 
TextBox(@sprintf("{bold}%s{/bold} \nμ=%.2f,σ=%.2f\nmin=%.2f\nmax=%.2f",name,mean(d),std(d),minimum(d),maximum(d)),fit=true,) * 
(histogram(d,vertical=true,height=1,grid=false,stats=false,labels=false,border=:none,padding=1,margin=0)|>UnicodePlots.panel)
h("testFactor",d)|>print

grafik

should be easily stackable for multi-event purposes :)

Edit: Number of events would be nice as a number too

EditEdit: This requires UnicodePlots.jl as a requirement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants