Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process_data error for negative polarity #1

Open
andrewjkwok opened this issue Jun 12, 2023 · 9 comments
Open

process_data error for negative polarity #1

andrewjkwok opened this issue Jun 12, 2023 · 9 comments

Comments

@andrewjkwok
Copy link

andrewjkwok commented Jun 12, 2023

Hi,

Thank you for this very comprehensive suite of software for metabolomics. I have a dataset which I was attempting to use this software for and was trying to preprocess my negative polarity rawdata, but ran into an issue after the initial convert_raw_data step. I was able to successfully generate my mzXML files, but then when feeding them into the process_data function, I get the following error:

Error in names(val) <- featureNames(object) : 
  attempt to set an attribute on NULL
Error in massprocesser::process_data(path = "./", polarity = "negative",  : 
  Error in xcms::findChromPeaks.

Looking at the source code (https://rdrr.io/github/tidymass/massprocesser/src/R/process_data.R), I find:

if (is(xdata, class2 = "try-error")) {
        stop("Error in xcms::findChromPeaks.")
}

which suggests to me that there is something wrong with my data class...? This is the traceback (which doesn't seem very helpful):

> traceback()
2: stop("Error in xcms::findChromPeaks.")
1: massprocesser::process_data(path = "./", polarity = "negative", 
       ppm = 15, peakwidth = c(5, 30), snthresh = 5, noise = 500, 
       threads = 6, output_tic = TRUE, output_bpc = TRUE, output_rt_correction_plot = TRUE, 
       min_fraction = 0.5, fill_peaks = FALSE)

I run into no such problem with the positive polarity data. Please let me know what else I could provide to help, and many thanks in advance.

@andrewjkwok
Copy link
Author

Hello - wanted to check whether there was any update on this issue? Many thanks in advance.

@jaspershen
Copy link
Member

Hi, there. It is difficult to find the problem without the code and data you used. I am not sure how many mzxml files you have for processing. I would recommend that you can use 2 or 3 files to run the process_data function again, and if the error persists, you can share the data and code with me, so I can try to identify the issue and fix it. Thank you.

@andrewjkwok
Copy link
Author

Hi, thanks for the reply. I'm still running into problems using a more limited set of files (actually I only have a single mzXML file per sample). Happy to share data and code - is there an email I can share a google drive link to?

@jaspershen
Copy link
Member

[email protected]

@andrewjkwok
Copy link
Author

Fantastic, thanks. Have shared the link with data and script. Please let me know what else might be needed / whether the error can be reproduced on your side.

This is my session info for reference:


R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8    
 [5] LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8    LC_PAPER=C.UTF-8       LC_NAME=C             
 [9] LC_ADDRESS=C           LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] massconverter_1.0.3   tictoc_1.2            lubridate_1.9.2       forcats_1.0.0        
 [5] stringr_1.5.0         purrr_1.0.1           readr_2.1.4           tibble_3.2.1         
 [9] tidyverse_2.0.0       metid_1.2.28          metpath_1.0.5         ComplexHeatmap_2.14.0
[13] mixOmics_6.22.0       lattice_0.20-45       MASS_7.3-58           massstat_1.0.4       
[17] tidyr_1.3.0           ggfortify_0.4.16      massqc_1.0.6          masscleaner_1.0.11   
[21] xcms_3.20.0           MSnbase_2.24.2        ProtGenerics_1.30.0   S4Vectors_0.36.2     
[25] mzR_2.32.0            Rcpp_1.0.10           Biobase_2.58.0        BiocGenerics_0.44.0  
[29] BiocParallel_1.32.6   massprocesser_1.0.10  ggplot2_3.4.2         dplyr_1.1.2          
[33] magrittr_2.0.3        masstools_1.0.10      massdataset_1.0.24    tidymass_1.0.8       

loaded via a namespace (and not attached):
  [1] utf8_1.2.3                  tidyselect_1.2.0            robust_0.7-1               
  [4] htmlwidgets_1.6.2           munsell_0.5.0               codetools_0.2-18           
  [7] preprocessCore_1.60.2       future_1.32.0               withr_2.5.0                
 [10] colorspace_2.1-0            knitr_1.43                  rstudioapi_0.14            
 [13] robustbase_0.95-1           mzID_1.36.0                 listenv_0.9.0              
 [16] MatrixGenerics_1.10.0       GenomeInfoDbData_1.2.9      polyclip_1.10-4            
 [19] farver_2.1.1                parallelly_1.36.0           vctrs_0.6.2                
 [22] generics_0.1.3              xfun_0.39                   timechange_0.2.0           
 [25] itertools_0.1-3             randomForest_4.7-1.1        R6_2.5.1                   
 [28] doParallel_1.0.17           GenomeInfoDb_1.34.9         graphlayouts_1.0.0         
 [31] clue_0.3-64                 MsCoreUtils_1.10.0          bitops_1.0-7               
 [34] DelayedArray_0.24.0         scales_1.2.1                ggraph_2.1.0               
 [37] nnet_7.3-18                 gtable_0.3.3                affy_1.76.0                
 [40] globals_0.16.2              tidygraph_1.2.3             rlang_1.1.1                
 [43] GlobalOptions_0.1.2         Rdisop_1.58.0               lazyeval_0.2.2             
 [46] impute_1.72.3               checkmate_2.2.0             BiocManager_1.30.21        
 [49] reshape2_1.4.4              stevedore_0.9.5             backports_1.4.1            
 [52] Hmisc_5.1-0                 MassSpecWavelet_1.64.1      tools_4.2.2                
 [55] affyio_1.68.0               RColorBrewer_1.1-3          proxy_0.4-27               
 [58] plyr_1.8.8                  base64enc_0.1-3             progress_1.2.2             
 [61] zlibbioc_1.44.0             RCurl_1.98-1.12             prettyunits_1.1.1          
 [64] rpart_4.1.16                viridis_0.6.3               pbapply_1.7-0              
 [67] GetoptLong_1.0.5            SummarizedExperiment_1.28.0 ggrepel_0.9.3              
 [70] cluster_2.1.4               furrr_0.3.1                 data.table_1.14.8          
 [73] RSpectra_0.16-1             openxlsx_4.2.5.2            circlize_0.4.15            
 [76] RANN_2.6.1                  pcaMethods_1.90.0           mvtnorm_1.2-2              
 [79] matrixStats_1.0.0           hms_1.1.3                   patchwork_1.1.2            
 [82] evaluate_0.21               XML_3.99-0.14               readxl_1.4.2               
 [85] fastDummies_1.6.3           IRanges_2.32.0              gridExtra_2.3              
 [88] shape_1.4.6                 compiler_4.2.2              ellipse_0.4.5              
 [91] ncdf4_1.21                  crayon_1.5.2                htmltools_0.5.5            
 [94] corpcor_1.6.10              pcaPP_2.0-3                 tzdb_0.4.0                 
 [97] Formula_1.2-5               rrcov_1.7-3                 tweenr_2.0.2               
[100] MsFeatures_1.6.0            Matrix_1.5-1                cli_3.6.1                  
[103] vsn_3.66.0                  parallel_4.2.2              igraph_1.4.3               
[106] GenomicRanges_1.50.2        pkgconfig_2.0.3             fit.models_0.64            
[109] foreign_0.8-82              plotly_4.10.2               MALDIquant_1.22.1          
[112] foreach_1.5.2               rARPACK_0.11-0              ggcorrplot_0.1.4           
[115] missForest_1.5              rngtools_1.5.2              XVector_0.38.0             
[118] doRNG_1.8.6                 digest_0.6.31               Biostrings_2.66.0          
[121] rmarkdown_2.22              cellranger_1.1.0            htmlTable_2.4.1            
[124] curl_5.0.1                  rjson_0.2.21                lifecycle_1.0.3            
[127] jsonlite_1.8.5              viridisLite_0.4.2           limma_3.54.2               
[130] fansi_1.0.4                 pillar_1.9.0                ggsci_3.0.0                
[133] KEGGREST_1.38.0             fastmap_1.1.1               httr_1.4.6                 
[136] DEoptimR_1.0-14             glue_1.6.2                  remotes_2.4.2              
[139] zip_2.3.0                   png_0.1-8                   iterators_1.0.14           
[142] ggforce_0.4.1               class_7.3-20                stringi_1.7.12             
[145] e1071_1.7-13               

@jaspershen
Copy link
Member

Hi, Just checked the issue. And I found this error is because of your data, not the package. You can see that after converting your raw data to mzXML format data, only around 20b for each one is abnormal. I then checked the massconvert package and used my demo raw data; it can get the normal mzxml data using the same package and the same code. So the massconvert package is also OK. So now the issue is because of your raw data. I would like to recommend using the msconver software (which only supports Windows OS), and if you can get the normal mzxml format data, this suggests that the massconvert package should have a bug. And if you still can't get normal mzxml data, so we can confirm your raw data may have an issue. Please let me know the results when you finish this.

@andrewjkwok
Copy link
Author

Thanks, this is helpful. Will try to get massconvert working and will let you know the results over the next few days.

@andrewjkwok
Copy link
Author

Hi - I can confirm that with msconvert I can produce MZXML files of a pretty large size (380MB), so I'm guessing the bug is on the end off the massconvert package? I've uploaded 4 test MZXML files to the same google drive link for reference.

@andrewjkwok
Copy link
Author

Hello - just wanted to quickly check whether there was any update on this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants