-
Input
- hg19.fa.gz : The reference genome file
- DMRs_unDMRs_signal.bed : The first three columns are location information and the fourth column represents the regions' classification(0: non-DMRs 2:hyper-DMRs 3:hypo-DMRs)
- real_data_mean_chr21.bed: The mean CpG level of CpG sites from real data.
-
Output
- test.fa: A folder that contains the fasta file for treatment group.
- control.fa : A folder that contains the fasta file for control group.
- test.bed: A folder that contains the fasta file for treatment group.
- control.bed: A folder that contains the fasta file for control group.
- total DMRs: The first three columns are the location information and the fourth column represents the regions' classification(0: non-DMRs 1:DMRs).
-
Script
- preparation.sh
-
Input
- control.fa : A folder that contains the fasta file for control group.
- test.fa : A folder that contains the fasta file for treatment group.
-
Output
- rrbssim: A folder that contains all simulated files.
- fastqc_initial: A folder that contains the html files before trimming.
- fq: A folder that contains the simulated fastq files.
- fastqc_trim: A folder that contains the file after trimming.
- bam: A folder that contains the bam files which were processed by bismark.
- cov: A folder that contains the coverage files which were processed by bismark
- bed: A folder that contains the bed files which generated from cov file.
-
Script
- simulation.sh : We could get the 1 test simulation data and 1 control simulation data through running this script
- The basic format of run this script is like: sh simulation.sh -o ${outdir} -t ${time}
- run_simulation.sh : We could get n pairs of simulated data through running this script and this script is a supplementary for simulation.sh
- The basic format of run this script is like: sh run_simulation.sh -o ${outdir} -n ${number}
- simulation.sh : We could get the 1 test simulation data and 1 control simulation data through running this script
-
Input
- *_DMR: The results of different tools about DMRs.
- chr21.bed: The bed file of the start and end of chromosome21.
- total DMRs: The first three columns are the location information and the fourth column represents the regions' classification(0: non-DMRs 1:DMRs).
-
Output
- *_rate.bed: The overlap rate of diifferent tools.
-
Script
-
simulateddata_overlap.sh: The main script of calculating overlap rate in simulated data.
-
simulattedata_overlap.py: The sub script of calculating overlap rate in simulated data.
-
-
Input
- *_DMR: The results of different tools about DMRs.
- merge_total.bed: The bed file that contains the depth information of the intersection of CpG sites of all samples.
- total DMRs: The first three columns are the location information and the fourth column represents the regions' classification(0: non-DMRs 1:DMRs).
- chr21.bed: The bed file of the start and end of chromosome21.
-
Output
- *_rate.bed: The overlap rate of diifferent tools.
- AUC_results: The file that contains the value of AUC.
-
Script
- calculate_AUC.sh: The main script of calculating AUC
- simulationdata_overlap.py: The sub script that used to calculated the overlapped rate in simulated data.
- calculate_AUC_all.R The sub script that used to calculate the AUC.PART 5: The process of calculating the AUC
-
Input
- *_rate.bed: The overlap rate of diifferent tools.
-
Output
- kendallW_results: The kendall‘s coefficient of concordance
- MDS_results:The results of multidimensional scaling.
-
Script
- kendall'W.R: The main script of calculating kendall'W.