mergeSam.pl -- Merging corresponding SAM files (e.g. all chr1 SAM files) from different independent replicates. Each replicate is indicated by its folder. The SAM files should have been subject to duplicate removal (rmDup)
This is part of the full pre-processing:
1. rmDup (removing PCR duplicates for SAM files (including Dr. JH Lee's SAM format))
2. mergeSam (merging SAM files if there are independent duplicates)
3. procReads (processing SAM files to get SNV read counts and generate bedgraph files)
USAGE:
perl mergeSam.pl folder_list prefix suffix output_folder
ARGUMENTS:
folder_list a file containing a line-separated list of folders representing mutilple biological replicates e.g. data/rep1 data/rep2 data/rep3 ... prefix the common chromosome file prefix in these folders e.g. proc_chr for proc_chr1.rmdup.sam, proc_chr2.rmdup.sam, etc. suffix the common file suffix in these folders e.g. rmdup.sam for proc_chr1.rmdup.sam, proc_chr2.rmdup.sam, etc. output_folder all the merged prefix*.suffix files will be put to that specified folder (assume it exists)
NOTE:
files matching folder/prefix*.suffix will be enumerated for merging, where "folder" is one element in the folder_list. Note the added / and *. in between.
The files are simply merged and not sorted.
EXAMPLE:
If one wants to merge all "proc_chr*.rmdup.sam" files under data/rep1, data/rep2, data/rep3 (list stored in "rep.lst") and output the merged files to folder "data.merged", mergeSam.pl should be run as follows:
perl mergeSam.pl rep.lst proc_chr rmdup.sam dadta.merged
This pipeline is free software; you can redistribute it and/or modify it given that the related works and authors are cited and acknowledged.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.
Cyrus Tak-Ming CHAN
Xiao Lab, Department of Integrative Biology & Physiology, UCLA