Dear galaxy users,
I am trying to map some multiplexing bisulfite PCR data (Illumina) to our 20+ genes of interest. I want to use the “Map with Bowtie for Illumina” in Galaxy. Therefore, I need to change the reference genome to my known DNA sequences.
However, I don’t know how to make such a reference index. Shall it be in FASTQ, FASTA or CTF format? My reference sequences are now in a word file. How can I convert their format to the desired one?
My second question is that my sequencing data for an individual sample are separated in 8 different FASTQ files. Does it matter that I map them individually and then merge them together? Or shall I combine them first (which will be
a very huge file) and then do the mapping? Does it change the results either way?
My last question is that since we are looking at the CG methylation, there certainly will be mismatches in the CG sites (such as being a CG or TG) compared to the reference sequence. I am afraid the mismatches may be greater than
3 for a read about 100 bp. Do you think Bowtie will allow these many mismatches? If so, can you suggest a better way to do the mapping?
Thanks for your attention!!!
Xiefan