Dear galaxy users, I am trying to map some multiplexing bisulfite PCR data (Illumina) to our 20+ genes of interest. I want to use the "Map with Bowtie for Illumina" in Galaxy. Therefore, I need to change the reference genome to my known DNA sequences. However, I don't know how to make such a reference index. Shall it be in FASTQ, FASTA or CTF format? My reference sequences are now in a word file. How can I convert their format to the desired one? My second question is that my sequencing data for an individual sample are separated in 8 different FASTQ files. Does it matter that I map them individually and then merge them together? Or shall I combine them first (which will be a very huge file) and then do the mapping? Does it change the results either way? My last question is that since we are looking at the CG methylation, there certainly will be mismatches in the CG sites (such as being a CG or TG) compared to the reference sequence. I am afraid the mismatches may be greater than 3 for a read about 100 bp. Do you think Bowtie will allow these many mismatches? If so, can you suggest a better way to do the mapping? Thanks for your attention!!! Xiefan
Hello Xiefan, For this type of analysis, using a tool such as Bismark would be best. http://www.bioinformatics.babraham.ac.uk/projects/bismark/ An open ticket to wrap this tool for Galaxy, that you can follow, is here: http://bitbucket.org/galaxy/galaxy-central/issue/626/wrap-bismark For custom genomes in general, help is in our wiki: http://wiki.g2.bx.psu.edu/Support#Custom_reference_genome Best, Jen Galaxy team On 5/14/12 9:43 AM, Fang, Xiefan wrote:
Dear galaxy users,
I am trying to map some multiplexing bisulfite PCR data (Illumina) to our 20+ genes of interest. I want to use the "Map with Bowtie for Illumina" in Galaxy. Therefore, I need to change the reference genome to my known DNA sequences. However, I don't know how to make such a reference index. Shall it be in FASTQ, FASTA or CTF format? My reference sequences are now in a word file. How can I convert their format to the desired one?
My second question is that my sequencing data for an individual sample are separated in 8 different FASTQ files. Does it matter that I map them individually and then merge them together? Or shall I combine them first (which will be a very huge file) and then do the mapping? Does it change the results either way?
My last question is that since we are looking at the CG methylation, there certainly will be mismatches in the CG sites (such as being a CG or TG) compared to the reference sequence. I am afraid the mismatches may be greater than 3 for a read about 100 bp. Do you think Bowtie will allow these many mismatches? If so, can you suggest a better way to do the mapping?
Thanks for your attention!!!
Xiefan
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org
Hello Xiefan, For this type of analysis, using a tool such as Bismark would be best. http://www.bioinformatics.babraham.ac.uk/projects/bismark/ An open ticket to wrap this tool for Galaxy, that you can follow, is here: http://bitbucket.org/galaxy/galaxy-central/issue/626/wrap-bismark For custom genomes in general, help is in our wiki: http://wiki.g2.bx.psu.edu/Support#Custom_reference_genome Best, Jen Galaxy team On 5/14/12 9:43 AM, Fang, Xiefan wrote:
Dear galaxy users,
I am trying to map some multiplexing bisulfite PCR data (Illumina) to our 20+ genes of interest. I want to use the "Map with Bowtie for Illumina" in Galaxy. Therefore, I need to change the reference genome to my known DNA sequences. However, I don't know how to make such a reference index. Shall it be in FASTQ, FASTA or CTF format? My reference sequences are now in a word file. How can I convert their format to the desired one?
My second question is that my sequencing data for an individual sample are separated in 8 different FASTQ files. Does it matter that I map them individually and then merge them together? Or shall I combine them first (which will be a very huge file) and then do the mapping? Does it change the results either way?
My last question is that since we are looking at the CG methylation, there certainly will be mismatches in the CG sites (such as being a CG or TG) compared to the reference sequence. I am afraid the mismatches may be greater than 3 for a read about 100 bp. Do you think Bowtie will allow these many mismatches? If so, can you suggest a better way to do the mapping?
Thanks for your attention!!!
Xiefan
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org
participants (2)
-
Fang, Xiefan
-
Jennifer Jackson