I have problem to split a paired-end FASTQ dataset into two separate datasets. In order to explain the problem clearly, I list the detail of what I did with my dataset:
Step 1) My aim is to compare datasets for the differential alternative splicing. I downloaded paired-end datasets at FASTQ format from SRA of NCBI as original data.
Below is part of my paired-end FASTQ dataset that I downloaed from SRA of NCBI, Does this dataset look OK?
@SRR192532.1.1 HWI-EAS269:1:4:655:110.1 length=35 GTTTTCTGAGTGAGAAAAGGTGTGTTTGGAGTTTG +SRR192532.1.1 HWI-EAS269:1:4:655:110.1 length=35 I28II;II*2/<5:++,(..*943F@I.('+.35' @SRR192532.1.2 HWI-EAS269:1:4:655:110.2 length=35 AAAGATGTTAGTGTTTTATACGGAAAGGATATCTC +SRR192532.1.2 HWI-EAS269:1:4:655:110.2 length=35 9+*9+7@?F1206,IGI+D122&/0++-.>+6/@?
Step 2) Then I performed FASTQ groomer at setting as follows:
a) Input FASTQ quality scores type: Illumina 1.3-1.7
b)Advanced Options: Hide Advanced Options.
Did I choose the right setting for FASTQ groomer? Should I use Advanced Options? If yes, what is the setting for Advances Options?
Below is part of groomed dataset:
@SRR192532.1.1 HWI-EAS269:1:4:655:110.1 length=35 GTTTTCTGAGTGAGAAAAGGTGTGTTTGGAGTTTG +SRR192532.1.1 HWI-EAS269:1:4:655:110.1 length=35 *!!**!**!!!!!!!!!!!!!!!!'!*!!!!!!!! @SRR192532.1.2 HWI-EAS269:1:4:655:110.2 length=35 AAAGATGTTAGTGTTTTATACGGAAAGGATATCTC +SRR192532.1.2 HWI-EAS269:1:4:655:110.2 length=35 !!!!!!!!'!!!!!*(*!%!!!!!!!!!!!!!!!!
Does the groomed data look right? Is number represnting the member of a pair correct? Here they are ".1" and ".2", should they be "/1" and "/2"?
Step 3) Then I ran FASTQ splitter with the groomed files. There is not setting for the splitter. I chose the right groomed file and then click "Excute". Below is the description of the splitted dataset:
Please help me dela with this problem.
Thanks.
Jianguang Du