Dear
Dr. Jennifer,
A lot of thanks for your
reply its really mean alot for me.
i am still NGS junior i
trying to do the following steps kindly give me the right
order for these procedures
Data Description: i have
to samples each sample consists of 14 FASTQ file (7 forward
and 7 reverse ) i think this mean its paired end from
Illumina then i will try the following workflow to got best
results
1- Drag tow input dataset
into workflow one for forward sequences file and one for
reverse to use paired end option in TOPHAT tool later and
when i run this workflow i will select multiple selection
for the 7 forward files to analyse all of them at the same
time
2- Drag FASTQC and link
with last step for each to got if these file may be illmina
1.8 version or older.
3- Drag FASTQ Groomer and
link with last step if files older than 1.8 version to
prepare as .FASTQSANGER format.
4- Drag Filter FASTQ and
link with last step to remove redundancy of sequences.
5- Drag FASTQ trimmer to
remove unwanted ends of sequenced may occur
6- Drag Manipulate FASTQ
and link with last step (i dont know why).
the above six steps done
twice to generate to files as output to make as input for
the following steps.
7- Drag TOPHAT for
illumina and make it accept paired end files and link each
file generated from QC to TOPHAT this step used to align and
map with reference genome.
8- Drag Cufflinks and link
with aligned BAM file generated from TOPHAT to create an
assembly
9- Drag Cuffmerge and link
with GTF file from Cufflinks this step to merge all
assemblies generated in Cufflinks
10- Drag Cuffcompare and
link with last step to got detailed reports for accuracy of
all generated assemblies.
11- Drag MPileup and link
with TOPHAT BAM file to generate file containing SNPs sites.
12- Drag
Pileup-to-Interval and link with MPileup step to filter the
number of output SNPs to successive one or the most
accurate. (i dont know what is the difference between this
tool and Filter Pileup).
- i dont know what is the
tools used to know the copy number variation CNV
i need to know how to
separate human sequences from sample may infected with any
other sequences (is this at alignment stage)
i need to know the perfect
order of steps if this order is not completely right.
Is this what i should do
to make a good NGS workflow got all possible information
form dataset
Really i am so so so
sorry for disturbance - waiting your reply
Best Regards,
elsayed