Hello,
I performed 6 different runs of Cufflinks using the same
Reference Annotation on 6 different samples. I expected the resulting
tabular output to be constant given the same Reference Annotation was used each
time. What I found is that there was an inconsistency.
Specific example, here is top 5 rows from output for one of
the six samples:
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage FPKM FPKM_conf_lo FPKM_conf_hi FPKM_status
NM_001005484 - - NM_001005484 - - chr1:69090-70008 - - 0 0 0 OK
NR_026820 - - NR_026820 - - chr1:34610-36081 - - 0 0 0 OK
NM_001005221 - - NM_001005221 - - chr1:367658-368597 - - 0 0 0 OK
NR_046018 - - NR_046018 - - chr1:11873-14408 - - 0 0 0 OK
Here are top 5 rows from output for another of the six samples:
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage FPKM FPKM_conf_lo FPKM_conf_hi FPKM_status
NM_001005484 - - NM_001005484 - - chr1:69090-70008 - - 0 0 0 OK
NR_026820 - - NR_026820 - - chr1:34610-36081 - - 0 0 0 OK
NM_001005221 - - NM_001005221 - - chr1:367658-368597 - - 0 0 0 OK
NR_046018 - - NR_046018 - - chr1:11873-14408 - - 0 0 0 OK
NR_024540 - - NR_024540 - - chr1:14361-29370 - - 15114.7 9501.21 20728.1 OK
Here are top 5 rows from output for yet another of the six samples:
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage FPKM FPKM_conf_lo FPKM_conf_hi FPKM_status
NR_026820 - - NR_026820 - - chr1:34610-36081 - - 0 0 0 OK
NM_001005484 - - NM_001005484 - - chr1:69090-70008 - - 0 0 0 OK
NM_001005221 - - NM_001005221 - - chr1:367658-368597 - - 0 0 0 OK
NR_046018 - - NR_046018 - - chr1:11873-14408 - - 0 0 0 OK
NR_024540 - - NR_024540 - - chr1:14361-29370 - - 28376.8 20008.9 36744.6 OK
As a user, when selecting the same Reference Annotation to
be used. I would expect the index of results by tracking id to be the
same.
Why is this happening? Why are they not consistent?
If were to concat the above results across samples
without correcting the indices, my data would be all mixed up.
I bring this to your attention as someone may not correct
the indices before concat and therefore would be doing downstream analysis incorrect
concatted data.
In turn, I would recommend that the index of results be corrected
to be constant for a Reference Annotation when selected.
Thank you in advance for this consideration.
Best,
Kory Johnson
--------------------------------------------
Kory R. Johnson, MS, PhD
Sr. Bioinformatics Scientist
www.kellygovernmentsolutions.com
Providing Contract Services For:
Bioinformatics Section,
Information Technology &
Bioinformatics Program,
Division of Intramural Research
(DIR),
National Institute of
Neurological Disorders & Stroke (NINDS),
National Institutes of Health
(NIH),
Bethesda, Maryland
Mailing Address:
NINDS/NIH
Clinical Center (Building 10)
Office 5S223
9000 Rockville Pike
Bethesda, MD 20892
Contact Information:
Phone:
301-402-1956
Fax:
301-480-3563
email:
johnsonko@ninds.nih.gov
P Green Message:
Please consider the environment
before printing this e-mail. Thank you.
Important Message:
This electronic message transmission
contains information intended for the recipient only. Such that, the
information contained herein may be confidential, privaledged, or
proprietary. If you are not the intended recipient, be aware that any
disclosure, copying, distribution, or use of this information is strictly
prohibited. If you have received this electronic information in error,
please notify the sender immediately by telephone. Thank you.
--------------------------------------------