Hi all,I have been analyzing my RNA-seq data on mouse tissues. My RNA-data is single-ended and 51 bp in length. I ran TopHat/Cufflink/Cuffdiff to test to differential gene expressionIn the Cuffdiff's output, I got very high RPKM value for some of miRNA and some other short genes ( less than 100bp). These genes are in the top genes with the highest RPKM. I think the RPKM values of these genes are probably too high to be true.
test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant ENSMUSG00000093077 ENSMUSG00000093077 Mir5105 5:146231229-146302874 Epithelium Fiber OK 1.53E+06 445558 -1.78097 -355.367 0.00715 0.016986 yes ENSMUSG00000093098 ENSMUSG00000093098 Gm22641 7:130162450-133124354 Epithelium Fiber OK 87894.1 36474.7 -1.26887 -0.59863 0.4913 0.587174 no ENSMUSG00000089855 ENSMUSG00000089855 Gm15662 10:105187662-105583874 Epithelium Fiber OK 42868.9 21566.5 -0.99114 -20.7066 0.0186 0.039568 yes ENSMUSG00000092984 ENSMUSG00000092984 Mir5115 2:73012853-73012927 Epithelium Fiber OK 21104.8 8317.49 -1.34335 -447.314 0.0001 0.000354 yes ENSMUSG00000086324 ENSMUSG00000086324 Gm15564 16:35926510-36037131 Epithelium Fiber OK 6443.35 3664.15 -0.81433 -1.52095 0.2129 0.301429 no ENSMUSG00000092981 ENSMUSG00000092981 Mir5125 17:23803186-23824739 Epithelium Fiber OK 5974.14 2390.75 -1.32127 -0.34111 0.5746 0.661937 no I checked some forums and they said that this is the drawback of TopHat/Cufflink/Cuffdiff when dealing with short genes. But I am still not so clear about this. Anyone got the same problem? What can I do with this situation?Anyone suggests any other good tools to test for (1) differential gene expression OR (2) both differential gene expression and gene discovery?Thank youThanh