I got an unexpected scientific result from a simple "get data" from UCSC table
browser with galaxy. I have uploaded the mouse mm9 repeat masker track with a filtering on
repClass = LINE SINE LTR DNA
If I ask the output to be sequences, I will get 1 454 739 sequences, but if I ask the same
data to be retrieved as a BED format I get roughly 3 600 000 which is the closest to the
"summary/statiscs" of the dataset (item count = 3 493 484). Why is there a
difference between the FASTA file and the BED file?
Show replies by date