Hi,
I'm using the fastx_toolkit (v0.0.13) command line scripts.
When using fastx_clipper, I get:
fastx_clipper -a TCGTATGCCGTCTTCTGCTTG -v -c -l 15 -M 5 -i s_8_sequence.fa -o s_8_sequence_clipped.fa
Clipping Adapter: TCGTATGCCGTCTTCTGCTTG
Min. Length: 15
Non-Clipped reads - discarded.
Input: 227673720 reads.
Output: 212647528 reads.
discarded 3527200 too-short reads.
discarded 725608 adapter-only reads.
discarded 10773384 non-clipped reads.
discarded 0 N reads.
The s_8_sequence.fa file is 2.2Gb, s_8_sequence_clipped.fa file is 1.7Gb.... seems like fastx_clipper is reporting way too many reads in this instance.
I also tried without the -M option but same thing.
I checked with:
wc -l s_8_sequence.fa
56918430
(divide this by 2 gives 28,459,215 reads)
wc -l s_8_sequence_clipped.fa
53161882
(divided by 2 gives 26,580,941 reads)
There has never been such a discrepancy with this tool.
I'm not sure if I'm doing something silly this time round, or somethings changed in my system that's affecting fastx_clipper counting.
Heres a couple of lines from input and output:
head -n 6 s_8_sequence.fa
>ILLUMINA-08A740_0000:8:1:1736:1055#0/1
GCGAGCGTAGTTCAATGGTAAAACATCTCCTTGCCAAGGA
>ILLUMINA-08A740_0000:8:1:2219:1057#0/1
CAAGCGTCGGAGGTTTAGTCTTTCGTATGCCGTCTTCTGC
>ILLUMINA-08A740_0000:8:1:2316:1056#0/1
TACCTGGTTGATCCTGCCAGTAGTCGTATGCCGTCTTCTG
head -n 6 s_8_sequence_clipped.fa
>ILLUMINA-08A740_0000:8:1:2219:1057#0/1
CAAGCGTCGGAGGTTTAGTCTT
>ILLUMINA-08A740_0000:8:1:2316:1056#0/1
TACCTGGTTGATCCTGCCAGTAG
>ILLUMINA-08A740_0000:8:1:3041:1059#0/1
GAAGCTGCGGGTTCGAGCCCCGTCAGTCCCGCCA
Any ideas?
Thanks,
Ken