Hi, I'm new to Galaxy and am trying to view several miRNA datasets as a differential expression. The pipeline I'm using is Bowtie for Illumina (paired-end run) > SAM-to-BAM > ? > xls. The references I used with Bowtie are a mature miRNA fasta and a piRNA fasta and the reads are 30nt in length. So, my questions are: Is this the proper pipeline? How do I go about converting the BAM into a xls file viewable in Excel? Thanks!
Hi Calvin, I am analyzing miRNA differential expression from my small RNA sequencing data from mouse tissue using Bowtie > Htseq>Deseq. I tried both whole mouse genome and hairpin miRNA( from miRbase) as reference sequences and annotation of all known miRNA (from miRbase). These worked for me. Another option is that you can try mirDeep2 and Novoalign. Anyway, what organism are you working with? Where u download the piRNA reference sequence? Let me know what happens Thanh On Fri, Oct 11, 2013 at 12:51 PM, Gabriel Calvin <gac4223@gmail.com> wrote:
Hi, I'm new to Galaxy and am trying to view several miRNA datasets as a differential expression. The pipeline I'm using is Bowtie for Illumina (paired-end run) > SAM-to-BAM > ? > xls. The references I used with Bowtie are a mature miRNA fasta and a piRNA fasta and the reads are 30nt in length.
So, my questions are: Is this the proper pipeline? How do I go about converting the BAM into a xls file viewable in Excel?
Thanks!
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
You can also try miRDeep_star. It identifies know miRs and discovers possible new miRs. There is a java gui and a command line option. However you have to get your genome indexed with a script they provide. I used Deseq for the known miRs as well. Luciano Sent from my HTC One. On Oct 11, 2013 12:19 PM, "Hoang, Thanh" <hoangtv@miamioh.edu> wrote:
Hi Calvin, I am analyzing miRNA differential expression from my small RNA sequencing data from mouse tissue using Bowtie > Htseq>Deseq. I tried both whole mouse genome and hairpin miRNA( from miRbase) as reference sequences and annotation of all known miRNA (from miRbase). These worked for me. Another option is that you can try mirDeep2 and Novoalign. Anyway, what organism are you working with? Where u download the piRNA reference sequence? Let me know what happens Thanh
On Fri, Oct 11, 2013 at 12:51 PM, Gabriel Calvin <gac4223@gmail.com>wrote:
Hi, I'm new to Galaxy and am trying to view several miRNA datasets as a differential expression. The pipeline I'm using is Bowtie for Illumina (paired-end run) > SAM-to-BAM > ? > xls. The references I used with Bowtie are a mature miRNA fasta and a piRNA fasta and the reads are 30nt in length.
So, my questions are: Is this the proper pipeline? How do I go about converting the BAM into a xls file viewable in Excel?
Thanks!
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
The organism is fruit fly. The piRNA reference sequence was obtained from http://www.fruitfly.org/p_disrupt/TE.html as FASTA.FORMAT.v9.4.1. I will check out those programs. Gabriel
Thanks for the responses It appears these programs require some background in Python or R. Is there a less code-intensive way to manipulate a sam or bam into a format viewable in Excel? Does Galaxy provide a tool for this? If it simply is a matter of learning code, so be it. On Fri, Oct 11, 2013 at 3:20 PM, Gabriel Calvin <gac4223@gmail.com> wrote:
The organism is fruit fly. The piRNA reference sequence was obtained from http://www.fruitfly.org/p_disrupt/TE.html as FASTA.FORMAT.v9.4.1.
I will check out those programs.
Gabriel
Hi Gabriel, I believe you can do it with galaxy. I never used it for miRNA analysis because most of the miRNAs of the organism that I work with are not annotated on its genome. You will need the total number of reads that uniquely map to each mature miRNA. What I have notice is that the guide and passenger strand most of the time have huge differences in expression. To get that your annotation file (gtf or gff) would have to have each strand, its okay if you don't have it. Probably you can use HTseq. It is available on the Galaxy tool shed. You can also use it directly at http://galaxy.nbic.nl/ (it is NGS:RNA Analysis). You can also run it on your computer http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html After you get the counts you can use Deseq to calculate differential expression. See if you can get the counts first. I never used the Galaxy Deseq wrapper, but they have it on the tool shed too. You can install R and the Deseq package on your computer. You might install RStudio as well. I can send you the code I used to do my analysis with comments if you decide to give it a try. Best, Luciano On Mon, Oct 14, 2013 at 1:13 PM, Gabriel Calvin <gac4223@gmail.com> wrote:
Thanks for the responses It appears these programs require some background in Python or R. Is there a less code-intensive way to manipulate a sam or bam into a format viewable in Excel? Does Galaxy provide a tool for this?
If it simply is a matter of learning code, so be it.
On Fri, Oct 11, 2013 at 3:20 PM, Gabriel Calvin <gac4223@gmail.com> wrote:
The organism is fruit fly. The piRNA reference sequence was obtained from http://www.fruitfly.org/p_disrupt/TE.html as FASTA.FORMAT.v9.4.1.
I will check out those programs.
Gabriel
-- *Luciano Cosme* --------------------------------------------- PhD Candidate Texas A&M Entomology Vector Biology Research Group www.lcosme.com 979 845 1885 cosme@tamu.edu ---------------------------------------------
Hi Gabriel, SAM format is just tabular data- so you could assign that as a datatype (use the pencil icon). The only concern here would be the size of the files - Excel can be easily swapped with very large files. Converting SAM->interval is another option, if you just need the coordinates. This can also be set as "tabular" format for Excel For both, make sure the file ends in ".tab" not ".tabular" and possibly just "txt", once downloaded, in order for Excel to recognize it (as far as I know). Galaxy will perform many of the calculations that Excel will do (see group "Text manipulations" & others tools like "Group" or those in "Statistics", but you have likely already seen those. Best! Jen Galaxy team On 10/14/13 11:13 AM, Gabriel Calvin wrote:
Thanks for the responses It appears these programs require some background in Python or R. Is there a less code-intensive way to manipulate a sam or bam into a format viewable in Excel? Does Galaxy provide a tool for this?
If it simply is a matter of learning code, so be it.
On Fri, Oct 11, 2013 at 3:20 PM, Gabriel Calvin <gac4223@gmail.com <mailto:gac4223@gmail.com>> wrote:
The organism is fruit fly. The piRNA reference sequence was obtained from http://www.fruitfly.org/p_disrupt/TE.html as FASTA.FORMAT.v9.4.1.
I will check out those programs.
Gabriel
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
-- Jennifer Hillman-Jackson http://galaxyproject.org
participants (4)
-
Gabriel Calvin
-
Hoang, Thanh
-
Jennifer Jackson
-
Luciano Cosme