Tophat "Mean Inner Distance between Mate Pairs"
by 杨继文
Hi all,
When mapping pair end RNA-seq reads using tophat, we need to type in "Mean Inner Distance between Mate Pairs". In galaxy, we can read the following information:
This is the expected (mean) inner distance between mate pairs. For, example, for paired end runs with fragments
selected at 300bp, where each end is 50bp, you should set -r to be 200. There is no default, and this parameter
is required for paired end runs.
I think the size of fragment (here 300bp) includes not only the length of pair end reads, but also the length of adaptors. so, maybe the Mean Inner Distance between Mate Pairs should be : fragment length - pair end read length - adaptor length. Am I right? or did I miss something?
Is it a must to type in the accurate value?
Looking forward to your reply
JIwen
10 years, 7 months
Maq consensus calling changes quality score of the read?
by Antony Jose
Hi,
We used the generate pileup tool with consensus base calling option using
Maq with default options. In the output, the quality scores of the bases
were changed. For example if the input score of bases in the SAM file were
'IIJIH'. They were changed to '22321'. Is this a glitch or is this
expected? Thank you.
Antony
10 years, 8 months
Download multiple files from history
by Adhemar
Hi,
I'm trying to download multiple files from a given history but I couldn't
figure out how to do it.
Is there a way?
Thanks,
Adhemar
10 years, 9 months
Blast2GO local instance Re: Table with gene count reads
by Luciano Cosme
Howdy,
Thanks Jen, I will try it tomorrow.
I installed Blast2Go from the Toolshed in my local instance of Galaxy
and when I try to run it I get the following error:
Index file named 'blast2go.loc' is required by tool but not available.
I logged as admin and the installation did not gave me any error. From
the terminal:
galaxy.util.shed_util DEBUG 2012-03-23 17:23:37,088 Installing repository
'blast2go'
galaxy.util.shed_util DEBUG 2012-03-23 17:23:37,088 Cloning
http://testtoolshed.g2.bx.psu.edu/repos/peterjc/blast2go
destination directory: blast2go
requesting all changes
adding changesets
adding manifests
adding file changes
added 2 changesets with 6 changes to 3 files
updating to branch default
3 files updated, 0 files merged, 0 files removed, 0 files unresolved
galaxy.util.shed_util DEBUG 2012-03-23 17:23:42,798 Updating cloned
repository to revision "7b53cc52e7ed"
Anyway, I was thinking to use it because most of my differentially
expressed genes are unknown. I was thinking to use Blast2GO to get them at
least clustered in functional groups. I am not sure if that would be the
best approach to find what might be the function of these genes. I also
checked the list of public services that might have this tool, and Berkeley
BOP is listed, but it seems that they no longer have the server or it was
down when I checked (or the link is broken http://galaxy.berkeleybop.org/).
Thank you.
Luciano
On Fri, Mar 23, 2012 at 8:43 AM, Jennifer Jackson <jen(a)bx.psu.edu> wrote:
> Hello Luciano,
>
> There is no single tool do to this operation (although there has been some
> discussion about including one in the Tool Shed), but the same information
> can be obtained by using a combination of existing tools.
>
> First, start by converting both starting datasets to interval format.
>
> mapped reads:
> - for TopHat output, "NGS: SAM Tools -> Convert SAM to interval"
> features:
> - for GFF file (convert to tabular if necessary), subtract "1"
> from the start position's value using tool "Text Manipulation ->
> Compute"
> - cut columns chrom, new start, stop, strand, name, and score from
> this result file using "Text Manipulation -> Cut"
> - set the data type to "interval" using the 'Edit attributes form
> (pencil icon)
>
> Next, use a tool in the group "Operate on Genomic Intervals" to compare
> these intervals for overlap. The tool "Cluster" with the option "Find" is
> mostly likely the one you will want to use.
>
> As a final step, summarize the data by feature using the tool "Join,
> Subtract and Group -> Group".
>
> Hopefully this helps,
>
> Best,
>
> Jen
> Galaxy team
>
>
> On 3/19/12 4:36 PM, Luciano Cosme wrote:
>
>> Hi,
>> I was wondering if there is any tool on Galaxy were I can obtain a
>> table with how many reads have been mapped to a given sample and to a
>> given gene (for example, use a Tophat output and use a GFF file to
>> obtain the table). I am using HTSeq to get it (htseq-count). There is
>> also GenomicRanges and easyRNASeq packages in bioconductor.
>> Thank you.
>>
>> Luciano
>>
>>
>>
>> ______________________________**_____________________________
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org. Please keep all replies on the list by
>> using "reply all" in your mail client. For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>>
>> http://lists.bx.psu.edu/**listinfo/galaxy-dev<http://lists.bx.psu.edu/listinfo/galaxy-dev>
>>
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>>
>> http://lists.bx.psu.edu/
>>
>
10 years, 10 months
Training on NGS data analysis
by Aarti Desai
Hi Jennifer,
We are working on developing NGS data analysis pipeline. Does your institution have a training program where one or two people from my team can be trained on NGS data analysis, particularly de novo genome assembly?
Regards,
Aarti
Dr. Aarti Desai | Domain Specialist - Life Sciences Domain
aarti_desai(a)persistent.co.in<mailto:aarti_desai@persistent.co.in> | Cell: +91-9673009492 | Tel: +91-20-30236348
Persistent Systems Ltd. | Partners in Innovation | www.persistentsys.com<http://www.persistentsys.com/>
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
10 years, 10 months
CummeRband in Galaxy platform
by Bomba Dam
Hi Galaxy team,
Are there any plans to add the CummeRband tool in the Galaxy platform.
It would be very nice to have it within Galaxy, this will help people like
us (Biologists, without much programming knowledge) to do the complete
RNA-seq analysis and visualize the data in graphical forms
Cheers.
Bomba
--
10 years, 10 months
correct input into the minimum alignment count
by vang0280@umn.edu
Dear all,
I changed the minimum alignment count to: 100, 400, and 1000
minimum alignment differential expressed genes
100 4, 000
400 2,000
1000 (default) 1,000
I was wondering, which minimum alignment we should go with? It would appear
the higher the alignment, the amount of differential expressed genes are
decreased.
I was also wondering if the minimum alignment refers to the # of reads per
sequence? Is this true?
Also how are the FPKM and the minimum alignment are related?
Thanks,
Bao
10 years, 10 months
how to view the Tophat output ?
by Xu, Jianpeng
Hi ,
I have installed our local galaxy and tried to run some programs. I have run the Tophat with RNA-seq data. There are two output files: accepted_hits, and splice_junctions. Can you tell me how to view the results files? Is there a tool on galaxy that can be used to view the result ? Can I use IGV to view it ?
Thanks a lot.
Jianpeng
________________________________
This e-mail message (including any attachments) is for the sole use of
the intended recipient(s) and may contain confidential and privileged
information. If the reader of this message is not the intended
recipient, you are hereby notified that any dissemination, distribution
or copying of this message (including any attachments) is strictly
prohibited.
If you have received this message in error, please contact
the sender by reply e-mail message and destroy all copies of the
original message (including attachments).
10 years, 10 months
how to view the tophat output ?
by Xu, Jianpeng
Hi ,
I have installed our local galaxy and tried to run some programs. I have run the Tophat with RNA-seq data. There are two output files: accepted_hits, and splice_junctions. Can you tell me how to view the results files? Is there a tool on galaxy that can be used to view the result ? Can I use IGV to view it ?
Thanks a lot.
Jianpeng
________________________________
This e-mail message (including any attachments) is for the sole use of
the intended recipient(s) and may contain confidential and privileged
information. If the reader of this message is not the intended
recipient, you are hereby notified that any dissemination, distribution
or copying of this message (including any attachments) is strictly
prohibited.
If you have received this message in error, please contact
the sender by reply e-mail message and destroy all copies of the
original message (including attachments).
10 years, 10 months