1. How do I transfer files using the URL? As a test I wanted to transfer a file to my local installation of Galaxy so I put in the location of the file on my Linux box.
into the "Get Data" URL box. But it seems to think that this is code that I've pasted rather than a link to the file. Will it only work if it is proceeded by http?
2. Also, when uploading a file using "Get Data" it seems the file is renamed to dataset_'id'.dat in ~/database/files/000/. Is it possible for the file to keep its name rather than being renamed?
There's this error message in my apache log:
[error] Request exceeded the limit of 10 subrequest nesting levels due to
probable confguration error. Use 'LimitInternalRecursion' to increase the
limit if necessary. Use 'LogLevel debug' to get a backtrace., referer: ...
When I turn the apache debug log on it also shows:
[debug] core.c(3072): r->uri = /galaxy/proxy:
It seems like mod_rewrite is somehow misconfigured, but I couldb't figure
out what it is.
Here is my httpd.con mod_rewrite conf:
RewriteRule ^/galaxy$ /galaxy/ [R]
RewriteRule ^/galaxy(.*) http://localhost:8080$1 [P]
Could you please help me to debug this?
In order to figure out the Mean Inner Distance between Mate Pairs of my paired-end RNA-seq datasets, I ran Bowtie (Map with Bowtie for Illumina) with both forward and reverse datasets and mouse mm9 as reference genome. Below I list the Bowtie output for only one pair of reads (I put the fields on the left side):
For the forward read
OPT: XA:i:1 MD:Z:0A35 NM:i:1
For the reverse read
OPT: MD:Z:29A6 NM:i:1
Is the ISIZE the insert size? The difference between POS and MPOS is 145bp, which is 36bp shorter than ISIZE (181). My question is: if ISIZE does mean insert size, how should I convert INSIZE into Mean Inner Distance between Mate Pairs?
I am trying to upload new files to galaxy via FTP using Filezilla.
I have this error message:
530 Sorry, the maximum number of clients (3) for this user are already
What can I do?
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
On 8/21/12 4:33 AM, i b wrote:
> Thanks Jen,
> useful link. But I did not understand one thing.
> I have the following FPKM in cufflinks for two samples:
> s1 (untreated): 1234106
> s2 (treated): 159713
> cuffdiff of the two samples gives me the following values:
> value_1: 5.4
> value_2: 20.9
> and it is not significant (!). My two question:
> 1. how is this not significant?
> 2. what is the realtion between the high fpkm in cufflinks and the low
> values in cuffdiff?I read the manual: is this part of the statistical
> method adopted?e.g are these numbers (cuffdiff values) derived from
> the formula adopted?
> thanks a lot,
> On Thu, Aug 16, 2012 at 11:26 PM, Jennifer Jackson <jen(a)bx.psu.edu> wrote:
>> A very similar question came up a few days ago and Jeremy had some good
>> advice for how to approach learning to interpret this data:
>> Galaxy team
>> On 8/15/12 8:49 AM, i b wrote:
>>> Dear all,
>>> in cuffdiff outputs e.g. transcript differential expression, I find for
>>> value_1 value_2 log2(fold_change)
>>> 7.77183 0 -1.79769e+308
>>> value_1 value_2 log2(fold_change)
>>> 0 14.5972 1.79769e+308
>>> for many many rows.
>>> if I sort in excel my data by fold change column (big to small ), all
>>> the rows with -1.79769e+308 or +1.79769e+308 are on the top.
>>> How can be sure that these on the top are really the most up-regulated
>>> or down regulated transcripts if I don't know the real value of one of
>>> the two samples (is 0 really zero?)?
>>> I was told that the zero in one if the two samples is very small
>>> number and Cuffdiff simply writes 0, but it is not absolutely zero,
>>> otherwise it would not be possible ot have -1.79769e+308 or
>>> Could you please tell me then how can I extrapolate the highest fold
>>> change? (up and down regualted)?or of what is done by sorting by log
>>> fold chnage is correct?
>>> The Galaxy User list should be used for the discussion of
>>> Galaxy analysis and other features on the public server
>>> at usegalaxy.org. Please keep all replies on the list by
>>> using "reply all" in your mail client. For discussion of
>>> local Galaxy instances and the Galaxy source code, please
>>> use the Galaxy Development list:
>>> To manage your subscriptions to this and other Galaxy lists,
>>> please use the interface at:
>> Jennifer Jackson
I am using the "Count intervals in one file overlapping intervals in
another file" tool (part of the bedTools package) to assess the number of
RNA-seq reads that map back to a specific region. (I am using this on the
Galaxy test server). I am finding that this tool returns many more reads
than it should.
In reading the bedTools manual, it seems like this tool is the "windowBed"
tool and it actually has many more parameters that are not shown on the
Galaxy interface. What are these (hidden) settings for these parameters
that are not shown?
Hopefully this can explain my incorrectly recorded reads.
Thanks in advance.
SAM datasets can be used as tabular data with the Text manipulation,
Filter and Sort, Join, Subtract and Group, etc. if the headers are
removed and the datatype changed. Or, you can convert the format to
Interval. For the simplest count directly on the SAM/BAM format itself,
the tool group "NGS: SAM Tools" and "NGS: Picard (beta)" have options.
There is no "score" value from a SAM file - perhaps the reply below was
misinterpreted? The score was derived from a GTF file, the GTF file and
an Interval file were joined, then some summaries were created with the
The Tool Shed has a repository for the DESeq package. The instructions
explain how to prepare the inputs. "NGS: Picard (beta)" also has tools
for assigning read groups if you need to do that.
http://getgalaxy.orghttp://toolshed.g2.bx.psu.edu/ Search for: DESeq
Hopefully this helps!
On 8/19/12 6:21 AM, mic(a)mb.au.dk wrote:
> Hi Jennifer,
> I could't find htseq or similar tool in galaxy tool sheds (sam2counts does not work), what is a bit problematic if one what's to work with Deseq (fastaq->tophat->sam->counts->Deseq).
> Could you please explain in more detail how to convert SAM to counts using available galaxy tools. It is not clear for me where to find "score" in interval produced from SAM and etc.
> <quote author='Jennifer Jackson'>
> Hello Luciano,
> There is no single tool do to this operation (although there has been
> some discussion about including one in the Tool Shed), but the same
> information can be obtained by using a combination of existing tools.
> First, start by converting both starting datasets to interval format.
> mapped reads:
> - for TopHat output, "NGS: SAM Tools -> Convert SAM to interval"
> - for GFF file (convert to tabular if necessary), subtract "1"
> from the start position's value using tool "Text Manipulation ->
> - cut columns chrom, new start, stop, strand, name, and score from
> this result file using "Text Manipulation -> Cut"
> - set the data type to "interval" using the 'Edit attributes form
> (pencil icon)
> Next, use a tool in the group "Operate on Genomic Intervals" to compare
> these intervals for overlap. The tool "Cluster" with the option "Find"
> is mostly likely the one you will want to use.
> As a final step, summarize the data by feature using the tool "Join,
> Subtract and Group -> Group".
> Hopefully this helps,
> Galaxy team
> On 3/19/12 4:36 PM, Luciano Cosme wrote:
>> I was wondering if there is any tool on Galaxy were I can obtain a
>> table with how many reads have been mapped to a given sample and to a
>> given gene (for example, use a Tophat output and use a GFF file to
>> obtain the table). I am using HTSeq to get it (htseq-count). There is
>> also GenomicRanges and easyRNASeq packages in bioconductor.
>> Thank you.
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org. Please keep all replies on the list by
>> using "reply all" in your mail client. For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org. Please keep all replies on the list by
> using "reply all" in your mail client. For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> Quoted from:
I am going to run Tophat with RNA-seq dataset to observe alternative splicing events. There is a parameter for Tophat: "Minimum length of read segment". According to "implemented Tophat options", the description for "Minimum length of read segment" is "Each read is cut up into segments, each at least this long. These segments are mapped independently. The default is 25". The length of my reads is 36bps, should I change this parameter based on the length of my reads? How long should I input?
Is it possible to link to compressed files in a Galaxy data library?
We receive all of our NGS data in bz2 or gzip format for obvious
reasons, just wondering if I have to decompress it on the filesystem
before I link to it or not. Thanks!