Data from history now showing up in fastq drop down
by Aarti Desai
Hi All,
We have a galaxy local install. Thanks to Carlos's suggestion, I was able to get the reference genome index to show up in the interface. Now, I am trying to get the data into the galaxy system. I have followed the instructions in the link below to create data libraries.
http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries/Libraries
I have modified the following sections in the universe_wsgi.ini file:
# Add an option to the library upload form which allows administrators to
# upload a directory of files.
library_import_dir = /media/FreeAgent GoFlex Drive_/HDD1/Project
# Add an option to the admin library upload tool allowing admins to paste
# filesystem paths to files and directories in a box, and these paths will be
# added to a library. Set to True to enable. Please note the security
# implication that this will give Galaxy Admins access to anything your Galaxy
# user has access to.
allow_library_path_paste = True
I created a data library and using the "Add dataset" function, I pasted the path of my data directory in the galaxy UI and selected the "link to files without copying into galaxy" option. This picked up all the files that were present in the directory and except for a couple of files, the job seems to have completed successfully. Now I am not sure how to actually analyze this data. I performed the "Import to current history" operation on two paired end fastq files I want to analyze. These show up in the history with the appropriate size. But when I choose the "Map with BWA for Illumina" option, the two fastq files do not show up in the FASTQ file drop down.
These files do show up in the list of files for running fastqc
I have also restarted the server after importing the data in the history, but the problem persists.
Any input on how to go about analyzing the data in the local galaxy instance once it has been brought into the galaxy frame work is highly appreciated.
Thanks for the help.
Regards,
Aarti
Aarti Desai, Ph.D | Domain Specialist - Life Sciences
aarti_desai(a)persistent.co.in<mailto:aarti_desai@persistent.co.in> | Cell: +91-9673009492 | Tel: + 91-20-67036348
Persistent Systems Ltd. | Partners in Innovation | www.persistentsys.com<http://www.persistentsys.com/>
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
9 years, 12 months
Problem with ftp transfer of large bam files
by Hans Matsson
Hi,
I´m using Galaxy (main) browser on a Win 7 PC to get statistics from my sequencing runs. Now I have bam files which are too big for upload from my local hard drive so I tried to ftp upload to main.g2.bx.psu.edu via a client (FileZilla). The transfer of files seems to be complete but the files do not appear under Get Data/Upload File and I have the message "Your FTP upload directory contains no files". I have tried to upload txt, zip, and bam files by ftp but nothing worked.
Any suggestions?
Many thanks
/Hans
Hans Matsson, PhD
Karolinska Institutet
Department of Biosciences and Nutrition
Novum
Hälsovägen 7-9
SE-141 83 Huddinge, Sweden
Email: Hans.Matsson(a)ki.se
Phone (office): +46-8-524 81143
¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
9 years, 12 months
Re: [galaxy-user] Getting reference index files in local galaxy install
by Aarti Desai
Hi,
We have a local install of galaxy and I'm trying to add the reference index files for bwa using the information provided in the following link
http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup
I have modified the bwa_index.loc file present in the ../tool-data directory by adding the path to where the index is on our server (Also attached). However, even after restarting the server, the reference genome does not show when choosing the "use a built-in index option". I'm not sure whether the loc file is correctly created and whether any other configuration file needs to be changed/updated. Help in the matter greatly appreciated.
Thanks,
Aarti
From: galaxy-user-bounces(a)lists.bx.psu.edu [mailto:galaxy-user-bounces@lists.bx.psu.edu] On Behalf Of Jennifer Jackson
Sent: Thursday, July 05, 2012 1:23 AM
To: Lindsey Kelly
Cc: galaxy-user(a)lists.bx.psu.edu
Subject: Re: [galaxy-user] Initial QC and grooming for Illumina HiSeq2000 paired end RNAseq data
Hello Lindsey,
Yes, you have this correct. The general path would be to:
- join forward and reverse data per run
- run FASTQ Groomer & FastQC
(note: if your data is already in Sanger FASTQ format with Phred+33 quality scaled
values, the datatype '.fastqsanger' can be directly assigned and the FASTQ Groomer
step skipped. This is likely true if your data is a from the latest CASAVA pipeline, but
please double check.)
- discard data as needed based on quality
- split forward and reverse data that passes QC
- concatenate all forward reads from a sample into one FASTQ file
- concatenate all reverse reads from a sample into one FASTQ file.
- for each sample, run TopHat using the two concatenated FASTQ files
To manipulate paired end data, please see the tools -> NGS: QC and manipulation: FASTQ splitter & FASTQ joiner.
To combined data files head-to-tail from multiple runs into a single FASTQ file please see the tool -> Text Manipulation: Concatenate datasets.
I am not sure of the actual volume of data, but if these start to get large or TopHat errors with a memory problem, a local or cluster instance would be the recommendation: http://getgalaxy.org
For reference:
http://tophat.cbcb.umd.edu/manual.html
http://www.nature.com/nprot/journal/v7/n3/full/nprot.2012.016.html
Hopefully this helps. Others are welcome to post comments/suggestions.
Jen
Galaxy team
On 7/2/12 11:17 AM, Lindsey Kelly wrote:
I am trying to do RNAseq analysis on Paired end data from the Hiseq2000. I have about 50 files for each sample (25 forward and 25 reverse - although each sample has a different number of files).
I think that I need to:
-convert them into FASTQ sanger format using the FASTSQ groomer tool
-check the quality using the FASTQqc tool
I don't know how to handle this many files. Do I have to groom and run the QC for each file? Should I join the paired files and run both tools on each pair, or should I combine all of the data for each sample (which I don't know how to do) and then groom and run the QC for all of the reads for the sample.
Thanks in advance for advice
Lindsey
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
--
Jennifer Jackson
http://galaxyproject.org
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
10 years
Re: [galaxy-user] tool_path
by Fabien Mareuil
Hi,
Thank for your answer, I copy you a part of the shell code
LOCAL_DIR= #{GALAXY_DIR}/galaxy-dist/tools/AnnotateGenes
R_DIR= #{R_PATH}/bin
echo "ChIP:" >$LOG
if [ -r $REG ]; then
echo "1: perl $LOCAL_DIR/geneAnnotation.pl -g
$LOCAL_DIR/$GENOME.noIdenticalTransc.txt -tf $CHIPFILE -selG $REG -o
$OUTSTAT -lp $LEFTPROM -rightp $RIGHTPROM -enh $ENH -dg $DOWNGENE" >>
$LOGTMP
and a part of the xml
<tool id="annotateGenes" name="Annotation of genes with Chip-Seq peaks"
version="1.0">
<description> </description>
<command interpreter="bash"> #if $use_reg.use_reg_selector == "no" and
$use_control.use_control_selector == "no" #annotateGenes_wrapper.sh -f
$inputfile -y $log -l $left -o $outputPNG -r $right -d $DownGene -h
$EnhLeft -u $stats -v $input_organism.version #elif
$use_reg.use_reg_selector == "no" and $use_control.use_control_selector
== "yes" # annotateGene_wrapper.sh -f $inputfile -y $log -c $controlfile
-x $statsControl -o $outputPNG -l $left -r $right -d $DownGene -h
$EnhLeft -u $stats -v $input_organism.version #elif
$use_reg.use_reg_selector == "yes" and $use_control.use_control_selector
== "no" # annotateGenes_wrapper.sh -y $log -f $inputfile -e $regfile -l
$left -o $outputPNG -r $right -d $DownGene -h $EnhLeft -u $stats -v
$input_organism.version #else # annotateGenes_wrapper.sh -f $inputfile
-c $controlfile -x $statsControl -l $left -y $log -o $outputPNG -r $right
-d $DownGene -h $EnhLeft -u $stats -v $input_organism.version -e $regfile
#end if
This tools are avaible in : http://nebula.curie.fr/
You can see that the variable LOCAL_DIR is the PATH of the tool so I would
like to know if it's possible to obtain this information without
hard-coded this?
Thank you for your answer.
Best Regards,
Fabien Mareuil
> Hello Fabien,
> I don't understand the issue - can you provide a sample tool config that
includes these hard-coded paths? This initially sounds like an issue
with
> the tool configs, not the tool shed, but I may see the problem with your
clarification.
> Thanks,
> Greg Von Kuster
> On Jul 3, 2012, at 9:58 AM, Fabien Mareuil wrote:
>> Hi,
>> I have read the exchange betwen you and Florent Angly about "Problem with
>> new tool shed" and I have a problem with Nebula Tools:
>> http://nebula.curie.fr/.
>> At the Pasteur Institute, we have 4 galaxy instances and I would like
to
>> use a local tool shed instance for Nebula installation.
>> However, the nebula has tools with hard-coded path tool but I don't
want
>> hard-coded this so do you have a solution to add a thing like this
${tool.install_dir} in the xml?
>> Thank you for your answer.
>> Best Regards,
>> Fabien Mareuil
>> ___________________________________________________________
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org. Please keep all replies on the list by
>> using "reply all" in your mail client. For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>> http://lists.bx.psu.edu/listinfo/galaxy-dev
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>> http://lists.bx.psu.edu/
10 years
tool_path
by Fabien Mareuil
Hi,
I have read the exchange betwen you and Florent Angly about "Problem with
new tool shed" and I have a problem with Nebula Tools:
http://nebula.curie.fr/.
At the Pasteur Institute, we have 4 galaxy instances and I would like to
use a local tool shed instance for Nebula installation.
However, the nebula has tools with hard-coded path tool but I don't want
hard-coded this so do you have a solution to add a thing like this
${tool.install_dir} in the xml?
Thank you for your answer.
Best Regards,
Fabien Mareuil
10 years
cufflinks
by Jennifer Jackson
On 7/1/12 8:30 PM, Paul
> Hello Jennifer,
> I was hoping you could enlighten me about a problem I am currently
> having. I have two rna-seq datasets that I am trying to evaluate
> using cufflinks - I keep getting null sets back with the second set,
> no matter what I do. I am pretty sure that the data are identical in
> nature, just from different conditions. Is there an issue with the
> current cufflinks instance? Or am I screwing up somehow. I am trying
> to evaluate item 149 from my history (which is the filtered and sorted
> set from Bowtie analysis). This should be identical in nature to item
> 113, with the only difference being the read source (same bug,
> different conditions). Both were mapped to the same ref genome (from
> the history) and the same annotation file (again, from the history).
> Any help is much appreciated!
>
> --
> Paul
--
Jennifer Jackson
http://galaxyproject.org
10 years
GMOD Summer School application deadline
by Scott Cain
Hello,
The deadline to apply for the GMOD Summer School is in one week, July
9th. The application is available as a Google Form:
https://docs.google.com/spreadsheet/embeddedform?formkey=dG5hNGFiQ3UwYTV2...
In the GMOD Summer School (August 24-29, 2012) we will cover the
installation, configuration and use of a variety of GMOD tools,
including Chado, GBrowse, JBrowse and Galaxy. For more information on
the course, see the course web page at
http://gmod.org/wiki/2012_GMOD_Summer_School
The course will make heavy use of the Amazon Web Service (aka, the
Cloud) via a grant from Amazon. Enrollment is limited to 24 students,
and the application process is competitive: the last few years we've
received over 75 applications for those 24 spots.
I look forward to seeing you in North Carolina in August!
--
------------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research
10 years
sanitizer for carriage return
by Katrien Bernaerts
Dear,
I am making a Galaxy appliciation with a text area. In the text area, the
user can copy/paste sequences. However, all carriage returns (e.g. after
the comment line) are converted to XX by Galaxy. I found that a sanitizer
can be used for specal characters, but I could not figure out how to
configure the sanitizer for a carriage return. Does anyone have an idea how
to handle carriage returns in the user input?
Thanks in advance,
--
Katrien Bernaerts
10 years