I would like to map (e.g. with Bowtie) collapsed sequences (tags) instead of individual sequence reads. Does anyone know if this is possible in Galaxy?
Thank you in advance.
Soetkin Versteyhe, PhD
University of Copenhagen
Faculty of Health Sciences
The Novo Nordisk Foundation
Center for Basic Metabolic Research
2200 København N
PHONE +45 35337116
I have a postdoc opening in my lab that could be an excellent opportunity for members of this list. The project is extremely cool, and will incorporate elements of ecology and systems biology. The position is open now, and I would like to fill it ASAP.
Genomic basis of species diversity and ecosystem functioning
We seek a postdoctoral scholar to collaborate on a project funded by the U.S. National Science Foundation to study the evolutionary basis of species diversity and ecosystem functioning in freshwater green algae. The successful candidate will take intellectual lead on the genomics portion of laboratory and field experiments that will (1) determine how much genetic differentiation is required for species to stably coexist, and (2) determine how mechanisms that allow for coexistence also impact community-level processes such as primary production. The candidate will also pursue his or her own research interests within the broader context of the grant proposal.
The candidate will work in Dr. Charles Delwiche's lab at the University of Maryland – College Park, and will collaborate with researchers in the labs of Drs. Bradley Cardinale (an ecologist at the University of Michigan) and Todd Oakley (a phylogeneticist at UC-Santa Barbara).
The position requires a Ph.D. in the biological sciences, bioinformatics, or a related field. Experience with high-throughput sequencing, sequence analysis, algal/plant biodiversity, or RNA biology are desirable.
This is a three-year position, with the initial appointment being for one year and renewals contingent on successful progress in research. The starting salary will be $ 42,000 with full benefits.
The University of Maryland is located in College Park, a suburb of the Washington, D.C. Metropolitan Area, and provides a vibrant cultural and academic environment with easy access to a vast array of Federal research facilities.
The position is available immediately. To apply formally, send a curriculum vitae, the names of 3 references, and a brief statement of how your research goals fit with research on algal biodiversity, systematics, and evolutionary biology to: aaalgeee(a)gmail.com
Charles F. Delwiche
Professor, Cell Biology and Molecular Genetics
CBMG , 0101J Biosciences Research
University of Maryland Building #413
College Park, MD 20742-4407
http://www.life.umd.edu/labs/delwiche tel: 301-405-8286 fax: 301-314-1248
There is still only you, listening to the music of the wind and lost in the stagnant cries of the miseries lost in the eyes of others laughing at you. How can someone so different be so much the same? - Liu Sola
Please keep all replies on list, this will allow the community to assist and benefit from these correspondences.
SICER requires BED input. To go from BAM to BED:
1.) Convert BAM to SAM
2.) Convert SAM to Interval (Convert SAM to interval)
3.) Convert interval to BED(6+). This can be done by implicitly (by selecting the Interval dataset, which will be marked with '(as bed)' in the SICER input box) or by clicking on the pencil icon and explicitly converting uder the section "Convert to new format".
Please let us know if we can provide additional assistance.
Thanks for using Galaxy,
On Nov 29, 2011, at 1:23 PM, Anupam Paliwal wrote:
> Hi Daniel,
> Thanks for your kind attention and advice.
> I have followed the following workflow: I aligned my query sequences to
> the reference genome using Bowtie; the Bowtie aligned SAM file was
> subjected to filter-SAM before converting it to BAM. I have re-BAM-to-SAM
> converted the BAM-file before subjecting it to pileup.
> However, now I do have the Input format file (after pileup of SAM) but am
> unable to convert it to BAM format to be able to submit it ti SICER.
> Please see if you can suggest how to convert the Input files back to BAM.
> I have tried changing directly through edit-attributes, but it shows
>> Hi AP,
>> SICER requires BED formatted input with at least 6 columns (for strand
>> information). You can convert your BAM files into SAM and then into
>> interval and BED format. Once you have your input in the BED (6+) format,
>> you should be able to use these tools. Please let us know if we can
>> provide additional information.
>> Thanks for using Galaxy,
>> On Nov 23, 2011, at 12:26 PM, Anupam Paliwal wrote:
>>> I want to use SICER or Find Peaks for peak calling on GALAXY.
>>> I am using my aligned ChIP-seq tag .BAM files. However for both the
>>> the history is unable to pick the Bowtie-ligned SAM to BAM converted
>>> On the other hand, using MACS the same files are working nicely for peak
>>> The Galaxy User list should be used for the discussion of
>>> Galaxy analysis and other features on the public server
>>> at usegalaxy.org. Please keep all replies on the list by
>>> using "reply all" in your mail client. For discussion of
>>> local Galaxy instances and the Galaxy source code, please
>>> use the Galaxy Development list:
>>> To manage your subscriptions to this and other Galaxy lists,
>>> please use the interface at:
Did you try to run Cuffcompare (part of Cufflinks) on your results?
According to the Cufflinks manual (http://cufflinks.cbcb.umd.edu/manual.html
>Cufflinks includes a program that you can use to help analyze the
transfrags you assemble. The program cuffcompare helps you:
> - Compare your assembled transcripts to a reference annotation
In the Galaxy version of Cuffcompare, I think that you can provide a
reference annotation file using "Use Reference Annotation:", which will be
compared to your results with Cufflinks.
It makes an "union" of the transcripts obtained with Cufflinks with the
annotation file (both in *.gtf format). You can then obtain a transcript
identifier for those already annotated.
It also provides a class code for the transcripts, which can inform about a
potential isoform for example.
Hope this helps.
Emilie Chautard, PhD
Ontario Institute for Cancer Research
MaRS Centre, South Tower
101 College Street, Suite 800
Toronto, Ontario, Canada M5G 0A3
> Message: 7
> Date: Thu, 20 Oct 2011 15:12:45 +0200
> From: GANDRILLON OLIVIER <olivier.gandrillon(a)univ-lyon1.fr>
> To: "galaxy-user(a)bx.psu.edu" <galaxy-user(a)bx.psu.edu>
> Subject: [galaxy-user] Names for genes in RNA-Seq analysis
> Message-ID: <CAC5EAED.8E99%olivier.gandrillon(a)univ-lyon1.fr>
> Content-Type: text/plain; charset="windows-1252"
> I am using Galaxy to analyse RNA-seq libraries made from chicken cells.
> I just groomed my sequences, passed them through TopHat and then Cufflinks.
> This worked well and in the end I get a list of genes and their respective
> FPKM values.
> My only problem is that the names of the genes do not appears in the
> listing, they are simply reference as "CUFF.1, CUFF.2, " etc?
> Could you please tell me how I could obtain gene names? (I went through the
> FAQ and could not get the answer).
I have illumina ChipSeq data and I want to use the "Draw quality score
I run the"quality format converter (ASCII numeric)". But the "Draw
quality score Boxplot" do an error "An error occurred running this
job:Could not find/open font when opening font "arial", .."
where is my problem?
thank you so much
we have a 454 metagenomic dataset. We have used barcode splitter to divide the dataset into it's constituent amplicons. We have also been using a clustering application (dnaclust) in Galaxy to subdivide the dataset by similarity. My question is; are there Galaxy tools to allow the combining, sorting and counting of these two outputs? For example, can each cluster - and then each sequence within that cluster - be given an identifier.... so that one can then split the output by barcode and summarise the data along the lines of amplicon/barcode X has X number of sequences within cluster 1, X number of sequences within cluster 2, ... etc? Am I making any sense?
This is the sort of problem that sounds like it is solvable in Excel and, indeed, a UK colleague of mine has been doing just this. But is there a straightforward means to do so in Galaxy? It is not obvious to me in the Filtering or Sorting tools.
The contents of this e-mail are confidential and may be subject to legal privilege.
If you are not the intended recipient you must not use, disseminate, distribute or
reproduce all or any part of this e-mail or attachments. If you have received this
e-mail in error, please notify the sender and delete all material pertaining to this
e-mail. Any opinion or views expressed in this e-mail are those of the individual
sender and may not represent those of The New Zealand Institute for Plant and
Food Research Limited.
A new version of CloudMan for running Galaxy on Amazon cloud has been
released today. Any new cluster will automatically use this version.
Existing clusters will have a link displayed at the top of the CloudMan
console offering to perform an automated update.
The new version brings the following updates/features:
- Added ability to specify a path where Galaxy is installed as part of user
data (using galaxy_home key). This allows custom Galaxy application to be
installed and picked up by CloudMan instead of the default one. This works
across cluster invocations as well as for shared clusters. For a complete
list of user data options see http://wiki.g2.bx.psu.edu/Admin/Cloud/UserData
- Use /etc/profile instead of /etc/bash.bashrc for system wide shell logins
- Support for 3.0 Kernel on Ubuntu 11.10 for SGE. Contributed by Brad
- Fix for SGE install after cloud-init has run and changed /etc/hosts
- post_start_service now runs if the script exists in the cluster bucket
even if no URL was provided as part of current user data
- Fix recognition of existing and attached file system volumes on instance
I am running Galaxy locally and it has been performing flawlessly! I
wanted to get more insight about this flag in the FASTQ Quality Trimmer
Maximum number of bases to exclude from the window during aggregation
Does it mean the number of 5' bases to exclude while the doing the
trimming step [i.e. the sliding window starts this many bp after the read
start] ? I would really appreciate if someone could shed more light on
I want to use SICER or Find Peaks for peak calling on GALAXY.
I am using my aligned ChIP-seq tag .BAM files. However for both the tools
the history is unable to pick the Bowtie-ligned SAM to BAM converted
On the other hand, using MACS the same files are working nicely for peak