April 2013 - galaxy-user - lists.galaxyproject.org

Gain sequences after mapping reads to a reference genome
by zwp112358 28 Apr '13

28 Apr '13

Hi all, I would like to know if mapping reads to a reference genome, through galaxy, can generate the query genome sequence? I had mapped my reads from Illumina sequencer to a reference genome through both BWA and Bowtie on Galaxy public server platform(https://main.g2.bx.psu.edu/root ). As a result, i gained the SAM files. But, i can't find how to generate the resulted assemblied genome sequence. Is there anyone know this? Any reply will be very appreciated. Best regards. Weiping Weiping Zhang, Doctor candidate; School of bioengineering, Jiangnan University; Lihu Roads 1800#, WUXI, Jiangsu; Zip Code: 214122

2 1

gff installation failed with easy_install
by Mic 28 Apr '13

28 Apr '13

Hi, I have tried to install gff with easy_install, but I got the following error: $ easy_install --prefix=/home/mic/apps/pymodules -UZ https://github.com/chapmanb/bcbb/tree/master/gff Downloading https://github.com/chapmanb/bcbb/tree/master/gff error: Unexpected HTML page found at https://github.com/chapmanb/bcbb/tree/master/gff How is it possible to install gff? Thank you in advance. Mic

1 0

GenBank database files
by Mike Dyall-Smith 28 Apr '13

28 Apr '13

This should be easy (but not for me so far). I want to do local blast searches, so I download the premade nr protein blast database from GenBank. It is split into 10 .tar.gz files. I've decompressed them all, and now I want to put all the file parts together. Can I simply concatenate all similar files? (e.g. all 10 parts of the .phd files). The Readme mentions use of an alias file, but I did not find this at all clear. A set of step-by-step decompression and restoration instructions would be useful. I could not find any. Thanks for any assistance, Mike DS Sent from my iPhone4

2 1

Regarding combining several txt files and producing one fastq file
by Jennifer Jackson 28 Apr '13

28 Apr '13

Hi Yona, To merge together multiple datasets this way, use the tool "Text Manipulation -> Concatenate datasets tail-to-head". This works on two datasets at a time, so you may need to run it a few times if you have more than that, adding in a new file to the master merged file with each run. Watch out for introducing blank lines (unintentionally) between the files. To remove them should any be present (it doesn't harm a file if none are there), after you have merged all the files together, use the tool "Filter and Sort -> Select" with: option: NOT Matching and the expression: ^$ Once you are sure that the merged file is correct, you can permanently delete the working files to recover disk space. "FastQC" and/or "FASTQ Groomer" are generally both good at detecting format problems. http://wiki.galaxyproject.org/Support#Error_from_tools Good luck with your project, Jen Galaxy team On 4/27/13 8:23 PM, Yona Kim wrote: > Dear Jennifer > > I was wondering if there is a tool in galaxy that combines several txt > files (which I got from decompressing fastq.tgz file) and produce one > fastq file from them. > > I was searching it in google and read your previous email to somebody > else and you mentioned about the tool "cat" which seems to be the > right tool for me to use to combine these txt files in order to > produce one fastq file.. but I can't find this tool.. > > any advice? > > Thank you very much and I always appreciate your help very much!! > > Bests, > > Yona Kim > > -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org

1 0

problem with the public Main Galaxy server
by Ianiri, Giuseppe 26 Apr '13

26 Apr '13

Hi, is anybody else experiencing problems with the public Main Galaxy server at http://main.g2.bx.psu.edu? It just does not work anymore. Any suggestion? Giuseppe Ianiri

2 1

Barcode splitter on paired end Illumina reads
by Veranja Liyanapathirana 26 Apr '13

26 Apr '13

Dear Galaxy team I am so sorry for repeatedly posting the same question, but I do need some inputs in to this. Please let mek now the best way to use barcode splitter on Paired end Miseq data. The data is already split for the Illumina indexes using Miseq reporter, what I want to do is to split some inhouse barcodes within each of the sample. Barcodes are there in both 5' and 3' end but they are both the same. Please let me know if the best practise is to 1. Join read 1 and two - barcode split and split the two reads 2. Split Read 1 and 2 and them join using FastQ joiner and split again Basically I want to exclude any reads where the same numbered reads are not categorised in to the same barcode. Thanks Kind Regards, Veranja Veranja Liyanapathirana Graduate student Microbiology, CUHK

2 1

Problems Adding a Custom Build
by Mackenzie Gavery 26 Apr '13

26 Apr '13

Hi, I am having problems that I think are related: *1. * I have not been able to visualize (in Trackster) a custom build that I recently added (Trackster says: "Could not load chroms for this dbkey:"). In addition, when I try to Operate on Genomic Intervals using bed files associated with that particular build I get an error: Error executing tool: maximum recursion depth exceeded while calling a Python object 2. Now I am trying to add a new custom build to see if there was something wrong with the previous build, and I get an error right after I click on "Add a new Custom Build" in the New Visualization menu (*The error has been logged to our team.* If you want to contact us about this error, please reference the following *GURU MEDITATION: #9e466e8a843e4ce9be878f3223bb0be4)* * * I am just wondering if anyone is having similar issues? Or if this is a temporary problem? Thanks, Mackenzie

2 1

Join, Subtract and Group PLUS large GOA files
by Colleen Burge 25 Apr '13

25 Apr '13

Hello all, I've been using the "Join, Subtract and Group" to join my transcriptome/annotation data to GO and GO Slim for some time (in the Main galaxy). I just updated my GO files as I've run a a new data set, and have been having trouble with the joining function, it never seems to complete (while before it would be done in just a few minutes). It works just fine joining my "new data" with my "old" GO files (which of course are now out of date) but not the new GO files from both my collaborator and from EBI (specifically the unipro). Not sure if its a file size limitation? Thanks, Colleen

3 3

ImportError: Unable to run arch-specific checks
by Shun Liang 25 Apr '13

25 Apr '13

Dear Galaxy Developers, I am trying to run Galaxy on ARM architecture. I am able to scramble the eggs locally. However, I am unable to run Galaxy on my machine as "run.sh" gives an error after some initialization steps: galaxy@arm01-48:/mnt/ceph/galaxy/galaxy-dist$ ./run.sh Initializing datatypes_conf.xml from datatypes_conf.xml.sample Initializing external_service_types_conf.xml from external_service_types_conf.xml.sample Initializing migrated_tools_conf.xml from migrated_tools_conf.xml.sample Initializing reports_wsgi.ini from reports_wsgi.ini.sample Initializing shed_tool_conf.xml from shed_tool_conf.xml.sample Initializing tool_conf.xml from tool_conf.xml.sample Initializing shed_tool_data_table_conf.xml from shed_tool_data_table_conf.xml.sample Initializing tool_data_table_conf.xml from tool_data_table_conf.xml.sample Initializing tool_sheds_conf.xml from tool_sheds_conf.xml.sample Initializing data_manager_conf.xml from data_manager_conf.xml.sample Initializing shed_data_manager_conf.xml from shed_data_manager_conf.xml.sample Initializing openid_conf.xml from openid_conf.xml.sample Initializing tool-data/shared/ncbi/builds.txt from builds.txt.sample Initializing tool-data/shared/ensembl/builds.txt from builds.txt.sample Initializing tool-data/shared/ucsc/builds.txt from builds.txt.sample Initializing tool-data/shared/ucsc/publicbuilds.txt from publicbuilds.txt.sample Initializing tool-data/shared/igv/igv_build_sites.txt from igv_build_sites.txt.sample Initializing tool-data/shared/rviewer/rviewer_build_sites.txt from rviewer_build_sites.txt.sample Initializing tool-data/add_scores.loc from add_scores.loc.sample Initializing tool-data/alignseq.loc from alignseq.loc.sample Initializing tool-data/all_fasta.loc from all_fasta.loc.sample Initializing tool-data/annotation_profiler_options.xml from annotation_profiler_options.xml.sample Initializing tool-data/annotation_profiler_valid_builds.txt from annotation_profiler_valid_builds.txt.sample Initializing tool-data/bfast_indexes.loc from bfast_indexes.loc.sample Initializing tool-data/binned_scores.loc from binned_scores.loc.sample Initializing tool-data/blastdb.loc from blastdb.loc.sample Initializing tool-data/blastdb_p.loc from blastdb_p.loc.sample Initializing tool-data/bowtie2_indices.loc from bowtie2_indices.loc.sample Initializing tool-data/ccat_configurations.loc from ccat_configurations.loc.sample Initializing tool-data/codingSnps.loc from codingSnps.loc.sample Initializing tool-data/encode_datasets.loc from encode_datasets.loc.sample Initializing tool-data/faseq.loc from faseq.loc.sample Initializing tool-data/funDo.loc from funDo.loc.sample Initializing tool-data/gatk_annotations.txt from gatk_annotations.txt.sample Initializing tool-data/gatk_sorted_picard_index.loc from gatk_sorted_picard_index.loc.sample Initializing tool-data/liftOver.loc from liftOver.loc.sample Initializing tool-data/maf_index.loc from maf_index.loc.sample Initializing tool-data/maf_pairwise.loc from maf_pairwise.loc.sample Initializing tool-data/microbial_data.loc from microbial_data.loc.sample Initializing tool-data/mosaik_index.loc from mosaik_index.loc.sample Initializing tool-data/ngs_sim_fasta.loc from ngs_sim_fasta.loc.sample Initializing tool-data/perm_base_index.loc from perm_base_index.loc.sample Initializing tool-data/perm_color_index.loc from perm_color_index.loc.sample Initializing tool-data/phastOdds.loc from phastOdds.loc.sample Initializing tool-data/picard_index.loc from picard_index.loc.sample Initializing tool-data/quality_scores.loc from quality_scores.loc.sample Initializing tool-data/regions.loc from regions.loc.sample Initializing tool-data/sam_fa_indices.loc from sam_fa_indices.loc.sample Initializing tool-data/sequence_index_base.loc from sequence_index_base.loc.sample Initializing tool-data/sequence_index_color.loc from sequence_index_color.loc.sample Initializing tool-data/sift_db.loc from sift_db.loc.sample Initializing tool-data/srma_index.loc from srma_index.loc.sample Initializing tool-data/twobit.loc from twobit.loc.sample Initializing static/welcome.html from welcome.html.sample Traceback (most recent call last): File "/mnt/ceph/galaxy/galaxy-dist/lib/galaxy/webapps/galaxy/buildapp.py", line 36, in app_factory from galaxy.app import UniverseApplication File "/mnt/ceph/galaxy/galaxy-dist/lib/galaxy/app.py", line 17, in <module> from galaxy.visualization.data_providers.registry import DataProviderRegistry File "/mnt/ceph/galaxy/galaxy-dist/lib/galaxy/visualization/data_providers/registry.py", line 2, in <module> from galaxy.visualization.data_providers import genome File "/mnt/ceph/galaxy/galaxy-dist/lib/galaxy/visualization/data_providers/genome.py", line 13, in <module> import numpy File "/mnt/ceph/galaxy/galaxy-dist/eggs/numpy-1.6.0-py2.7-linux-armv7l-ucs4.egg/numpy/__init__.py", line 137, in <module> import add_newdocs File "/mnt/ceph/galaxy/galaxy-dist/eggs/numpy-1.6.0-py2.7-linux-armv7l-ucs4.egg/numpy/add_newdocs.py", line 9, in <module> from numpy.lib import add_newdoc File "/mnt/ceph/galaxy/galaxy-dist/eggs/numpy-1.6.0-py2.7-linux-armv7l-ucs4.egg/numpy/lib/__init__.py", line 4, in <module> from type_check import * File "/mnt/ceph/galaxy/galaxy-dist/eggs/numpy-1.6.0-py2.7-linux-armv7l-ucs4.egg/numpy/lib/type_check.py", line 8, in <module> import numpy.core.numeric as _nx File "/mnt/ceph/galaxy/galaxy-dist/eggs/numpy-1.6.0-py2.7-linux-armv7l-ucs4.egg/numpy/core/__init__.py", line 5, in <module> import multiarray ImportError: /mnt/ceph/galaxy/galaxy-dist/eggs/numpy-1.6.0-py2.7-linux-armv7l-ucs4.egg/numpy/core/multiarray.so: Unable to run arch-specific checks May I ask if anyone has any idea about this error? Many thanks. Best regards, Shun

2 2

Teaching using Galaxy
by David Joly 25 Apr '13

25 Apr '13

Hi everybody! I am currently creating a "bioinformatics" course for undergraduate (biology students with no knowledge of programming). I would like to use Galaxy as their everyday platform where they would learn the basics and use the appropriate tools (BLAST and databases, multiple alignment, phylogenetics, dealing with "omics" data, and so on). Is there any available resources about using Galaxy for teaching (undergraduates)? Any suggestions of good textbooks? Not a Galaxy textbook of course, but a "bioinformatics textbook" that would be a good companion to help the students understand the basics behind the tools. Thanks, DJ

3 5