December 2010 - galaxy-user - lists.galaxyproject.org

MACS problem
by Christopher Scharer 13 Dec '10

13 Dec '10

Hi, I recently mapped some ChIP-seq data to the mm8 version of the mouse genome with Bowtie to create a SAM file. However, I can not run MACS on the mapped data because I keep getting the error below. Any suggestions?? Thanks, Chris Messages from MACS: INFO @ Thu, 11 Nov 2010 10:07:30: # ARGUMENTS LIST: # name = MACS_in_Galaxy # format = SAM # ChIP-seq file = /galaxy/home/g2main/galaxy_main/database/files/001/727/dataset_1727796.dat # control file = None # effective genome size = 2.70e+09 # tag size = 25 # band width = 300 # model fold = 32 # pvalue cutoff = 1.00e-05 # Ranges for calculating regional lambda are : peak_region,1000,5000,10000 INFO @ Thu, 11 Nov 2010 10:07:30: #1 read tag files... INFO @ Thu, 11 Nov 2010 10:07:30: #1 read treatment tags... INFO @ Thu, 11 Nov 2010 10:07:43: 1000000 INFO @ Thu, 11 Nov 2010 10:07:57: 2000000 INFO @ Thu, 11 Nov 2010 10:08:08: 3000000 INFO @ Thu, 11 Nov 2010 10:08:21: 4000000 INFO @ Thu, 11 Nov 2010 10:08:33: 5000000 Traceback (most recent call last): File "/home/g2main/linux2.6-x86_64/bin/macs", line 273, in main() File "/home/g2main/linux2.6-x86_64/bin/macs", line 57, in main (treat, control) = load_tag_files_options (options) File "/home/g2main/linux2.6-x86_64/bin/macs", line 252, in load_tag_files_options treat = options.build(open2(options.tfile, gzip_flag=options.gzip_flag)) File "/home/g2main/linux2.6-x86_64/lib/python2.6/MACS/IO/__init__.py", line 1480, in build_fwtrack (chromosome,fpos,strand) = self.__fw_parse_line(thisline) File "/home/g2main/linux2.6-x86_64/lib/python2.6/MACS/IO/__init__.py", line 1500, in __fw_parse_line bwflag = int(thisfields[1]) ValueError: invalid literal for int() with base 10: 'CTCF:7:185:443:687' -- Chris Scharer, PhD Post-doctoral Fellow Laboratory of Dr. Jeremy Boss Dept of Immunology and Microbiology Emory University Atlanta GA 30322 Ph: 404-727-5959

3 3

p_ids and cufflinks
by David Matthews 08 Dec '10

08 Dec '10

Just a thought, I notice that in the ensemble.gtf file the protein ids are listed as follows: chr11 protein_coding CDS 129060 129388 . - 0 gene_id "ENSG00000230724"; transcript_id "ENST00000382784"; exon_number "1"; gene_name "AC069287.3"; transcript_name "AC069287.3-201"; protein_id "ENSP00000372234" Is the p_id problem in cufflinks because the ensemble.gtf file uses the word protein_id and not p_id??? Cheers David __________________________________ Dr David A. Matthews Senior Lecturer in Virology Room E49 Department of Cellular and Molecular Medicine, School of Medical Sciences University Walk, University of Bristol Bristol. BS8 1TD U.K. Tel. +44 117 3312058 D.A.Matthews(a)bristol.ac.uk

3 12

Fwd: Galaxy on the cloud...
by Enis Afgan 08 Dec '10

08 Dec '10

Hi Richard, These type of questions are best sent to the general Galaxy user mailing list so it gets the most exposure, so I'm forwarding it. Enis ---------- Forwarded message ---------- From: Richard Bruskiewich <r.bruskiewich(a)irri.org> Date: Mon, Nov 29, 2010 at 9:08 PM Subject: Re: Galaxy on the cloud... To: Enis Afgan <eafgan(a)emory.edu> Hi Enis, Sorry to bother you again but I'm not sure who can answer this particular question on the Galaxy team. I've been following the instructions on http://bitbucket.org/galaxy/galaxy-central/wiki/Config/ProductionServer to create a new remote server instance for Galaxy using NGINX as the proxy server (the latter configured using the instructions at http://bitbucket.org/galaxy/galaxy-central/wiki/Config/nginxProxy) The instance *almost* works except for the following defect: only Microsoft IE renders the Galaxy home page correctly from the instance. FireFox and Chrome do not render the first page correct (see the attached image). If I run a local Galaxy (developer's) instance, though, it renders fine in Firefox. Is there some kind of additional NGINX-specific configuration that needs to be tweaked for this? I don't have much experience with NGINX, but decided to try it out since you folks - Galaxy - are using it on your main web site (so says the production page...). Thank you in advance for your kind advice (or redirection of this query to a member of your team who knows what to do here...). Cheers Richard

2 1

Error with workflows
by Florent Angly 08 Dec '10

08 Dec '10

Hi all, I attempted to make a small workflow in on my Galaxy server. After saving it, I get this error each time I try to access the workflow tab at http://eury.eait.uq.edu.au:8080/workflow: > Error - <class 'simplejson.decoder.JSONDecodeError'>: No JSON object > could be decoded: line 1 column 0 (char 0) > URL: http://eury.eait.uq.edu.au:8080/workflow > File > '/Users/florent/galaxy-dist/eggs/Paste-1.6-py2.6.egg/paste/exceptions/errormiddleware.py', > line 143 in __call__ > File > '/Users/florent/galaxy-dist/eggs/Paste-1.6-py2.6.egg/paste/recursive.py', > line 80 in __call__ > File > '/Users/florent/galaxy-dist/eggs/Paste-1.6-py2.6.egg/paste/httpexceptions.py', > line 632 in __call__ > File '/Users/florent/galaxy-dist/lib/galaxy/web/framework/base.py', > line 145 in __call__ > File > '/Users/florent/galaxy-dist/lib/galaxy/web/controllers/workflow.py', > line 114 in index > File > '/Users/florent/galaxy-dist/lib/galaxy/web/framework/__init__.py', > line 84 in decorator > File > '/Users/florent/galaxy-dist/lib/galaxy/web/controllers/workflow.py', > line 139 in list > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/query.py', > line 1267 in all > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/query.py', > line 1422 in instances > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/query.py', > line 2032 in main > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/mapper.py', > line 1729 in _instance > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/mapper.py', > line 1614 in populate_state > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/strategies.py', > line 762 in execute > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/mapper.py', > line 1729 in _instance > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/mapper.py', > line 1614 in populate_state > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/strategies.py', > line 781 in execute > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/mapper.py', > line 1729 in _instance > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/mapper.py', > line 1614 in populate_state > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/orm/strategies.py', > line 120 in new_execute > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/engine/base.py', > line 1348 in __getitem__ > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/engine/base.py', > line 1620 in _get_col > File > '/Users/florent/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/types.py', > line 288 in process > File '/Users/florent/galaxy-dist/lib/galaxy/model/custom_types.py', > line 32 in process_result_value > File > '/opt/local/galaxy-dist/eggs/simplejson-2.1.1-py2.6-macosx-10.6-universal-ucs2.egg/simplejson/decoder.py', > line 402 in decode > File > '/opt/local/galaxy-dist/eggs/simplejson-2.1.1-py2.6-macosx-10.6-universal-ucs2.egg/simplejson/decoder.py', > line 420 in raw_decode > JSONDecodeError: No JSON object could be decoded: line 1 column 0 (char 0) Any idea what is going on? Thank you, Florent

1 0

Re: [galaxy-user] Urgent :Some Help Regarding the Galaxy
by Guruprasad Ananda 07 Dec '10

07 Dec '10

Hi Vikas, *** Please in the future, send all questions to the mailing list instead of individual email addresses; this will ensure that your questions can be answered in a timely manner and so that others can benefit. *** > The Question to you is that Is it possible to make a annotation pipeline tool for the Prokaryotic Genome is possible using the Galaxy ? Can you suggest me that is it possible as the wsdl of many of the softwares which I have sorted out the best doesnt have the wsdl . I'm not sure I quite understand your question. Are you asking if you could use the 'Profile Annotations' tool on prokaryotic genomes? or are you asking how to integrate an annotation tool into Galaxy? If you could please clarify the same, we'll be able to help you with this. Thanks, Guru. On Dec 7, 2010, at 9:04 AM, vikas kumar wrote: > > The Question to you is that Is it possible to make a annotation pipeline tool for the Prokaryotic Genome is possible using the Galaxy ? Can you suggest me that is it possible as the wsdl of many of the softwares which I have sorted out the best doesnt have the wsdl . > > So if can kindly suggest me that such a big pipeline is possible or your views regarding this?? > > thanking you > -- > Vikas Kumar > Project Assistant > Institute Of Genomics and Integrative Biology > New Delhi, India >

1 0

Re: [galaxy-user] cuffdiff using the combined gtf files from cuffcompare
by Jennifer Jackson 07 Dec '10

07 Dec '10

Hello David, You are correct about the tools, so the problem is most likely with the original GTF file. If gene_id is not assigned there correctly, then the data will not be sorted by gene_id. Although GTF format is consistent (mostly!) between sources, the actual content can vary. One example is from UCSC - the GTF format from the Table browser will have the transcript name assigned to both the gene_id and the transcript_id tags in the attributes field (f9). Post processing to extract gene name from the track and swapping it into the GTF file's gene_id attribute tag would be a necessary pre-processing step before using the downstream tools with functionality that would use the attribute. The good news is that you should be able to use Galaxy's Text Manipulation tools to do whatever file processing you need to do, from whatever input source you are using, once you have the data content loaded into your history. Create->save->use a workflow so that you only have to work out tedious file conversions step-by-step one time. If you need more help, please let us know and share your history: Options -> Share or Publish -> Share with a user "jen(a)bx.pus.edu". Thanks, Jen Galaxy team ps. It is best to send data questions to galaxy-use mail list to help the community learn from each other. I am going to forward this answer there now, since this question has come up a few times recently after the addition of the new tools. On 12/1/10 7:00 AM, David Matthews wrote: > Dear Jennifer, > > Hope you can help, after using cuffdiff on my data using the combined > gtf files from cuffcompare I get the usual list of files back. However, > in the genes tracking file and the genes fpkm files many genes are > listed more than once. My understanding was that cuffdiff was supposed > to amalgamate these into one whole number for that gene id, am I doing > something wrong? > > Cheers > David

2 2

WS180 genomic sequence not available
by Zhu, Lihua (Julie) 07 Dec '10

07 Dec '10

Hi, I am trying to fetch sequences from WS180 ce5. However, it seems that the sequences are not currently available for the specified build. Could you please make it available? Thanks so much for your help! Best regards, Julie

2 1

file uploads
by Keith E. Giles 07 Dec '10

07 Dec '10

I am trying to upload .bam and .bigwig datasets, from geo, into galaxy. Galaxy will not recognize either dataset. Any suggestions? I can include the datasets if anyone has any thoughts. thanks, keith.

2 1

Retrieving genomic sequence using UCSC Table Browser
by Jim Johnson 07 Dec '10

07 Dec '10

I tried to retrieve a set of 20 bp length genomic sequences using the UCSC Table Browser, using assembly track and providing a set of defined regions. The Table Browser returned large sequence regions that included the requested regions instead of just the requested bases. Is there a setting for UCSC Table Browser that will return just the requested bases? Thanks, JJ

2 1

bx-python question about bed_intersect_basewise.py
by Amit Indap 07 Dec '10

07 Dec '10

Sorry, Galaxy I hit sent by accident in my earlier message. Like I said, I had a question about bed_intersect_basewise.py My first bed file has the following interval: chr22 267 572 and my second bed file has the intervals: chr22 147 267 chr22 267 387 chr22 387 507 chr22 507 627 When I run the program, I get the answer chr22 202 572 But I'm trying to see if there is a way I can get the complement, bases in the second file that don't overlap the first. The desired answer would be: chr22 147 267 chr22 572 627 I know there is a -v option on bed_intersect.py, but I want the equivalent basewise behavior. I was trying to poke around the BinnedBitSet code, and I see there is an invert method, but calling the invert method has some side effects on the rest of the code (either the next_set or next_clear call), and I get a run-time error. The BinnedBitSet code somewhat of a blackbox to me, but I'd appreciate any pointers in the right direction. Thanks for your help, Amit Indap -- Amit Indap

2 1