This bug irritated me, so I fixed it. Essentially add_file() in
upload.py is not in on the joke that local dirs are relative paths and need the absolute
path tacked onto it.
Is there a written process on how to submit the fix? I could not find it.
Hi Ted,
Is this the case when library_import_dir in the config file is relative? I've always
used an absolute path there. I suppose it couldn't hurt to make it absolute
programatically.
--nate
Thanks,
Ted
diff -r 21b645303c02 tools/data_source/upload.py
--- a/tools/data_source/upload.py Thu Dec 22 13:54:33 2011 -0500
+++ b/tools/data_source/upload.py Sat Dec 31 15:29:45 2011 -0800
@@ -74,7 +74,10 @@
id, files_path, path = arg.split( ':', 2 )
rval[int( id )] = ( path, files_path )
return rval
-def add_file( dataset, registry, json_file, output_path ):
+
+import pdb
+
+def add_file( dataset, registry, json_file, output_path, root_dir):
data_type = None
line_count = None
converted_path = None
@@ -94,7 +97,10 @@
file_err( 'Unable to fetch %s\n%s' % ( dataset.path, str( e ) ),
dataset, json_file )
return
dataset.path = temp_name
- # See if we have an empty file
+
+ if dataset.type == 'server_dir' and not os.path.isabs( dataset.path):
+ dataset.path = os.path.join( root_dir, dataset.path )
+
if not os.path.exists( dataset.path ):
file_err( 'Uploaded temporary file (%s) does not exist.' % dataset.path,
dataset, json_file )
return
@@ -384,7 +390,7 @@
files_path = output_paths[int( dataset.dataset_id )][1]
add_composite_file( dataset, registry, json_file, output_path, files_path )
else:
- add_file( dataset, registry, json_file, output_path )
+ add_file( dataset, registry, json_file, output_path , sys.argv[1])
# clean up paramfile
try:
os.remove( sys.argv[3] )
[ted@tap galaxy-central]$ !v
On Dec 29, 2011, at 1:22 AM, Ted Goldstein wrote:
> Hi there,
> Here are three interrelated issues.
>
> I am trying to use Galaxy with some large cancer genomic datasets here at UCSC and do
some systems biology. I have petabyte size dataset data libraries which will
constantly be in flux at the edges. I would prefer to just have the Galaxy read the
metadata from the file system for large datasets without using the database. Is there a
convenient api boundary to write an adapter to the dataset object interface?
>
> In the meantime, I am going to try to just import day using the link. Its great that
this feature is in already When I import into a couple of a modest megabyte size dataset
using "Link to files without copying to Galaxy" option, the status never
changes from "queued". Is this a bug? Is there a known work around? I have many
large datasets.
>
> Also, it takes a long time to expand the dataset name link. (My experiment on import
is a data tree of about a thousand files). Is this a known bug?
>
> Thanks!
> Ted
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client. To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>
http://lists.bx.psu.edu/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/