On Apr 4, 2012, at 11:03 AM, Langhorst, Brad wrote:
Maybe more a more specific example including all the files in the directory, the input data, and the specific tool that will do the analysis would make this clearer.
One particular tool is running pplacer (https://github.com/matsen/pplacer) given an input alignment and reference package. The invocation, taken from our public example (https://github.com/fhcrc/microbiome-demo) resembles:
pplacer -c vaginal_16s.refpkg src/p4z1r36.fasta
The reference package is laid out like so:
$ ls -l vaginal_16s.refpkg total 7936 -rwxr-xr-x 1 habnabit staff 987 Feb 8 16:46 CONTENTS.json -rwxr-xr-x 1 habnabit staff 3911 Feb 8 16:46 RAxML_info.bv_refs_aln -rwxr-xr-x 1 habnabit staff 37984 Feb 8 16:46 RAxML_result.bv_refs_aln -rwxr-xr-x 1 habnabit staff 514875 Feb 8 16:46 bacteria16S_508_mod5.cm -rwxr-xr-x 1 habnabit staff 79661 Feb 8 16:46 bv_refdata.csv -rwxr-xr-x 1 habnabit staff 1912382 Feb 8 16:46 bv_refs.sto -rwxr-xr-x 1 habnabit staff 1450284 Feb 8 16:46 bv_refs_aln.fasta -rwxr-xr-x 1 habnabit staff 397 Feb 13 15:33 phylo_model.json -rwxr-xr-x 1 habnabit staff 41259 Feb 8 16:46 tax_table.csv
Tools, including pplacer, read the CONTENTS.json file, which is a manifest that describes the other files contained in the directory. In most tools, there's no way of specifying these things other than passing the entire reference package directory. We'd never had any issues previously with passing around directories. ~Aaron