Wednesday, August 5, 2009

some other changes

I've made a few other changes in the last couple of days:

Modifications to the FGDC Bio Standard (fgdcbio.txt) A number of the terms had short names that didn't match up between the hierarchy definition and the standard definition. Because the names appeared multiple times in the hierarchy and once in the standard, I changed them in the standard. I've tried to make minimal changes to the standard (e.g. by coding around typos) because my copy probably isn't going to end up in wide distribution. But for matching terms between the two files, it is necessary that each term has the same short name in both. Here is a list of changes:
  • placek to placekt
  • taxonpr to taxonpro
  • orien to orienta
In readTerms(), removed the option to allow term names as keys in the dictionary- because this field is unique in neither FGDC not EML. Currently, the only allowed keys are xpaths.

Deprecated paths(), dictReader() and dictWriter(). paths() constructs hierarchy paths from fgdc_hierarchy.xml, but writes them into a dictionary. If a term appears in multiple places in the hierarchy, the duplicates are overwritten- a major flaw in the function. dictReader() and dictWriter() only exist as helpers to paths().

Added getPaths(), a replacement for paths(), which constructs paths from fgdc_hierarchy and writes them into a list. Have not added support functions analogous to dictReader() and dictWriter() yet.

Added the following functions to produce output compatible to Namrata's program:
  • pathDesc(): Given a path and a dictionary (where path is a key in the dictionary), adds the values of every term in the path into the value of the path's final term.
  • expandDesc(): Applies pathDesc() to an entire dictionary. This is preferable to applying pathDesc() multiple times because the results are order-dependent.
  • dictToXML(): Writes the contents of a dictionary (with paths as keys) into an XML file in the same format as extract(). Requires an XML file in the same format as input, to supply the information missing from the dictionary (which can only hold two types of information at once).
Moved filePrompt() to prep_fns.py.

No comments:

Post a Comment