Got the XML issue cleared up. When I was writing the function, I forgot to replace things like '<' and '>' with escape characters. I added a function to fgdc_extractor.py that takes a string and returns the same string, with all necessary characters escaped for XML. This is called by the extract() function before it writes anything to file.
I sent the output to Namrata and she told me already that she can open it (The new XML file is in the SVN trunk as well, by the way).
More Filter Functions
I added six functions to filters.py. They retrieve the following related words from lemmas and/or synsets of the words in a term description:
- hypernym distances
- instance hypernyms
- member holonyms
- part holonyms
- substance holonyms
It's interesting that the WordNet online browser returns direct hypernyms and hyponyms, full hypernyms and hyponyms, inherited hypernyms and sister terms, etc. but the NLTK functions don't make those distinctions. Is there another Python library that has a better representation of WordNet than NLTK?