I committed a couple of things to the svn repository:
1. similarity.py
Contains a function to calculate the cosine similarity between two token lists. These lists can
either be tokenized descriptions, or processed versions thereof.
2. filters.py
Contains a number of functions to process and modify tokenized descriptions- including two
stemming functions, a function to throw out stop words, etc.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment