The filter algorithm I used was:
- Augment each description with the WordNet similar-tos of the synsets of every word in that description.
- Add every word's synonyms to the description.
- Throw away all words with length < 6.
This is the output that comes from a call to correspRank() (with the two rankings as input):
Top Rank
1
Bottom Rank
1
Range
0
Median Rank
1.0
Top Score
0.369239797651
Bottom Score
0.360379846875
Range
0.00885995077593
Median Score
0.364809822263
Note that mean and variance are not reported- I haven't written them in yet because my sample sizes so far are too small.
Let's look at the top 5 hits in both rankings:
MaximumElevationInMeters
- Altitude Maximum
- Altitude System Definition
- Altitude System Definition
- False Easting
- False Easting
MinimumElevationInMeters
- Altitude Minimum
- Altitude System Definition
- Altitude System Definition
- Altitude Maximum
- False Easting
One that's not so good about this result is that the filter/score/rank/assess process took about 141 seconds of processor time. Ouch!
No comments:
Post a Comment