Remove flanker and replace PyML with scikit-learn

I never was actually able to successfully install PyML but the source-forge
distribution and lack of python3 support convinced me that scikit-learn would
be a fine substitute. Flanker was also difficult for me to install and seemed
only to be used in the tests, so I removed it as well to get into a position
where I could run the tests. As of this commit, only one is not passing
(test_standard_replies with android.eml) though I'm not familiar with the `email`
library yet.
This commit is contained in:
Alex Riina
2015-03-08 00:06:01 -05:00
committed by Alex Riina
parent b36287e573
commit f16760c466
12 changed files with 44 additions and 133 deletions

View File

@@ -3,7 +3,7 @@
import logging
import regex as re
from PyML import SparseDataSet
import numpy
from talon.signature.learning.featurespace import features, build_pattern
from talon.utils import get_delimiter
@@ -32,8 +32,8 @@ RE_REVERSE_SIGNATURE = re.compile(r'''
def is_signature_line(line, sender, classifier):
'''Checks if the line belongs to signature. Returns True or False.'''
data = SparseDataSet([build_pattern(line, features(sender))])
return classifier.decisionFunc(data, 0) > 0
data = numpy.array(build_pattern(line, features(sender)))
return classifier.predict(data) > 0
def extract(body, sender):