9 Commits

Author SHA1 Message Date
Sergey Obukhov
5bcf7403ad Merge pull request #94 from mailgun/obukhov-sergey-patch-1
Update README.rst
2016-05-31 20:16:13 -07:00
Sergey Obukhov
2d6c092b65 bump version 2016-05-31 18:42:47 -07:00
Sergey Obukhov
6d0689cad6 Update README.rst 2016-05-31 18:39:07 -07:00
Sergey Obukhov
3f80e93ee0 Merge pull request #93 from mailgun/sergey/version-bump
bump
2016-05-31 18:15:28 -07:00
Sergey Obukhov
1b18abab1d bump 2016-05-31 16:53:41 -07:00
Sergey Obukhov
03dd5af5ab Merge pull request #91 from KevinCathcart/patch-1
Support outlook 2007/2010 running in en-us locale
2016-05-31 16:50:35 -07:00
Sergey Obukhov
dfba82b07c Merge pull request #92 from mailgun/obukhov-sergey-kuntzcamera
Update README.rst
2016-05-31 15:42:34 -07:00
Sergey Obukhov
08ca02c87f Update README.rst 2016-05-31 15:14:32 -07:00
Kevin Cathcart
b61f4ec095 Support outlook 2007/2010 running in en-us locale
My American English copy of outlook 2007 is using inches in the reply separator rather than centimeters. The separator is otherwise Identical. What a strange thing to localize. I'm guessing it uses whatever it thinks the preferred units for page margins are.
2016-05-23 17:23:53 -04:00
3 changed files with 6 additions and 3 deletions

View File

@@ -95,7 +95,7 @@ classifiers. The core of machine learning algorithm lays in
apply to a message (``featurespace.py``), how data sets are built
(``dataset.py``), classifiers interface (``classifier.py``).
The data used for training is taken from our personal email
Currently the data used for training is taken from our personal email
conversations and from `ENRON`_ dataset. As a result of applying our set
of features to the dataset we provide files ``classifier`` and
``train.data`` that dont have any personal information but could be

View File

@@ -2,7 +2,7 @@ from setuptools import setup, find_packages
setup(name='talon',
version='1.2.7',
version='1.2.9',
description=("Mailgun library "
"to extract message quotations and signatures."),
long_description=open("README.rst").read(),

View File

@@ -86,9 +86,12 @@ def cut_gmail_quote(html_message):
def cut_microsoft_quote(html_message):
''' Cuts splitter block and all following blocks. '''
splitter = html_message.xpath(
#outlook 2007, 2010
#outlook 2007, 2010 (international)
"//div[@style='border:none;border-top:solid #B5C4DF 1.0pt;"
"padding:3.0pt 0cm 0cm 0cm']|"
#outlook 2007, 2010 (american)
"//div[@style='border:none;border-top:solid #B5C4DF 1.0pt;"
"padding:3.0pt 0in 0in 0in']|"
#windows mail
"//div[@style='padding-top: 5px; "
"border-top-color: rgb(229, 229, 229); "