Commit Graph

228 Commits

Author SHA1 Message Date
Sergey Obukhov
4b953bcddc fixes mailgun/talon#103 keep newlines when parsing html quotations 2016-08-11 20:17:37 -07:00
Sergey Obukhov
315eaa7080 if html stripped off quotations does not have readable text fallback to unparsed html 2016-08-11 19:55:23 -07:00
Sergey Obukhov
5a9bc967f1 Merge pull request #100 from mailgun/sergey/restrict
do not parse html quotations if html is longer then certain threshold
v1.2.14
2016-08-11 16:08:03 -07:00
Sergey Obukhov
a0d7236d0b bump version and add a comment 2016-08-11 15:49:09 -07:00
Sergey Obukhov
21e9a31ffe add test 2016-08-09 17:15:49 -07:00
Sergey Obukhov
4ee46c0a97 do not parse html quotations if html is longer then certain threshold 2016-08-09 17:08:58 -07:00
Sergey Obukhov
10d9a930f9 Merge pull request #99 from mailgun/sergey/capitalized
consider word capitilized only if it is camel case - not all upper case
v1.2.12
2016-07-20 16:47:12 -07:00
Sergey Obukhov
a21ccdb21b consider word capitilized only if it is camel case - not all upper case 2016-07-19 17:37:36 -07:00
Sergey Obukhov
7cdd7a8f35 Merge pull request #98 from mailgun/sergey/1.2.11
version bump
v1.2.11
2016-07-19 16:22:24 -07:00
Sergey Obukhov
01e03a47e0 version bump 2016-07-19 15:51:46 -07:00
Sergey Obukhov
1b9a71551a Merge pull request #97 from umairwaheed/strip-talon
Strip down Talon
2016-07-19 15:46:56 -07:00
Umair Khan
911efd1db4 Move encoding detection inside if condition. 2016-07-19 09:44:40 +05:00
Umair Khan
e61f0a68c4 Add six library to setup.py 2016-07-19 09:40:03 +05:00
Umair Khan
cefbcffd59 Make tests/text_quotations_test.py compatible with Python 3. 2016-07-13 14:45:26 +05:00
Umair Khan
622a98d6d5 Make utils compatible with Python 3. 2016-07-13 13:00:24 +05:00
Umair Khan
7901f5d1dc Convert msg_body into unicode in preprocess. 2016-07-13 11:18:10 +05:00
Umair Khan
555c34d7a8 Make sure html_to_text processes bytes 2016-07-13 11:18:10 +05:00
Umair Khan
dcc0d1de20 Convert msg_body to bytes in extract_from_html 2016-07-13 11:18:06 +05:00
Umair Khan
7bdf4d622b Only encode if str 2016-07-13 08:01:47 +05:00
Umair Khan
4a7207b0d0 Only convert to unicode if str 2016-07-13 08:01:47 +05:00
Umair Khan
ad9c2ca0e8 Upgrade quotations.py 2016-07-13 08:01:44 +05:00
Umair Khan
da998ddb60 Run modernizer on the code. 2016-07-12 17:25:46 +05:00
Umair Khan
07f68815df Allow installation of ML free version.
Add an option to the install script, `--no-ml`, that when given will
install Talon without ML support.

Fixes #96
2016-07-12 15:08:53 +05:00
Sergey Obukhov
35645f9ade Merge pull request #95 from mailgun/sergey/forge
open-sourcing email dataset
v1.2.10
2016-06-10 15:45:29 -07:00
Sergey Obukhov
7c3d91301c open-sourcing email dataset 2016-06-10 14:10:53 -07:00
Sergey Obukhov
5bcf7403ad Merge pull request #94 from mailgun/obukhov-sergey-patch-1
Update README.rst
v1.2.9
2016-05-31 20:16:13 -07:00
Sergey Obukhov
2d6c092b65 bump version 2016-05-31 18:42:47 -07:00
Sergey Obukhov
6d0689cad6 Update README.rst 2016-05-31 18:39:07 -07:00
Sergey Obukhov
3f80e93ee0 Merge pull request #93 from mailgun/sergey/version-bump
bump
v1.2.8
2016-05-31 18:15:28 -07:00
Sergey Obukhov
1b18abab1d bump 2016-05-31 16:53:41 -07:00
Sergey Obukhov
03dd5af5ab Merge pull request #91 from KevinCathcart/patch-1
Support outlook 2007/2010 running in en-us locale
2016-05-31 16:50:35 -07:00
Sergey Obukhov
dfba82b07c Merge pull request #92 from mailgun/obukhov-sergey-kuntzcamera
Update README.rst
2016-05-31 15:42:34 -07:00
Sergey Obukhov
08ca02c87f Update README.rst 2016-05-31 15:14:32 -07:00
Kevin Cathcart
b61f4ec095 Support outlook 2007/2010 running in en-us locale
My American English copy of outlook 2007 is using inches in the reply separator rather than centimeters. The separator is otherwise Identical. What a strange thing to localize. I'm guessing it uses whatever it thinks the preferred units for page margins are.
2016-05-23 17:23:53 -04:00
Sergey Obukhov
9dbe6a494b Merge pull request #90 from mailgun/sergey/89
fixes mailgun/talon#89
v1.2.7
2016-05-17 16:01:56 -07:00
Sergey Obukhov
44e70939d6 fixes mailgun/talon#89 2016-05-17 15:31:01 -07:00
Sergey Obukhov
ab6066eafa Merge pull request #87 from mailgun/sergey/1.2.6
bump up version
v1.2.6
2016-04-07 17:54:12 -07:00
Sergey Obukhov
42258cdd36 bump up version 2016-04-07 17:51:48 -07:00
Sergey Obukhov
d3de9e6893 Merge pull request #86 from dougkeen/master
Fix #85 (exception when stripping gmail quotes)
2016-04-07 17:47:38 -07:00
Doug Keen
333beb94af Fix #85 (exception when stripping gmail quotes) 2016-04-04 14:22:50 -07:00
Sergey Obukhov
f3c0942c49 Merge pull request #80 from mailgun/sergey/12
fixes mailgun/talon#12
v1.2.5
2016-03-04 13:33:46 -08:00
Sergey Obukhov
02adf53ab9 fixes mailgun/talon#12 2016-03-04 13:14:50 -08:00
Sergey Obukhov
3497b5cab4 Merge pull request #79 from mailgun/sergey/version
bump version
v1.2.4
2016-02-29 15:13:51 -08:00
Sergey Obukhov
9c17dca17c bump version 2016-02-29 14:50:52 -08:00
Sergey Obukhov
de342d3177 Merge pull request #78 from defkev/master
Added Zimbra HTML quotation extraction
2016-02-29 14:14:09 -08:00
defkev
743b452daf Added Zimbra HTML quotation extraction 2016-02-21 16:56:52 +01:00
Sergey Obukhov
c762f3c337 Merge pull request #77 from mailgun/sergey/fix-gmail-fwd
fixes mailgun/talon#18
v1.2.3
2016-02-19 19:08:37 -08:00
Sergey Obukhov
31803d41bc fixes mailgun/talon#18 2016-02-19 19:07:10 -08:00
Sergey Obukhov
2ecd9779fc bump up version v1.2.2 2016-02-19 18:32:07 -08:00
Sergey Obukhov
5a7047233e Merge pull request #76 from mailgun/sergey/fix-date-splitter
fixes mailgun/talon#19
2016-02-19 18:28:23 -08:00