Derrick J. Wippler
3083f86c75
Continue with quotation cut even if html cut throws an exception
2020-02-10 11:40:00 -06:00
Derrick J. Wippler
c575beb27d
Test import clean up and pep8
2020-01-30 11:50:41 -06:00
Derrick J. Wippler
1018e88ec1
Now removing namespaces from parsed HTML
2019-05-10 11:16:12 -05:00
Yacine Filali
15e61768f2
Encoding fixes
2017-05-23 16:17:39 -07:00
Yacine Filali
dd0a0f5c4d
Python 2.7 backward compat
2017-05-23 16:10:13 -07:00
Yacine Filali
086f5ba43b
Updated talon for Python 3
2017-05-23 15:39:50 -07:00
Sergey Obukhov
534457e713
protect html_to_text as well
2016-09-14 09:58:41 -07:00
Sergey Obukhov
ea82a9730e
restrict html processing to a certain number of tags
2016-09-14 09:33:30 -07:00
Sergey Obukhov
37c95ff97b
fallback untouched html if we can not parse html tree
2016-08-19 11:38:12 -07:00
Sergey Obukhov
bcf97eccfa
use html5lib to parse html
2016-08-15 19:36:21 -07:00
Sergey Obukhov
69a44b10a1
Merge branch 'master' into sergey/empty-html
2016-08-11 23:58:11 -07:00
Sergey Obukhov
4b953bcddc
fixes mailgun/talon#103 keep newlines when parsing html quotations
2016-08-11 20:17:37 -07:00
Sergey Obukhov
315eaa7080
if html stripped off quotations does not have readable text fallback to unparsed html
2016-08-11 19:55:23 -07:00
Sergey Obukhov
21e9a31ffe
add test
2016-08-09 17:15:49 -07:00
Umair Khan
555c34d7a8
Make sure html_to_text processes bytes
2016-07-13 11:18:10 +05:00
Umair Khan
da998ddb60
Run modernizer on the code.
2016-07-12 17:25:46 +05:00
Sergey Obukhov
44e70939d6
fixes mailgun/talon#89
2016-05-17 15:31:01 -07:00
Doug Keen
333beb94af
Fix #85 (exception when stripping gmail quotes)
2016-04-04 14:22:50 -07:00
Sergey Obukhov
31803d41bc
fixes mailgun/talon#18
2016-02-19 19:07:10 -08:00
Sergey Obukhov
ce65ff8fc8
Merge pull request #71 from clara-labs/ms-2010-issue
...
First pass at handling issue with ms outlook 2010 with unenclosed quo…
2015-12-18 19:14:13 -08:00
Sergey Obukhov
3d9ae356ea
add more tests, make standard reply tests more relaxed
2015-12-18 18:56:41 -08:00
Carlos Correa
f688d074b5
First pass at handling issue with ms outlook 2010 with unenclosed quoted text.
2015-12-10 19:16:13 -08:00
Sergey Obukhov
41457d8fbd
fixes mailgun/talon#38 mailgun/talon#20
2015-12-05 00:37:02 -08:00
Sergey Obukhov
ae508fe0e5
fixes mailgun/talon#26
2015-09-21 09:51:26 -07:00
Sergey Obukhov
ad09b18f3f
fixes mailgun/talon#52
2015-09-18 04:47:23 -07:00
Sergey Obukhov
cc98befba5
Merge pull request #50 from Easy-D/preserve-regular-blockquotes
...
Preserve regular blockquotes
2015-09-11 04:49:36 -07:00
Easy-D
ed6b861a47
add failing test that shows how regular blockquotes are removed
2015-07-16 21:24:49 +02:00
Alex Riina
f16760c466
Remove flanker and replace PyML with scikit-learn
...
I never was actually able to successfully install PyML but the source-forge
distribution and lack of python3 support convinced me that scikit-learn would
be a fine substitute. Flanker was also difficult for me to install and seemed
only to be used in the tests, so I removed it as well to get into a position
where I could run the tests. As of this commit, only one is not passing
(test_standard_replies with android.eml) though I'm not familiar with the `email`
library yet.
2015-07-02 21:49:09 -04:00
Sergey Obukhov
170f11038b
initial commit
2014-07-23 21:12:54 -07:00