Commit Graph

62 Commits

Author SHA1 Message Date
Sergey Obukhov
ae508fe0e5 fixes mailgun/talon#26 2015-09-21 09:51:26 -07:00
Sergey Obukhov
2cb9b5399c bump up version 2015-09-18 05:23:29 -07:00
Sergey Obukhov
134c47f515 Merge pull request #59 from mailgun/sergey/43
fixes mailgun/talon#43
2015-09-18 05:20:51 -07:00
Sergey Obukhov
d328c9d128 fixes mailgun/talon#43 2015-09-18 05:19:59 -07:00
Sergey Obukhov
77b62b0fef Merge pull request #58 from mailgun/sergey/52
fixes mailgun/talon#52
2015-09-18 04:48:50 -07:00
Sergey Obukhov
ad09b18f3f fixes mailgun/talon#52 2015-09-18 04:47:23 -07:00
Sergey Obukhov
b5af9c03a5 bump up version v1.0.7 2015-09-11 10:42:26 -07:00
Sergey Obukhov
176c7e7532 Merge pull request #57 from mailgun/sergey/to_unicode
use precise encoding when converting to unicode
2015-09-11 10:40:52 -07:00
Sergey Obukhov
15976888a0 use precise encoding when converting to unicode 2015-09-11 10:38:28 -07:00
Sergey Obukhov
9bee502903 bump up version 2015-09-11 06:27:12 -07:00
Sergey Obukhov
e3cb8dc3e6 Merge pull request #56 from mailgun/sergey/1000+German+NL
process first 1000 lines for long messages, support for German and Dutch
2015-09-11 06:20:34 -07:00
Sergey Obukhov
385285e5de process first 1000 lines for long messages, support for German and Dutch 2015-09-11 06:17:14 -07:00
Sergey Obukhov
127771dac9 bump up version 2015-09-11 04:51:39 -07:00
Sergey Obukhov
cc98befba5 Merge pull request #50 from Easy-D/preserve-regular-blockquotes
Preserve regular blockquotes
2015-09-11 04:49:36 -07:00
Sergey Obukhov
567549cba4 bump up talon version 2015-09-10 10:47:16 -07:00
Sergey Obukhov
76c4f49be8 Merge pull request #55 from mailgun/sergey/lxml
unpin lxml version
2015-09-10 10:44:59 -07:00
Sergey Obukhov
d9d89dc250 unpin lxml version 2015-09-10 10:44:05 -07:00
Sergey Obukhov
9358db6cee bump up talon version 2015-09-03 11:03:01 -07:00
Sergey Obukhov
08c9d7db03 Merge pull request #45 from AlexRiina/master
Replace PyML with sklearn and clean up dependencies
2015-09-03 10:56:18 -07:00
Easy-D
390b0a6dc9 preserve regular blockquotes 2015-07-16 21:31:41 +02:00
Easy-D
ed6b861a47 add failing test that shows how regular blockquotes are removed 2015-07-16 21:24:49 +02:00
Alex Riina
85c7ee980c add script to regenerate ml model 2015-07-02 21:49:09 -04:00
Oliver Song
7ea773e6a9 Fix iphone test 2015-07-02 21:49:09 -04:00
Scott MacVicar
e3c4ff38fe move test stuff out to its own section 2015-07-02 21:49:09 -04:00
Scott MacVicar
8b1f87b1c0 Get this building and passing tests
Changes:
* add .DS_Store to .gitignore
* Decode base64 encoded emails for tests
* Pick a version of scikit since the pickled clasifiers are based on that
* Add missing numpy and scipy dependencies
2015-07-02 21:49:09 -04:00
Alex Riina
c5e4cd9ab4 dont be too restrictive on the test library version 2015-07-02 21:49:09 -04:00
Alex Riina
215e36e9ed allow higher version of regex library 2015-07-02 21:49:09 -04:00
Alex Riina
e3ef622031 remove unused regex 2015-07-02 21:49:09 -04:00
Alex Riina
f16760c466 Remove flanker and replace PyML with scikit-learn
I never was actually able to successfully install PyML but the source-forge
distribution and lack of python3 support convinced me that scikit-learn would
be a fine substitute. Flanker was also difficult for me to install and seemed
only to be used in the tests, so I removed it as well to get into a position
where I could run the tests. As of this commit, only one is not passing
(test_standard_replies with android.eml) though I'm not familiar with the `email`
library yet.
2015-07-02 21:49:09 -04:00
Alex Riina
b36287e573 clean up style and extra imports 2015-07-02 21:49:09 -04:00
Alex Riina
4df7aa284b remove extra imports 2015-07-02 21:49:09 -04:00
Jeremy Schlatter
3a37d8b649 Merge pull request #41 from simonflore/master
New splitter pattern for Dutch mail replies
v1.0.4
2015-04-22 12:17:39 -07:00
Simon
f9f428f4c3 Revert "Change of behavior when msg_body has more then 1000 lines"
This reverts commit 84a83e865e.
2015-04-16 13:26:17 +02:00
Simon
84a83e865e Change of behavior when msg_body has more then 1000 lines 2015-04-16 13:22:18 +02:00
Simon
b4c180b9ff Extra spaces check in RE_ON_DATE_WROTE_SMB reggae 2015-04-15 13:55:59 +02:00
Simon
072a440837 Test cases for new patterns 2015-04-15 13:55:17 +02:00
Simon
105d16644d For patterns like this '---- On {date} {name} {mail} wrote ---- ' 2015-04-14 18:52:45 +02:00
Simon
df3338192a Another submission to a dutch variation 2015-04-14 18:49:26 +02:00
Simon
f0ed5d6c07 New splitter pattern for Dutch mail replies 2015-04-14 18:22:48 +02:00
Sergey Obukhov
790463821f Merge pull request #31 from tsheasha/patch-1
Utilising the Constants
2015-03-02 14:48:41 -08:00
Sergey Obukhov
763d3b308e Merge pull request #35 from futuresimple/more_formats
Support some polish and french formats
2015-03-02 14:25:26 -08:00
szymonsobczak
3c9ef4653f some more french fromats 2015-02-24 12:18:54 +01:00
szymonsobczak
b16060261a support some polish and french formats 2015-02-24 11:39:12 +01:00
Tarek Sheasha
13dc43e960 Utilising the Constants
Checking for the length of a line to determine if it is possibly a signature or not could be done
in a more generic way by determining the maximum size of the line via a constant. Hence advocating
the spirit of the modifying the code in only one place and propagating that change everywhere.

This exact approach has already been used at:
2015-01-21 15:54:57 +01:00
Jeremy Schlatter
3768d7ba31 make a separate test function for each language 2014-12-30 14:41:20 -08:00
Jeremy Schlatter
613d1fc815 Add extra splitter expressions and tests for German and Danish.
Also some refactoring to make it a bit easier to add more languages.
2014-12-23 15:44:04 -08:00
Sergey Obukhov
52505bba8a Update README.rst
Clarified that some signature extraction methods require initializing the lib first.
v1.0.3
2014-09-14 09:03:10 -07:00
Sergey Obukhov
79cd4fcc52 Merge pull request #15 from willemdelbare/master
added extra splitter expressions for Dutch, French, German
2014-09-14 08:38:39 -07:00
Willem Delbare
a4f156b174 added extra splitter expressions for Dutch, French, German 2014-09-13 15:33:08 +02:00
Sergey Obukhov
1789ccf3c8 Merge branch 'master' of github.com:mailgun/talon 2014-07-24 20:37:47 -07:00