Sergey Obukhov
2c416ecc0e
Merge pull request #62 from tgwizard/better-support-for-scandinavian-languages
...
Add better support for Scandinavian languages
2015-10-14 21:48:10 -07:00
Adam Renberg
14e3a0d80b
Add better support for Scandinavian languages
...
This is a port of https://github.com/tictail/claw/pull/6 by @simonflore.
2015-09-21 21:42:01 +02:00
Adam Renberg
fcd9e2716a
Add fix for Apple Mail email format
...
Where they have an initial > on the "date line".
2015-09-21 21:33:57 +02:00
Sergey Obukhov
e4c1c11845
remove print
2015-09-21 09:52:47 -07:00
Sergey Obukhov
ae508fe0e5
fixes mailgun/talon#26
2015-09-21 09:51:26 -07:00
Sergey Obukhov
d328c9d128
fixes mailgun/talon#43
2015-09-18 05:19:59 -07:00
Sergey Obukhov
15976888a0
use precise encoding when converting to unicode
2015-09-11 10:38:28 -07:00
Sergey Obukhov
385285e5de
process first 1000 lines for long messages, support for German and Dutch
2015-09-11 06:17:14 -07:00
Sergey Obukhov
cc98befba5
Merge pull request #50 from Easy-D/preserve-regular-blockquotes
...
Preserve regular blockquotes
2015-09-11 04:49:36 -07:00
Easy-D
390b0a6dc9
preserve regular blockquotes
2015-07-16 21:31:41 +02:00
Scott MacVicar
8b1f87b1c0
Get this building and passing tests
...
Changes:
* add .DS_Store to .gitignore
* Decode base64 encoded emails for tests
* Pick a version of scikit since the pickled clasifiers are based on that
* Add missing numpy and scipy dependencies
2015-07-02 21:49:09 -04:00
Alex Riina
215e36e9ed
allow higher version of regex library
2015-07-02 21:49:09 -04:00
Alex Riina
e3ef622031
remove unused regex
2015-07-02 21:49:09 -04:00
Alex Riina
f16760c466
Remove flanker and replace PyML with scikit-learn
...
I never was actually able to successfully install PyML but the source-forge
distribution and lack of python3 support convinced me that scikit-learn would
be a fine substitute. Flanker was also difficult for me to install and seemed
only to be used in the tests, so I removed it as well to get into a position
where I could run the tests. As of this commit, only one is not passing
(test_standard_replies with android.eml) though I'm not familiar with the `email`
library yet.
2015-07-02 21:49:09 -04:00
Alex Riina
b36287e573
clean up style and extra imports
2015-07-02 21:49:09 -04:00
Simon
f9f428f4c3
Revert "Change of behavior when msg_body has more then 1000 lines"
...
This reverts commit 84a83e865e .
2015-04-16 13:26:17 +02:00
Simon
84a83e865e
Change of behavior when msg_body has more then 1000 lines
2015-04-16 13:22:18 +02:00
Simon
b4c180b9ff
Extra spaces check in RE_ON_DATE_WROTE_SMB reggae
2015-04-15 13:55:59 +02:00
Simon
105d16644d
For patterns like this '---- On {date} {name} {mail} wrote ---- '
2015-04-14 18:52:45 +02:00
Simon
df3338192a
Another submission to a dutch variation
2015-04-14 18:49:26 +02:00
Simon
f0ed5d6c07
New splitter pattern for Dutch mail replies
2015-04-14 18:22:48 +02:00
Sergey Obukhov
790463821f
Merge pull request #31 from tsheasha/patch-1
...
Utilising the Constants
2015-03-02 14:48:41 -08:00
szymonsobczak
3c9ef4653f
some more french fromats
2015-02-24 12:18:54 +01:00
szymonsobczak
b16060261a
support some polish and french formats
2015-02-24 11:39:12 +01:00
Tarek Sheasha
13dc43e960
Utilising the Constants
...
Checking for the length of a line to determine if it is possibly a signature or not could be done
in a more generic way by determining the maximum size of the line via a constant. Hence advocating
the spirit of the modifying the code in only one place and propagating that change everywhere.
This exact approach has already been used at:
2015-01-21 15:54:57 +01:00
Jeremy Schlatter
613d1fc815
Add extra splitter expressions and tests for German and Danish.
...
Also some refactoring to make it a bit easier to add more languages.
2014-12-23 15:44:04 -08:00
Willem Delbare
a4f156b174
added extra splitter expressions for Dutch, French, German
2014-09-13 15:33:08 +02:00
Pascal Borreli
8b78da5977
Fixed typos
2014-07-25 02:40:37 +00:00
Sergey Obukhov
170f11038b
initial commit
2014-07-23 21:12:54 -07:00