Re: Splitting a text file into sentences



Hi,
I have looked at NLTK in Python (and had been hoping a Rubyist would
rewrite it in Ruby). I will go back to NLTK and see if it has a
split-sentence algorithm of sort. And thanks for the tip on Stanfords
Java NLP package. Yes, those abbreviations are pesky, and I may have to
resort to an exceptions list containing the most common ones.
Thanks much,
basi

.



Relevant Pages

  • text analysis in python
    ... So far, the most comprehensive toolkit in python for my purpose is NLTK by Edward Loper and Steven Bird, followed by mxTextTools. ... there is GATE (general architecture for text engineering) and it seems very impressive. ...
    (comp.lang.python)
  • Re: text analysis in python
    ... So far, the most comprehensive toolkit in python for my purpose is NLTK (natural language tool kit) by Edward Loper and Steven Bird, followed by mxTextTools. ...
    (comp.lang.python)
  • Re: text analysis in python
    ... program with Jython, your source code can be in Python, but you have full access to any library coded in Java. ... Say I code my stuffs in Jython in a file "text.py"... ... The main reason I'm looking at NLTK is that it is pure python and is about the comprehensive text analysis toolkit in python. ...
    (comp.lang.python)
  • Re: Words fail me
    ... a slow-running Perl script. ... You'd have to rewrite it in Ruby or Python to get decent geological features ...
    (alt.sysadmin.recovery)
  • Re: RubyToC - Second Question
    ... you could try Ruby 1.9. ... But probably the simplest Ruby speed hack is to rewrite the more ... common for Python and Perl, ... it in Ruby first, then profile it to find out what actually needs to be C. ...
    (comp.lang.ruby)