CopyPastor

Detecting plagiarism made easy.

Score: 1; Reported for: Exact paragraph match Open both answers

Possible Plagiarism

Plagiarized on 2019-05-15
by Antbal 0415

Original Post

Original - Posted on 2011-01-01
by Ned Batchelder



            
Present in both answers; Present only in the new answer; Present only in the old answer;

You did not put your code in the question, but I tried your input on the accepted answer at the link(I am assuming that is the code you used). I found that I did have to add a line of code and a set of parentheses to get it to run, but from your question it sounded like the program ran but failed. When I ran it it did succeed.
the code listed in the answer:
import nltk.data tokenizer = nltk.data.load('tokenizers/punkt/english.pickle') fp = open("test.txt") data = fp.read() print '\n-----\n'.join(tokenizer.tokenize(data))
The code I ran which succeeded:
import nltk.data nltk.download('punkt')
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle') fp = open("test.txt") data = fp.read() print ('\n-----\n'.join(tokenizer.tokenize(data)))
The program's output:
The Minister must prepare an annual report on the implementation of specific programs. ----- The report is included in the annual management report of the Ministere de l’Emploi et de la Solidarite sociale.
I would like to mention that for this code, the input must be in the .txt file, and the output will be to the console.
If I have missed something or any of my assumptions were wrong, please let me know so I can try to fix it. Adding more information to your answer and relying less on links will probably help you get more accurate and relevant answers. For example, there are many ways a program can fail, so an explanation and/or a sample output and expected output can go a long way.
The Natural Language Toolkit ([nltk.org](http://www.nltk.org/)) has what you need. [This group posting](http://mailman.uib.no/public/corpora/2007-October/005426.html) indicates this does it:
import nltk.data tokenizer = nltk.data.load('tokenizers/punkt/english.pickle') fp = open("test.txt") data = fp.read() print '\n-----\n'.join(tokenizer.tokenize(data))
(I haven't tried it!)

        
Present in both answers; Present only in the new answer; Present only in the old answer;