//tagger maxenttagger tagger = new maxenttagger(args[0]); tokenizerfactory<corelabel> ptbtokenizerfactory = ptbtokenizer.factory(new corelabeltokenfactory(), "untokenizable=nonekeep"); bufferedreader r = new bufferedreader(new inputstreamreader(new fileinputstream(args[1]), "utf-8")); printwriter pw = new printwriter(new outputstreamwriter(system.out, "utf-8")); documentpreprocessor documentpreprocessor = new documentpreprocessor(r); documentpreprocessor.settokenizerfactory(ptbtokenizerfactory); (list<hasword> sentence : documentpreprocessor) { list<taggedword> tsentence = tagger.tagsentence(sentence); pw.println(sentence.listtostring(tsentence, false)); }
it fails following exception reading pos tagger model c:\work\development\workspace\stanfordnlp\sample.txt ...
c:\work\development\workspace\stanfordnlp\sample.txtexception in thread "main" edu.stanford.nlp.io.runtimeioexception: error while loading tagger model (probably missing model file) @ edu.stanford.nlp.tagger.maxent.maxenttagger.readmodelandinit(maxenttagger.java:869) @ edu.stanford.nlp.tagger.maxent.maxenttagger.readmodelandinit(maxenttagger.java:767) @ edu.stanford.nlp.tagger.maxent.maxenttagger.<init>(maxenttagger.java:298) @ edu.stanford.nlp.tagger.maxent.maxenttagger.<init>(maxenttagger.java:263) @ phoenix.tokenizerdemo.main(tokenizerdemo.java:42) caused by: java.io.streamcorruptedexception: invalid stream header: 416e6f74 @ java.io.objectinputstream.readstreamheader(unknown source) @ java.io.objectinputstream.<init>(unknown source) @ edu.stanford.nlp.tagger.maxent.taggerconfig.readconfig(taggerconfig.java:748) @ edu.stanford.nlp.tagger.maxent.maxenttagger.readmodelandinit(maxenttagger.java:804) ... 4 more
the log should indicate problem:
reading pos tagger model c:\work\development\workspace\stanfordnlp\sample.txt ...
you incorrectly instantiating maxenttagger
instance. if provide single string argument constructor, string expected provide path tagger model file.
see documentation maxenttagger
more information.
Comments
Post a Comment