i'm using twitter4j package information retrieval class , have collected tweets. however, next part of assignment, use lucene index on tweets. in order this, thought save tweets json strings file , reread them when needed. however, i'm running error.
when file written, can see entire json object fine. total object quite large (2500 characters). however, when reading file, unterminated string @ xxxx
error. using twitterobjectfactory methods both write , read string. here sample code:
writing:
public void onstatus(status status) { try{ string jsonstring = twitterobjectfactory.getrawjson(status); output.write(jsonstring+"\n"); numtweets++; if(numtweets > 10){ synchronized(lock){ lock.notify(); } } } catch(ioexception e){ e.printstacktrace(); } }
reading:
scanner input = new scanner(file); while(input.hasnext()){ status status = twitterobjectfactory.createstatus(input.nextline()); system.out.println(status.getuser().getscreenname()); }
this works of time. if run program multiple times , many tweets, program crashes after 2-3 tweets have been read file, same error. if you'd replicate code, can follow this example. i've added synchronized block in order close stream after 10 tweets, it's not necessary replicate error.
can explain happening? guess there's wrong way i'm encoding json file. i'm using bufferedwriter
wrapping outputstreamwriter
in order output in utf-8 format.
edit: close stream. here's bottom snippet of code:
twitterstream.addlistener(listener); twitterstream.sample("en"); try{ synchronized(lock){ lock.wait(); } } catch(interruptedexception e){ e.printstacktrace(); } twitterstream.clearlisteners(); twitterstream.cleanup(); twitterstream.shutdown(); output.close();
you need flush output, before notify reader. otherwise parts of string stay in buffer.
public void onstatus(status status) { try{ string jsonstring = twitterobjectfactory.getrawjson(status); output.write(jsonstring+"\n"); output.flush(); numtweets++; if(numtweets > 10){ synchronized(lock){ lock.notify(); } } } catch(ioexception e){ e.printstacktrace(); } }