i'm new twitter api, , wondering if use search api, , want call every minute, retrieve 1000 tweets. duplicate tweets if in case there created less 1000 tweets given criteria or call more once minute
i hope question clear, in case if matters use python-twitter
library. , way tweets :
self.api = twitter.api(consumer_key, consumer_secret ,access_key, access_secret) self.api.verifycredentials() self.api.getsearch(self.hashtag, per_page=100)
your search results overlap because api has no idea searched before. 1 way prevent overlap use use tweet id last retrieved tweet. here python 2.7 snippet code:
maxid = 10000000000000000000 in range(0,10): open('output.json','a') outfile: time.sleep(5) # don't piss off twitter print 'maxid=',maxid,', twitter loop',i results = api.getsearch('search_term', count=100,max_id = maxid) tweet in results: tweet = str(tweet).replace('\n',' ').replace('\r',' ') # remove new lines tweet = (json.loads(tweet)) maxid = tweet['id'] # redefine maxid json.dump(tweet,outfile) outfile.write('\n') #print tweets on new lines
this code gives 10 loops of 100 tweets since last id, defined each time through loop. write json file (with 1 tweet per line). use code search recent past, can adapt have non-overlapping tweets changing 'max_id' 'since_id'.
Comments
Post a Comment