python - Nested for loops with large data set -

i have list of sublists each of consists of 1 or more strings. comparing each string in 1 sublist every other string in other sublists. consists of writing 2 loops. however, data set ~5000 sublists, means program keeps running forever unless run code in increments of 500 sublists. how change flow of program can still @ j values corresponding each i, , yet able run program ~5000 sublists. (wn wordnet library) here's part of code:

for in range(len(somelist)):     if == len(somelist)-1: #if last sublist, not compare         break     title_former  = somelist[i]      word in title_former:         singular = wn.morphy(word) #convert singular         if singular == none:             pass          elif singular != none:             newwordsyn  = getnewwordsyn(word,singular)             if not newwordsyn:                 uncounted_words.append(word)             else:                 j in range(i+1,len(somelist)):                     title_latter = somelist[j]                     word1 in title_latter:                         singular1 = wn.morphy(word1)                          if singular1 == none:                             uncounted_words.append(word1)                         elif singular1 != none:                             newwordsyn1      = getnewwordsyn(word1,singular1)                             tempsimilarity  = newwordsyn.wup_similarity(newwordsyn1)

example:

input = [['space', 'invaders'], ['draw']] output= {('space','draw'):0.5,('invaders','draw'):0.2}

the output dictionary corresponding string pair tuple , similarity value. above code snippet not complete.

you try doubt faster (and need change distance function)

def dist(s1,s2):     return sum([i!=j i,j in zip(s1,s2)]) + abs(len(s1)-len(s2))  dict([((k,v),dist(k,v)) k,v in itertools.product(input1,input2)])

Autos

Search This Blog

python - Nested for loops with large data set -