Join the Stack Overflow Community
Stack Overflow is a community of 6.7 million programmers, just like you, helping each other.
Join them; it only takes a minute:
Sign up

I have a list of lists, which looks like

listOfLists = [
    ['a','b','c','d'],
    ['a','b'],
    ['a','c'],
    ['c','c','c','c']  
 ] 

I want to count the number of lists which have a particular element. For Example, my output should be

{'a':3,'b':2,'c':3,'d':1}

As you can see, I don't need the total count of an element. In the case of "c", though its total count is 5, the output is 3 as it occurs only in 3 lists.

I am using a counter to get the counts. The same can be seen below.

line_count_tags = []
for lists in lists_of_lists:
    s = set()
    for element in lists:
         s.add(t)
    lines_count_tags.append(list(s))

count = Counter([count for counts in lines_count_tags for count in counts])

So, when I print count, I get

{'a':3,'c':3,'b':2,'d':1}

I want to know if there's a much better way to accomplish my goal.

share|improve this question
up vote 8 down vote accepted

Use a Counter and convert each list to a set. The set will remove any duplicates from each list so that you don't count duplicate values in the same list:

>>> from collections import Counter

>>> Counter(item for lst in listOfLists for item in set(lst))
Counter({'a': 3, 'b': 2, 'c': 3, 'd': 1})

If you like functional programming you can also feed a chain of set-mapped listOfLists to the Counter:

>>> from collections import Counter
>>> from itertools import chain

>>> Counter(chain.from_iterable(map(set, listOfLists)))
Counter({'a': 3, 'b': 2, 'c': 3, 'd': 1})

Which is totally equivalent (except maybe being a bit faster) to the first approach.

share|improve this answer

I would convert each list as a set before counting in a generator comprehension passed to Counter:

import collections
print(collections.Counter(y for x in listOfLists for y in set(x)))

result:

Counter({'a': 3, 'c': 3, 'b': 2, 'd': 1})

(that's practically what you did, but the above code shorts a lot of loops and temporary list creations)

share|improve this answer

You can do it without a Counter, too:

result = {}
for lis in listOfLists:
    for element in set(lis):
        result[element] = result.get(element, 0) + 1
print result  # {'a': 3, 'c': 3, 'b': 2, 'd': 1}

Not the most elegant, but should be considerably faster.

share|improve this answer

A bit of a stylistic difference on the Counter approach with itertools.chain.from_iterable may look like

Counter(chain.from_iterable(map(set, listOfLists)))

Demo

>>> from itertools import chain
>>> from collections import Counter
>>> Counter(chain.from_iterable(map(set, listOfLists)))
Counter({'a': 3, 'b': 2, 'c': 3, 'd': 1})

Rough benchmark

%timeit Counter(item for lst in listOfLists for item in set(lst))
100000 loops, best of 3: 13.5 µs per loop

%timeit Counter(chain.from_iterable(map(set, listOfLists)))
100000 loops, best of 3: 12.4 µs per loop
share|improve this answer
1  
or chain.from_iterable – Copperfield 4 hours ago
    
I get much faster execution using itertools.chain (~40%!) on CPython 2.7.11. Still, Counter + itertools.chain execute 4 times slower than the raw method I presented. – zwer 4 hours ago
1  
@zwer Eh, depends what input size we are discussing. My solution has more overhead, but if you increase the input size it shall be faster. That's why the benchmarking isn't all too important :) – Mitch 3 hours ago
    
True that, I was just surprised at the stark difference in speed at my place, I'm not used to itertools actually outperforming, well, pretty much anything - they are usually the slower, but easier to read choice :D – zwer 3 hours ago
from collections import Counter
from itertools import chain

Just convert to set and then feed into a Counter.

inp = [
    ['a','b','c','d'],
    ['a','b'],
    ['a','c'],
    ['c','c','c','c']  
 ] 


print(Counter(chain.from_iterable(map(set,inp))))
share|improve this answer

This approach calculates the unique entries in listOfLists using set comprehension, and then counts occurrences in each list using dictionary comprehension

A = {val for s in listOfLists for val in s}
d = {i: sum( i in j for j in listOfLists) for i in A}
print(d) # {'a': 3, 'c': 3, 'b': 2, 'd': 1}

I'll admit it's a little ugly, but it's a possible solution (and a cool use of dictionary comprehension). You could also make this a one-liner by moving the calculation of A right into the dictionary comprehension

share|improve this answer
    
there is no need to cast your set A to a list again or feed the set with a list comprehension, a generation expression is better... actually you can build A as a set comprehension too – Copperfield 4 hours ago
    
@Copperfield Thanks for your suggestion. I've made a change. – nbryans 4 hours ago

Here is another version using loops:

listOfLists = [
    ['a','b','c','d'],
    ['a','b'],
    ['a','c'],
    ['c','c','c','c']
    ]

final = {}
for lst in listOfLists:
    for letter in lst:
        if letter in final:
            final[letter] += 1
        else:
            final[letter] = 1

print final

So create an empty dictionary called final. Then loop through each letter of each list. Create a new key and value = 1 if the letter does not yet exist in final as a key. Otherwise add 1 to the value for that key.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.