If you want to calculate a set containing all subsets of set (also called power set) you could either choose an recursive approach or try this iterative approach which is faster than the recursive one.
def get_subsets(fullset): listrep = list(fullset) subsets = [] for i in range(2**len(listrep)): subset = [] for k in range(len(listrep)): if i & 1<<k: subset.append(listrep[k]) subsets.append(subset) return subsets subsets = get_subsets(set([1,2,3,4])) print(subsets) print(len(subsets))
You can also find a shorter version at the end of the article, but to understand the principle the algorithm above is more suitable.
A powerset of a set with \(n\) elements always contains \(2^n\) elements. For example, the power set of \(\{1,2\}\) is \(\{\{\}, \{1\}, \{2\}, \{1,2\}\}\)
By iterating from \(0\) to \(2^n\) we iterate \(2^n\) times and the binary representation of every iteration index can be used to determine whether to include an element from the original set or not.
For example, iteration 0 would be represented by 0…0, so we would not include any element and produce the empty set which is a subset of every set.
Iteration 0…01 would only include element 1 as there is only one 1 in the binary representation of the iteration index. 0…0101 would add the subset containing element 1 and 3, and so on.
The inner loop iterates from \(k=0\) to \(k=n\) and \(i \& 1<<k\) is true when there is a 1 in the binary representation at the index corresponding to element k.
Please note that this algorithm makes use of big integers, so if you want to calculate the power set of a set containing more than 32 elements (or 64 on 64 bit systems) you will run into overflow problems and then should better use a recursive approach.
Here is a more pythonic solution that uses list comprehensions:
def get_subsets(fullset): listrep = list(fullset) n = len(listrep) return [[listrep[k] for k in range(n) if i&1<<k] for i in range(2**n)]
This is really helpful to my assignment! Thanks Simon!
BTW, may I ask what does “i&1 << k" mean?
Hi,
first you have to understand what 1 << k means. “<<” is the bit-shift operator (to the left). This means that if k = 1, 1 is moved 1 position to the left.
k = 1 (decimal) => 0001 (binary) << 1 = 0010 (binary) = 2 (decimal)
k = 2 (decimal) => 0001 (binary) << 2 = 0100 (binary) = 4 (decimal)
and so on…
“&” is the bit-wise and operator. It takes two integers, looks at their binary representation and returns an integer that has 1 in the binary representation at every position where the two input integers both have a 1.
Example:
a = 5 (decimal) = 0101 (binary)
b = 3 (decimal) = 0011 (binary)
=> a & b = 0101 & 0011 = 0001 (binary) = 1 (decimal)
a = 2d = 0010b
b = 6d = 0110b
=> a & b = 0010b & 0110b = 0010b = 2
In this algorithm, the right side of the bit-wise and is always a power of 2. This means that the binary representation has exactly one 1 in it and every other position is 0.
In general: 1 << k = 2^k
So i & 1 << k basically checks if the number i has a 1 at position k in its binary representation. If yes, element k is included in subset i.
I hope this helps you
Another important thing to remember about this is that the left shift (<<) takes precedence over the arithmetic and (&), because the left shift is prioritized as exponentiation (see Simon's remark about 1 << k being equivalent to 2 ** k) and arithmetic and is bit-wise multiplication.
Hi, I have a question about the speed of these algorithms.
Having timed this code, alongside a version that uses recursion, it seems that the recursive method is consistently faster by a significant margin.
Can you explain why this might be?
Hi,
did you implement both versions using Python? Did you time it a few times and take the average?
And how big is the difference?
I have not timed the algorithm yet, and it is also surprising to me that recursion is faster. The runtime of the version in this article should be O(n * 2^n) with n= length of input set. I am not 100% sure what the O for the recursive version will be. Quick google search indicates that it will also be O(n * 2^n).
So maybe the python interpreter is able to do some optimizations on the recursive version but can’t optimize the non-recursive one. One thing that could be slowing down the non-recursive algorithm is the bit-wise operations. But that is just guessing.
Can you share the code that you used and the times that you measured?
Best,
Simon
For the assignment I’m doing, I need to ask for the user and use that input to find powerset. Users are typing string like abc not inputs like 123. Can you help me solve it.
The exact same algorithm can be used for both strings and integer sets. Just convert your string to a list of chars.
When I run this as a sum solver, I only get one result. When using another algorithm, I get more than 5 results.
I cannot figure out why it only provides one result when there are many solutions:
def get_subsets(fullset):
listrep = list(fullset)
subsets = []
for i in range(2**len(listrep)):
subset = []
for k in range(len(listrep)):
if i & 1<<k:
subset.append(listrep[k])
subsets.append(subset)
return subsets
subsets = get_subsets(set([98.31,27.46,8869.56,522.4,4150.1,540.4,30.42,6067.79,98.31,269.2,3463.84,720.93,2978.08,107.36,117.08,107.36,229.77,1444.9,118.66,101.14,1264.58,89.44,429.99,530.14,113.57]))
SumTuples = map(sum,subsets)
list_a = list(subsets)
list_b = list(SumTuples)
Result = list(zip(list_a, list_b))
[t for t in Result if t[1] == 21158.62]