General query regarding removal of duplicate sublists from a list of lists

We need to eliminate the duplicate sublists from a list of lists. I understand where they’re sorting the sublists and then making tuples out of them, but what does the ‘*’ mean, in this case? Does anyone have any insights?

1 Like

Hi,

Actually, star (*) here represents the unpacking of a tuple (or iterable in general). Using star (*), we are unpacking the iterable into direct values. For instance, consider this code:

def transpose_list(list_of_lists):
    return [
        list(row)
        for row in zip(*list_of_lists)
    ]

Here, a list of lists is passed to the transpose_list function. Now in this function, the zip operator is responsible for iterating over the elements of each of the sublists. To accomplish this, we have unpacked the list of lists in zip operator or in a more intuitive way, it is of the form zip(lst1, lst2, ..). Therefore, the unpacking via star (*) enables us to access these values (list in this case). The output for this, in case you are wondering, is:

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

Now, for the question you mentioned:

  1. We need to eliminate the duplicate sublist from the list of lists.
  2. For this purpose, we are sorting the elements of the sublists, storing them as tuples, and storing these tuples in the list.

[tuple(sorted(i)) for i in lst]

  1. Then we are unpacking this list of tuples in a set so that duplicated tuples are removed from the list.

{*[tuple(sorted(i)) for i in lst]}

  1. Lastly, we iterating over the elements of this set and then converting those tuples back to the list and then storing them in the master list.

[list(i) for i in {*[tuple(sorted(i)) for i in lst]}]

More things to note:

  1. You can avoid the use of the (*) operator by using the built-in set() function. set(), list(), tuple() takes one argument and doesn’t require unpacking as done in the cases where their notations where used: { }, [ ] or ( ). Therefore, another way to approach this can be:

[list(i) for i in set([tuple(sorted(i)) for i in lst])]

  1. Tuple was used in place of the list to store the sorted elements because in sets, you cannot use mutable objects and list are mutable.

  2. Instead of storing tuples of sublist in the list, you can store these tuples in another tuple and then unpack it in the set. It will be more time-efficient as list are expensive operations:

[list(i) for i in {*(tuple(sorted(i)) for i in a)}]

I hope this explanation clears your doubt.

3 Likes

Thanks a lot, this was a massive help.

1 Like