Hướng dẫn ordered set python

The answer is no, but you can use collections.OrderedDict from the Python standard library with just keys (and values as None) for the same purpose.

Update: As of Python 3.7 (and CPython 3.6), standard dict is guaranteed to preserve order and is more performant than OrderedDict. (For backward compatibility and especially readability, however, you may wish to continue using OrderedDict.)

Here's an example of how to use dict as an ordered set to filter out duplicate items while preserving order, thereby emulating an ordered set. Use the dict class method fromkeys() to create a dict, then simply ask for the keys() back.

>>> keywords = ['foo', 'bar', 'bar', 'foo', 'baz', 'foo']

>>> list(dict.fromkeys(keywords))
['foo', 'bar', 'baz']

Hướng dẫn ordered set python

Asclepius

51.8k15 gold badges149 silver badges131 bronze badges

answered Dec 6, 2018 at 18:21

8

There is an ordered set (possible new link) recipe for this which is referred to from the Python 2 Documentation. This runs on Py2.6 or later and 3.0 or later without any modifications. The interface is almost exactly the same as a normal set, except that initialisation should be done with a list.

OrderedSet([1, 2, 3])

This is a MutableSet, so the signature for .union doesn't match that of set, but since it includes __or__ something similar can easily be added:

@staticmethod
def union(*sets):
    union = OrderedSet()
    union.union(*sets)
    return union

def union(self, *sets):
    for set in sets:
        self |= set

Hướng dẫn ordered set python

LondonRob

65.8k33 gold badges131 silver badges182 bronze badges

answered Oct 31, 2009 at 10:15

CasebashCasebash

110k83 gold badges243 silver badges347 bronze badges

6

Update: This answer is obsolete as of Python 3.7. See jrc's answer above for a better solution. Will keep this answer here only for historical reasons.


An ordered set is functionally a special case of an ordered dictionary.

The keys of a dictionary are unique. Thus, if one disregards the values in an ordered dictionary (e.g. by assigning them None), then one has essentially an ordered set.

As of Python 3.1 and 2.7 there is collections.OrderedDict. The following is an example implementation of an OrderedSet. (Note that only few methods need to be defined or overridden: collections.OrderedDict and collections.MutableSet do the heavy lifting.)

import collections

class OrderedSet(collections.OrderedDict, collections.MutableSet):

    def update(self, *args, **kwargs):
        if kwargs:
            raise TypeError("update() takes no keyword arguments")

        for s in args:
            for e in s:
                 self.add(e)

    def add(self, elem):
        self[elem] = None

    def discard(self, elem):
        self.pop(elem, None)

    def __le__(self, other):
        return all(e in other for e in self)

    def __lt__(self, other):
        return self <= other and self != other

    def __ge__(self, other):
        return all(e in self for e in other)

    def __gt__(self, other):
        return self >= other and self != other

    def __repr__(self):
        return 'OrderedSet([%s])' % (', '.join(map(repr, self.keys())))

    def __str__(self):
        return '{%s}' % (', '.join(map(repr, self.keys())))
    
    difference = property(lambda self: self.__sub__)
    difference_update = property(lambda self: self.__isub__)
    intersection = property(lambda self: self.__and__)
    intersection_update = property(lambda self: self.__iand__)
    issubset = property(lambda self: self.__le__)
    issuperset = property(lambda self: self.__ge__)
    symmetric_difference = property(lambda self: self.__xor__)
    symmetric_difference_update = property(lambda self: self.__ixor__)
    union = property(lambda self: self.__or__)

answered Oct 31, 2009 at 10:17

Stephan202Stephan202

58.3k13 gold badges124 silver badges131 bronze badges

14

Implementations on PyPI

While others have pointed out that there is no built-in implementation of an insertion-order preserving set in Python (yet), I am feeling that this question is missing an answer which states what there is to be found on PyPI.

There are the packages:

  • ordered-set (Python based)
  • orderedset (Cython based)
  • collections-extended
  • boltons (under iterutils.IndexedSet, Python-based)
  • oset (last updated in 2012)

Some of these implementations are based on the recipe posted by Raymond Hettinger to ActiveState which is also mentioned in other answers here.

Some differences

  • ordered-set (version 1.1)
  • advantage: O(1) for lookups by index (e.g. my_set[5])
  • oset (version 0.1.3)
  • advantage: O(1) for remove(item)
  • disadvantage: apparently O(n) for lookups by index

Both implementations have O(1) for add(item) and __contains__(item) (item in my_set).

answered Apr 22, 2014 at 16:22

Hướng dẫn ordered set python

Daniel KDaniel K

2,8792 gold badges23 silver badges23 bronze badges

3

I can do you one better than an OrderedSet: boltons has a pure-Python, 2/3-compatible IndexedSet type that is not only an ordered set, but also supports indexing (as with lists).

Simply pip install boltons (or copy setutils.py into your codebase), import the IndexedSet and:

>>> from boltons.setutils import IndexedSet
>>> x = IndexedSet(list(range(4)) + list(range(8)))
>>> x
IndexedSet([0, 1, 2, 3, 4, 5, 6, 7])
>>> x - set(range(2))
IndexedSet([2, 3, 4, 5, 6, 7])
>>> x[-1]
7
>>> fcr = IndexedSet('freecreditreport.com')
>>> ''.join(fcr[:fcr.index('.')])
'frecditpo'

Everything is unique and retained in order. Full disclosure: I wrote the IndexedSet, but that also means you can bug me if there are any issues. :)

NOhs

2,7303 gold badges22 silver badges55 bronze badges

answered Feb 7, 2016 at 20:41

2

If you're using the ordered set to maintain a sorted order, consider using a sorted set implementation from PyPI. The sortedcontainers module provides a SortedSet for just this purpose. Some benefits: pure-Python, fast-as-C implementations, 100% unit test coverage, hours of stress testing.

Installing from PyPI is easy with pip:

pip install sortedcontainers

Note that if you can't pip install, simply pull down the sortedlist.py and sortedset.py files from the open-source repository.

Once installed you can simply:

from sortedcontainers import SortedSet
help(SortedSet)

The sortedcontainers module also maintains a performance comparison with several alternative implementations.

For the comment that asked about Python's bag data type, there's alternatively a SortedList data type which can be used to efficiently implement a bag.

answered Sep 23, 2014 at 6:52

GrantJGrantJ

7,6263 gold badges49 silver badges45 bronze badges

6

As other answers mention, as for python 3.7+, the dict is ordered by definition. Instead of subclassing OrderedDict we can subclass abc.collections.MutableSet or typing.MutableSet using the dict's keys to store our values.

import itertools
import typing

T = typing.TypeVar("T")

class OrderedSet(typing.MutableSet[T]):
    """A set that preserves insertion order by internally using a dict."""

    def __init__(self, iterable: typing.Iterator[T]):
        self._d = dict.fromkeys(iterable)

    def add(self, x: T) -> None:
        self._d[x] = None

    def discard(self, x: T) -> None:
        self._d.pop(x, None)

    def __contains__(self, x: object) -> bool:
        return self._d.__contains__(x)

    def __len__(self) -> int:
        return self._d.__len__()

    def __iter__(self) -> typing.Iterator[T]:
        return self._d.__iter__()

    def __str__(self):
        return f"{{{', '.join(str(i) for i in self)}}}"

    def __repr__(self):
        return f""

Then just:

x = OrderedSet([1, 2, -1, "bar"])
x.add(0)
assert list(x) == [1, 2, -1, "bar", 0]

I added this code, with some tests, in a small library, so anyone can just pip install it.

answered May 26, 2020 at 10:09

bustawinbustawin

6247 silver badges11 bronze badges

2

In case you're already using pandas in your code, its Index object behaves pretty like an ordered set, as shown in this article.

Examples from the article:

indA = pd.Index([1, 3, 5, 7, 9])
indB = pd.Index([2, 3, 5, 7, 11])

indA & indB  # intersection
indA | indB  # union
indA - indB  # difference
indA ^ indB  # symmetric difference

answered Sep 25, 2015 at 14:13

Berislav LopacBerislav Lopac

15.8k6 gold badges68 silver badges78 bronze badges

3

There's no OrderedSet in official library. I make an exhaustive cheatsheet of all the data structure for your reference.

DataStructure = {
    'Collections': {
        'Map': [
            ('dict', 'OrderDict', 'defaultdict'),
            ('chainmap', 'types.MappingProxyType')
        ],
        'Set': [('set', 'frozenset'), {'multiset': 'collection.Counter'}]
    },
    'Sequence': {
        'Basic': ['list', 'tuple', 'iterator']
    },
    'Algorithm': {
        'Priority': ['heapq', 'queue.PriorityQueue'],
        'Queue': ['queue.Queue', 'multiprocessing.Queue'],
        'Stack': ['collection.deque', 'queue.LifeQueue']
        },
    'text_sequence': ['str', 'byte', 'bytearray']
}

Hướng dẫn ordered set python

fhdrsdg

9,8072 gold badges37 silver badges58 bronze badges

answered Dec 6, 2017 at 10:50

Hướng dẫn ordered set python

AbstProcDoAbstProcDo

18.3k14 gold badges70 silver badges122 bronze badges

1

As others have said, OrderedDict is a superset of an ordered set in terms of functionality, but if you need a set for interacting with an API and don't need it to be mutable, OrderedDict.keys() is actually an implementation abc.collections.Set:

import random
from collections import OrderedDict, abc

a = list(range(0, 100))
random.shuffle(a)

# True
a == list(OrderedDict((i, 0) for i in a).keys())

# True
isinstance(OrderedDict().keys(), abc.Set)   

The caveats are immutability and having to build up the set like a dict, but it's simple and only uses built-ins.

answered Sep 2, 2020 at 2:33

David EhrmannDavid Ehrmann

7,1661 gold badge28 silver badges37 bronze badges

The ParallelRegression package provides a setList( ) ordered set class that is more method-complete than the options based on the ActiveState recipe. It supports all methods available for lists and most if not all methods available for sets.

answered Jan 21, 2017 at 22:45

There is a pip library that does this:

pip install ordered-set

Then you can use it:

from ordered_set import OrderedSet

answered Apr 4 at 20:04

Hướng dẫn ordered set python

Watchdog101Watchdog101

5904 silver badges17 bronze badges

For many purposes simply calling sorted will suffice. For example

>>> s = set([0, 1, 2, 99, 4, 40, 3, 20, 24, 100, 60])
>>> sorted(s)
[0, 1, 2, 3, 4, 20, 24, 40, 60, 99, 100]

If you are going to use this repeatedly, there will be overhead incurred by calling the sorted function so you might want to save the resulting list, as long as you're done changing the set. If you need to maintain unique elements and sorted, I agree with the suggestion of using OrderedDict from collections with an arbitrary value such as None.

answered Feb 20, 2013 at 22:52

hwrdhwrd

4154 silver badges7 bronze badges

1

So i also had a small list where i clearly had the possibility of introducing non-unique values.

I searched for the existence of a unique list of some sort, but then realized that testing the existence of the element before adding it works just fine.

if(not new_element in my_list):
    my_list.append(new_element)

I don't know if there are caveats to this simple approach, but it solves my problem.

answered Jul 16, 2018 at 2:40

Loïc N.Loïc N.

3133 silver badges16 bronze badges

1

Not the answer you're looking for? Browse other questions tagged python set or ask your own question.