Python:对称差异排序列表(Python: Symmetric Difference Sorted List)

编程入门 行业动态 更新时间:2024-10-24 18:22:37
Python:对称差异排序列表(Python: Symmetric Difference Sorted List)

有没有一种好方法可以在python中获取两个排序列表的对称差异,并返回一个排序列表作为结果。 我当前的版本似乎是一个糟糕的工作(转换为设置,找到对称的差异,转换回列表,然后度假)

使用Numpy的解决方案很好,正在排序的数据类型是整数。

sorted_symdiff(list1,list2): """ Each list is already sorted, this seems inefficient """ s1,s2 = set(list1),set(list2) diff = list(s1.symmetric_difference(s2)) diff.sort() return diff

Is there a good way to take the symmetric difference of two sorted lists in python and return a sorted list as a result. My current version seems like a poor work around (convert to set, find symmetric difference, convert back to list, then resort)

Solutions with Numpy are fine the data type being sorted are ints.

sorted_symdiff(list1,list2): """ Each list is already sorted, this seems inefficient """ s1,s2 = set(list1),set(list2) diff = list(s1.symmetric_difference(s2)) diff.sort() return diff

最满意答案

是的,有办法。 您必须利用两个序列进行排序的事实。 您需要在逐个比较元素的同时遍历两者,并在沿着每个序列前进时构建对称差异。

如果您熟悉大O表示法 ,则以下代码的复杂度为O(m+n) ,其中m = len(seq1) , n = len(seq2)

算法的复杂性为O(log(m+n)*(m+n))因为您需要对结果集进行排序。

警告:

这个答案主要是演示如何利用排序输入的练习。

尽管复杂度更高,但对于大多数输入,其执行时间比使用python内置set方法的原始海报代码慢。 在python中,集合在c代码下实现。 纯蟒蛇将很难击败它。 需要非常大的输入才能看到任何优势(如果有任何优势可见)。 这个算法效率最高,但这并不意味着它更快 - 也不意味着你应该使用它:set内置方法经过优化并经过测试c代码; 它们使代码更易于编写,读取,理解,调试和维护。

码:

def get_symmetric_difference(seq1, seq2): """ computes the symmetric difference of unique elements of seq1 & seq2 as a new sorted list, without mutating the parameters. seq1: a sorted sequence of int seq2: a sorted sequence of int return: a new sorted list containing the symmetric difference of unique elements of seq1 & seq2 """ if not seq1: symmetric_difference = seq2[:] return symmetric_difference if not seq2: symmetric_difference = seq1[:] return symmetric_difference symmetric_difference = [] idx = 0 jdx = 0 last_insert = None last_seen = None while idx < len(seq1) and jdx < len(seq2): s1 = seq1[idx] s2 = seq2[jdx] if s1 == s2: idx += 1 jdx += 1 last_seen = s1 elif s1 < s2: if last_insert != s1 and last_seen != s1: symmetric_difference.append(s1) last_insert = s1 idx += 1 elif s2 < s1: if last_insert != s2 and last_seen != s2: symmetric_difference.append(s2) last_insert = s2 jdx += 1 if len(seq1[idx:]) > len(seq2[jdx:]): for elt in seq1[idx:]: if last_insert != elt and last_seen != elt: symmetric_difference.append(elt) last_insert = elt last_seen = elt else: for elt in seq2[jdx:]: if last_insert != elt and last_seen != elt: symmetric_difference.append(elt) last_insert = elt last_seen = elt return symmetric_difference

测试:

def test_get_symmetric_difference(): seq1 = [] seq2 = [] assert get_symmetric_difference(seq1, seq2) == [] seq1 = [1] seq2 = [] assert get_symmetric_difference(seq1, seq2) == [1] seq1 = [1, 2, 3, 4] seq2 = [-2, -1, 5, 6, 7, 8] assert get_symmetric_difference(seq1, seq2) == [-2, -1, 1, 2, 3, 4, 5, 6, 7, 8] seq1 = [ -1, 1, 2, 3, 4, 6, 9, 22, 34] seq2 = [-2, -1, 5, 6, 7, 8, 19, 22, 43] assert get_symmetric_difference(seq1, seq2) == [-2, 1, 2, 3, 4, 5, 7, 8, 9, 19, 34, 43] seq1 = [-2, -1, 5, 6, 7, 8, 19, 22, 43] seq2 = [ -1, 1, 2, 3, 4, 6, 9, 22, 34] assert get_symmetric_difference(seq1, seq2) == [-2, 1, 2, 3, 4, 5, 7, 8, 9, 19, 34, 43] seq1 = [-2, -1, 0, 5, 22, 34] seq2 = [-2, -1, 1, 2, 3, 4, 6, 9, 22, 34] assert get_symmetric_difference(seq1, seq2) == [0, 1, 2, 3, 4, 5, 6, 9] seq1 = [-2, -1, 1, 2, 3, 4, 6, 9, 22, 34] seq2 = [-2, -1, 1, 2, 3, 4, 6, 9, 22, 34] assert get_symmetric_difference(seq1, seq2) == [] seq1 = [7, 7, 7, 7, 7, 7] seq2 = [-2, -1, 1, 2, 3, 4, 6, 9, 22, 34] assert get_symmetric_difference(seq1, seq2) == [-2, -1, 1, 2, 3, 4, 6, 7, 9, 22, 34] seq1 = [-2, -1, 1, 2, 3, 4, 6, 9, 22, 34] seq2 = [7, 7, 7, 7, 7, 7] assert get_symmetric_difference(seq1, seq2) == [-2, -1, 1, 2, 3, 4, 6, 7, 9, 22, 34] seq1 = [-2, -1, 1, 2, 3, 4, 6, 9, 22, 34] seq2 = [-1, -1, 7, 7, 43, 43, 43] assert get_symmetric_difference(seq1, seq2) == [-2, 1, 2, 3, 4, 6, 7, 9, 22, 34, 43] seq1 = [34, 34, 34, 34] seq2 = [7, 34] assert get_symmetric_difference(seq1, seq2) == [7] seq1 = [7, 34] seq2 = [34, 34, 34, 34] assert get_symmetric_difference(seq1, seq2) == [7] seq1 = [7, 34] seq2 = [7, 7, 7, 7, 7] assert get_symmetric_difference(seq1, seq2) == [34] seq1 = [7, 7, 7, 7, 34] seq2 = [7, 7] assert get_symmetric_difference(seq1, seq2) == [34] print("***all tests pass***") test_get_symmetric_difference()

输出:

***all tests pass***

Yes, there is a way. You must take advantage of the fact that the two sequences are sorted. You need to traverse both while comparing the elements one by one, and constructing the symmetric difference as you progress along each sequence.

If you are familiar with big O notation, the complexity of the following code is O(m+n) where m = len(seq1) and n = len(seq2)

The complexity of your algorithm is O(log(m+n)*(m+n)) because you need to sort the resulting set.

Caveat:

This answer is mostly an exercise to demonstrate how to take advantage of a sorted input.

In spite of a better complexity, for most inputs, its execution times are slower than the original poster's code that uses python builtin set methods. In python, sets are implemented in c code under the hood. Pure python will have a hard time beating that. Very large input would be necessary to see any advantage (if any is at all visible). This algorithm is the most efficient, but that does not mean that it is faster - nor does it mean that you should use it: set builtin methods are optimized and battle tested c code; they make for code that is simpler to write, read, understand, debug, and maintain.

code:

def get_symmetric_difference(seq1, seq2): """ computes the symmetric difference of unique elements of seq1 & seq2 as a new sorted list, without mutating the parameters. seq1: a sorted sequence of int seq2: a sorted sequence of int return: a new sorted list containing the symmetric difference of unique elements of seq1 & seq2 """ if not seq1: symmetric_difference = seq2[:] return symmetric_difference if not seq2: symmetric_difference = seq1[:] return symmetric_difference symmetric_difference = [] idx = 0 jdx = 0 last_insert = None last_seen = None while idx < len(seq1) and jdx < len(seq2): s1 = seq1[idx] s2 = seq2[jdx] if s1 == s2: idx += 1 jdx += 1 last_seen = s1 elif s1 < s2: if last_insert != s1 and last_seen != s1: symmetric_difference.append(s1) last_insert = s1 idx += 1 elif s2 < s1: if last_insert != s2 and last_seen != s2: symmetric_difference.append(s2) last_insert = s2 jdx += 1 if len(seq1[idx:]) > len(seq2[jdx:]): for elt in seq1[idx:]: if last_insert != elt and last_seen != elt: symmetric_difference.append(elt) last_insert = elt last_seen = elt else: for elt in seq2[jdx:]: if last_insert != elt and last_seen != elt: symmetric_difference.append(elt) last_insert = elt last_seen = elt return symmetric_difference

tests:

def test_get_symmetric_difference(): seq1 = [] seq2 = [] assert get_symmetric_difference(seq1, seq2) == [] seq1 = [1] seq2 = [] assert get_symmetric_difference(seq1, seq2) == [1] seq1 = [1, 2, 3, 4] seq2 = [-2, -1, 5, 6, 7, 8] assert get_symmetric_difference(seq1, seq2) == [-2, -1, 1, 2, 3, 4, 5, 6, 7, 8] seq1 = [ -1, 1, 2, 3, 4, 6, 9, 22, 34] seq2 = [-2, -1, 5, 6, 7, 8, 19, 22, 43] assert get_symmetric_difference(seq1, seq2) == [-2, 1, 2, 3, 4, 5, 7, 8, 9, 19, 34, 43] seq1 = [-2, -1, 5, 6, 7, 8, 19, 22, 43] seq2 = [ -1, 1, 2, 3, 4, 6, 9, 22, 34] assert get_symmetric_difference(seq1, seq2) == [-2, 1, 2, 3, 4, 5, 7, 8, 9, 19, 34, 43] seq1 = [-2, -1, 0, 5, 22, 34] seq2 = [-2, -1, 1, 2, 3, 4, 6, 9, 22, 34] assert get_symmetric_difference(seq1, seq2) == [0, 1, 2, 3, 4, 5, 6, 9] seq1 = [-2, -1, 1, 2, 3, 4, 6, 9, 22, 34] seq2 = [-2, -1, 1, 2, 3, 4, 6, 9, 22, 34] assert get_symmetric_difference(seq1, seq2) == [] seq1 = [7, 7, 7, 7, 7, 7] seq2 = [-2, -1, 1, 2, 3, 4, 6, 9, 22, 34] assert get_symmetric_difference(seq1, seq2) == [-2, -1, 1, 2, 3, 4, 6, 7, 9, 22, 34] seq1 = [-2, -1, 1, 2, 3, 4, 6, 9, 22, 34] seq2 = [7, 7, 7, 7, 7, 7] assert get_symmetric_difference(seq1, seq2) == [-2, -1, 1, 2, 3, 4, 6, 7, 9, 22, 34] seq1 = [-2, -1, 1, 2, 3, 4, 6, 9, 22, 34] seq2 = [-1, -1, 7, 7, 43, 43, 43] assert get_symmetric_difference(seq1, seq2) == [-2, 1, 2, 3, 4, 6, 7, 9, 22, 34, 43] seq1 = [34, 34, 34, 34] seq2 = [7, 34] assert get_symmetric_difference(seq1, seq2) == [7] seq1 = [7, 34] seq2 = [34, 34, 34, 34] assert get_symmetric_difference(seq1, seq2) == [7] seq1 = [7, 34] seq2 = [7, 7, 7, 7, 7] assert get_symmetric_difference(seq1, seq2) == [34] seq1 = [7, 7, 7, 7, 34] seq2 = [7, 7] assert get_symmetric_difference(seq1, seq2) == [34] print("***all tests pass***") test_get_symmetric_difference()

output:

***all tests pass***

更多推荐

本文发布于:2023-08-07 20:52:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1465954.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:对称   差异   列表   Python   List

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!