组合大量字典键重叠间隔的值

编程入门 行业动态 更新时间:2024-10-16 22:20:17
本文介绍了组合大量字典键重叠间隔的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有一个字典的字典有这样的项目

I have a dictionary of dictionaries that has items like this

all={ 1:{ ('a',123,145):20, ('a',155,170):12, ('b',234,345): 34}, 2:{ ('a',121,135):10, ('a',155,175):28, ('b',230,345): 16}, 3:{ ('a',130,140):20, ('a',150,170):10, ('b',234,345): 30}, ... n: {...} }

编辑:字典名称是由我根据从初始数据读取的文件名任意给出的,我可以使用任何值来命名这些字典。 我想获得每个重叠区域的这些值的总和。显示如何重叠应该是这样的输出是这样的

edit: The dictionary names are arbitrarily given by me according to the file names the initial data is read from, I can use any value I want to name these dictionaries. I would like to get the sum of these values for each overlapping region. The output showing how the overlaps should be like is this

{ ('a',121,122):10, ('a',123,130):30, ('a',131,135):50, ('a',136,140):40,('a',141,145):20, ...}

编辑:每个字典都有不重叠的间隔,所以没有('a',2,10)和('a' ,3,12),但由于开始和结束位置不一样(即字典之间的键不相同),字典之间的间隔重叠。

edit: Each dictionary has non-overlapping intervals so there never is ('a',2,10) and ('a',3,12) in a given dictionary but the intervals overlap between dictionaries as the start and end positions are not the same (i.e keys are not the same between dictionaries).

我不必使用字典数据结构,因为我首先创建了这个字典,如果这更容易做到列表,集合等,我可以得到这些结构之一的数据,我可以工作另一种解决方案也是基于不同的数据结构。

I don't have to use the dictionary data structure and since I have created this dictionary in the first place, if this is more easy to do with lists, sets etc I can get the data in one of those structures, I can work with another solution based on a different data structure as well.

感谢您的帮助。

推荐答案

认为我得到它:基本上你有一堆重叠的间隔,由具有给定厚度的某个位置的条形表示。你会把这些酒吧放在彼此之下,看看他们在任何一点上的浓度如何。

Ok, now i think i get it: Basically you have a bunch of overlapping intervals, represented by bars at a certain position with a given thickness. You would draw these bars below each other and see how thick they are together at any given point.

我认为滥用你的整数位置的事实是最简单/最快的这样做:

I think it's easiest/fastest to abuse the fact that you have integer positions to do this:

all={ 1:{ ('a',123,145):20, ('a',155,170):12, ('b',234,345): 34}, 2:{ ('a',121,135):10, ('a',155,175):28, ('b',230,345): 16}, 3:{ ('a',130,140):20, ('a',150,170):10, ('b',234,345): 30} } from collections import defaultdict summer = defaultdict(int) mini, maxi = 0,0 for d in all.values(): for (name, start, stop), value in d.iteritems(): # im completely ignoring the `name` here, not sure if that's what you want # else just separate the data before doing this ... if mini == 0: mini = start mini, maxi = min(mini, start), max(maxi, stop) for i in range(start, stop+1): summer[i]+=value # now we have the values at each point, very redundant but very fast so far print summer # now we can find the intervals: def get_intervals(points, start, stop): cstart = start for i in range(start, stop+1): if points[cstart] != points[i]: # did the value change ? yield cstart, i-1, points[cstart] cstart = i if cstart != i: yield cstart, i, points[cstart] print list(get_intervals(summer, mini, maxi))

当仅使用它给出的'a'项目时:

When using only the 'a' items it give:

[(121, 122, 10), (123, 129, 30), (130, 135, 50), (136, 140, 40), (141, 145, 20), (146, 149, 0), (150, 154, 10), (155, 170, 50), (171, 175, 28)]

编辑:它只是打我怎么做这个真的简单:

It just hit me how to do this really simple:

from collections import defaultdict from heapq import heappush, heappop class Summer(object): def __init__(self): # its a priority queue, kind of like a sorted list self.hq = [] def additem(self, start, stop, value): # at `start` add it as a positive value heappush(self.hq, (start, value)) # at `stop` subtract that value again heappush(self.hq, (stop, -value)) def intervals(self): hq = self.hq start, val = heappop(hq) while hq: point, value = heappop(hq) yield start, point, val # just maintain the current value and where the interval started val += value start = point assert val == 0 summers = defaultdict(Summer) for d in all.values(): for (name, start, stop), value in d.iteritems(): summers[name].additem(start, stop, value) for name,s in summers.iteritems(): print name, list(s.intervals())

更多推荐

组合大量字典键重叠间隔的值

本文发布于:2023-11-29 20:11:32,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1647507.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:组合   间隔   字典

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!