Django查询自然排序

编程入门 行业动态 更新时间:2024-10-22 04:26:24
本文介绍了Django查询自然排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

假设我有这个Django模型:

Let's say I have this Django model:

class Question(models.Model): question_code = models.CharField(max_length=10)

我在数据库中有15,000个问题.

and I have 15k questions in the database.

我想按字母数字 question_code 对其进行排序.这是一个非常经典的问题,已经在以下文章中进行了讨论:

I want to sort it by question_code, which is alphanumeric. This is quite a classical problem and has been talked about in:

  • blog.codinghorror/sorting- for-humans-natural-sort-order/
  • Python是否具有内置函数字符串自然排序吗?
  • blog.codinghorror/sorting-for-humans-natural-sort-order/
  • Does Python have a built in function for string natural sort?

我尝试了第二个链接中的代码(复制到下面,更改了一下),并注意到最多需要3秒才能对数据进行排序.为了确保函数的性能,我编写了一个测试,该测试创建了一个100k个随机字母数字字符串的列表.对该列表进行排序仅需0.76s.那是怎么回事?

I tried the code in the 2nd link (which is copied below, changed a bit), and notice it takes up to 3 seconds to sort the data. To make sure about the function's performance, I write a test which creates a list of 100k random alphanumeric string. It takes only 0.76s to sort that list. So what's happening?

这就是我的想法.该函数需要获取每个问题的 question_code 进行比较,因此调用此函数对15k值进行排序意味着要分别请求mysql 15k的时间.这就是为什么要花这么长时间的原因.任何的想法?对于Django,自然排序的任何解决方案都可以吗?非常感谢!

This is what I think. The function needs to get the question_code of each question for comparing, thus calling this function to sort 15k values means requesting mysql 15k separate times. And this is the reason why it takes so long. Any idea? And any solution to natural sort for Django in general? Thanks a lot!

def natural_sort(l, ascending, key=lambda s:s): def get_alphanum_key_func(key): convert = lambda text: int(text) if text.isdigit() else text return lambda s: [convert(c) for c in re.split('([0-9]+)', key(s))] sort_key = get_alphanum_key_func(key) return sorted(l, key=sort_key, reverse=ascending)

推荐答案

据我所知,还没有通用的Django解决方案.您可以通过构建id/question_code查找结构来减少内存使用并限制数据库查询

As far as I'm aware there isn't a generic Django solution to this. You can reduce your memory usage and limit your db queries by building an id/question_code lookup structure

from natsort import natsorted question_code_lookup = Question.objects.values('id','question_code') ordered_question_codes = natsorted(question_code_lookup, key=lambda i: i['question_code'])

假设您要分页结果,然后可以对ordered_question_codes进行切片,执行另一个查询以检索所有需要的问题,并根据它们在该切片中的位置对其进行排序

Assuming you want to page the results you can then slice up ordered_question_codes, perform another query to retrieve all the questions you need order them according to their position in that slice

#get the first 20 questions ordered_question_codes = ordered_question_codes[:20] question_ids = [q['id'] for q in ordered_question_codes] questions = Question.objects.filter(id__in=question_ids) #put them back into question code order id_to_pos = dict(zip((question_ids), range(len(question_ids)))) questions = sorted(questions, key = lambda x: id_to_pos[x.id])

如果查找结构仍然使用太多内存,或者排序时间太长,那么您将不得不提出更高级的内容.当然,这无法很好地扩展到庞大的数据集

If the lookup structure still uses too much memory, or takes too long to sort, then you'll have to come up with something more advanced. This certainly wouldn't scale well to a huge dataset

更多推荐

Django查询自然排序

本文发布于:2023-10-14 22:03:01,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1492343.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:自然   Django

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!