高效的Django QuerySet正则表达式

编程入门 行业动态 更新时间:2024-10-23 09:24:00
本文介绍了高效的Django QuerySet正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有一个这样的模型:

class CampaignPermittedURL(models.Model): hostname = models.CharField(max_length=255) path = models.CharField(max_length=255,blank=True)

经常,我将交给一个URL,我可以把它分成一个主机名和路径。我想要的是最终用户能够输入主机名(yahoo)和可能的路径(weddings)。

On a frequent basis, I will be handed a URL, which I can urlsplit into a hostname and a path. What I'd like is for the end user to be able to enter a hostname (yahoo) and possibly a path (weddings).

我想找到一个URL不匹配的主机名/路径组合如下所示:

I'd like to find when a URL does not 'match' that hostname/path combination like so:

  • success: www.yahoo / weddings / newyork
  • 成功:yahoo/weddings
  • 失败: cnn
  • 失败:cnn/weddings
  • success: www.yahoo/weddings/newyork
  • success: yahoo/weddings
  • failure: cnn
  • failure: cnn/weddings

我认为最好的方法是:

url = urlsplit("www.yahoo/weddings/newyork") ### split hostname on . and path on / matches = CampaignPermittedURL.objects.filter(hostname__regex=r'(com|yahoo|www.yahoo)'), \ path__regex=r'(weddings|weddings/newyork)')

有人有更好的想法吗?我正在使用 PostgreSQL ,否则将尝试尝试 Django全文搜索,但我不知道这是否值得,或者如果它真的适合我的需求比这更好。还有其他方法同样快吗?

Does anybody have better ideas? I am using PostgreSQL and would otherwise want to try Django Full Text Search but I'm not sure if that's worth it or if it really fits my needs any better than this. Are there other methods that are equally fast?

请记住,我的方法具有传递给它的URL,并且CampaignPermittedURL对象可能有多百条记录。我正在寻找可扩展/可维护的解决方案,但它也需要高效,因为这将被扩展到几百个电话。

Keep in mind that my method has the URL passed to it and that the CampaignPermittedURL object may have many hundred records. I am looking for extensible/maintainable solutions foremost, but it does also need to be efficient since this will be scaled to several hundred calls a second.

我也很好使用另一个后端(狮身人面像?)但我最关心的

I'm also fine with using another back-end (Sphinx?) but I am most concerned about staying with standard Django to the highest degree possible.

推荐答案

我最终构造了一个'verbose'正则表达式,并使用ORM作为指定在这个问题。这应该是非常快的,而不离开Django:

I ended up constructing a 'verbose' regex and using the ORM as specified in the question. This should be quite fast while not departing from Django:

# >>> url.hostname.split(".") # ["bakery", "yahoo", "com"] host_list = url.hostname.split(".") # Build regex like r"^$|^[.]?com$|^[.]?yahoo\$|^[.]?baking[.]yahoo[.]com$" # Remember that # >>> r'\' # '\\' host_list.reverse() # append_str2 might not be necessary append_str = r"" append_str2 = r"" host_regex = r"^$" for host in host_list: append_str = r"[.]" + host + append_str append_str2 = append_str[3:] host_regex = host_regex + r"|^[.]?" + append_str2 + r"$" # If nothing is in the filter at all, bypass the filter. if CampaignRequiredURL.objects.filter(): if not CampaignRequiredURL.objects.filter(hostname__iregex=host_regex): #Do something based on a hit.

更多推荐

高效的Django QuerySet正则表达式

本文发布于:2023-11-29 23:16:45,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1647929.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:高效   正则表达式   Django   QuerySet

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!