在特定字符之后从字符串中提取数字(Extract a number from a string, after a certain character)

系统教程行业动态更新时间:2024-06-14 16:57:17

在特定字符之后从字符串中提取数字(Extract a number from a string, after a certain character) python

Ayyyy，我需要一些帮助。我有以下字符串，总是以“char，num”格式显示：

s = "abcdef,12" v = "gbhjjj,699"

我希望得到逗号后面的数字，我该怎么做，而不用逗号分隔字符串作为分隔符？

我尝试了s[-2:]和v[-3:] ，但是如何在不知道数字位数的情况下使其工作？

Ayyyy, I need some help. I have the following strings, always in "char,num" format:

s = "abcdef,12" v = "gbhjjj,699"

I want to get just the digits after the comma, how do I do that without splitting the string with the comma as a delimiter?

I tried s[-2:] and v[-3:] which works, but how do I make it work without knowing the number of digits?

最满意答案

假设：

你知道字符串中有一个逗号，所以你不必搜索整个字符串来找出是否存在。你知道这个模式是'many_not_digits,few_digits'所以在逗号两边的左/右部分的大小之间存在很大的不平衡。你可以到达字符串的末尾而不用走它，你可以在Python中使用，因为字符串索引是不变的

然后你可以从最后开始，向后走，寻找逗号，这对于你的例子来说不是整体工作，而是从左边寻找逗号。

使用Python代码工作比使用C编写的Python引擎代码慢得多，对吧？那真的会更快吗？

制作一个字符串“aaaaa ....，12” 使用timeit模块来比较每种方法 - 分割或右行走。 Timeit执行一百万次的一些代码。延长“aaaaaaaaaaaaaaaa .... 12”的长度，使其变得极端。

他们如何比较？

字符串分割：1400“a”的运行一百万次花费了1秒。字符串分割：4000“a”的运行一百万次需要2秒钟。正确的步行：1400“a”跑了一百万次，耗时0.4秒。正确的散步：999,999“a”跑了一百万次，耗时0.4秒。

！

from timeit import timeit _split = """num = x.split(',')[-1]""" _rwalk = """ i=-1 while x[i] != ',': i-=1 num = x[i+1:] """ print(timeit(_split, setup='x="a"*1400 + ",12"')) print(timeit(_rwalk, setup='x="a"*999999 + ",12"'))

例如

1.0063155219977489 # "aaa...,12" for 1400 chars, string split 0.4027107510046335 # "aaa...,12" for 999999 chars, rwalked. Faster.

在repl.it上在线试用

我不认为这在算法上比O（n）好，但是由于我的假设条件的限制，使得你拥有比str.split（）更多的知识，并且可以利用它来跳过大部分字符串并击败它在实践中 - 文本部分越长，数字部分越短，受益越多。

Assuming:

You know there is a comma in the string, so you don't have to search the entire string to find out if there is or not. You know the pattern is 'many_not_digits,few_digits' so there is a big imbalance between the size of the left/right parts either side of the comma. You can get to the end of the string without walking it, which you can in Python because string indexing is constant time

Then you could start from the end and walk backwards looking for the comma, which would be less overall work for your examples than walking from the left looking for the comma.

Doing work in Python code is way slower than using Python engine code written in C, right? So would it really be faster?

Make a string "aaaaa....,12" use the timeit module to compare each approach - split, or right-walk. Timeit does a million runs of some code. Extend the length of "aaaaaaaaaaaaaaaa....,12" to make it extreme.

How do they compare?

String split: 1400 "a"'s run a million times took 1 second. String split: 4000 "a"'s run a million times took 2 seconds. Right walk: 1400 "a"'s run a million times took 0.4 seconds. Right walk: 999,999 "a"'s run a million times took ... 0.4 seconds.

e.g.

1.0063155219977489 # "aaa...,12" for 1400 chars, string split 0.4027107510046335 # "aaa...,12" for 999999 chars, rwalked. Faster.

Try it online at repl.it

I don't think this is algorithmically better than O(n), but with the constraints of the assumptions I made you have more knowledge than str.split() has, and can leverage that to skip walking most of the string and beat it in practise - and the longer the text part, and shorter the digit part, the more you benefit.

更多推荐

本文发布于:2023-04-12 20:03:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/dzcp/03dcfec91faa0e3aa2755029d235d41e.html