从 R 中的单个字符串中提取所有数字

编程入门行业动态更新时间:2024-10-21 23:22:56

本文介绍了从 R 中的单个字符串中提取所有数字的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

假设你有一个字符串:

strLine <- "The transactions (on your account) were as follows: 0 3,000 (500) 0 2.25 (1,200)"

是否有一个函数可以将数字剥离到一个数组/向量中，产生以下所需的解决方案:

Is there a function that strips out the numbers into an array/vector producing the following required solution:

result <- c(0, 3000, -500, 0, 2.25, -1200)?

即

result[3] = -500

请注意，数字以会计形式显示，因此负数出现在 () 之间.此外，您可以假设只有数字出现在数字第一次出现的右侧.我对 regexp 不是很好，所以如果你能提供帮助，我将不胜感激.另外，我不想假设字符串总是相同的，所以我希望在第一个数字的位置之前去除所有单词(和任何特殊字符).

Notice, the numbers are presented in accounting form so negative numbers appear between (). Also, you can assume that only numbers appear to the right of the first occurance of a number. I am not that good with regexp so would appreciate it if you could help if this would be required. Also, I don't want to assume the string is always the same so I am looking to strip out all words (and any special characters) before the location of the first number.

推荐答案

library(stringr) x <- str_extract_all(strLine,"\\(?[0-9,.]+\\)?")[[1]] > x [1] "0" "3,000" "(500)" "0" "2.25" "(1,200)"

将括号更改为否定:

x <- gsub("\\((.+)\\)","-\\1",x) x [1] "0" "3,000" "-500" "0" "2.25" "-1,200"

然后 as.numeric() 或 taRifx::destring 完成(下一版本 destring 将默认支持否定，因此 keep 选项将是必需的):

And then as.numeric() or taRifx::destring to finish up (the next version of destring will support negatives by default so the keep option won't be necessary):

library(taRifx) destring( x, keep="0-9.-") [1] 0 3000 -500 0 2.25 -1200

或:

as.numeric(gsub(",","",x)) [1] 0 3000 -500 0 2.25 -1200

更多推荐

从 R 中的单个字符串中提取所有数字

本文发布于:2023-06-13 17:41:51，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/686694.html