我正在尝试编写SQL查询,该查询从给定的字符串中提取所有数字序列.例如,像这样的字符串:
I'm trying to write SQL query which pull out all number sequences from given string. For example from string like:
'<rej:idnymu s7fn we2dfgdg="dfgrgdfgfZszik">278579133</rej:> <rej:idnymu k3op d5dfgdg="dfgrgdfgfZszik">118545112</rej:> <rej:idnymu s1rd s9dfgdg="dfgrgdfgfZszik">271238461</rej:>'我想提取类似的内容
278579133, 118545112, 271238461最重要和最难的事情是忽略出现单个数字的情况,例如:"s7fn"或"we2d".
The most important thing and the most hard to do is to ignore cases when single number occurs like: "s7fn" or "we2d".
到目前为止,我已经能够进行如下查询:
So far i was able to do query like this:
SELECT Regexp_replace( '<rej:idnymu s7fn we2dfgdg="dfgrgdfgfZszik">278579133</rej:> <rej:idnymu k3op d5dfgdg="dfgrgdfgfZszik">118545112</rej:> <rej:idnymu s1rd s9dfgdg="dfgrgdfgfZszik">271238461</rej:>' , '([^0-9])', '') RESULT FROM dual;但是不知道如何分隔数字序列,更重要的是如何摆脱我之前提到的单个数字.
But do not have any idea how to separeta number sequences and more important how to get rid of single numbers what i mentioned earlier.
有人可以建议如何实现这一目标吗?
Could someone suggest how to achieve this?
更新
整个字符串或多或少是这样的:
Whole string looks more or less like this:
'<efwewfsdf/> <asdf> <qwe> <rej:aaa ns2=rnsjvnsvjrlntjnbsnbltdbk">64-0203-0875/4</rej:aaa> <rej:bbb ns2=bmtbrbpnrtusrnbdbd k">06396</rej:bbb> <rej:ccc ns2=vmkfbkmbkfmgkmfgkbgfk">4567650</rej:ccc> <rej:ddd ns2=zxczxczczxczxczxk">402341198</rej:ddd> <rej:eee ns2=asdasdasdazxck">723112</rej:eee> <rej:eee ns2=adasdasdawawqweqwek">342253</rej:eee> <rej:eee ns2=bgbgfnnhnhnhnhnhnk">656345</rej:eee> <rej:fff ns2=ewretyrtyghnk">352</rej:fff> <rej:ggg ns2="vzxczdvdbdfgs">asd.Gdsa</rej:ggg> <rej:hhh ns2="zcsasdqwfwgerbyn">W1_22</rej:hhh> </qwe> </asdf>'通常它要更长一些,我从blob获得它.查询如下:
normally it is much longer and i am obtaining it from blob. Query looks like this:
SELECT REGEXP_REPLACE((REGEXP_SUBSTR(utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 300)) || utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 2300)), '<rej:aaa[^<]*')), '.*>', '') aaa, REGEXP_REPLACE((REGEXP_SUBSTR(utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 300)) || utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 2300)), '<rej:bbb[^<]*')), '.*>', '') bbb, REGEXP_REPLACE((REGEXP_SUBSTR(utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 300)) || utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 2300)), '<rej:ccc[^<]*')), '.*>', '') ccc, REGEXP_REPLACE((REGEXP_SUBSTR(utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 300)) || utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 2300)), '<rej:ddd[^<]*')), '.*>', '') ddd, REGEXP_REPLACE((REGEXP_SUBSTR(utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 300)) || utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 2300)), '<rej:eee[^<]*')), '.*>', '') eee, REGEXP_REPLACE((REGEXP_SUBSTR(utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 300)) || utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 2300)), '<rej:eee[^<]*')), '.*>', '') eee, REGEXP_REPLACE((REGEXP_SUBSTR(utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 300)) || utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 2300)), '<rej:eee[^<]*')), '.*>', '') eee, REGEXP_REPLACE((REGEXP_SUBSTR(utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 300)) || utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 2300)), '<rej:fff[^<]*')), '.*>', '') fff, REGEXP_REPLACE((REGEXP_SUBSTR(utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 300)) || utl_raw.cast_to_varchar2(dbms_lob.substr(tn.blob_value, 2000, 2300)), '<rej:ggg[^<]*')), '.*>', '') ggg, tn.blob_value FROM table_name tn问题是"rej:eee"标签中的3个是相同的,我不知道如何区分它们.另外,字符串具有1-5个"rej:eee"标签,因此它不是固定数字.
The problem is that 3 of "rej:eee" tags are identical and i do not know how to distinguish them. Also strings has 1-5 "rej:eee" tags so it is not fixed number.
希望它现在更加透明,对我的英语不好对不起.
Hope it is more transparent now and sorry for my bad english.
谢谢 Rav
推荐答案尝试一下:
with t(d) as ( select '<efwewfsdf/> <asdf> <qwe> <rej:aaa ns2=rnsjvnsvjrlntjnbsnbltdbk">64-0203-0875/4</rej:aaa> <rej:bbb ns2=bmtbrbpnrtusrnbdbd k">06396</rej:bbb> <rej:ccc ns2=vmkfbkmbkfmgkmfgkbgfk">4567650</rej:ccc> <rej:ddd ns2=zxczxczczxczxczxk">402341198</rej:ddd> <rej:eee ns2=asdasdasdazxck">723112</rej:eee> <rej:eee ns2=adasdasdawawqweqwek">342253</rej:eee> <rej:eee ns2=bgbgfnnhnhnhnhnhnk">656345</rej:eee> <rej:fff ns2=ewretyrtyghnk">352</rej:fff> <rej:ggg ns2="vzxczdvdbdfgs">asd.Gdsa</rej:ggg> <rej:hhh ns2="zcsasdqwfwgerbyn">W1_22</rej:hhh> </qwe> </asdf>' from dual ), t1(d) as ( select SUBSTR(d, REGEXP_INSTR(d, '<[A-Za-z]{3}:', 1, level), REGEXP_INSTR(d, '</rej:[A-Za-z]{3}>', 1, level) - REGEXP_INSTR(d, '<[A-Za-z]{3}:', 1, level)) from t connect by rownum < LENGTH(d) - LENGTH(REGEXP_REPLACE(d, ':[A-Za-z]{3}>', '')) ) select REGEXP_REPLACE(d, '^<rej:.*">', '') ff from t1 FF -------------- 64-0203-0875/4 06396 4567650 402341198 723112 342253 656345 352 asd.Gdsa W1_22更多推荐
如何使用Oracle正则表达式从字符串中提取所有数字序列?
发布评论