正则表达式针对XPath之后的标记？

编程入门行业动态更新时间:2024-10-23 19:33:07

本文介绍了正则表达式针对XPath之后的标记？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我面临的问题是我必须为不同的输入做一个字符串选择，因此我想用正则表达式来做这些，以从这些字符串中获取所需的数据。正则表达式将分别来自每个字符串的配置。（因为它们不同）

下面的字符串是用XPath得到的： // body / div / table / tbody / tr / td / p [5] 但我无法再深入研究这个问题来检索正确的数据，或者我可以吗？

我在举例如下：

Kontaktdaten des Absenders： 名称：通缉数据 电话： 
从这个字符串我试图得到通缉的数据
到目前为止，我的正则表达式如下所示：
（？<=< \ / （？）（。*）（？= ）

 <强>名称：其中/强>想要的数据 <强>电话：其中/强> < a dir ='ltr'href ='tel：XXXXXXXXX'x-apple-data-detectors ='true'x-apple-data-detectors-type ='telephone'x-apple-data-detector-result =' 3' > XXXXXXXXX< / A>
我以为我可以用重复的组来解决这个问题
$ b $ （）（。*）（？= ））+）
但是，如果没有重复组，这将返回相同的输出。
我知道我可以围绕这个正则表达式构建一个{}循环来获得相同的输出，但是由于这是我必须做的唯一正则表达式（但意味着我必须为所有其他数据更改它），所以我是想知道是否可以在正则表达式中做到这一点。

感谢您的支持。
解决方案
正则表达式是解析标记的错误工具。您有一个正确的XML解析工具XPath。使用它完成工作：
这个XPath，
strong [ 。='Name：'] / following-sibling :: text（）[1]
到您原来的XPath中，
// body / div / table / tbody / tr / td / p [5] / strong [ 。='Name：'] / following-sibling :: text（）[1]
按照要求选择紧跟在名称： 标签之后的文本节点，并且不需要正则表达式来覆盖标记。

Have been searching for the solution to my problem now already for a while and have been playing around regex101 for a while but cannot find a solution.

The problem I am facing is that I have to make a string select for different inputs, thus I wanted to do this with Regular expressions to get the wanted data from these strings. The regular expression will come from a configuration for each string seperately. (since they differ)

The string below is gained with a XPath: //body/div/table/tbody/tr/td/p[5] but I cannot dig any lower into this anymore to retrieve the right data or can I ?

The string I am using at the moment as example is the following:
Kontaktdaten des Absenders: Name: Wanted data Telefon: <a dir='ltr' href='tel:XXXXXXXXX' x-apple-data-detectors='true' x-apple-data-detectors-type='telephone' x-apple-data-detectors-result='3'>XXXXXXXXX</a> 
From this string I am trying to get the "Wanted data"

My regular expression so far is the following:
(?<=<\/strong> )(.*)(?= )
But this returns the whole:
 Name: Wanted data Telefon: <a dir='ltr' href='tel:XXXXXXXXX' x-apple-data-detectors='true' x-apple-data-detectors-type='telephone' x-apple-data-detectors-result='3'>XXXXXXXXX</a>
I thought I could solve this with a repeat group
((:?(?<=<\/strong> )(.*)(?= ))+)
But this returns the same output as without the repeat group.

I know I could build a for { } loop around this regex to gain the same output, but since this is the only regular expression I have to do this for (but means I have to change it for all the other data) I was wondering if it is possible to do this in a regular expression.

Thank you for the support already so far.
解决方案
Regex is the wrong tool for parsing markup. You have a proper XML parsing tool, XPath, in hand. Finish the job with it:

This XPath,
strong[.='Name:']/following-sibling::text()[1]
when appended to your original XPath,
//body/div/table/tbody/tr/td/p[5]/strong[.='Name:']/following-sibling::text()[1]
will finish the job of selecting the text node immediately following the Name: label, as requested, with no regex hacks over markup required.

更多推荐

正则表达式针对XPath之后的标记？

本文发布于:2023-11-29 23:18:17，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1647934.html

版权声明:本站内容均来自互联网，仅供演示用，请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系，我们将在24小时内删除。

标记正则表达式 XPath

上一篇：默认函数参数的有效表达式

下一篇：表达式不能用作分配目标

发布评论取消回复

评论列表（有 0 条评论）

最近发表

荆门网站建设的重要性

win10蓝屏终止代码CRITICAL_PROCESS_DIED解决方法

您可以尝试添加 --skip-broken 选项来解决该问题您可以尝试执行：rpm -Va --nofiles --nodigest 解决方案

关于无线网络波动大的解决办法

Windows10 关于系统中断CPU占用过高导致电脑变卡的解决办法

VS 2019 点击页面自动定位到解决方案资源管理器目录位置

（亲测解决）VMware打开需要半天才进入、打开系统很慢、运行很慢解决办法

Typora官网下载的最新版本mac10.13以下版本用不了的解决办法

成功解决ModuleNotFoundError: No module named ‘torch._C‘

MySQL:由于找不到VCRUNTIME140_1.dll，无法继续执行代码。重新安装程序可能会解决此问题

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍！

热门文章

从源“http://localhost:5173”访问“...”处的 XMLHttpRequest 已被 CORS 策略阻止

币安API错误代码1102，未发送强制参数“时间戳”

如果我在bot telegram nodejs中使用editMessageMedia，我如何制作标题

在 Node.js 中从网络流创建 blob

使用 Node.js / ES6 如何设置 dotenv 文件的自定义路径？

使用 NODE.JS 和 html5 实现低延迟（50 毫秒）视频流

如何从nodejs连接laravel>laravel

使用nodejs观看目录

如果文件包含特定字符串，如何跳过 GitHub 工作流程步骤？

FirebaseError：无法从.env加载环境变量

标签列表

文件

如何在

Python

系统

java

方法

数据

错误

windows

函数

android

linux

教程

如何使用

代码

字符串

计算机

电脑

服务器

NET

应用程序

数组

PHP

MySQL

SQL

对象

项目

程序

数据库

word