使用XPATH搜索包含＆nbsp;(Using XPATH to search text containing )

编程入门行业动态更新时间:2024-10-21 11:46:20

我使用XPather浏览器来检查我的XPATH表达式在HTML页面上。

我的最终目标是在Selenium中使用这些表达式来测试我的用户界面。

我收到一个HTML文件，其内容类似于此：

我想选择一个包含字符串“   ”的文本的节点。

用正常的字符串，如“abc”，没有问题。我使用类似于//td[text()="abc"]的XPATH。

当我尝试一个XPATH像//td[text()=" "]它什么都不返回。有关“ & ”的文本有特殊的规定吗？

I use XPather Browser to check my XPATH expressions on an HTML page.

My end goal is to use these expressions in Selenium for the testing of my user interfaces.

I got an HTML file with a content similar to this:

I want to select a node with a text containing the string " ".

With a normal string like "abc" there is no problem. I use an XPATH similar to //td[text()="abc"].

When I try with an an XPATH like //td[text()=" "] it returns nothing. Is there a special rule concerning texts with "&" ?

最满意答案

似乎OpenQA ，Selenium背后的人已经解决了这个问题。他们定义了一些变量来明确地匹配空格。在我的情况下，我需要使用类似于//td[text()="${nbsp}"]的XPATH。

我在这里转载了OpenQA关于这个问题的文本（在这里找到）：

HTML自动对元素中的空白进行标准化，忽略前导/尾随空格，并将多余的空格，制表符和换行符转换为单个空格。当Selenium从页面中读取文本时，它会尝试复制此行为，因此您可以忽略HTML中的所有标签和换行符，并根据文本在呈现时在浏览器中的显示方式进行断言。我们通过用单个空格替换所有不可见的空白（包括不间断空格“   ”）来执行此操作。应该保留所有可见的换行符（ <br> ， <p>和<pre>格式的新行）。

我们对HTML Selenese测试用例表的文本使用相同的规范化逻辑。这有很多优点。首先，您不需要查看页面的HTML源代码来弄清楚您的断言应该是什么; “   ”符号对于最终用户是不可见的，因此在编写Selenese测试时不必担心它们。（您不需要将测试用例中的“标记”放在包含“   ”的字段中的assertText中。）您还可以在Selenese标签中添加多余的换行符和空格; 由于我们在测试用例上使用与文本相同的规范化逻辑，因此我们可以确保断言和提取的文本将完全匹配。

当您真的想/需要在测试用例中插入额外的空格时，这会在极少数情况下产生一些问题。例如，您可能需要在如下所示的字段中键入文本：“ foo ”。但是如果您只是在Selenese测试用例中写入<td>foo </td> ，我们将只用一个空格替换您的额外空格。

这个问题有一个简单的解决方法。我们已经在Selenese（ ${space}定义了一个变量，它的值是一个空格。您可以使用${space}插入不会自动修剪的空间，如下所示： <td>foo${space}${space}${space}</td> 。我们还包括一个变量${nbsp} ，您可以使用它来插入一个不间断的空间。

请注意，XPath 不会按照我们的方式对空格进行规范化。如果您需要编写一个像//div[text()="hello world"]的XPath，但链接的HTML是真的“ hello world ” hello world “，你需要在你的Selenese中插入一个真正的”   “测试用例来匹配，如下所示： //div[text()="hello${nbsp}world"] 。

It seems that OpenQA, guys behind Selenium, have already addressed this problem. They defined some variables to explicitely match whitespaces. In my case, I need to use an XPATH similar to //td[text()="${nbsp}"].

I reproduced here the text from OpenQA concerning this issue (found here):

HTML automatically normalizes whitespace within elements, ignoring leading/trailing spaces and converting extra spaces, tabs and newlines into a single space. When Selenium reads text out of the page, it attempts to duplicate this behavior, so you can ignore all the tabs and newlines in your HTML and do assertions based on how the text looks in the browser when rendered. We do this by replacing all non-visible whitespace (including the non-breaking space " ") with a single space. All visible newlines (<br>, <p>, and <pre> formatted new lines) should be preserved.

We use the same normalization logic on the text of HTML Selenese test case tables. This has a number of advantages. First, you don't need to look at the HTML source of the page to figure out what your assertions should be; " " symbols are invisible to the end user, and so you shouldn't have to worry about them when writing Selenese tests. (You don't need to put " " markers in your test case to assertText on a field that contains " ".) You may also put extra newlines and spaces in your Selenese <td> tags; since we use the same normalization logic on the test case as we do on the text, we can ensure that assertions and the extracted text will match exactly.

This creates a bit of a problem on those rare occasions when you really want/need to insert extra whitespace in your test case. For example, you may need to type text in a field like this: "foo ". But if you simply write <td>foo </td> in your Selenese test case, we'll replace your extra spaces with just one space.

This problem has a simple workaround. We've defined a variable in Selenese, ${space}, whose value is a single space. You can use ${space} to insert a space that won't be automatically trimmed, like this: <td>foo${space}${space}${space}</td>. We've also included a variable ${nbsp}, that you can use to insert a non-breaking space.

Note that XPaths do not normalize whitespace the way we do. If you need to write an XPath like //div[text()="hello world"] but the HTML of the link is really "hello world", you'll need to insert a real " " into your Selenese test case to get it to match, like this: //div[text()="hello${nbsp}world"].

更多推荐

本文发布于:2023-07-07 21:39:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1068514.html