验证 xsd:token 和 xsd:string 完全相同的一组字符串的正则表达式是什么?

编程入门行业动态更新时间:2024-10-24 16:33:00

本文介绍了验证 xsd:token 和 xsd:string 完全相同的一组字符串的正则表达式是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

限时送ChatGPT账号..

我想编写一个 XSD 来限制类型为 xsd:token 的有效 XML 元素的内容，以便在验证时它们与包装在 xsd:string 中的相同内容无法区分.

I want write an XSD to restrict the content of valid XML elements of type xsd:token such that at validation they would indistinguishable from the same content wrapped in xsd:string.

即它们不包含回车 (#xD)、换行 (#xA) 或制表符 (#x9) 字符，以空格 (#x20) 字符开头或结尾，并且不包含两个或多个相邻空格的序列字符.

I.e. they do not contain the carriage return (#xD), line feed (#xA) nor tab (#x9) characters, begin or end with a space (#x20) character, and do not include a sequence of two or more adjacent space characters.

我认为要使用的正则表达式是这样的:

I think the regular expression to use is this:

\S+( \S+)*

(一些非空格，可选[一个或多个非空格旁边的单个空格]，包括总是非空格关闭)

(some non-whitespace, optional [single spaces next to one or more non-whitespaces], including always non-whitespace to close out)

这适用于各种正则表达式测试工具，但我似乎无法使用 oXygen XML 编辑器进行检查；字符串中的双空格、前导和尾随空格、制表符和换行符似乎允许 XML 实例仍然通过验证.

This works with various regex testing tools but I can't seem to check it using oXygen XML Editor; double spaces, leading and trailing spaces, tabs, and line breaks in the strings seem to allow the XML instance to still pass validation.

这是 XSD 实现:

<xs:simpleType name="Tokenized500Type">
    <xs:restriction base="xs:token">
      <xs:maxLength value="500"/>
      <xs:minLength value="1"/>
      <xs:pattern value="\S+( \S+)*"/>
    </xs:restriction>
  </xs:simpleType>

有什么特点

XML

或

XSD

或

oXygen XML 编辑器

这会阻止这项工作吗?

推荐答案

你原来的([^\s])+( [^\s]+)*([^\s])* regex 包含一些冗余模式:它匹配并捕获 1+ 个非空格的每次迭代，然后匹配 0+ 个空格序列和 1+ 个非空格，然后再次尝试匹配和捕获每个非空白的迭代.

Your original ([^\s])+( [^\s]+)*([^\s])* regex contains some redundant patterns: it matches and captures each iteration of 1+ non-whitespaces, then matches 0+ sequences of space and 1+ non-whitespaces, and then again tries to match and capture each iteration of a non-whitespace.

你可以使用类似的，但更短的

You may use a similar, but shorter

\S+( \S+)*

由于默认情况下锚定 XML Schema regex，因此表达式匹配:

Since XML Schema regex is anchored by default, there expression matches:

\S+ - 除了空格之外的一个或多个字符，特别是 (空格)、\t(制表符), \n(换行)和 \r(返回)( \S+)* - 零个或多个空格序列和 1+ 个空格. \S+ - one or more chars other than whitespace, specifically  (space), \t (tab), \n (newline) and \r (return) ( \S+)* - zero or more sequences of a space and 1+ whitespaces.

此表达式不允许重复的连续空格和前导/尾随位置没有空格.

This expression disallows duplicate consecutive spaces and no spaces at leading/trailing position.

以下是应该如何使用正则表达式:

Here is how the regex should be used:

<xs:simpleType name="Tokenized500Type">
  <xs:restriction base="xs:string">
    <xs:pattern value="\S+( \S+)*"/>
    <xs:maxLength value="500"/>
    <xs:minLength value="1"/>
  </xs:restriction>
</xs:simpleType>

这篇关于验证 xsd:token 和 xsd:string 完全相同的一组字符串的正则表达式是什么?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

更多推荐

[db:关键词]

本文发布于:2023-04-30 04:49:08，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1390125.html