问题描述
限时送ChatGPT账号..我想编写一个 XSD 来限制类型为 xsd:token 的有效 XML 元素的内容,以便在验证时它们与包装在 xsd:string 中的相同内容无法区分.
I want write an XSD to restrict the content of valid XML elements of type xsd:token such that at validation they would indistinguishable from the same content wrapped in xsd:string.
即它们不包含回车 (#xD)、换行 (#xA) 或制表符 (#x9) 字符,以空格 (#x20) 字符开头或结尾,并且不包含两个或多个相邻空格的序列字符.
I.e. they do not contain the carriage return (#xD), line feed (#xA) nor tab (#x9) characters, begin or end with a space (#x20) character, and do not include a sequence of two or more adjacent space characters.
我认为要使用的正则表达式是这样的:
I think the regular expression to use is this:
\S+( \S+)*
(一些非空格,可选[一个或多个非空格旁边的单个空格],包括总是非空格关闭)
(some non-whitespace, optional [single spaces next to one or more non-whitespaces], including always non-whitespace to close out)
这适用于各种正则表达式测试工具,但我似乎无法使用 oXygen XML 编辑器进行检查;字符串中的双空格、前导和尾随空格、制表符和换行符似乎允许 XML 实例仍然通过验证.
This works with various regex testing tools but I can't seem to check it using oXygen XML Editor; double spaces, leading and trailing spaces, tabs, and line breaks in the strings seem to allow the XML instance to still pass validation.
这是 XSD 实现:
<xs:simpleType name="Tokenized500Type">
<xs:restriction base="xs:token">
<xs:maxLength value="500"/>
<xs:minLength value="1"/>
<xs:pattern value="\S+( \S+)*"/>
</xs:restriction>
</xs:simpleType>
有什么特点
XML或
XSD或
oXygen XML 编辑器这会阻止这项工作吗?
推荐答案
你原来的([^\s])+( [^\s]+)*([^\s])*
regex 包含一些冗余模式:它匹配并捕获 1+ 个非空格的每次迭代,然后匹配 0+ 个空格序列和 1+ 个非空格,然后 再次 尝试匹配和捕获每个非空白的迭代.
Your original ([^\s])+( [^\s]+)*([^\s])*
regex contains some redundant patterns: it matches and captures each iteration of 1+ non-whitespaces, then matches 0+ sequences of space and 1+ non-whitespaces, and then again tries to match and capture each iteration of a non-whitespace.
你可以使用类似的,但更短的
You may use a similar, but shorter
\S+( \S+)*
由于默认情况下锚定 XML Schema regex,因此表达式匹配:
Since XML Schema regex is anchored by default, there expression matches:
\S+
- 除了空格之外的一个或多个字符,特别是 
(空格)、\t
(制表符), \n
(换行)和 \r
(返回)( \S+)*
- 零个或多个空格序列和 1+ 个空格.
\S+
- one or more chars other than whitespace, specifically 
(space), \t
(tab), \n
(newline) and \r
(return)
( \S+)*
- zero or more sequences of a space and 1+ whitespaces.
此表达式不允许重复的连续空格和前导/尾随位置没有空格.
This expression disallows duplicate consecutive spaces and no spaces at leading/trailing position.
以下是应该如何使用正则表达式:
Here is how the regex should be used:
<xs:simpleType name="Tokenized500Type">
<xs:restriction base="xs:string">
<xs:pattern value="\S+( \S+)*"/>
<xs:maxLength value="500"/>
<xs:minLength value="1"/>
</xs:restriction>
</xs:simpleType>
这篇关于验证 xsd:token 和 xsd:string 完全相同的一组字符串的正则表达式是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
更多推荐
[db:关键词]
发布评论