使用正则表达式查找包含节点的特定文本(Finding with regular expressions specific text wrapped around node)

编程入门 行业动态 更新时间:2024-10-26 14:35:47
使用正则表达式查找包含节点的特定文本(Finding with regular expressions specific text wrapped around node)

我对XSLT很新,而且我不是程序员,对于我可能愚蠢的问题感到抱歉。

我需要找到某些引用,如下所示:

BSK StPO-`<emphasis role="smallcaps">Burger,</emphasis>` Art. 4 N 5

包含引文的文本节点可以位于不同的父元素内,例如para或footnote 。

比我想要使用引用的部分作为id将整个引用包装在refid元素中。

<refid multi-idref="K_BSK_STPO-JSTPO_StPO_Art4_5"> BSK STGB I-`<span class="smallcaps">Burger,</span>` Art. 4 N 5 </refid>`

问题是emphasis因素:我无法找到“围绕”它的方法。 我找到了类似问题的答案 ,我试图将它应用到我的问题但我没有成功。 此脚本部分未找到任何引用。

这是我的代码的一部分。 $DokumentName指的是全局定义的参数。 引文中带有罗马数字的部分是可选的:

<xsl:template match="text()[matches(., 'BSK\s+(\p{L}{2,5})\s+(I|II|III|IV|V|VI|VII)?\p{P}')]">
  <xsl:variable name="vCur" select="."/>
  <xsl:variable name="pContent" select="string(.)"/>
  <xsl:analyze-string select="$pContent" regex="BSK\s+(\p{{L}}{{2,5}})\s+(I|II|III|IV|V|VI|VII)?\p{{P}}" flags="i">
    <xsl:matching-substring>
      <xsl:variable name="figureToTargetId">
        <xsl:choose>
          <xsl:when test="matches(., 'BSK\s+(\p{L}{2,5})\s+(I|II|III|IV|V|VI|VII)?\p{P}')">                  
            <xsl:analyze-string select="." regex="(\p{{L}}{{2,5}})\s+(I|II|III|IV|V|VI|VII)">
              <xsl:matching-substring>
                <xsl:value-of select="concat($DokumentName, '_', regex-group(1), regex-group(2))"/>
              </xsl:matching-substring>
            </xsl:analyze-string>
          </xsl:when>
          <xsl:otherwise>
            <xsl:analyze-string select="." regex="(\p{{L}}{{2,5}})">
              <xsl:matching-substring>
                <xsl:value-of select="concat($DokumentName, '_', regex-group(1))"/>
              </xsl:matching-substring>
            </xsl:analyze-string>
          </xsl:otherwise>
        </xsl:choose>   
      </xsl:variable>
      <xsl:variable name="figureFromTargetId">
        <xsl:if test="matches($vCur, 'BSK\s+(\p{L}{2,5})\s+(I|II|III|IV|V|VI|VII)?\p{P}')">
          <xsl:analyze-string select="string($vCur/following-sibling::emphasis[1]/following-sibling::*[1])" regex=",?Art\.\s+(d+)\s+N\s+(d+)">
            <xsl:matching-substring>
              <xsl:value-of
                select="concat('_Art', regex-group(1), '_', regex-group(2))"/>
            </xsl:matching-substring>
          </xsl:analyze-string>
        </xsl:if>
      </xsl:variable>
      <xsl:element name="ref-multi-id">
        <xsl:attribute name="multi-idref">
          <xsl:value-of select="concat($figureToTargetId, $figureToTargetId)"/>
        </xsl:attribute>
        <xsl:value-of select="."/>
        <xsl:if test="matches($vCur, 'BSK\s+(\p{L}{2,5})\s+(I|II|III|IV|V|VI|VII)?\p{P}')">
          <xsl:apply-templates select="$vCur/following-sibling::emphasis[1]" mode="copy-style"/>
          <xsl:value-of select="$vCur/following-sibling::emphasis[1]/following-sibling::*[1][matches(.,',?Art\.\s+(d+)\s+N\s+(d+)')]"/>
        </xsl:if>
      </xsl:element>
    </xsl:matching-substring>
    <xsl:non-matching-substring>
      <xsl:value-of select="."/>
    </xsl:non-matching-substring>
  </xsl:analyze-string>
</xsl:template>

<xsl:template match="emphasis[@role='smallcaps']" mode="copy-style">
  <xsl:element name="span">
    <xsl:attribute name="class">
      <xsl:value-of select="@role"/>
    </xsl:attribute>
    <xsl:apply-templates/>
  </xsl:element>
</xsl:template>
 

任何帮助将非常感激!

I'm very new to XSLT and I'm not a programmer, so sorry for my possibly stupid question.

I need to find certain citations which can look like this:

BSK StPO-`<emphasis role="smallcaps">Burger,</emphasis>` Art. 4 N 5

The text node which contains the citations can be inside different parent elements, e.g. para, or footnote.

Than I want to wrap the whole citation in a refid element using parts of the citation as id.

<refid multi-idref="K_BSK_STPO-JSTPO_StPO_Art4_5"> BSK STGB I-`<span class="smallcaps">Burger,</span>` Art. 4 N 5 </refid>`

The problem is the emphasis element: I can't find a way "around" it. I found this answer to a similar question and I tried to apply it to my problem but I didn't succeed. This script part doesn't find any citations.

Here is the part of my code. $DokumentName refers to a parameter which was defined globally. The part with the Roman numbers in the citation is optional:

<xsl:template match="text()[matches(., 'BSK\s+(\p{L}{2,5})\s+(I|II|III|IV|V|VI|VII)?\p{P}')]">
  <xsl:variable name="vCur" select="."/>
  <xsl:variable name="pContent" select="string(.)"/>
  <xsl:analyze-string select="$pContent" regex="BSK\s+(\p{{L}}{{2,5}})\s+(I|II|III|IV|V|VI|VII)?\p{{P}}" flags="i">
    <xsl:matching-substring>
      <xsl:variable name="figureToTargetId">
        <xsl:choose>
          <xsl:when test="matches(., 'BSK\s+(\p{L}{2,5})\s+(I|II|III|IV|V|VI|VII)?\p{P}')">                  
            <xsl:analyze-string select="." regex="(\p{{L}}{{2,5}})\s+(I|II|III|IV|V|VI|VII)">
              <xsl:matching-substring>
                <xsl:value-of select="concat($DokumentName, '_', regex-group(1), regex-group(2))"/>
              </xsl:matching-substring>
            </xsl:analyze-string>
          </xsl:when>
          <xsl:otherwise>
            <xsl:analyze-string select="." regex="(\p{{L}}{{2,5}})">
              <xsl:matching-substring>
                <xsl:value-of select="concat($DokumentName, '_', regex-group(1))"/>
              </xsl:matching-substring>
            </xsl:analyze-string>
          </xsl:otherwise>
        </xsl:choose>   
      </xsl:variable>
      <xsl:variable name="figureFromTargetId">
        <xsl:if test="matches($vCur, 'BSK\s+(\p{L}{2,5})\s+(I|II|III|IV|V|VI|VII)?\p{P}')">
          <xsl:analyze-string select="string($vCur/following-sibling::emphasis[1]/following-sibling::*[1])" regex=",?Art\.\s+(d+)\s+N\s+(d+)">
            <xsl:matching-substring>
              <xsl:value-of
                select="concat('_Art', regex-group(1), '_', regex-group(2))"/>
            </xsl:matching-substring>
          </xsl:analyze-string>
        </xsl:if>
      </xsl:variable>
      <xsl:element name="ref-multi-id">
        <xsl:attribute name="multi-idref">
          <xsl:value-of select="concat($figureToTargetId, $figureToTargetId)"/>
        </xsl:attribute>
        <xsl:value-of select="."/>
        <xsl:if test="matches($vCur, 'BSK\s+(\p{L}{2,5})\s+(I|II|III|IV|V|VI|VII)?\p{P}')">
          <xsl:apply-templates select="$vCur/following-sibling::emphasis[1]" mode="copy-style"/>
          <xsl:value-of select="$vCur/following-sibling::emphasis[1]/following-sibling::*[1][matches(.,',?Art\.\s+(d+)\s+N\s+(d+)')]"/>
        </xsl:if>
      </xsl:element>
    </xsl:matching-substring>
    <xsl:non-matching-substring>
      <xsl:value-of select="."/>
    </xsl:non-matching-substring>
  </xsl:analyze-string>
</xsl:template>

<xsl:template match="emphasis[@role='smallcaps']" mode="copy-style">
  <xsl:element name="span">
    <xsl:attribute name="class">
      <xsl:value-of select="@role"/>
    </xsl:attribute>
    <xsl:apply-templates/>
  </xsl:element>
</xsl:template>
 

Any help would be really appreciated!

最满意答案

您的正则表达式与您显示的字符串不匹配,因为您需要有连字符的空格。 看起来像

BSK\s+(\p{L}{2,5})\s+(I|II|III|IV|V|VI|VII)?\p{P}

应该

BSK\s+(\p{L}{2,5})(\s+(I|II|III|IV|V|VI|VII))?\p{P}

This is finally the working code, I forgot simply to add the "mode"-attribute, I had to take into account a few more citation alternatives and I had to get rid of redundant nodes and node parts.

<xsl:template match="text()[matches(., 'BSK\s+(\p{L}{2,5})(\s+(I|II|III|IV|V|VI|VII))?\p{P}')]" mode="copy-style"> <xsl:variable name="vCur" select="."/> <xsl:variable name="pContent" select="string(.)"/> <xsl:analyze-string select="$pContent" regex="BSK\s+(\p{{L}}{{2,5}})(\s+(I|II|III|IV|V|VI|VII))?\p{{P}}" flags="i"> <xsl:matching-substring> <xsl:variable name="figureToTargetId"> <xsl:choose> <xsl:when test="matches(., '(BSK)\s+(\p{L}{2,5})\s+(I|II|III|IV|V|VI|VII)\p{P}')"> <xsl:analyze-string select="." regex="(\p{{L}}{{2,5}})\s+(I|II|III|IV|V|VI|VII)"> <xsl:matching-substring> <xsl:value-of select="concat('K_BSK_', regex-group(1), regex-group(2), '_', regex-group(1))"/> </xsl:matching-substring> </xsl:analyze-string> </xsl:when> <xsl:when test="matches(., '(BSK)\s+(StPO)\p{P}') "> <xsl:analyze-string select="." regex="(BSK)\s+(\p{{L}}{{2,5}})"> <xsl:matching-substring> <xsl:value-of select="concat('K_BSK_STPO-JSTPO_', regex-group(2))"/> </xsl:matching-substring> </xsl:analyze-string> </xsl:when> <xsl:when test="matches(., '(BSK)\s+(JStPO)\p{P}') "> <xsl:analyze-string select="." regex="(BSK)\s+(\p{{L}}{{2,5}})"> <xsl:matching-substring> <xsl:value-of select="concat('K_BSK_STPO-JSTPO_', regex-group(2))"/> </xsl:matching-substring> </xsl:analyze-string> </xsl:when> <xsl:when test="matches(., 'BSK\s+(\p{L}{2,5})\p{P}') "> <xsl:analyze-string select="." regex="BSK\s+(\p{{L}}{{2,5}})"> <xsl:matching-substring> <xsl:value-of select="concat('K_BSK_', regex-group(1), '_', regex-group(1))"/> </xsl:matching-substring> </xsl:analyze-string> </xsl:when> </xsl:choose> </xsl:variable> <xsl:variable name="figureFromTargetId"> <xsl:if test="matches($vCur, 'BSK\s+(\p{L}{2,5})(\s+(I|II|III|IV|V|VI|VII))?\p{P}')"> <xsl:analyze-string select="string($vCur/following-sibling::emphasis[1]/following-sibling::text()[1])" regex="^,?(\s+Vor)?\s+Art\.(\s+|\p{{Zs}})(\p{{N}}{{1,4}})\s+N(\s+|\p{{Zs}})(\p{{N}}{{1,4}})"> <xsl:matching-substring> <xsl:choose> <xsl:when test="contains(., 'Vor')"> <xsl:value-of select="concat('_VorArt', regex-group(3), '_', regex-group(5))"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="concat('_Art', regex-group(3), '_', regex-group(5))"/> </xsl:otherwise> </xsl:choose> </xsl:matching-substring> </xsl:analyze-string> </xsl:if> </xsl:variable> <xsl:element name="ref-multi-id"> <xsl:attribute name="multi-idref"> <xsl:value-of select="concat($figureToTargetId, $figureFromTargetId)"/> </xsl:attribute> <xsl:value-of select="."/> <xsl:if test="matches($vCur, 'BSK\s+(\p{L}{2,5})(\s+(I|II|III|IV|V|VI|VII))?\p{P}')"> <xsl:apply-templates select="$vCur/following-sibling::emphasis[1]" mode="match"/> </xsl:if> <xsl:analyze-string select="string($vCur/following-sibling::emphasis[1]/following-sibling::text()[1])" regex="(^,?(\s+Vor)?\s+Art\.(\s+|\p{{Zs}})(\p{{N}}{{1,4}})\s+N(\s+|\p{{Zs}})(\p{{N}}{{1,4}}))"> <xsl:matching-substring> <xsl:value-of select="regex-group(1)"/> </xsl:matching-substring> </xsl:analyze-string> </xsl:element> </xsl:matching-substring> <xsl:non-matching-substring> <xsl:copy-of select="."/> </xsl:non-matching-substring> </xsl:analyze-string> </xsl:template> <xsl:template match="emphasis[@role='smallcaps'][not(preceding-sibling::node()[1][self::text() and matches(., 'BSK\s+(\p{L}{2,5})(\s+(I|II|III|IV|V|VI|VII))?\p{P}')])]" mode="copy-style"> <xsl:element name="span"> <xsl:attribute name="class"> <xsl:value-of select="@role"/> </xsl:attribute> <xsl:apply-templates/> </xsl:element> </xsl:template> <xsl:template match="emphasis[@role='smallcaps'][preceding-sibling::node()[1][self::text() and matches(., 'BSK\s+(\p{L}{2,5})(\s+(I|II|III|IV|V|VI|VII))?\p{P}')]]" mode="match"> <xsl:element name="span"> <xsl:attribute name="class"> <xsl:value-of select="@role"/> </xsl:attribute> <xsl:apply-templates/> </xsl:element> </xsl:template> <xsl:template mode="copy-style" match="text()[matches(., '^,?(\s+Vor)?\s+Art\.(\s+|\p{Zs})(\p{N}{1,4})\s+N(\s+|\p{Zs})(\p{N}{1,4})') and preceding-sibling::emphasis[@role='smallcaps'][1] and matches(preceding-sibling::emphasis[1]/preceding-sibling::text()[1], 'BSK\s+(\p{L}{2,5})(\s+(I|II|III|IV|V|VI|VII))?\p{P}')]"> <xsl:variable name="pContent" select="string(.)"/> <xsl:analyze-string select="$pContent" regex="^,?(\s+Vor)?\s+Art\.(\s+|\p{{Zs}})(\p{{N}}{{1,4}})\s+N(\s+|\p{{Zs}})(\p{{N}}{{1,4}})"> <xsl:matching-substring/> <xsl:non-matching-substring> <xsl:copy-of select="."/> </xsl:non-matching-substring> </xsl:analyze-string> </xsl:template>

更多推荐

本文发布于:2023-07-22 20:47:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1223301.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:节点   文本   正则表达式   Finding   regular

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!