bash 4:通过任意分隔符对字符串的子串(n)进行通用访问?(bash 4: Generic access to substring (n) of string by arbitrary delim

编程入门 行业动态 更新时间:2024-10-23 05:53:13
bash 4:通过任意分隔符对字符串的子串(n)进行通用访问?(bash 4: Generic access to substring (n) of string by arbitrary delimiter?)

假设我有以下字符串: x="number 1;number 2;number 3" 。

通过${x%%";"*}成功访问第一个子字符串,访问最后一个子字符串是通过${x##*";"} :

$ x="number 1;number 2;number 3" $ echo "front : ${x%%";"*}" #front-most-part number 1 $ echo "back : ${x##*";"}" #back-most-part number 3 $

我如何访问中间部分:(例如, number 2 )? 如果我有(很多......)更多零件然后只有三个,有更好的方法吗? 换句话说:是否存在访问字符串yyy子字符串n的通用方法,由字符串xxx分隔,其中xxx是一个任意字符串/分隔符?


我已阅读如何在Bash中的分隔符上拆分字符串? ,但我特别不想迭代字符串,而是直接访问给定的子字符串。

这特别不要求或分成数组,而是分成子字符串。

Let's assume I have the following string: x="number 1;number 2;number 3".

Access to the first substring is successfull via ${x%%";"*}, access to the last substring is via ${x##*";"}:

$ x="number 1;number 2;number 3" $ echo "front : ${x%%";"*}" #front-most-part number 1 $ echo "back : ${x##*";"}" #back-most-part number 3 $

How do I access the middle part: (eg. number 2)? Is there a better way to do this if I have (many...) more parts then just three? In other words: Is there a generic way of accessing substring No. n of string yyy, delimited by string xxx where xxx is an arbitraty string/delimiter?


I have read How do I split a string on a delimiter in Bash?, but I specifically do not want to iterate over the string but rather directly access a given substring.

This specifically does not ask or a split into arrays, but into sub-strings.

最满意答案

使用固定索引:

x="number 1;number 2;number 3" # Split input into fields by ';' and read the 2nd field into $f2 # Note the need for the *2nd* `unused`, otherwise f2 would # receive the 2nd field *plus the remainder of the line*. IFS=';' read -r unused f2 unused <<<"$x" echo "$f2"

通常,使用数组:

x="number 1;number 2;number 3" # Split input int fields by ';' and read all resulting fields # into an *array* (-a). IFS=';' read -r -a fields <<<"$x" # Access the desired field. ndx=1 echo "${fields[ndx]}"

制约因素

使用IFS ,指定I nternal Field S eparator字符的特殊变量, 总是意味着:

只有单个文字字符可以充当字段分隔符。

但是,您可以指定多个字符,在这种情况下, 任何字符都将被视为分隔符。

默认的分隔符是$' \t\n' - 即空格,制表符和换行符,它们的运行 (多个连续的实例)始终被视为单个分隔符; 例如, 'a b'有2个字段 - 多个空格计为单个分隔符。

相比之下, 对于任何其他字符 ,运行中的字符被单独考虑,因此分开字段; 例如, 'a;;b'有3个字段 - 每个字段; 是它自己的分隔符,所以之间有一个空字段;; 。

read -r -a ... <<<...技术通常效果很好,只要

输入是单行的 你不关心被遗弃的尾随空场

如果您需要一个解决上述问题的完全通用,强大的解决方案 ,请使用以下变体,这在@gniourf_gniourf中的解释如下 :

sep=';' IFS="$sep" read -r -d '' -a fields < <(printf "%s${sep}\0" "$x")

注意需要使用-d ''一次读取多行输入,并且需要使用另一个分隔符实例终止输入以保留尾随的空字段; 需要尾随\0以确保read的退出代码为0 。

With a fixed index:

x="number 1;number 2;number 3" # Split input into fields by ';' and read the 2nd field into $f2 # Note the need for the *2nd* `unused`, otherwise f2 would # receive the 2nd field *plus the remainder of the line*. IFS=';' read -r unused f2 unused <<<"$x" echo "$f2"

Generically, using an array:

x="number 1;number 2;number 3" # Split input int fields by ';' and read all resulting fields # into an *array* (-a). IFS=';' read -r -a fields <<<"$x" # Access the desired field. ndx=1 echo "${fields[ndx]}"

Constraints:

Using IFS, the special variable specifying the Internal Field Separator characters, invariably means:

Only single, literal characters can act as field separators.

However, you can specify multiple characters, in which case any of them is treated as a separator.

The default separator characters are $' \t\n' - i.e., space, tab, and newline, and runs of them (multiple contiguious instances) are always considered a single separator; e.g., 'a b' has 2 fields - the multiple space count as a single separator.

By contrast, with any other character, characters in a run are considered separately, and thus separate empty fields; e.g., 'a;;b' has 3 fields - each ; is its own separator, so there's an empty field between ;;.

The read -r -a ... <<<... technique generally works well, as long as:

the input is single-line you're not concerned about a trailing empty field getting discarded

If you need a fully generic, robust solution that addresses the issues above, use the following variation, which is explained in @gniourf_gniourf answer here:

sep=';' IFS="$sep" read -r -d '' -a fields < <(printf "%s${sep}\0" "$x")

Note the need to use -d '' to read multi-line input all at once, and the need to terminate the input with another separator instance to preserve a trailing empty field; the trailing \0 is needed to ensure that read's exit code is 0.

更多推荐

本文发布于:2023-07-25 19:41:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1265193.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:字符串   分隔符   Generic   bash   access

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!