bash 4：通过任意分隔符对字符串的子串（n）进行通用访问？(bash 4: Generic access to substring (n) of string by arbitrary delim

编程入门行业动态更新时间:2024-10-23 05:53:13

bash 4：通过任意分隔符对字符串的子串（n）进行通用访问？(bash 4: Generic access to substring (n) of string by arbitrary delimiter?)

假设我有以下字符串： x="number 1;number 2;number 3" 。

通过${x%%";"*}成功访问第一个子字符串，访问最后一个子字符串是通过${x##*";"} ：

$ x="number 1;number 2;number 3" $ echo "front : ${x%%";"*}" #front-most-part number 1 $ echo "back : ${x##*";"}" #back-most-part number 3 $

我如何访问中间部分:(例如， number 2 ）？如果我有（很多......）更多零件然后只有三个，有更好的方法吗？ 换句话说：是否存在访问字符串yyy子字符串n的通用方法，由字符串xxx分隔，其中xxx是一个任意字符串/分隔符？

我已阅读如何在Bash中的分隔符上拆分字符串？，但我特别不想迭代字符串，而是直接访问给定的子字符串。

这特别不要求或分成数组，而是分成子字符串。

Let's assume I have the following string: x="number 1;number 2;number 3".

Access to the first substring is successfull via ${x%%";"*}, access to the last substring is via ${x##*";"}:

$ x="number 1;number 2;number 3" $ echo "front : ${x%%";"*}" #front-most-part number 1 $ echo "back : ${x##*";"}" #back-most-part number 3 $

How do I access the middle part: (eg. number 2)? Is there a better way to do this if I have (many...) more parts then just three? In other words: Is there a generic way of accessing substring No. n of string yyy, delimited by string xxx where xxx is an arbitraty string/delimiter?

I have read How do I split a string on a delimiter in Bash?, but I specifically do not want to iterate over the string but rather directly access a given substring.

This specifically does not ask or a split into arrays, but into sub-strings.

最满意答案

使用固定索引：

x="number 1;number 2;number 3" # Split input into fields by ';' and read the 2nd field into $f2 # Note the need for the *2nd* `unused`, otherwise f2 would # receive the 2nd field *plus the remainder of the line*. IFS=';' read -r unused f2 unused <<<"$x" echo "$f2"

通常，使用数组：

x="number 1;number 2;number 3" # Split input int fields by ';' and read all resulting fields # into an *array* (-a). IFS=';' read -r -a fields <<<"$x" # Access the desired field. ndx=1 echo "${fields[ndx]}"

制约因素 ：

使用IFS ，指定I nternal Field S eparator字符的特殊变量，总是意味着：

只有单个文字字符可以充当字段分隔符。

但是，您可以指定多个字符，在这种情况下，任何字符都将被视为分隔符。

默认的分隔符是$' \t\n' - 即空格，制表符和换行符，它们的运行（多个连续的实例）始终被视为单个分隔符; 例如， 'a b'有2个字段 - 多个空格计为单个分隔符。

相比之下， 对于任何其他字符 ，运行中的字符被单独考虑，因此分开空字段; 例如， 'a;;b'有3个字段 - 每个字段; 是它自己的分隔符，所以之间有一个空字段;; 。

read -r -a ... <<<...技术通常效果很好，只要 ：

输入是单行的 你不关心被遗弃的尾随空场

如果您需要一个解决上述问题的完全通用，强大的解决方案 ，请使用以下变体，这在@gniourf_gniourf中的解释如下：

sep=';' IFS="$sep" read -r -d '' -a fields < <(printf "%s${sep}\0" "$x")

注意需要使用-d ''一次读取多行输入，并且需要使用另一个分隔符实例终止输入以保留尾随的空字段; 需要尾随\0以确保read的退出代码为0 。

With a fixed index:

Generically, using an array:

Constraints:

Using IFS, the special variable specifying the Internal Field Separator characters, invariably means:

Only single, literal characters can act as field separators.

However, you can specify multiple characters, in which case any of them is treated as a separator.

The default separator characters are $' \t\n' - i.e., space, tab, and newline, and runs of them (multiple contiguious instances) are always considered a single separator; e.g., 'a b' has 2 fields - the multiple space count as a single separator.

By contrast, with any other character, characters in a run are considered separately, and thus separate empty fields; e.g., 'a;;b' has 3 fields - each ; is its own separator, so there's an empty field between ;;.

The read -r -a ... <<<... technique generally works well, as long as:

the input is single-line you're not concerned about a trailing empty field getting discarded

If you need a fully generic, robust solution that addresses the issues above, use the following variation, which is explained in @gniourf_gniourf answer here:

sep=';' IFS="$sep" read -r -d '' -a fields < <(printf "%s${sep}\0" "$x")

Note the need to use -d '' to read multi-line input all at once, and the need to terminate the input with another separator instance to preserve a trailing empty field; the trailing \0 is needed to ensure that read's exit code is 0.

更多推荐

本文发布于:2023-07-25 19:41:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1265193.html