可互换地使用str和String

编程入门 行业动态 更新时间:2024-10-28 08:20:02
本文介绍了可互换地使用str和String的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

假设我正在尝试使用& str 在Rust中创建一个精美的零拷贝解析器,但是有时我需要修改文本(例如,实现变量替换).我真的很想做这样的事情:

Suppose I'm trying to do a fancy zero-copy parser in Rust using &str, but sometimes I need to modify the text (e.g. to implement variable substitution). I really want to do something like this:

fn main() { let mut v: Vec<&str> = "Hello there $world!".split_whitespace().collect(); for t in v.iter_mut() { if (t.contains("$world")) { *t = &t.replace("$world", "Earth"); } } println!("{:?}", &v); }

但是,当然,由 t.replace()返回的 String 寿命不足.有没有解决这个问题的好方法?也许有一种类型的意思是理想情况下是& str ,但必要时是 String "?或者,也许有一种方法可以使用生存期批注来告诉编译器,返回的 String 应该保持活动状态,直到 main()结束(或与 v )?

But of course the String returned by t.replace() doesn't live long enough. Is there a nice way around this? Perhaps there is a type which means "ideally a &str but if necessary a String"? Or maybe there is a way to use lifetime annotations to tell the compiler that the returned String should be kept alive until the end of main() (or have the same lifetime as v)?

推荐答案

Rust具有您想要的 Cow (写入时克隆)类型.

Rust has exactly what you want in form of a Cow (Clone On Write) type.

use std::borrow::Cow; fn main() { let mut v: Vec<_> = "Hello there $world!".split_whitespace() .map(|s| Cow::Borrowed(s)) .collect(); for t in v.iter_mut() { if t.contains("$world") { *t.to_mut() = t.replace("$world", "Earth"); } } println!("{:?}", &v); }

正如@sellibitze正确指出的那样, to_mut()创建一个新的 String ,这将导致堆分配存储以前的借入值.如果确定只有借用的字符串,则可以使用

as @sellibitze correctly notes, the to_mut() creates a new String which causes a heap allocation to store the previous borrowed value. If you are sure you only have borrowed strings, then you can use

*t = Cow::Owned(t.replace("$world", "Earth"));

如果Vec包含 Cow :: Owned 元素,则仍会丢弃分配.您可以防止使用以下非常脆弱和不安全代码(它确实对UTF-8字符串进行基于字节的直接操作,并且依赖于替换恰好是相同数量的字节这一事实.)在您的for循环中.

In case the Vec contains Cow::Owned elements, this would still throw away the allocation. You can prevent that using the following very fragile and unsafe code (It does direct byte-based manipulation of UTF-8 strings and relies of the fact that the replacement happens to be exactly the same number of bytes.) inside your for loop.

let mut last_pos = 0; // so we don't start at the beginning every time while let Some(pos) = t[last_pos..].find("$world") { let p = pos + last_pos; // find always starts at last_pos last_pos = pos + 5; unsafe { let s = t.to_mut().as_mut_vec(); // operating on Vec is easier s.remove(p); // remove $ sign for (c, sc) in "Earth".bytes().zip(&mut s[p..]) { *sc = c; } } }

请注意,这是完全根据"$ world"->地球"映射量身定制的.任何其他映射都需要在不安全的代码内进行仔细考虑.

Note that this is tailored exactly to the "$world" -> "Earth" mapping. Any other mappings require careful consideration inside the unsafe code.

更多推荐

可互换地使用str和String

本文发布于:2023-10-07 20:03:59,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1470400.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:str   String

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!