将自定义降价转换为HTML？(Convert custom markdown to HTML?)

编程入门行业动态更新时间:2024-10-08 14:49:39

挑战：我们的用户可以访问“contentEditable”DIV，其中JS库在其中插入HTML。以下是我们认为HTML应该在contentEditable中显示的方式：

[data-user="12345" data-userId="678910"] John Smith [/] ...Blablabla some other text...

我们将这个HTML交给PHP，我们执行strip_tags（）。这应该给我们：

[data-user="12345" data-userId="678910"]John Smith[/] ...Blablabla some other text...

问题：在页面上呈现文本时，我们想知道是否有一种安全/可靠的方法将上述自定义降价转换为（在将其交给Handlebars.js之前）：

John Smith ...Blablabla some other text...

原因：这确保我们安全地处理用户生成的内容，同时保持用户在contentEditable“漂亮”（“时尚 - 蓝色 - 按钮”类）中生成降价。

如果您有任何建议可以使整个过程变得更简单，我们可以更改我们的降价格式。

非常感谢！

Challenge : Our users have access to an "contentEditable" DIV in which a JS library inserts HTML in it. Here's how we thought the HTML should show up in the contentEditable :

We hand over this HTML to PHP, where we execute strip_tags(). This should give us :

[data-user="12345" data-userId="678910"]John Smith[/] ...Blablabla some other text...

Question : When rendering the text on the page, we were wondering if there was a secure/reliable way to have the above custom markdown converted to (before handing it to Handlebars.js) :

John Smith ...Blablabla some other text...

Why : This assures us that the user generated content was handled safely, all while keeping the user generated markdown in the contentEditable "pretty" ("stylish-blue-button" class).

If you have any suggestions to make this whole process simpler, we're opened to changing our markdown's format.

Thank you so much!

最满意答案

你可以使用这样的正则表达式：

$string = ' [data-user="12345" data-userId="678910"] John Smith [/] ...Blablabla some other text...'; echo preg_replace('~\[(data-user="\d+")\h+(data-userId="\d+")\]\s*(.+?)\s*\[/\]\s*(.*)~s', '$3$4', trim(strip_tags($string)));

这是一个regex101演示，正好解释了正则表达式的作用。如果您有特殊问题，请询问。

输出：

John Smith...Blablabla some other text...

一些快速的正则表达式注释。

*是一个量词，表示前面字符的零个或多个。 +是一个量词，表示前一个字符的一个或多个（也就是说是必需的）。 \s是一个空白字符。 \h是一个水平空间。 . 是任何单个字符。 \d是单个数字（0-9）。 ()捕获的顺序捕获他们捕获的组， $1 ， $2等。

快速回顾一下这个正则表达式：这个\[/\]被读作文字[/] 。反斜杠会转义[] ，否则会创建一个字符类（意味着只允许/字符存在）。

多实例：

$string = ' [data-user="12345" data-userId="678910"] John Smith [/] ...Blablabla some other text... [data-user="12345" data-userId="678910"] John Smith [/] ...Blablabla some other text... [data-user="12345" data-userId="678910"] John Smith [/] ...Blablabla some other text...'; echo preg_replace('~\s*\[(data-user="\d+")\h+(data-userId="\d+")\]\s*(.+?)\s*\[/\]\s*~s', '$3', trim(strip_tags($string)));

输出：

John Smith...Blablabla some other text...John Smith...Blablabla some other text...John Smith...Blablabla some other text...

对于更宽松的ID，只需将\d+更改为[a-zA-Z0-9 ]+ 。

所以：

preg_replace('~\s*\[(data-user="\d+")\h+(data-userId="[a-zA-Z0-9 ]+")\]\s*(.+?)\s*\[/\]\s*~s'

You could use a regex like this:

$string = ' [data-user="12345" data-userId="678910"] John Smith [/] ...Blablabla some other text...'; echo preg_replace('~\[(data-user="\d+")\h+(data-userId="\d+")\]\s*(.+?)\s*\[/\]\s*(.*)~s', '$3$4', trim(strip_tags($string)));

Here's a regex101 demo explaining exactly what that regex is doing. If you have a particular questions please ask.

Output:

John Smith...Blablabla some other text...

A few quick regex notes.

* is a quantifier meaning zero or more of the preceding character. + is a quantifier meaning one or more (aka it is required) of the preceding character. \s is a whitespace character. \h is a horizontal space. . is any single character. \d is a single number (0-9). () are capturing groups they capture into $1, $2 etc. in the order they were found.

Looking at that regex again a quick note: This \[/\] is read as literal [/]. The backslashes are escaping the [] which otherwise would create a character class (meaning only the / character would be allowed there).

Multi-instances:

$string = ' [data-user="12345" data-userId="678910"] John Smith [/] ...Blablabla some other text... [data-user="12345" data-userId="678910"] John Smith [/] ...Blablabla some other text... [data-user="12345" data-userId="678910"] John Smith [/] ...Blablabla some other text...'; echo preg_replace('~\s*\[(data-user="\d+")\h+(data-userId="\d+")\]\s*(.+?)\s*\[/\]\s*~s', '$3', trim(strip_tags($string)));

Output:

For looser Ids just change the \d+ to [a-zA-Z0-9 ]+.

So:

preg_replace('~\s*\[(data-user="\d+")\h+(data-userId="[a-zA-Z0-9 ]+")\]\s*(.+?)\s*\[/\]\s*~s'

更多推荐

本文发布于:2023-08-02 04:15:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1369106.html