Note the different outputs from different versions of the same tag:
<?php // striptags.php
$data = '<br>Each<br/>New<br />Line';
$new = strip_tags($data, '<br>');
var_dump($new); // OUTPUTS string(21) "<br>EachNew<br />Line"
<?php // striptags.php
$data = '<br>Each<br/>New<br />Line';
$new = strip_tags($data, '<br/>');
var_dump($new); // OUTPUTS string(16) "Each<br/>NewLine"
<?php // striptags.php
$data = '<br>Each<br/>New<br />Line';
$new = strip_tags($data, '<br />');
var_dump($new); // OUTPUTS string(11) "EachNewLine"
?>
PHP strip_tags() 函数
定义和用法
strip_tags() 函数剥去 HTML、XML 以及 PHP 的标签。
语法
strip_tags(string,allow)
参数 | 描述 |
---|---|
string | 必需。规定要检查的字符串。 |
allow | 可选。规定允许的标签。这些标签不会被删除。 |
提示和注释
注释:该函数始终会剥离 HTML 注释。这点无法通过 allow 参数改变。
例子
例子 1
<?php
echo strip_tags("Hello <b>world!</b>")
;
?>
输出:
Hello world!
例子 2
<?php
echo strip_tags("Hello <b><i>world!</i></b>","<b>"
);
?>
输出:
Hello world!
PHP String 函数
来自 http://www.w3school.com.cn/php/func_string_strip_tags.asp
来自
strip_tags
(PHP 4, PHP 5)
strip_tags — 从字符串中去除 HTML 和 PHP 标记
说明
string strip_tags ( string
$str
[, string $allowable_tags
] )该函数尝试返回给定的字符串 str
去除空字符、HTML 和 PHP 标记后的结果。它使用与函数 fgetss() 一样的标记去除状态机。
参数
str
输入字符串。
allowable_tags
使用可选的第二个参数指定不被去除的字符列表。
Note:
HTML 注释和 PHP 标签也会被去除。这里是硬编码处理的,所以无法通过
allowable_tags
参数进行改变。
返回值
返回处理后的字符串。
更新日志
版本 | 说明 |
---|---|
5.0.0 | strip_tags() 变为二进制安全的。 |
4.3.0 | HTML 注释总是被删除。 |
范例
Example #1 strip_tags() 范例
<?php
$text = '<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo strip_tags($text);
echo "\n";
// 允许 <p> 和 <a>
echo strip_tags($text, '<p><a>');
?>
以上例程会输出:
Test paragraph. Other text <p>Test paragraph.</p> <a href="#fragment">Other text</a>
注释
Warning
由于 strip_tags() 无法实际验证 HTML,不完整或者破损标签将导致更多的数据被删除。
Warning
该函数不会修改 allowable_tags
参数中指定的允许标记的任何属性,包括 style 和 onmouseover 属性,用户可能会在提交的内容中恶意滥用这些属性,从而展示给其他用户。
add a note User Contributed Notes strip_tags - [10 notes]
CEO at CarPool2Camp dot org
4 years ago
admin at automapit dot com
7 years ago
<?php
function html2txt($document){
$search = array('@<script[^>]*?>.*?</script>@si', // Strip out javascript
'@<[\/\!]*?[^<>]*?>@si', // Strip out HTML tags
'@<style[^>]*?>.*?</style>@siU', // Strip style tags properly
'@<![\s\S]*?--[ \t\n\r]*>@' // Strip multi-line comments including CDATA
);
$text = preg_replace($search, '', $document);
return $text;
}
?>
This function turns HTML into text... strips tags, comments spanning multiple lines including CDATA, and anything else that gets in it's way.
It's a frankenstein function I made from bits picked up on my travels through the web, thanks to the many who have unwittingly contributed!
mariusz.tarnaski at wp dot pl
4 years ago
Hi. I made a function that removes the HTML tags along with their contents:
Function:
<?php
function strip_tags_content($text, $tags = '', $invert = FALSE) {
preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags);
$tags = array_unique($tags[1]);
if(is_array($tags) AND count($tags) > 0) {
if($invert == FALSE) {
return preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text);
}
else {
return preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text);
}
}
elseif($invert == FALSE) {
return preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text);
}
return $text;
}
?>
Sample text:
$text = '<b>sample</b> text with <div>tags</div>';
Result for strip_tags($text):
sample text with tags
Result for strip_tags_content($text):
text with
Result for strip_tags_content($text, '<b>'):
<b>sample</b> text with
Result for strip_tags_content($text, '<b>', TRUE);
text with <div>tags</div>
I hope that someone is useful :)
bzplan at web dot de
1 year ago
a HTML code like this:
<?php
$html = '
<div>
<p style="color:blue;">color is blue</p><p>size is <span style="font-size:200%;">huge</span></p>
<p>material is wood</p>
</div>
';
?>
with <?php $str = strip_tags($html); ?>
... the result is:
$str = 'color is bluesize is huge
material is wood';
notice: the words 'blue' and 'size' grow together :(
and line-breaks are still in new string $str
if you need a space between the words (and without line-break)
use my function: <?php $str = rip_tags($html); ?>
... the result is:
$str = 'color is blue size is huge material is wood';
the function:
<?php
// --------------------------------------------------------------
function rip_tags($string) {
// ----- remove HTML TAGs -----
$string = preg_replace ('/<[^>]*>/', ' ', $string);
// ----- remove control characters -----
$string = str_replace("\r", '', $string); // --- replace with empty space
$string = str_replace("\n", ' ', $string); // --- replace with space
$string = str_replace("\t", ' ', $string); // --- replace with space
// ----- remove multiple spaces -----
$string = trim(preg_replace('/ {2,}/', ' ', $string));
return $string;
}
// --------------------------------------------------------------
?>
the KEY is the regex pattern: '/<[^>]*>/'
instead of strip_tags()
... then remove control characters and multiple spaces
:)
tom at cowin dot us
3 years ago
With most web based user input of more than a line of text, it seems I get 90% 'paste from Word'. I've developed this fn over time to try to strip all of this cruft out. A few things I do here are application specific, but if it helps you - great, if you can improve on it or have a better way - please - post it...
<?php
function strip_word_html($text, $allowed_tags = '<b><i><sup><sub><em><strong><u><br>')
{
mb_regex_encoding('UTF-8');
//replace MS special characters first
$search = array('/‘/u', '/’/u', '/“/u', '/”/u', '/—/u');
$replace = array('\'', '\'', '"', '"', '-');
$text = preg_replace($search, $replace, $text);
//make sure _all_ html entities are converted to the plain ascii equivalents - it appears
//in some MS headers, some html entities are encoded and some aren't
$text = html_entity_decode($text, ENT_QUOTES, 'UTF-8');
//try to strip out any C style comments first, since these, embedded in html comments, seem to
//prevent strip_tags from removing html comments (MS Word introduced combination)
if(mb_stripos($text, '/*') !== FALSE){
$text = mb_eregi_replace('#/\*.*?\*/#s', '', $text, 'm');
}
//introduce a space into any arithmetic expressions that could be caught by strip_tags so that they won't be
//'<1' becomes '< 1'(note: somewhat application specific)
$text = preg_replace(array('/<([0-9]+)/'), array('< $1'), $text);
$text = strip_tags($text, $allowed_tags);
//eliminate extraneous whitespace from start and end of line, or anywhere there are two or more spaces, convert it to one
$text = preg_replace(array('/^\s\s+/', '/\s\s+$/', '/\s\s+/u'), array('', '', ' '), $text);
//strip out inline css and simplify style tags
$search = array('#<(strong|b)[^>]*>(.*?)</(strong|b)>#isu', '#<(em|i)[^>]*>(.*?)</(em|i)>#isu', '#<u[^>]*>(.*?)</u>#isu');
$replace = array('<b>$2</b>', '<i>$2</i>', '<u>$1</u>');
$text = preg_replace($search, $replace, $text);
//on some of the ?newer MS Word exports, where you get conditionals of the form 'if gte mso 9', etc., it appears
//that whatever is in one of the html comments prevents strip_tags from eradicating the html comment that contains
//some MS Style Definitions - this last bit gets rid of any leftover comments */
$num_matches = preg_match_all("/\<!--/u", $text, $matches);
if($num_matches){
$text = preg_replace('/\<!--(.)*--\>/isu', '', $text);
}
return $text;
}
?>
cesar at nixar dot org
7 years ago
Here is a recursive function for strip_tags like the one showed in the stripslashes manual page.
<?php
function strip_tags_deep($value)
{
return is_array($value) ?
array_map('strip_tags_deep', $value) :
strip_tags($value);
}
// Example
$array = array('<b>Foo</b>', '<i>Bar</i>', array('<b>Foo</b>', '<i>Bar</i>'));
$array = strip_tags_deep($array);
// Output
print_r($array);
?>
kai at froghh dot de
4 years ago
a function that decides if < is a start of a tag or a lower than / lower than + equal:
<?php
function lt_replace($str){
return preg_replace("/<([^[:alpha:]])/", '<\\1', $str);
}
?>
It's to be used before strip_slashes.
salavert at~ akelos
7 years ago
<?php
/**
* Works like PHP function strip_tags, but it only removes selected tags.
* Example:
* strip_selected_tags('<b>Person:</b> <strong>Salavert</strong>', 'strong') => <b>Person:</b> Salavert
*/
function strip_selected_tags($text, $tags = array())
{
$args = func_get_args();
$text = array_shift($args);
$tags = func_num_args() > 2 ? array_diff($args,array($text)) : (array)$tags;
foreach ($tags as $tag){
if(preg_match_all('/<'.$tag.'[^>]*>(.*)<\/'.$tag.'>/iU', $text, $found)){
$text = str_replace($found[0],$found[1],$text);
}
}
return $text;
}
?>
Hope you find it useful,
Jose Salavert
brettz9 AAT yah
4 years ago
Works on shortened <?...?> syntax and thus also will remove XML processing instructions.
mshaffer
1 month ago
Below was a note on "strip_tags" page that got removed off of PHP.net ... I found this note useful, and use the code in parsing before "stripping tags" ... I don't know why in the world you would delete this one, but keep others ... your review system is a bit disturbing ...
On your page you have a warning about how data may be lost, but you delete a user-contributed comment that helps prevent that?
======================
aleksey at favor dot com dot ua 24-Feb-2011 01:06
strip_tags destroys the whole HTML behind the tags with invalid attributes. Like <img src="/images/image.jpg""> (look, there is an odd quote before >.)
So I wrote function which fixes unsafe attributes and replaces odd " and ' quotes with " and '.
<?php
function fix_unsafe_attributes($s) {
$out = false;
while (preg_match('/<([A-Za-z])[^>]*?>/', $s, $i, PREG_OFFSET_CAPTURE)) { // find where the tag begins
$i = $i[1][1]+1;
$out.= substr($s, 0, $i);
$s = substr($s, $i);
// scan attributes and find odd " and '
while (((($i1 = strpos($s, '"')) || 1) && (($i2 = strpos($s, '\'')) || 1)) && ($i1 !== false || $i2 !== false) &&
(($i = (int)(($i1 !== false) && ($i2 !== false) ? ($i1 < $i2 ? $i1 : $i2) : ($i1 == false ? $i2 : $i1))) !== false) &&
((($c = strpos($s, '>')) === false) || ($i < $c))) {
$c = $s{$i};
if (($i < 1) || ($s{$i-1} != '=')) {
$out.= substr($s, 0, $i).($s{$i} == '"' ? '"' : '''); // replace odd " and '
$s = substr($s, $i+1);
}else {
$i++;
$out.= substr($s, 0, $i);
$s = substr($s, $i);
if (($i = strpos($s, $c)) !== false) {
$i++;
$out.= substr($s, 0, $i);
$s = substr($s, $i);
}
}
}
}
return $out.$s;
}
?>
Maybe this function can be rewritten with simple regular expression but I have no luck to make it quickly.
来自
http://php.net/manual/zh/function.strip-tags.php