欢迎各位兄弟 发布技术文章

这里的技术是共享的

You are here

PHP strip_tags() 函数

shiping1 的头像

PHP strip_tags() 函数

定义和用法

strip_tags() 函数剥去 HTML、XML 以及 PHP 的标签。

语法

strip_tags(string,allow)
参数描述
string必需。规定要检查的字符串。
allow可选。规定允许的标签。这些标签不会被删除。

提示和注释

注释:该函数始终会剥离 HTML 注释。这点无法通过 allow 参数改变。

例子

例子 1

<?php
echo strip_tags("Hello <b>world!</b>");
?>

输出:

Hello world!

例子 2

<?php
echo strip_tags("Hello <b><i>world!</i></b>","<b>");
?>

输出:

Hello world!

PHP String 函数
 来自 http://www.w3school.com.cn/php/func_string_strip_tags.asp

来自

strip_tags

(PHP 4, PHP 5)

strip_tags从字符串中去除 HTML 和 PHP 标记

说明

string strip_tags ( string $str [, string $allowable_tags ] )

该函数尝试返回给定的字符串 str 去除空字符、HTML 和 PHP 标记后的结果。它使用与函数 fgetss() 一样的标记去除状态机。

参数

 

str

输入字符串。

allowable_tags

使用可选的第二个参数指定不被去除的字符列表。

Note:

HTML 注释和 PHP 标签也会被去除。这里是硬编码处理的,所以无法通过 allowable_tags 参数进行改变。

返回值

返回处理后的字符串。

更新日志

 

版本说明
5.0.0strip_tags() 变为二进制安全的。
4.3.0HTML 注释总是被删除。

范例

 

Example #1 strip_tags() 范例

<?php
$text 
'<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo 
strip_tags($text);
echo 
"\n";

// 允许 <p> 和 <a>
echo strip_tags($text'<p><a>');
?>

以上例程会输出:

Test paragraph. Other text
<p>Test paragraph.</p> <a href="#fragment">Other text</a>

注释

Warning

由于 strip_tags() 无法实际验证 HTML,不完整或者破损标签将导致更多的数据被删除。

Warning

该函数不会修改 allowable_tags 参数中指定的允许标记的任何属性,包括 styleonmouseover 属性,用户可能会在提交的内容中恶意滥用这些属性,从而展示给其他用户。

参见

 



stripcslashes> <strcspn
[edit] Last updated: Fri, 01 Nov 2013
 
add a note add a note User Contributed Notes strip_tags - [10 notes]
CEO at CarPool2Camp dot org
4 years ago
Note the different outputs from different versions of the same tag:

<?php // striptags.php
$data = '<br>Each<br/>New<br />Line';
$new  = strip_tags($data, '<br>');
var_dump($new);  // OUTPUTS string(21) "<br>EachNew<br />Line"

<?php // striptags.php
$data = '<br>Each<br/>New<br />Line';
$new  = strip_tags($data, '<br/>');
var_dump($new); // OUTPUTS string(16) "Each<br/>NewLine"

<?php // striptags.php
$data = '<br>Each<br/>New<br />Line';
$new  = strip_tags($data, '<br />');
var_dump($new); // OUTPUTS string(11) "EachNewLine"
?>
admin at automapit dot com
7 years ago
<?php
function html2txt($document){
$search = array('@<script[^>]*?>.*?</script>@si'// Strip out javascript
              
'@<[\/\!]*?[^<>]*?>@si',            // Strip out HTML tags
              
'@<style[^>]*?>.*?</style>@siU',    // Strip style tags properly
              
'@<![\s\S]*?--[ \t\n\r]*>@'         // Strip multi-line comments including CDATA
);
$text = preg_replace($search, '', $document);
return
$text;
}

?>

This function turns HTML into text... strips tags, comments spanning multiple lines including CDATA, and anything else that gets in it's way.

It's a frankenstein function I made from bits picked up on my travels through the web, thanks to the many who have unwittingly contributed!
mariusz.tarnaski at wp dot pl
4 years ago
Hi. I made a function that removes the HTML tags along with their contents:

Function:
<?php
function strip_tags_content($text, $tags = '', $invert = FALSE) {

 
preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags);
 
$tags = array_unique($tags[1]);
   
  if(
is_array($tags) AND count($tags) > 0) {
    if(
$invert == FALSE) {
      return
preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text);
    }
    else {
      return
preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text);
    }
  }
  elseif(
$invert == FALSE) {
    return
preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text);
  }
  return
$text;
}

?>

Sample text:
$text = '<b>sample</b> text with <div>tags</div>';

Result for strip_tags($text):
sample text with tags

Result for strip_tags_content($text):
 text with

Result for strip_tags_content($text, '<b>'):
<b>sample</b> text with

Result for strip_tags_content($text, '<b>', TRUE);
 text with <div>tags</div>

I hope that someone is useful :)
bzplan at web dot de
1 year ago
a HTML code like this:

<?php
$html
= '
<div>
<p style="color:blue;">color is blue</p><p>size is <span style="font-size:200%;">huge</span></p>
<p>material is wood</p>
</div>
'
;
?>

with <?php $str = strip_tags($html); ?>
... the result is:

$str = 'color is bluesize is huge
material is wood';

notice: the words 'blue' and 'size' grow together :(
and line-breaks are still in new string $str

if you need a space between the words (and without line-break)
use my function: <?php $str = rip_tags($html); ?>
... the result is:

$str = 'color is blue size is huge material is wood';

the function:

<?php
// --------------------------------------------------------------

function rip_tags($string) {
   
   
// ----- remove HTML TAGs -----
   
$string = preg_replace ('/<[^>]*>/', ' ', $string);
   
   
// ----- remove control characters -----
   
$string = str_replace("\r", '', $string);    // --- replace with empty space
   
$string = str_replace("\n", ' ', $string);   // --- replace with space
   
$string = str_replace("\t", ' ', $string);   // --- replace with space
   
    // ----- remove multiple spaces -----
   
$string = trim(preg_replace('/ {2,}/', ' ', $string));
   
    return
$string;

}


// --------------------------------------------------------------
?>

the KEY is the regex pattern: '/<[^>]*>/'
instead of strip_tags()
... then remove control characters and multiple spaces
:)
tom at cowin dot us
3 years ago
With most web based user input of more than a line of text, it seems I get 90% 'paste from Word'. I've developed this fn over time to try to strip all of this cruft out. A few things I do here are application specific, but if it helps you - great, if you can improve on it or have a better way - please - post it...

<?php

   
function strip_word_html($text, $allowed_tags = '<b><i><sup><sub><em><strong><u><br>')
    {
       
mb_regex_encoding('UTF-8');
       
//replace MS special characters first
       
$search = array('/&lsquo;/u', '/&rsquo;/u', '/&ldquo;/u', '/&rdquo;/u', '/&mdash;/u');
       
$replace = array('\'', '\'', '"', '"', '-');
       
$text = preg_replace($search, $replace, $text);
       
//make sure _all_ html entities are converted to the plain ascii equivalents - it appears
        //in some MS headers, some html entities are encoded and some aren't
       
$text = html_entity_decode($text, ENT_QUOTES, 'UTF-8');
       
//try to strip out any C style comments first, since these, embedded in html comments, seem to
        //prevent strip_tags from removing html comments (MS Word introduced combination)
       
if(mb_stripos($text, '/*') !== FALSE){
           
$text = mb_eregi_replace('#/\*.*?\*/#s', '', $text, 'm');
        }
       
//introduce a space into any arithmetic expressions that could be caught by strip_tags so that they won't be
        //'<1' becomes '< 1'(note: somewhat application specific)
       
$text = preg_replace(array('/<([0-9]+)/'), array('< $1'), $text);
       
$text = strip_tags($text, $allowed_tags);
       
//eliminate extraneous whitespace from start and end of line, or anywhere there are two or more spaces, convert it to one
       
$text = preg_replace(array('/^\s\s+/', '/\s\s+$/', '/\s\s+/u'), array('', '', ' '), $text);
       
//strip out inline css and simplify style tags
       
$search = array('#<(strong|b)[^>]*>(.*?)</(strong|b)>#isu', '#<(em|i)[^>]*>(.*?)</(em|i)>#isu', '#<u[^>]*>(.*?)</u>#isu');
       
$replace = array('<b>$2</b>', '<i>$2</i>', '<u>$1</u>');
       
$text = preg_replace($search, $replace, $text);
       
//on some of the ?newer MS Word exports, where you get conditionals of the form 'if gte mso 9', etc., it appears
        //that whatever is in one of the html comments prevents strip_tags from eradicating the html comment that contains
        //some MS Style Definitions - this last bit gets rid of any leftover comments */
       
$num_matches = preg_match_all("/\<!--/u", $text, $matches);
        if(
$num_matches){
             
$text = preg_replace('/\<!--(.)*--\>/isu', '', $text);
        }
        return
$text;
    }

?>
cesar at nixar dot org
7 years ago
Here is a recursive function for strip_tags like the one showed in the stripslashes manual page.

<?php
function strip_tags_deep($value)
{
  return
is_array($value) ?
   
array_map('strip_tags_deep', $value) :
   
strip_tags($value);
}


// Example
$array = array('<b>Foo</b>', '<i>Bar</i>', array('<b>Foo</b>', '<i>Bar</i>'));
$array = strip_tags_deep($array);

// Output
print_r($array);
?>
kai at froghh dot de
4 years ago
a function that decides if < is a start of a tag or a lower than / lower than + equal:

<?php
function lt_replace($str){
    return
preg_replace("/<([^[:alpha:]])/", '&lt;\\1', $str);
}

?>

It's to be used before strip_slashes.
salavert at~ akelos
7 years ago
<?php
      
/**
    * Works like PHP function strip_tags, but it only removes selected tags.
    * Example:
    *     strip_selected_tags('<b>Person:</b> <strong>Salavert</strong>', 'strong') => <b>Person:</b> Salavert
    */

   
function strip_selected_tags($text, $tags = array())
    {
       
$args = func_get_args();
       
$text = array_shift($args);
       
$tags = func_num_args() > 2 ? array_diff($args,array($text))  : (array)$tags;
        foreach (
$tags as $tag){
            if(
preg_match_all('/<'.$tag.'[^>]*>(.*)<\/'.$tag.'>/iU', $text, $found)){
               
$text = str_replace($found[0],$found[1],$text);
          }
        }

        return
$text;
    }


?>

Hope you find it useful,

Jose Salavert
brettz9 AAT yah
4 years ago
Works on shortened <?...?> syntax and thus also will remove XML processing instructions.
mshaffer
1 month ago
Below was a note on "strip_tags" page that got removed off of PHP.net ... I found this note useful, and use the code in parsing before "stripping tags" ... I don't know why in the world you would delete this one, but keep others ... your review system is a bit disturbing ...

On your page you have a warning about how data may be lost, but you delete a user-contributed comment that helps prevent that?

======================

aleksey at favor dot com dot ua 24-Feb-2011 01:06

strip_tags destroys the whole HTML behind the tags with invalid attributes. Like <img src="/images/image.jpg""> (look, there is an odd quote before >.)

So I wrote function which fixes unsafe attributes and replaces odd " and ' quotes with &quot; and &#39;.

<?php
function fix_unsafe_attributes($s) {
 
$out = false;
  while (
preg_match('/<([A-Za-z])[^>]*?>/', $s, $i, PREG_OFFSET_CAPTURE)) { // find where the tag begins
   
$i = $i[1][1]+1;
   
$out.= substr($s, 0, $i);
   
$s = substr($s, $i);

   
// scan attributes and find odd " and '
   
while (((($i1 = strpos($s, '"')) || 1) && (($i2 = strpos($s, '\'')) || 1)) && ($i1 !== false || $i2 !== false) &&
           ((
$i = (int)(($i1 !== false) && ($i2 !== false) ? ($i1 < $i2 ? $i1 : $i2) : ($i1 == false ? $i2 : $i1))) !== false) &&
           (((
$c = strpos($s, '>')) === false) || ($i < $c))) {

     
$c = $s{$i};
      if ((
$i < 1) || ($s{$i-1} != '=')) {
       
$out.= substr($s, 0, $i).($s{$i} == '"' ? '&quot;' : '&#39;'); // replace odd " and '
       
$s = substr($s, $i+1);
      }else {
       
$i++;
       
$out.= substr($s, 0, $i);
       
$s = substr($s, $i);

        if ((
$i = strpos($s, $c)) !== false) {
         
$i++;
         
$out.= substr($s, 0, $i);
         
$s = substr($s, $i);
        }
      }
    }
  }
  return
$out.$s;
}

?>

Maybe this function can be rewritten with simple regular expression but I have no luck to make it quickly.

来自
http://php.net/manual/zh/function.strip-tags.php
 


 

普通分类: