星期日, 2013-11-03 17:57 — shiping1

PHP strip_tags() 函数

定义和用法

strip_tags() 函数剥去 HTML、XML 以及 PHP 的标签。

语法

strip_tags(string,allow)

参数	描述
string	必需。规定要检查的字符串。
allow	可选。规定允许的标签。这些标签不会被删除。

提示和注释

注释：该函数始终会剥离 HTML 注释。这点无法通过 allow 参数改变。

例子

例子 1

<?php
echo strip_tags("Hello <b>world!</b>");
?>

输出：

Hello world!

例子 2

<?php
echo strip_tags("Hello <b><i>world!</i></b>","<b>");
?>

输出：

Hello world!

PHP String 函数
来自 http://www.w3school.com.cn/php/func_string_strip_tags.asp

来自

strip_tags

(PHP 4, PHP 5)

strip_tags — 从字符串中去除 HTML 和 PHP 标记

说明

string strip_tags ( string $str [, string $allowable_tags ] )

该函数尝试返回给定的字符串 str 去除空字符、HTML 和 PHP 标记后的结果。它使用与函数 fgetss() 一样的标记去除状态机。

参数

str: 输入字符串。
allowable_tags: 使用可选的第二个参数指定不被去除的字符列表。
Note:
HTML 注释和 PHP 标签也会被去除。这里是硬编码处理的，所以无法通过 allowable_tags 参数进行改变。

返回值

返回处理后的字符串。

更新日志

版本	说明
5.0.0	strip_tags() 变为二进制安全的。
4.3.0	HTML 注释总是被删除。

范例

Example #1 strip_tags() 范例

<?php
$text = '<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo strip_tags($text);
echo "\n";

// 允许 <p> 和 <a>
echo strip_tags($text, '<p><a>');
?>

以上例程会输出：

Test paragraph. Other text
<p>Test paragraph.</p> <a href="#fragment">Other text</a>

注释

Warning

由于 strip_tags() 无法实际验证 HTML，不完整或者破损标签将导致更多的数据被删除。

Warning

该函数不会修改 allowable_tags 参数中指定的允许标记的任何属性，包括 style 和 onmouseover 属性，用户可能会在提交的内容中恶意滥用这些属性，从而展示给其他用户。

参见

htmlspecialchars() - Convert special characters to HTML entities

stripcslashes

strcspn

[edit] Last updated: Fri, 01 Nov 2013

add a note User Contributed Notes strip_tags - [10 notes]

down

CEO at CarPool2Camp dot org

4 years ago

Note the different outputs from different versions of the same tag:

<?php // striptags.php 
$data = '<br>Each<br/>New<br />Line'; 
$new  = strip_tags($data, '<br>'); 
var_dump($new);  // OUTPUTS string(21) "<br>EachNew<br />Line" 

<?php // striptags.php 
$data = '<br>Each<br/>New<br />Line'; 
$new  = strip_tags($data, '<br/>'); 
var_dump($new); // OUTPUTS string(16) "Each<br/>NewLine" 

<?php // striptags.php 
$data = '<br>Each<br/>New<br />Line'; 
$new  = strip_tags($data, '<br />'); 
var_dump($new); // OUTPUTS string(11) "EachNewLine" 
?>

down

admin at automapit dot com

7 years ago

<?php 
function html2txt($document){ 
$search = array('@<script[^>]*?>.*?</script>@si',  // Strip out javascript
               '@<[\/\!]*?[^<>]*?>@si',            // Strip out HTML tags
               '@<style[^>]*?>.*?</style>@siU',    // Strip style tags properly
               '@<![\s\S]*?--[ \t\n\r]*>@'         // Strip multi-line comments including CDATA 
); 
$text = preg_replace($search, '', $document);
return $text;
} 
?> 

This function turns HTML into text... strips tags, comments spanning multiple lines including CDATA, and anything else that gets in it's way.

It's a frankenstein function I made from bits picked up on my travels through the web, thanks to the many who have unwittingly contributed!

down

mariusz.tarnaski at wp dot pl

4 years ago

Hi. I made a function that removes the HTML tags along with their contents:

Function:
<?php 
function strip_tags_content($text, $tags = '', $invert = FALSE) {

  preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags);
  $tags = array_unique($tags[1]);
   
  if(is_array($tags) AND count($tags) > 0) {
    if($invert == FALSE) {
      return preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text);
    }
    else {
      return preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text);
    }
  }
  elseif($invert == FALSE) {
    return preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text);
  }
  return $text;
} 
?> 

Sample text:
$text = '<b>sample</b> text with <div>tags</div>';

Result for strip_tags($text):
sample text with tags

Result for strip_tags_content($text):
 text with

Result for strip_tags_content($text, '<b>'):
<b>sample</b> text with

Result for strip_tags_content($text, '<b>', TRUE);
 text with <div>tags</div>

I hope that someone is useful :)

down

bzplan at web dot de

1 year ago

a HTML code like this:

<?php
$html = '
<div>
<p style="color:blue;">color is blue</p><p>size is <span style="font-size:200%;">huge</span></p>
<p>material is wood</p>
</div>
'; 
?>

with <?php $str = strip_tags($html); ?>
... the result is:

$str = 'color is bluesize is huge
material is wood';

notice: the words 'blue' and 'size' grow together :(
and line-breaks are still in new string $str

if you need a space between the words (and without line-break)
use my function: <?php $str = rip_tags($html); ?>
... the result is:

$str = 'color is blue size is huge material is wood';

the function:

<?php
// -------------------------------------------------------------- 

function rip_tags($string) {
   
    // ----- remove HTML TAGs -----
    $string = preg_replace ('/<[^>]*>/', ' ', $string);
   
    // ----- remove control characters -----
    $string = str_replace("\r", '', $string);    // --- replace with empty space
    $string = str_replace("\n", ' ', $string);   // --- replace with space
    $string = str_replace("\t", ' ', $string);   // --- replace with space
   
    // ----- remove multiple spaces -----
    $string = trim(preg_replace('/ {2,}/', ' ', $string));
   
    return $string;

}

// -------------------------------------------------------------- 
?>

the KEY is the regex pattern: '/<[^>]*>/'
instead of strip_tags()
... then remove control characters and multiple spaces
:)

down

tom at cowin dot us

3 years ago

With most web based user input of more than a line of text, it seems I get 90% 'paste from Word'. I've developed this fn over time to try to strip all of this cruft out. A few things I do here are application specific, but if it helps you - great, if you can improve on it or have a better way - please - post it...

<?php

    function strip_word_html($text, $allowed_tags = '<b><i><sup><sub><em><strong><u><br>')
    {
        mb_regex_encoding('UTF-8');
        //replace MS special characters first
        $search = array('/&lsquo;/u', '/&rsquo;/u', '/&ldquo;/u', '/&rdquo;/u', '/&mdash;/u');
        $replace = array('\'', '\'', '"', '"', '-');
        $text = preg_replace($search, $replace, $text);
        //make sure _all_ html entities are converted to the plain ascii equivalents - it appears
        //in some MS headers, some html entities are encoded and some aren't
        $text = html_entity_decode($text, ENT_QUOTES, 'UTF-8');
        //try to strip out any C style comments first, since these, embedded in html comments, seem to
        //prevent strip_tags from removing html comments (MS Word introduced combination)
        if(mb_stripos($text, '/*') !== FALSE){
            $text = mb_eregi_replace('#/\*.*?\*/#s', '', $text, 'm');
        }
        //introduce a space into any arithmetic expressions that could be caught by strip_tags so that they won't be
        //'<1' becomes '< 1'(note: somewhat application specific)
        $text = preg_replace(array('/<([0-9]+)/'), array('< $1'), $text);
        $text = strip_tags($text, $allowed_tags);
        //eliminate extraneous whitespace from start and end of line, or anywhere there are two or more spaces, convert it to one
        $text = preg_replace(array('/^\s\s+/', '/\s\s+$/', '/\s\s+/u'), array('', '', ' '), $text);
        //strip out inline css and simplify style tags
        $search = array('#<(strong|b)[^>]*>(.*?)</(strong|b)>#isu', '#<(em|i)[^>]*>(.*?)</(em|i)>#isu', '#<u[^>]*>(.*?)</u>#isu');
        $replace = array('<b>$2</b>', '<i>$2</i>', '<u>$1</u>');
        $text = preg_replace($search, $replace, $text);
        //on some of the ?newer MS Word exports, where you get conditionals of the form 'if gte mso 9', etc., it appears
        //that whatever is in one of the html comments prevents strip_tags from eradicating the html comment that contains
        //some MS Style Definitions - this last bit gets rid of any leftover comments */
        $num_matches = preg_match_all("/\<!--/u", $text, $matches);
        if($num_matches){
              $text = preg_replace('/\<!--(.)*--\>/isu', '', $text);
        }
        return $text;
    } 
?>

down

cesar at nixar dot org

7 years ago

Here is a recursive function for strip_tags like the one showed in the stripslashes manual page.

<?php
function strip_tags_deep($value)
{
  return is_array($value) ?
    array_map('strip_tags_deep', $value) :
    strip_tags($value);
}

// Example
$array = array('<b>Foo</b>', '<i>Bar</i>', array('<b>Foo</b>', '<i>Bar</i>'));
$array = strip_tags_deep($array);

// Output
print_r($array);
?>

down

kai at froghh dot de

4 years ago

a function that decides if < is a start of a tag or a lower than / lower than + equal:

<?php 
function lt_replace($str){
    return preg_replace("/<([^[:alpha:]])/", '&lt;\\1', $str);
} 
?> 

It's to be used before strip_slashes.

down

-2

salavert at~ akelos

7 years ago

<?php
       /**
    * Works like PHP function strip_tags, but it only removes selected tags.
    * Example:
    *     strip_selected_tags('<b>Person:</b> <strong>Salavert</strong>', 'strong') => <b>Person:</b> Salavert
    */

    function strip_selected_tags($text, $tags = array())
    {
        $args = func_get_args();
        $text = array_shift($args);
        $tags = func_num_args() > 2 ? array_diff($args,array($text))  : (array)$tags;
        foreach ($tags as $tag){
            if(preg_match_all('/<'.$tag.'[^>]*>(.*)<\/'.$tag.'>/iU', $text, $found)){
                $text = str_replace($found[0],$found[1],$text);
          }
        }

        return $text;
    }

?>

Hope you find it useful,

Jose Salavert

down

-3

brettz9 AAT yah

4 years ago

Works on shortened <?...?> syntax and thus also will remove XML processing instructions.

down

mshaffer

1 month ago

Below was a note on "strip_tags" page that got removed off of PHP.net ... I found this note useful, and use the code in parsing before "stripping tags" ... I don't know why in the world you would delete this one, but keep others ... your review system is a bit disturbing ...

On your page you have a warning about how data may be lost, but you delete a user-contributed comment that helps prevent that?

======================

aleksey at favor dot com dot ua 24-Feb-2011 01:06

strip_tags destroys the whole HTML behind the tags with invalid attributes. Like <img src="/images/image.jpg""> (look, there is an odd quote before >.)

So I wrote function which fixes unsafe attributes and replaces odd " and ' quotes with &quot; and &#39;.

<?php 
function fix_unsafe_attributes($s) {
  $out = false;
  while (preg_match('/<([A-Za-z])[^>]*?>/', $s, $i, PREG_OFFSET_CAPTURE)) { // find where the tag begins
    $i = $i[1][1]+1;
    $out.= substr($s, 0, $i);
    $s = substr($s, $i);

    // scan attributes and find odd " and '
    while (((($i1 = strpos($s, '"')) || 1) && (($i2 = strpos($s, '\'')) || 1)) && ($i1 !== false || $i2 !== false) &&
           (($i = (int)(($i1 !== false) && ($i2 !== false) ? ($i1 < $i2 ? $i1 : $i2) : ($i1 == false ? $i2 : $i1))) !== false) &&
           ((($c = strpos($s, '>')) === false) || ($i < $c))) {

      $c = $s{$i};
      if (($i < 1) || ($s{$i-1} != '=')) {
        $out.= substr($s, 0, $i).($s{$i} == '"' ? '&quot;' : '&#39;'); // replace odd " and '
        $s = substr($s, $i+1);
      }else {
        $i++;
        $out.= substr($s, 0, $i);
        $s = substr($s, $i);

        if (($i = strpos($s, $c)) !== false) {
          $i++;
          $out.= substr($s, 0, $i);
          $s = substr($s, $i);
        }
      }
    }
  }
  return $out.$s;
} 
?>

Maybe this function can be rewritten with simple regular expression but I have no luck to make it quickly.
来自

http://php.net/manual/zh/function.strip-tags.php

普通分类:

php

You are here

PHP strip_tags() 函数

PHP strip_tags() 函数

定义和用法

语法

提示和注释

例子

例子 1

例子 2

strip_tags

说明

参数

返回值

更新日志

范例

注释

参见

友情链接

搜索表单

用户登录

You are here

PHP strip_tags() 函数

PHP strip_tags() 函数

定义和用法

语法

提示和注释

例子

例子 1

例子 2

strip_tags

说明

参数

返回值

更新日志

范例

注释

参见

友情链接