PHP 截取字符串专题集合

1、UTF-8、GB2312都支持的汉字截取函数

  

复制代码 代码如下:

  <?php

  /*

  Utf-8、gb2312都支持的汉字截取函数

  cut_str(字符串, 截取长度, 开始长度, 编码);

  编码默认为 utf-8

  开始长度默认为 0

  */

  function cut_str($string, $sublen, $start = 0, $code = 'UTF-8')

  {

  if($code == 'UTF-8')

  {

  $pa = "/[\x01-\x7f]|[\xc2-\xdf][\x80-\xbf]|\xe0[\xa0-\xbf][\x80-\xbf]|[\xe1-\xef][\x80-\xbf][\x80-\xbf]|\xf0[\x90-\xbf][\x80-\xbf][\x80-\xbf]|[\xf1-\xf7][\x80-\xbf][\x80-\xbf][\x80-\xbf]/";

  preg_match_all($pa, $string, $t_string);

  if(count($t_string[0]) - $start > $sublen) return join('', array_slice($t_string[0], $start, $sublen))."…";

  return join('', array_slice($t_string[0], $start, $sublen));

  }

  else

  {

  $start = $start*2;

  $sublen = $sublen*2;

  $strlen = strlen($string);

  $tmpstr = '';

  for($i=0; $i< $strlen; $i++)

  {

  if($i>=$start && $i< ($start+$sublen))

  {

  if(ord(substr($string, $i, 1))>129)

  {

  $tmpstr.= substr($string, $i, 2);

  }

  else

  {

  $tmpstr.= substr($string, $i, 1);

  }

  }

  if(ord(substr($string, $i, 1))>129) $i++;

  }

  if(strlen($tmpstr)< $strlen ) $tmpstr.= "…";

  return $tmpstr;

  }

  }

  $str = "abcd需要截取的字符串";

  echo cut_str($str, 8, 0, 'gb2312');

  ?>

  2、截取utf8编码的多字节字符串

  

复制代码 代码如下:

  <?php

  //截取utf8字符串

  function utf8Substr($str, $from, $len)

  {

  return preg_replace('#^(?:[\x00-\x7F]|[\xC0-\xFF][\x80-\xBF]+){0,'.$from.'}'.

  '((?:[\x00-\x7F]|[\xC0-\xFF][\x80-\xBF]+){0,'.$len.'}).*#s',

  '$1',$str);

  }

  ?>

  3、截取GB2312中文字符串

  

复制代码 代码如下:

  <?php

  //截取中文字符串

  function mysubstr($str, $start, $len) {

  $tmpstr = "";

  $strlen = $start + $len;

  for($i = 0; $i < $strlen; $i++) {

  if(ord(substr($str, $i, 1)) > 0xa0) {

  $tmpstr .= substr($str, $i, 2);

  $i++;

  } else

  $tmpstr .= substr($str, $i, 1);

  }

  return $tmpstr;

  }

  ?>

  4、BugFree 的字符截取函数

  

复制代码 代码如下:

  <?php

  /**

  * @package BugFree

  * @version $Id: FunctionsMain.inc.php,v 1.32 2005/09/24 11:38:37 wwccss Exp $

  *

  *

  * Return part of a string(Enhance the function substr())

  *

  * @param string $String the string to cut.

  * @param int $Length the length of returned string.

  * @param booble $Append whether append "…": false|true

  * @return string the cutted string.

  */

  function sysSubStr($String,$Length,$Append = false)

  {

  if (strlen($String) < = $Length )

  {

  return $String;

  }

  else

  {

  $I = 0;

  while ($I < $Length)

  {

  $StringTMP = substr($String,$I,1);

  if ( ord($StringTMP) >=224 )

  {

  $StringTMP = substr($String,$I,3);

  $I = $I + 3;

  }

  elseif( ord($StringTMP) >=192 )

  {

  $StringTMP = substr($String,$I,2);

  $I = $I + 2;

  }

  else

  {

  $I = $I + 1;

  }

  $StringLast[] = $StringTMP;

  }

  $StringLast = implode("",$StringLast);

  if($Append)

  {

  $StringLast .= "…";

  }

  return $StringLast;

  }

  }

  $String = "http://www.glzy8.com — 简单、精彩、通用";

  $Length = "18";

  $Append = false;

  echo sysSubStr($String,$Length,$Append);

  ?>

  dedecms中的截取代码

  这是从dedecms直接拿的代码,大家可以稍作修改即可。

  

复制代码 代码如下:

  //中文截取2,单字节截取模式

  //如果是request的内容,必须使用这个函数

  function cn_substrR($str,$slen,$startdd=0)

  {

  $str = cn_substr(stripslashes($str),$slen,$startdd);

  return addslashes($str);

  }

  //中文截取2,单字节截取模式

  function cn_substr($str,$slen,$startdd=0)

  {

  global $cfg_soft_lang;

  if($cfg_soft_lang=='utf-8')

  {

  return cn_substr_utf8($str,$slen,$startdd);

  }

  $restr = '';

  $c = '';

  $str_len = strlen($str);

  if($str_len < $startdd+1)

  {

  return '';

  }

  if($str_len < $startdd + $slen || $slen==0)

  {

  $slen = $str_len - $startdd;

  }

  $enddd = $startdd + $slen - 1;

  for($i=0;$i<$str_len;$i++)

  {

  if($startdd==0)

  {

  $restr .= $c;

  }

  else if($i > $startdd)

  {

  $restr .= $c;

  }

  if(ord($str[$i])>0x80)

  {

  if($str_len>$i+1)

  {

  $c = $str[$i].$str[$i+1];

  }

  $i++;

  }

  else

  {

  $c = $str[$i];

  }

  if($i >= $enddd)

  {

  if(strlen($restr)+strlen($c)>$slen)

  {

  break;

  }

  else

  {

  $restr .= $c;

  break;

  }

  }

  }

  return $restr;

  }

  //utf-8中文截取,单字节截取模式

  function cn_substr_utf8($str, $length, $start=0)

  {

  if(strlen($str) < $start+1)

  {

  return '';

  }

  preg_match_all("/./su", $str, $ar);

  $str = '';

  $tstr = '';

  //为了兼容mysql4.1以下版本,与数据库varchar一致,这里使用按字节截取

  for($i=0; isset($ar[0][$i]); $i++)

  {

  if(strlen($tstr) < $start)

  {

  $tstr .= $ar[0][$i];

  }

  else

  {

  if(strlen($str) < $length + strlen($ar[0][$i]) )

  {

  $str .= $ar[0][$i];

  }

  else

  {

  break;

  }

  }

  }

  return $str;

  }

  phpcms中的字符串截取代码:

  

复制代码 代码如下:

  function str_cut($string, $length, $dot = '...')

  {

  $strlen = strlen($string);

  if($strlen <= $length) return $string;

  $string = str_replace(array(' ', '&', '"', ''', '“', '”', '—', '<', '>', '·', '…'), array(' ', '&', '"', "'", '“', '”', '—', '<', '>', '·', '…'), $string);

  $strcut = '';

  if(strtolower(CHARSET) == 'utf-8')

  {

  $n = $tn = $noc = 0;

  while($n < $strlen)

  {

  $t = ord($string[$n]);

  if($t == 9 || $t == 10 || (32 <= $t && $t <= 126)) {

  $tn = 1; $n++; $noc++;

  } elseif(194 <= $t && $t <= 223) {

  $tn = 2; $n += 2; $noc += 2;

  } elseif(224 <= $t && $t < 239) {

  $tn = 3; $n += 3; $noc += 2;

  } elseif(240 <= $t && $t <= 247) {

  $tn = 4; $n += 4; $noc += 2;

  } elseif(248 <= $t && $t <= 251) {

  $tn = 5; $n += 5; $noc += 2;

  } elseif($t == 252 || $t == 253) {

  $tn = 6; $n += 6; $noc += 2;

  } else {

  $n++;

  }

  if($noc >= $length) break;

  }

  if($noc > $length) $n -= $tn;

  $strcut = substr($string, 0, $n);

  }

  else

  {

  $dotlen = strlen($dot);

  $maxi = $length - $dotlen - 1;

  for($i = 0; $i < $maxi; $i++)

  {

  $strcut .= ord($string[$i]) > 127 ? $string[$i].$string[++$i] : $string[$i];

  }

  }

  $strcut = str_replace(array('&', '"', "'", '<', '>'), array('&', '"', ''', '<', '>'), $strcut);

  return $strcut.$dot;

  }