首页 > Web开发 > 详细

PHP 删除非法UTF-8字符

时间:2015-09-15 14:17:23      阅读:480      评论:0      收藏:0      [点我收藏+]
//reject overly long 2 byte sequences, as well as characters above U+10000 and replace with ?
$some_string = preg_replace(/[x00-x08x10x0Bx0Cx0E-x19x7F].
 |[x00-x7F][x80-xBF]+.
 |([xC0xC1]|[xF0-xFF])[x80-xBF]*.
 |[xC2-xDF]((?![x80-xBF])|[x80-xBF]{2,}).
 |[xE0-xEF](([x80-xBF](?![x80-xBF]))|(?![x80-xBF]{2})|[x80-xBF]{3,})/S,
 ?, $some_string );

//reject overly long 3 byte sequences and UTF-16 surrogates and replace with ?
$some_string = preg_replace(/xE0[x80-x9F][x80-xBF].
 |xED[xA0-xBF][x80-xBF]/S,?, $some_string );

 

PHP 删除非法UTF-8字符

原文:http://www.cnblogs.com/chenshuo/p/4809950.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!