这是个获取文章内容中所有链接的php正则表达式.
$str ="";
$reg = "<a[sS]*?(href)s*=s*(?(?=["'])((["'])(?<href>[^"']*))|(?<src>[^s>]+))[sS]*?>";
下面这个实例是获取内容中域名正则表达式,代码如下:
- function get_domain($url){
- $pattern = "/[w-]+.(com|net|org|gov|cc|biz|info|cn)(.(cn|hk))*/";
- preg_match($pattern, $url, $matches);
- if(count($matches) > 0) {
- return $matches[0];
- }else{
- $rs = parse_url($url);
- $main_url = $rs["host"];
- if(!strcmp(long2ip(sprintf("%u",ip2long($main_url))),$main_url)) {
- return $main_url;
- }else{
- $arr = explode(".",$main_url);
- $count=count($arr);
- $endArr = array("com","net","org","3322");
- if (in_array($arr[$count-2],$endArr)){
- $domain = $arr[$count-3].".".$arr[$count-2].".".$arr[$count-1];
- }else{
- $domain = $arr[$count-2].".".$arr[$count-1];
- }
- return $domain;
- }
- }
- }
使用实例如下:
$str ="jfkdlajfdafdjak;www.phpfensi.com";
echo get_domain($str)
得出值为phpfensi.com |