这个采集程序是一个非常简单的程序了,个人认为不适合于大量数据采集了单页还是没有问题了,因为fopen函数对于远程文件操作与多线程时是非常的不理想的,这个只是一个作者写的觉得好玩合出来了,代码如下:
-
-
-
-
-
-
-
- private function fetchbyurl($url){
- $handle = fopen($url, ‘r’);
- $content = ”;
- while (!feof($handle)){
- $content .= fgets($handle, 10000);
- }
- return $content;
-
- }
-
-
-
-
-
-
-
-
- private function utf8_iconv($content){
- return iconv(‘GBK’, ‘UTF-8′, $content);
- }
- private function strCutAll($str,$start,$end){
- $content = explode($start,$str);
- $matchs = array();
- $sum = count($content);
- for( $i = 1;$i < $sum;$i++ ){
- $tmp = explode($end,$content[$i]);
- $matchs[] = $tmp[0];
- unset($tmp);
- }
- return $matchs;
- }
-
-
-
-
-
-
-
- private function strCut($str, $start, $end){
- $content = strstr( $str, $start );
- $content = substr( $content, strlen( $start ), strpos( $content, $end ) - strlen( $start ) );
- return $content;
- }
测试,实例代码如下:
- header("content-Type: text/html; charset=utf-8");
|