章 39. 使用远程文件

只要在 php.ini 文件中激活了 allow_url_fopen 选项,就可以在大多数需要用文件名作为参数的函数中使用 HTTPFTP 的 URL 来代替文件名。同时,也可以在 include()include_once()require()require_once() 语句中使用 URL。PHP 所支持协议的更多信息参见附录 L

注: 要在 PHP 4.0.3 及其更早的版本中使用 URL 封装协议,需要在编译时用 --enable-url-fopen-wrapper 参数来配置 PHP。

注: Windows 版本的 PHP 4.3 版之前不支持以下函数的远程访问:include()include_once()require()require_once() 以及参考 LV, Image 图像函数中的 imagecreatefromXXX 函数。

例如,可以用以下范例来打开远程 web 服务器上的文件,解析需要的输出数据,然后将这些数据用在数据库的检索中,或者简单地以和自己网站其它页面相同的风格输出其内容。

例子 39-1. 获取远程文件的标题

<?php
$file
= fopen ("http://www.example.com/", "r");
if (!
$file) {
    echo
"<p>Unable to open remote file.\n";
    exit;
}
while (!
feof ($file)) {
    
$line = fgets ($file, 1024);
    
/* This only works if the title and its tags are on one line */
    
if (eregi ("<title>(.*)</title>", $line, $out)) {
        
$title = $out[1];
        break;
    }
}
fclose($file);
?>

如果有合法的访问权限,以一个用户的身份和某 FTP 服务器建立了链接,还可以向该 FTP 服务器端的文件进行写操作。仅能用该方法来创建新的文件;如果尝试覆盖已经存在的文件,fopen() 函数的调用将会失败。

要以“anonymous”以外的用户名连接服务器,需要指明用户名(可能还有密码),例如“ftp://user:password@ftp.example.com/path/to/file”(也可以在通过需要 Basic 认证的 HTTP 协议访问远程文件时使用相同的语法)。

例子 39-2. 将数据保存到远程服务器

<?php
$file
= fopen ("ftp://ftp.example.com/incoming/outputfile", "w");
if (!
$file) {
    echo
"<p>Unable to open remote file for writing.\n";
    exit;
}
/* Write the data here. */
fwrite ($file, $_SERVER['HTTP_USER_AGENT'] . "\n");
fclose ($file);
?>

注: 或许可以从以上范例中得到启发,用该技术来存储远程日志文件。但是正如以上提到的,在用 fopen() 方式打开的 URL 中,仅能对新文件进行写操作。如果远程文件已经存在则 fopen() 函数的操作将会失败。要做类似于分布式日志的事,可以参考 syslog() 函数。


add a note add a note User Contributed Notes
geoffrey at nevra dot net
07-May-2006 06:53
Really, you should not send headers terminated by \n - it's not per-rfc supported by a HTTP server.

Instead, send as \r\n which is what the protocol specifies, and that regular expression would be matched anywhere, so match for something like /^Content-Length: \d+$/i on each header-line (headers are terminated by the regular expression  /(\r\n|[\r\n])/ - so preg_split on that. Remeber to use the appropriate flags, I can't be arsed to look them up)
heck at fas dot harvard dot edu
14-Sep-2004 03:06
The previous post is part right, part wrong. It's part right because it's true that the php script will run on the remote server, if it's capable of interpreting php scripts. You can see this by creating this script on a remote machine:
<?php
echo system("hostname");
?>
Then include that in a php file on your local machine. When you view it in a browser, you'll see the hostname of the remote machine.

However, that does not mean there are no security worries here. Just try replacing the previous script with this one:
<?php
echo "<?php system(\"hostname\"); ?>";
?>
I'm guessing you can figure out what that's gonna do.

So yes, remote includes can be a major security problem.
geoffrey at nevra dot net
05-Aug-2003 08:25
ok, here is the story:

I was trying to download remote images, finding urls throught apache indexs with regexps and fopen()ing them to get the datas. It didn't work. I thought about binary considerations. Putting the 'b' in the second argument of fopen didn't help much, my browser still didn't want to display the images. I finally understood by watching the datas i was getting from the remote host: it was an html page ! hey, i didn't know apache sent html pages when requesting images, did you ?
the right way is then to send an http request via fsockopen. Here comes my second problem, using explode("\n\n", $buffer); to get rid of the headers. The right way is to get the value of the Content-Lenght field and use it in substr($buffer, -$Content-Lenght);

finally, here is my own function to download these files:

<?php
function http_get($url)
{

  
$url_stuff = parse_url($url);
  
$port = isset($url_stuff['port']) ? $url_stuff['port'] : 80;

  
$fp = fsockopen($url_stuff['host'], $port);

  
$query  = 'GET ' . $url_stuff['path'] . " HTTP/1.0\n";
  
$query .= 'Host: ' . $url_stuff['host'];
  
$query .= "\n\n";

  
fwrite($fp, $query);

   while (
$tmp = fread($fp, 1024))
   {
      
$buffer .= $tmp;
   }

  
preg_match('/Content-Length: ([0-9]+)/', $buffer, $parts);
   return
substr($buffer, - $parts[1]);
?>

}

ho, maybe you'll say i could have parsed the page to get rid of the html stuff, but i wanted to experience http a little ;)