字符串

string 是一系列字符。在 PHP 中,字符和字节一样,也就是说,一共有 256 种不同字符的可能性。这也暗示 PHP 对 Unicode 没有本地支持。请参阅函数 utf8_encode()utf8_decode() 以了解有关 Unicode 支持。

注: 一个字符串变得非常巨大也没有问题,PHP 没有给字符串的大小强加实现范围,所以完全没有理由担心长字符串。

语法

字符串可以用三种字面上的方法定义。

单引号

指定一个简单字符串的最简单的方法是用单引号(字符 ')括起来。

要表示一个单引号,需要用反斜线(\)转义,和很多其它语言一样。如果在单引号之前或字符串结尾需要出现一个反斜线,需要用两个反斜线表示。注意如果试图转义任何其它字符,反斜线本身也会被显示出来!所以通常不需要转义反斜线本身。

注: 在 PHP 3 中,此情况下将发出一个 E_NOTICE 级的警告。

注: 和其他两种语法不同,单引号字符串中出现的变量和转义序列不会被变量的值替代。

<?php
echo 'this is a simple string';

echo
'You can also have embedded newlines in
strings this way as it is
okay to do'
;

// Outputs: Arnold once said: "I'll be back"
echo 'Arnold once said: "I\'ll be back"';

// Outputs: You deleted C:\*.*?
echo 'You deleted C:\\*.*?';

// Outputs: You deleted C:\*.*?
echo 'You deleted C:\*.*?';

// Outputs: This will not expand: \n a newline
echo 'This will not expand: \n a newline';

// Outputs: Variables do not $expand $either
echo 'Variables do not $expand $either';
?>

双引号

如果用双引号(")括起字符串,PHP 懂得更多特殊字符的转义序列:

表格 11-1. 转义字符

序列含义
\n换行(LF 或 ASCII 字符 0x0A(10))
\r回车(CR 或 ASCII 字符 0x0D(13))
\t水平制表符(HT 或 ASCII 字符 0x09(9))
\\反斜线
\$美元符号
\"双引号
\[0-7]{1,3} 此正则表达式序列匹配一个用八进制符号表示的字符
\x[0-9A-Fa-f]{1,2} 此正则表达式序列匹配一个用十六进制符号表示的字符

此外,如果试图转义任何其它字符,反斜线本身也会被显示出来!

双引号字符串最重要的一点是其中的变量名会被变量值替代。细节参见字符串解析

定界符

另一种给字符串定界的方法使用定界符语法(“<<<”)。应该在 <<< 之后提供一个标识符,然后是字符串,然后是同样的标识符结束字符串。

结束标识符必须从行的第一列开始。同样,标识符也必须遵循 PHP 中其它任何标签的命名规则:只能包含字母数字下划线,而且必须以下划线或非数字字符开始。

警告

很重要的一点必须指出,结束标识符所在的行不能包含任何其它字符,可能除了一个分号(;)之外。这尤其意味着该标识符不能被缩进,而且在分号之前和之后都不能有任何空格或制表符。同样重要的是要意识到在结束标识符之前的第一个字符必须是你的操作系统中定义的换行符。例如在 Macintosh 系统中是 \r

如果破坏了这条规则使得结束标识符不“干净”,则它不会被视为结束标识符,PHP 将继续寻找下去。如果在这种情况下找不到合适的结束标识符,将会导致一个在脚本最后一行出现的语法错误。

不能用定界符语法初始化类成员。用其它字符串语法替代。

例子 11-3. 非法的例子

<?php
class foo {
    
public $bar = <<<EOT
bar
EOT;
}
?>

定界符文本表现的就和双引号字符串一样,只是没有双引号。这意味着在定界符文本中不需要转义引号,不过仍然可以用以上列出来的转义代码。变量会被展开,但当在定界符文本中表达复杂变量时和字符串一样同样也要注意。

例子 11-4. 定界符字符串例子

<?php
$str
= <<<EOD
Example of string
spanning multiple lines
using heredoc syntax.
EOD;

/* More complex example, with variables. */
class foo
{
    var
$foo;
    var
$bar;

    function
foo()
    {
        
$this->foo = 'Foo';
        
$this->bar = array('Bar1', 'Bar2', 'Bar3');
    }
}

$foo = new foo();
$name = 'MyName';

echo <<<EOT
My name is "$name". I am printing some $foo->foo.
Now, I am printing some
{$foo->bar[1]}.
This should print a capital 'A':
\x41
EOT;
?>

注: 定界符支持是 PHP 4 中加入的。

变量解析

当用双引号或者定界符指定字符串时,其中的变量会被解析。

有两种语法,一种简单的和一种复杂的。简单语法最通用和方便,它提供了解析变量,数组值,或者对象属性的方法。

复杂语法是 PHP 4 引进的,可以用花括号括起一个表达式。

简单语法

如果遇到美元符号($),解析器会尽可能多地取得后面的字符以组成一个合法的变量名。如果想明示指定名字的结束,用花括号把变量名括起来。

<?php
$beer
= 'Heineken';
echo
"$beer's taste is great"; // works, "'" is an invalid character for varnames
echo "He drank some $beers";   // won't work, 's' is a valid character for varnames
echo "He drank some ${beer}s"; // works
echo "He drank some {$beer}s"; // works
?>

同样也可以解析数组索引或者对象属性。对于数组索引,右方括号(])标志着索引的结束。对象属性则和简单变量适用同样的规则,尽管对于对象属性没有像变量那样的小技巧。

<?php
// These examples are specific to using arrays inside of strings.
// When outside of a string, always quote your array string keys
// and do not use {braces} when outside of strings either.

// Let's show all errors
error_reporting(E_ALL);

$fruits = array('strawberry' => 'red', 'banana' => 'yellow');
// Works but note that this works differently outside string-quotes
echo "A banana is $fruits[banana].";
// Works
echo "A banana is {$fruits['banana']}.";

// Works but PHP looks for a constant named banana first
// as described below.
echo "A banana is {$fruits[banana]}.";

// Won't work, use braces.  This results in a parse error.
echo "A banana is $fruits['banana'].";

// Works
echo "A banana is " . $fruits['banana'] . ".";

// Works

echo "This square is $square->width meters broad.";
// Won't work. For a solution, see the complex syntax.
echo "This square is $square->width00 centimeters broad.";
?>

对于任何更复杂的情况,应该使用复杂语法。

复杂(花括号)语法

不是因为语法复杂而称其为复杂,而是因为用此方法可以包含复杂的表达式。

事实上,用此语法可以在字符串中包含任何在名字空间的值。仅仅用和在字符串之外同样的方法写一个表达式,然后用 { 和 } 把它包含进来。因为不能转义“{”,此语法仅在 $ 紧跟在 { 后面时被识别(用“{\$”或者“\{$”来得到一个字面上的“{$”)。用一些例子可以更清晰:

<?php
// Let's show all errors
error_reporting(E_ALL);

$great = 'fantastic';

// 不行,输出为:This is { fantastic}
echo "This is { $great}";

// 可以,输出为:This is fantastic
echo "This is {$great}";
echo
"This is ${great}";

// Works
echo "This square is {$square->width}00 centimeters broad.";

// Works
echo "This works: {$arr[4][3]}";
// This is wrong for the same reason as $foo[bar] is wrong
// outside a string.  In otherwords, it will still work but
// because PHP first looks for a constant named foo, it will
// throw an error of level E_NOTICE (undefined constant).
echo "This is wrong: {$arr[foo][3]}";
// Works.  When using multi-dimensional arrays, always use
// braces around arrays when inside of strings
echo "This works: {$arr['foo'][3]}";

// Works.
echo "This works: " . $arr['foo'][3];

echo
"You can even write {$obj->values[3]->name}";

echo
"This is the value of the var named $name: {${$name}}";
?>

访问和修改字符串中的字符

字符串中的字符可以通过在字符串之后用花括号指定所要字符从零开始的偏移量来访问和修改。

注: 为了向下兼容,仍然可以用方括号。不过此语法自 PHP 4 起已过时。

例子 11-5. 一些字符串例子

<?php
// Get the first character of a string
$str = 'This is a test.';
$first = $str{0};
// Get the third character of a string
$third = $str{2};

// Get the last character of a string.
$str = 'This is still a test.';
$last = $str{strlen($str)-1};

// Modify the last character of a string
$str = 'Look at the sea';
$str{strlen($str)-1} = 'e';

?>

实用函数及操作符

字符串可以用“.”(点)运算符连接。注意这里不能用“+”(加)运算符。更多信息参见字符串运算符

有很多实用函数来改变字符串。

普通函数见字符串函数一节,高级搜索和替换见正则表达式函数(两种风格:PerlPOSIX 扩展)。

还有 URL 字符串函数,以及加密/解密字符串的函数(mcryptmhash)。

最后,如果还是找不到想要的函数,参见字符类型函数

字符串转换

可以用 (string) 标记或者 strval() 函数将一个值转换为字符串。当某表达式需要字符串时,字符串的转换会在表达式范围内自动完成。例如当使用 echo() 或者 print() 函数时,或者将一个变量值与一个字符串进行比较的时候。阅读手册中有关类型类型戏法中的部分有助于更清楚一些。参见 settype()

布尔值 TRUE 将被转换为字符串 "1",而值 FALSE 将被表示为 ""(即空字符串)。这样就可以随意地在布尔值和字符串之间进行比较。

整数或浮点数数值在转换成字符串时,字符串由表示这些数值的数字字符组成(浮点数还包含有指数部分)。

数组将被转换成字符串 "Array",因此无法通过 echo() 或者 print() 函数来输出数组的内容。请参考下文以获取更多提示。

对象将被转换成字符串 "Object"。如果因为调试需要,需要将对象的成员变量打印出来,请阅读下文。如果希望得到该对象所依附的类的名称,请使用函数 get_class()。自 PHP 5 起,如果合适可以用 __toString() 方法。

资源类型总是以 "Resource id #1" 的格式被转换成字符串,其中 1 是 PHP 在运行时给资源指定的唯一标识。如果希望获取资源的类型,请使用函数 get_resource_type()

NULL 将被转换成空字符串。

正如以上所示,将数组、对象或者资源打印出来,并不能提供任何关于这些值本身的有用的信息。请参阅函数 print_r()var_dump(),对于调试来说,这些是更好的打印值的方法。

可以将 PHP 的值转换为字符串以永久地储存它们。这种方法被称为序列化,可以用函数 serialize() 来完成该操作。如果在安装 PHP 时建立了 WDDX 支持,还可以将 PHP 的值序列化为 XML 结构。

字符串转换为数值

当一个字符串被当作数字来求值时,根据以下规则来决定结果的类型和值。

如果包括“.”,“e”或“E”其中任何一个字符的话,字符串被当作 float 来求值。否则就被当作整数。

该值由字符串最前面的部分决定。如果字符串以合法的数字数据开始,就用该数字作为其值,否则其值为 0(零)。合法数字数据由可选的正负号开始,后面跟着一个或多个数字(可选地包括十进制分数),后面跟着可选的指数。指数是一个“e”或者“E”后面跟着一个或多个数字。

<?php
$foo
= 1 + "10.5";                // $foo is float (11.5)
$foo = 1 + "-1.3e3";              // $foo is float (-1299)
$foo = 1 + "bob-1.3e3";           // $foo is integer (1)
$foo = 1 + "bob3";                // $foo is integer (1)
$foo = 1 + "10 Small Pigs";       // $foo is integer (11)
$foo = 4 + "10.2 Little Piggies"; // $foo is float (14.2)
$foo = "10.0 pigs " + 1;          // $foo is float (11)
$foo = "10.0 pigs " + 1.0;        // $foo is float (11)
?>

此转换的更多信息见 Unix 手册中关于 strtod(3) 的部分。

如果想测试本节中的任何例子,可以拷贝和粘贴这些例子并且加上下面这一行自己看看会发生什么:

<?php
echo "\$foo==$foo; type is " . gettype ($foo) . "<br />\n";
?>

不要指望在将一个字符转换成整型时能够得到该字符的编码(可能也会在 C 中这么做)。如果希望在字符编码和字符之间转换,请使用 ord()chr() 函数。


add a note add a note User Contributed Notes
Wade
05-Oct-2006 08:12
If you're after performance in your PHP, then doing
<?php $var = "string with a $variable reference" ?>
is measurably slower than than
<?php $var = 'string with a '.$variable.' reference' ?>
kevin at metalaxe dot com
20-Aug-2006 04:18
When editing with Dreamweaver, the syntax higlighting will be messed up if you use heredoc syntax. I found that the color syntaxing can be taken care by editing the php syntax file for DW.

Open Macromedia\Dreamweaver 8\Configuration\CodeColoring\PHP.xml and look for:

<blockStart doctypes="PHP_MySQL" scheme="customText"><![CDATA[<?=]]></blockStart>
<
blockEnd><![CDATA[?>]]></blockEnd>

Directly after that add:

<blockStart doctypes="PHP_MySQL" scheme="customText"><![CDATA[EOF;]]></blockStart>
<blockEnd><![CDATA[<<<EOF]]></blockEnd>

Now, DW will treat <<<EOF as if it was the end of the php syntax block (?>) and EOF; as it was the start (<?php or <?). I have yet to figure out how to make it work dynamically with any heredoc identifier. DW uses some regex in that file but it appears as if some are disabled. If anyone else happens to figure it out please share! :D
bishop
29-Mar-2006 04:58
You may use heredoc syntax to comment out large blocks of code, as follows:
<?php
<<<_EOC
    // end-of-line comment will be masked... so will regular PHP:
   echo ($test == 'foo' ? 'bar' : 'baz');
   /* c-style comment will be masked, as will other heredocs (not using the same marker) */
   echo <<<EOHTML
This is text you'll never see!       
EOHTML;
   function defintion($params)
{
       echo 'foo';
  
}
   class definition extends nothing   
{
       function definition($param)
{
         echo 'do nothing';
      
}     
  
}

   how about syntax errors?; = gone, I bet.
_EOC;
?>

Useful for debugging when C-style just won't do.  Also useful if you wish to embed Perl-like Plain Old Documentation; extraction between POD markers is left as an exercise for the reader.

Note there is a performance penalty for this method, as PHP must still parse and variable substitute the string.
adam at obledesign dot com
22-Jan-2006 04:51
If you really want to get constants inside your string you can do something like this (PHP5 only).

<?php

class GetConstant {
 
public function __get($constant_name) {
  return (
defined($constant_name) ? constant($constant_name) : NULL);
 }
}

$C = new GetConstant();

define('FOO','bar');

echo(
"I want a candy {$C->FOO}.");

?>
csaba at alum dot mit dot edu
31-Dec-2005 07:15
Multiline strings:
In most of the strings on this doc page, the heredoc form (<<<) is not needed.  The heredoc form is useful when you also want to embed, without escaping, quotes of the same type as the starting quote.  The single quote form is particularly useful when you want to specify multiline code for future evaluation.

<?php
$code
= '
  $output = "Mutliline output";
  $out = "$output within multiline code:
$ needs escaping within \$out,
and double quotes (\") need escaping within \$out,
but single quotes (\') need escaping everywhere.";
  // Next two lines work with php.exe
  /* on Windows systems */
  $oWSH = new COM("WScript.Shell");
  $oWSH->Popup($out, 4, \'IO Demo\', 131120);
'
;
print
"<pre>$code</pre>";
call_user_func (create_function('', $code));
?>

Happy New Year,
Csaba Gabor from Vienna
wilkinson98 at hotmail dot com
17-Dec-2005 09:53
You can't depend on hex in strings to be converted.

<?php

// prints 17
echo 1 + '0x10';   

// prints 0
echo (int) '0x10';
                    
?>
webmaster at rephunter dot net
01-Dec-2005 12:57
Use caution when you need white space at the end of a heredoc. Not only is the mandatory final newline before the terminating symbol stripped, but an immediately preceding newline or space character is also stripped.

For example, in the following, the final space character (indicated by \s -- that is, the "\s" is not literally in the text, but is only used to indicate the space character) is stripped:

$string = <<<EOT
this is a string with a terminating space\s
EOT;

In the following, there will only be a single newline at the end of the string, even though two are shown in the text:

$string = <<<EOT
this is a string that must be
followed by a single newline

EOT;
DELETETHIS dot php at dfackrell dot mailshell dot com
02-Nov-2005 12:05
Just some quick observations on variable interpolation:

Because PHP looks for {? to start a complex variable expression in a double-quoted string, you can call object methods, but not class methods or unbound functions.

This works:

<?php
class a {
   function
b() {
       return
"World";
   }
}
$c = new a;
echo
"Hello {$c->b()}.\n"
?>

While this does not:

<?php
function b() {
   return
"World";
}
echo
"Hello {b()}\n";
?>

Also, it appears that you can almost without limitation perform other processing within the argument list, but not outside it.  For example:

<?
$true
= true;
define("HW", "Hello World");
echo
"{$true && HW}";
?>

gives: Parse error: parse error, unexpected T_BOOLEAN_AND, expecting '}' in - on line 3

There may still be some way to kludge the syntax to allow constants and unbound function calls inside a double-quoted string, but it isn't readily apparent to me at the moment, and I'm not sure I'd prefer the workaround over breaking out of the string at this point.
lelon at lelon dot net
28-Oct-2004 03:01
You can use the complex syntax to put the value of both object properties AND object methods inside a string.  For example...
<?php
class Test {
  
public $one = 1;
  
public function two() {
       return
2;
   }
}
$test = new Test();
echo
"foo {$test->one} bar {$test->two()}";
?>
Will output "foo 1 bar 2".

However, you cannot do this for all values in your namespace.  Class constants and static properties/methods will not work because the complex syntax looks for the '$'.
<?php
class Test {
   const
ONE = 1;
}
echo
"foo {Test::ONE} bar";
?>
This will output "foo {Test::one} bar".  Constants and static properties require you to break up the string.
Jonathan Lozinski
07-Aug-2004 03:03
A note on the heredoc stuff.

If you're editing with VI/VIM and possible other syntax highlighting editors, then using certain words is the way forward.  if you use <<<HTML for example, then the text will be hightlighted for HTML!!

I just found this out and used sed to alter all EOF to HTML.

JAVASCRIPT also works, and possibly others.  The only thing about <<<JAVASCRIPT is that you can't add the <script> tags..,  so use HTML instead, which will correctly highlight all JavaScript too..

You can also use EOHTML, EOSQL, and EOJAVASCRIPT.
www.feisar.de
28-Apr-2004 10:49
watch out when comparing strings that are numbers. this example:

<?php

$x1
= '111111111111111111';
$x2 = '111111111111111112';

echo (
$x1 == $x2) ? "true\n" : "false\n";

?>

will output "true", although the strings are different. With large integer-strings, it seems that PHP compares only the integer values, not the strings. Even strval() will not work here.

To be on the safe side, use:

$x1 === $x2
atnak at chejz dot com
12-Apr-2004 06:53
Here is a possible gotcha related to oddness involved with accessing strings by character past the end of the string:

$string = 'a';

var_dump($string[2]);  // string(0) ""
var_dump($string[7]);  // string(0) ""
$string[7] === '';  // TRUE

It appears that anything past the end of the string gives an empty string..  However, when E_NOTICE is on, the above examples will throw the message:

Notice:  Uninitialized string offset:  N in FILE on line LINE

This message cannot be specifically masked with @$string[7], as is possible when $string itself is unset.

isset($string[7]);  // FALSE
$string[7] === NULL;  // FALSE

Even though it seems like a not-NULL value of type string, it is still considered unset.
dandrake
20-Jan-2004 07:41
By the way, the example with the "\n" sequence will insert a new line in the html code, while the output will be decided by the HTML syntax. That's why, if you use

<?
 
echo "Hello \n World";
?>

the browser will receive the HTML code on 2 lines
but his output on the page will be shown on one line only.
To diplay on 2 lines simply use:

<?
 
echo "Hello <br>World";
?>

like in HTML.
philip at cornado dot com
12-Apr-2003 08:37
Note that in PHP versions 4.3.0 and 4.3.1, the following provides a bogus E_NOTICE (this is a known bug):

echo "$somearray['bar']";

This is accessing an array inside a string using a quoted key and no {braces}.  Reading the documention shows all the correct ways to do this but the above will output nothing on most systems (most have E_NOTICE off) so users may be confused.  In PHP 4.3.2, the above will again yield a parse error.
04-Mar-2003 02:04
Regarding "String access by character":

Apparently if you edit a specific character in a string, causing the string to be non-continuous, blank spaces will be added in the empty spots.

echo '<pre>';
$str = '0123';
echo "$str\n";
$str[4] = '4';
echo "$str\n";
$str[6] = '6';
echo "$str\n";

This will output:
0123
01234
01234 6
Notice the blank space where 5 should be.
vallo at cs dot helsinki dot fi
04-Nov-2002 09:41
Even if the correct way to handle variables is determined from the context, some things just doesn't work without doing some preparation.

I spent several hours figuring out why I couldn't index a character out of a string after doing some math with it just before. The reason was that PHP thought the string was an integer!

$reference = $base + $userid;
.. looping commands ..
$chartohandle = $reference{$last_char - $i};

Above doesn't work. Reason: last operation with $reference is to store a product of an addition -> integer variable. $reference .=""; (string catenation) had to be added before I got it to work:

$reference = $base + $userid;
$reference .= "";
.. looping commands ..
$chartohandle = $reference{$last_char - $i};

Et voil! Nice stream of single characters.
guidod at gmx dot de
23-Jul-2002 02:26
PHP's double-quoted strings are inherently ill-featured - they will be a problem especially with computed code like in /e-evals with preg_replace.

bash and perl follow the widely accepted rule that all backslashes will escape the nextfollowing char, and nonalpha-chars will always get printed there as themselves whereas (the unescaped chars might have special meaning in regex). Anyway, it is a great way to just escape all nonalpha chars that you uncertain about whether they have special meaning in some places, and ye'll be sure they will get printed literal.

Furthermore, note that \{ sequence is not mentioned in the  escape-char table! You'll get to know about it only "complex (curly) syntax". This can even more be a problem with evals, as they behave rather flaky like it _cannot_ be accomodated for computed code. Try all variants of `echo "hello \{\$world}"` removing one or more of the chars in the \{\$ part - have fun!
philip at cornado dot com
11-Aug-2000 04:38
A nice "Introduction to PHP Strings" tutorial:
  http://www.zend.com/zend/tut/using-strings.php