mb_encode_numericentity

(PHP 4 >= 4.0.6, PHP 5)

mb_encode_numericentity --  Encode character to HTML numeric string reference

Description

string mb_encode_numericentity ( string str, array convmap [, string encoding] )

mb_encode_numericentity() converts specified character codes in string str from HTML numeric character reference to character code. It returns converted string.

convmap is array specifies code area to convert.

encoding is character encoding.

例子 1. convmap example

$convmap = array (
int start_code1, int end_code1, int offset1, int mask1,
int start_code2, int end_code2, int offset2, int mask2,
........
int start_codeN, int end_codeN, int offsetN, int maskN );
// Specify Unicode value for start_codeN and end_codeN
// Add offsetN to value and take bit-wise 'AND' with maskN, then
// it converts value to numeric string reference.

例子 2. mb_encode_numericentity() example

<?php
/* Convert Left side of ISO-8859-1 to HTML numeric character reference */
$convmap = array(0x80, 0xff, 0, 0xff);
$str = mb_encode_numericentity($str, $convmap, "ISO-8859-1");

/* Convert user defined SJIS-win code in block 95-104 to numeric
   string reference */
$convmap = array(
       
0xe000, 0xe03e, 0x1040, 0xffff,
       
0xe03f, 0xe0bb, 0x1041, 0xffff,
       
0xe0bc, 0xe0fa, 0x1084, 0xffff,
       
0xe0fb, 0xe177, 0x1085, 0xffff,
       
0xe178, 0xe1b6, 0x10c8, 0xffff,
       
0xe1b7, 0xe233, 0x10c9, 0xffff,
       
0xe234, 0xe272, 0x110c, 0xffff,
       
0xe273, 0xe2ef, 0x110d, 0xffff,
       
0xe2f0, 0xe32e, 0x1150, 0xffff,
       
0xe32f, 0xe3ab, 0x1151, 0xffff );
$str = mb_encode_numericentity($str, $convmap, "sjis-win");
?>

See also mb_decode_numericentity().


add a note add a note User Contributed Notes
dan at boxuk dot com
27-Feb-2003 01:48
We were experiencing difficulties with PHP/Sablotron on Solaris; placing HTML character references into the XSL transformation, when set to output UTF-8, converts them back into UTF8 encoded chars.  This was then a problem for non unicode storage.  Using a bit of code from http://homepage.mac.com/marko/ the following function converts the string back to character references:

function utf2html ($utf2html_string)
{
   $f = 0xffff;
   $convmap = array(
/* <!ENTITY % HTMLlat1 PUBLIC "-//W3C//ENTITIES Latin 1//EN//HTML">
   %HTMLlat1; */
     160,  255, 0, $f,
/* <!ENTITY % HTMLsymbol PUBLIC "-//W3C//ENTITIES Symbols//EN//HTML">
   %HTMLsymbol; */
     402,  402, 0, $f,  913,  929, 0, $f,  931,  937, 0, $f,
     945,  969, 0, $f,  977,  978, 0, $f,  982,  982, 0, $f,
   8226, 8226, 0, $f, 8230, 8230, 0, $f, 8242, 8243, 0, $f,
   8254, 8254, 0, $f, 8260, 8260, 0, $f, 8465, 8465, 0, $f,
   8472, 8472, 0, $f, 8476, 8476, 0, $f, 8482, 8482, 0, $f,
   8501, 8501, 0, $f, 8592, 8596, 0, $f, 8629, 8629, 0, $f,
   8656, 8660, 0, $f, 8704, 8704, 0, $f, 8706, 8707, 0, $f,
   8709, 8709, 0, $f, 8711, 8713, 0, $f, 8715, 8715, 0, $f,
   8719, 8719, 0, $f, 8721, 8722, 0, $f, 8727, 8727, 0, $f,
   8730, 8730, 0, $f, 8733, 8734, 0, $f, 8736, 8736, 0, $f,
   8743, 8747, 0, $f, 8756, 8756, 0, $f, 8764, 8764, 0, $f,
   8773, 8773, 0, $f, 8776, 8776, 0, $f, 8800, 8801, 0, $f,
   8804, 8805, 0, $f, 8834, 8836, 0, $f, 8838, 8839, 0, $f,
   8853, 8853, 0, $f, 8855, 8855, 0, $f, 8869, 8869, 0, $f,
   8901, 8901, 0, $f, 8968, 8971, 0, $f, 9001, 9002, 0, $f,
   9674, 9674, 0, $f, 9824, 9824, 0, $f, 9827, 9827, 0, $f,
   9829, 9830, 0, $f,
/* <!ENTITY % HTMLspecial PUBLIC "-//W3C//ENTITIES Special//EN//HTML">
   %HTMLspecial; */
/* These ones are excluded to enable HTML: 34, 38, 60, 62 */
     338,  339, 0, $f,  352,  353, 0, $f,  376,  376, 0, $f,
     710,  710, 0, $f,  732,  732, 0, $f, 8194, 8195, 0, $f,
   8201, 8201, 0, $f, 8204, 8207, 0, $f, 8211, 8212, 0, $f,
   8216, 8218, 0, $f, 8218, 8218, 0, $f, 8220, 8222, 0, $f,
   8224, 8225, 0, $f, 8240, 8240, 0, $f, 8249, 8250, 0, $f,
   8364, 8364, 0, $f);

   return mb_encode_numericentity($utf2html_string, $convmap, "UTF-8");
}