mb_encode_mimeheader

(PHP 4 >= 4.0.6, PHP 5)

mb_encode_mimeheader -- Encode string for MIME header

Description

string mb_encode_mimeheader ( string str [, string charset [, string transfer_encoding [, string linefeed]]] )

mb_encode_mimeheader() encodes a given string str by the MIME header encoding scheme. Returns a converted version of the string represented in ASCII.

charset specifies the name of the character set in which str is represented in. The default value is determined by the current NLS setting (mbstring.language).

transfer_encoding specifies the scheme of MIME encoding. It should be either "B" (Base64) or "Q" (Quoted-Printable). Falls back to "B" if not given.

linefeed specifies the EOL (end-of-line) marker with which mb_encode_mime_header() performs line-folding (a RFC term, the act of breaking a line longer than a certain length into multiple lines. The length is currently hard-coded to 74 characters). Falls back to "\r\n" (CRLF) if not given.

例子 1. mb_encode_mimeheader() example

<?php
$name
= ""; // kanji
$mbox = "kru";
$doma = "gtinn.mon";
$addr = mb_encode_mimeheader($name, "UTF-7", "Q") . " <" . $mbox . "@" . $doma . ">";
echo
$addr;
?>

注: This function isn't designed to break lines at higher-level contextual break points (word boundaries, etc.). This behaviour may clutter up the original string with unexpected spaces.

See also mb_decode_mimeheader().


add a note add a note User Contributed Notes
chappy at citromail dot hu
05-Jun-2006 07:33
I found a bad function.

<?php
function encodeHeader($input, $charset = 'ISO-8859-2')
{
  
preg_match_all('/(\\w*[\\x80-\\xFF]+\\w*)/', $input, $matches);
   foreach (
$matches[1] as $value) {
      
$replacement = preg_replace('/([\\x80-\\xFF])/e', '"=" . strtoupper(dechex(ord("\\1")))', $value);
      
$input = str_replace($value, '=?' . $charset . '?Q?' . $replacement . '?=', $input);
   }
   return
$input;
}
?>

This function should be used:

<?php
function encodeHeader($input, $charset = 'ISO-8859-2')
{
  
$m=preg_match_all('/(\w*[\x80-\xFF]+\w*)/', $input, $matches);
   if(
$m)$input=mb_encode_mimeheader($input,$charset, 'Q');
   return
$input;
}
?>
stormflyCUT at hyh dot pl
05-May-2006 07:41
Some solution for using national chars and have problem with UTF-8 for example in mail subject. Before you use mb_encode_mimeheader with UTF-8 set mb_internal_encoding('UTF-8').
paravoid
02-Jan-2006 09:58
If mb_ version doesn't work for you in MIME-B mode:

function encode_mimeheader($string, $charset=null, $linefeed="\r\n") {
   if (!$charset)
       $charset = mb_internal_encoding();

   $start = "=?$charset?B?";
   $end = "?=";
   $encoded = '';

   /* Each line must have length <= 75, including $start and $end */
   $length = 75 - strlen($start) - strlen($end);
   /* Average multi-byte ratio */
   $ratio = mb_strlen($string, $charset) / strlen($string);
   /* Base64 has a 4:3 ratio */
   $magic = $avglength = floor(3 * $length * $ratio / 4);

   for ($i=0; $i <= mb_strlen($string, $charset); $i+=$magic) {
       $magic = $avglength;
       $offset = 0;
       /* Recalculate magic for each line to be 100% sure */
       do {
           $magic -= $offset;
           $chunk = mb_substr($string, $i, $magic, $charset);
           $chunk = base64_encode($chunk);
           $offset++;
       } while (strlen($chunk) > $length);
       if ($chunk)
           $encoded .= ' '.$start.$chunk.$end.$linefeed;
   }
   /* Chomp the first space and the last linefeed */
   $encoded = substr($encoded, 1, -strlen($linefeed));

   return $encoded;
}
nigrez at nius dot waw dot pl
14-Dec-2005 07:42
True, function is broken (PHP5.1, encoding from UTF-8 with pl_PL charset). Below is about 15% faster version of proposed _mb_mime_encode. Also it has header more like othe mb_* functions and doesn't trigger any errors/warnings/notices.

<?php

function mb_mime_header($string, $encoding=null, $linefeed="\r\n") {
  if(!
$encoding) $encoding = mb_internal_encoding();
 
$encoded = '';

  while(
$length = mb_strlen($string)) {
  
$encoded .= "=?$encoding?B?"
            
. base64_encode(mb_substr($string,0,24,$encoding))
             .
"?=$linefeed";

  
$string = mb_substr($string,24,$length,$encoding);
  }

  return
$encoded;
}

?>
gullevek at gullevek dot org
07-Nov-2005 09:29
My first post was around 2003, and still the mb_mime_header is broken. It is *NOT* usable with longer subjects, and mostly unusable with anything else than japanese.

iwakura at junx dot org is also not working for me, it produces also some gargabe.

I updated my old function (the one I posted 2003) and I tested it with overlong subjects in UTF-8, ISO-2022-JP (japanese), GB2312 (simplified chinese) and EUC-KR (korean) and I got readable results in thunderbird, mail.app, outlook, etc.

<?php

function _mb_mime_encode($string, $encoding)
{
  
$pos = 0;
  
// after 36 single bytes characters if then comes MB, it is broken
   // but I trimmed it down to 24, to stay 100% < 76 chars per line
  
$split = 24;
   while (
$pos < mb_strlen($string, $encoding))
   {
      
$output = mb_strimwidth($string, $pos, $split, "", $encoding);
      
$pos += mb_strlen($output, $encoding);
      
$_string_encoded = "=?".$encoding."?B?".base64_encode($output)."?=";
       if (
$_string)
          
$_string .= "\r\n";
      
$_string .= $_string_encoded;
   }
  
$string = $_string;
   return
$string;
}

?>
chappy at citromail dot hu
29-Oct-2005 02:14
In countries where there's non-us ASCII, it's a very good example, for sending mail:

mb_internal_encoding('iso-8859-2');
setlocale(LC_CTYPE, 'hu_HU');

function encode($str,$charset){
   $str=mb_encode_mimeheader(trim($str),$charset, 'Q', "\n\t");
   return $str;
}

print encode('the text with spec. chars: &#337; &#368; &#336; &#369;, ','iso-8859-2');

It creates a 7bit string
iwakura at junx dot org
16-Sep-2005 02:35
i think mb_encode_mimeheader still have bug. here is sample code:

function mb_encode_mimeheader2($string, $encoding = "ISO-2022-JP") {
   $string_array = array();
   $pos = 0;
   $row = 0;
   $mode = 0;
  
   while ($pos < mb_strlen($string)) {
       $word = mb_strimwidth($string, $pos, 1);
       if (!$word) {
           $word = mb_strimwidth($string, $pos, 2);
       }
       if (mb_ereg_match("[ -~]", $word)) {    // ascii
           if ($mode != 1) {
               $row++;
               $mode = 1;
               $string_array[$row] = NULL;
           }
       } else {                                // multibyte
           if ($mode != 2) {
               $row++;
               $mode = 2;
               $string_array[$row] = NULL;
           }
       }
       $string_array[$row] .= $word;
       $pos++;
   }
  
   //echo "<pre>";
   //print_r($string_array);
   //echo "</pre>";
  
   foreach ($string_array as $key => $value) {
       $value = mb_convert_encoding($value, $encoding);
       $string_array[$key] = mb_encode_mimeheader($value, $encoding);
   }
  
   //echo "<pre>";
   //print_r($string_array);
   //echo "</pre>";
  
   return implode("", $string_array);
}

is not the best, but it works
mortoray at ecircle-ag dot com
15-Mar-2005 05:19
At least for Q encoding, this function is unsafe and does not encode correctly. Raw characters which appear as RFC2047 sequences are simply left as is.

Ex:

mb_encode_mimeheader( '=?iso-8859-1?q?this=20is=20some=20text?=' );

returns '=?iso-8859-1?q?this=20is=20some=20text?='

The exact same string, which is obviously not the encoding for the source string.  That is, mb_encode_mimeheader does not do any type of escaping.

That is, the following condition is not always true:
   mb_decode_mimeheader( mb_encode_mimeheader( $text ) ) == $text
gullevek at gullevek dot org
30-Jul-2003 03:02
Read this FIRST: http://bugs.php.net/bug.php?id=23192 because mb_encode_mimeheaders is BUGGY!

a work around for the multibyte broken error for too long subjects for ISO-2022-JP:

$pos=0;
$split=36; // after 36 single bytes characters, if then comes MB, it is broken
while ($pos<mb_strlen($string,$encoding))
{
  $output=mb_strimwidth($string,$pos,$split,"",$encoding);
  $pos+=mb_strlen($output,$encoding);
  $_string.=(($_string)?' ':'').mb_encode_mimeheader($output,$encoding);
}
$string=$_string;

is not the best, but it works
masataka
12-Apr-2003 10:46
second parameter 'charset' is character encoding name, but default must be UTF-8 on PHP4.3.1.