View Issue Details

IDProjectCategoryView StatusLast Update
0008577mantisbtemailpublic2018-02-22 19:38
Reporterjanusz Assigned Togiallu  
PrioritynormalSeveritymajorReproducibilityalways
Status closedResolutionfixed 
Product Version1.1.0rc2 
Target Version1.1.3Fixed in Version1.2.0a1 
Summary0008577: Long utf-8 encoded subject fails (phpmailer bug)
Description

phpmailer bug:
http://sourceforge.net/tracker/index.php?func=detail&aid=1190849&group_id=26031&atid=385707

"If a subject line contains non US Ascii characters and
the encoding is set to UTF-8, phpMailer occaisionally
splits the subject line in the middle of a multibyte
character, causing the encoded representation to appear
in the email client."

ported patch (http://sourceforge.net/tracker/index.php?func=detail&aid=1828199&group_id=26031&atid=385709) to mantis version of phpmailer attached

TagsNo tags attached.
Attached Files
encoding.patch (7,276 bytes)   
--- class.phpmailer.php.orig	2005-06-29 04:23:45.000000000 +0200
+++ class.phpmailer.php	2007-11-13 19:14:18.000000000 +0100
@@ -656,6 +656,9 @@
      */
     function WrapText($message, $length, $qp_mode = false) {
         $soft_break = ($qp_mode) ? sprintf(" =%s", $this->LE) : $this->LE;
+        // If utf-8 encoding is used, we will need to make sure we don't
+        // split multibyte characters when we wrap
+        $is_utf8 = (strtolower($this->CharSet) == "utf-8");
 
         $message = $this->FixEOL($message);
         if (substr($message, -1) == $this->LE)
@@ -678,9 +681,11 @@
                     if ($space_left > 20)
                     {
                         $len = $space_left;
-                        if (substr($word, $len - 1, 1) == "=")
+                        if ($is_utf8) {
+                            $len = $this->UTF8CharBoundary($word, $len);
+                        } elseif (substr($word, $len - 1, 1) == "=") {
                           $len--;
-                        elseif (substr($word, $len - 2, 1) == "=")
+                        } elseif (substr($word, $len - 2, 1) == "=")
                           $len -= 2;
                         $part = substr($word, 0, $len);
                         $word = substr($word, $len);
@@ -696,9 +701,11 @@
                 while (strlen($word) > 0)
                 {
                     $len = $length;
-                    if (substr($word, $len - 1, 1) == "=")
+                    if ($is_utf8) {
+                        $len = $this->UTF8CharBoundary($word, $len);
+                    } elseif (substr($word, $len - 1, 1) == "=") {
                         $len--;
-                    elseif (substr($word, $len - 2, 1) == "=")
+					} elseif (substr($word, $len - 2, 1) == "=")
                         $len -= 2;
                     $part = substr($word, 0, $len);
                     $word = substr($word, $len);
@@ -727,6 +734,65 @@
         return $message;
     }
     
+     /**
+     * Finds last character boundary prior to maxLength in a utf-8 
+     * quoted (printable) encoded string.
+     * Original written by Colin Brown.  
+     *
+     * @access private
+     *
+     * @param string $encodedText utf-8 QP text
+     * @param int    $maxLength   find last character boundary prior to this length
+     *
+     * @return int
+     */
+    function UTF8CharBoundary($encodedText, $maxLength)
+    {
+        $foundSplitPos = false;
+        $lookBack = 3;
+
+        while (!$foundSplitPos)
+        {
+            $lastChunk = substr($encodedText, $maxLength - $lookBack, $lookBack);
+            $encodedCharPos = strpos($lastChunk, "=");
+        
+            if ($encodedCharPos !== false) {
+                // Found start of encoded character byte within $lookBack block.
+                // Check the encoded byte value (the 2 chars after the '=')
+                $hex = substr($encodedText, $maxLength - $lookBack + $encodedCharPos + 1, 2);
+                $dec = hexdec($hex);
+                if ($dec < 128)
+                {
+                    // Single byte character.
+                    
+                    // If the encoded char was found at pos 0, it will fit
+                    // otherwise reduce maxLength to start of the encoded char
+                    $maxLength = ($encodedCharPos == 0) ? $maxLength :
+                                 $maxLength - ($lookBack - $encodedCharPos);
+                    $foundSplitPos = true;
+                }
+                elseif ($dec >= 192)
+                {
+                    // First byte of a multi byte character
+                    
+                    // Reduce maxLength to split at start of character
+                    $maxLength = $maxLength - ($lookBack - $encodedCharPos);
+                    $foundSplitPos = true;    
+                }
+                elseif ($dec < 192)
+                {
+                    // Middle byte of a multi byte character, look further back
+                    $lookBack += 3;
+                }
+            } else {
+                // No encoded character found
+                $foundSplitPos = true;    
+            }
+        }
+
+        return $maxLength;
+    }
+
     /**
      * Set the body wrapping.
      * @access private
@@ -1166,9 +1232,15 @@
       // Try to select the encoding which should produce the shortest output
       if (strlen($str)/3 < $x) {
         $encoding = 'B';
-        $encoded = base64_encode($str);
-        $maxlen -= $maxlen % 4;
-        $encoded = trim(chunk_split($encoded, $maxlen, "\n"));
+        if (function_exists('mb_strlen') && $this->HasMultiBytes($str)) {
+            // Use a custom function which correctly encodes and wraps long
+            // multibyte strings without breaking lines within a character 
+            $encoded = $this->Base64EncodeWrapMB($str);
+        } else {
+            $encoded = base64_encode($str);
+            $maxlen -= $maxlen % 4;
+            $encoded = trim(chunk_split($encoded, $maxlen, "\n"));
+        }
       } else {
         $encoding = 'Q';
         $encoded = $this->EncodeQ($str, $position);
@@ -1182,6 +1254,74 @@
       return $encoded;
     }
     
+
+    /**
+     * Checks if a string contains multibyte characters.
+     * 
+     * @access private
+     *
+     * @param string $str multi-byte text to wrap encode
+     *
+     * @return bool
+     */
+    function HasMultiBytes($str)
+    {
+      if (function_exists('mb_strlen')) {
+        return (strlen($str) > mb_strlen($str, $this->CharSet));
+      } else {
+        // Assume no multibytes (we can't handle without mbstring functions anyway)
+        return False;
+      }
+    }
+
+    /**
+     * Correctly encodes and wraps long multibyte strings for mail headers
+     * without breaking lines within a character.
+     * 
+     * Adapted from a function by paravoid at http://uk.php.net/manual/en/function.mb-encode-mimeheader.php
+     * 
+     * @access private
+     *
+     * @param string $str multi-byte text to wrap encode
+     *
+     * @return string
+     */
+    function Base64EncodeWrapMB($str)
+    {
+        $start = "=?".$this->CharSet."?B?";
+        $end = "?=";
+        $encoded = "";
+     
+        $mb_length = mb_strlen($str, $this->CharSet);
+        // Each line must have length <= 75, including $start and $end
+        $length = 75 - strlen($start) - strlen($end);
+        // Average multi-byte ratio
+        $ratio = $mb_length / strlen($str);
+        // Base64 has a 4:3 ratio
+        $offset = $avgLength = floor($length * $ratio * .75);
+     
+        for ($i = 0; $i < $mb_length; $i += $offset)
+        {
+            $lookBack = 0;
+ 
+            do {
+                $offset = $avgLength - $lookBack;
+                $chunk = mb_substr($str, $i, $offset, $this->CharSet);
+                $chunk = base64_encode($chunk);
+                $lookBack++;
+            }
+            while (strlen($chunk) > $length);
+ 
+            $encoded .= $chunk . $this->LE;
+        }
+ 
+        // Chomp the last linefeed
+        $encoded = substr($encoded, 0, -strlen($this->LE));
+     
+        return $encoded;
+    }
+	
+
     /**
      * Encode string to quoted-printable.  
      * @access private
encoding.patch (7,276 bytes)   

Relationships

has duplicate 0008033 closedgiallu PHPMailer: wrong subject header encoding (when using utf-8) 
related to 0009091 closeddregad Reopen 0008577 (Long utf-8 encoded subject fails) 
related to 0009440 closeddregad E-mail notification headers broken in Russian locale UTF-8 

Activities

janusz

janusz

2007-12-20 07:31

reporter   ~0016471

any chance to fix it in 1.1.1

vboctor

vboctor

2007-12-20 12:13

manager   ~0016481

I set the target for 1.1.1. However, I would suggest that you follow up with PHPMailer to release a version with this fix so that we can deploy an official version. I would hate to apply my own patches to their code (we try to avoid that when we can). There has been more activity from PHPMailer recently, hence, hopefully they will include this in a release soon.

However, a good starting point would be to attach a modified version of the file to this issue. Hence, users can just use it to overwrite a file in their installation without having to deal with patches.

ViFF

ViFF

2008-01-17 10:21

reporter   ~0016700

On my installation i replaced substr(,,) with mb_substr(,,, VariableWithCurrentLanguage). This function truncate strings of email correctly.

combr

combr

2008-02-12 08:59

reporter   ~0017024

Last edited: 2008-02-12 09:34

I'm upgrade 1.0.8 with cp1251 codepage to 1.1.1 with utf8 codepage and encounter the same error (in russian subjects).

I see, this bug (maybe :) closed in phpmailer 2.0.0rc1, but it is not release
(http://sourceforge.net/tracker/index.php?func=detail&aid=1190849&group_id=26031&atid=385707) but mantis 1.1.1 include a phpmailer 1.73.

can i fix it? patch in this ticket is working? or are there a better method?

giallu

giallu

2008-02-12 09:30

reporter   ~0017025

I updated the phpmailer class to the released 2.0 a couple days ago in SVN trunk. It would be nice if you could test it and see if the situation is better.

combr

combr

2008-02-14 02:33

reporter   ~0017041

I'm using PHP5, but 2.0 release ..
"PHPMailer v2.0.0 released for PHP4!"

you updated SVN for php4?

also i can say now that "encoding.patch" from here working ok.

giallu

giallu

2008-02-16 19:03

reporter   ~0017079

Yes. I updated to 2.0 becasue 2.1 (which works with PHP 5 exclusively) is still in beta (whatever this means to phpmailer developers).

Thanks for testing

cus

cus

2008-02-21 16:28

reporter   ~0017148

Unfortunately phpmailer 2.0.0 does not contain the "encoding.patch" that was mentioned in the previous comment, so this bug is probably not fixed yet.

I created a new patch against 2.0.0 which also fixes a minor problem with the previous patch:
http://sourceforge.net/tracker/index.php?func=detail&aid=1899080&group_id=26031&atid=385709

xocolate

xocolate

2008-03-14 10:29

reporter   ~0017352

have same problem..

after apply this patch .. have no solve problem.

after i install

php-mbstring package - this help me.