View Issue Details

IDProjectCategoryView StatusLast Update
0010718mantisbtotherpublic2010-09-19 03:11
Reporternigel5 Assigned To 
PriorityhighSeveritycrashReproducibilityalways
Status closedResolutionnot fixable 
Platformi86OSCentosOS Version5
Product Version1.2.0rc1 
Summary0010718: preg_replace() warnings
Description

Even though I have set "display_errors" off in my PHP ini file I still get the following error:

SYSTEM WARNING: preg_replace() [function.preg-replace]: Compilation failed: PCRE does not support \L, \l, \N, \P, \p, \U, \u, or \X at offset 98

in vasrious places in the application, It appears 4 times on the plugins page, it appears in every project description, and again on "My View"

I attach some screenshots

Steps To Reproduce

Just load the page and they are there.

Additional Information

I read somewhere that this happens if you are passing in a blank string, but certainly in the case of the project descriptions the process string is not blank.

TagsNo tags attached.

Activities

nigel5

nigel5

2009-07-14 01:43

reporter  

screenshots.zip (209,639 bytes)
jreese

jreese

2009-07-14 08:44

reporter   ~0022493

Last edited: 2009-07-14 08:44

I cannot reproduce this. What version of PHP are you using, and what OS is your server running? The minimum PHP version required for MantisBT 1.2.x is PHP 5.1, and I suspect that you have an older version...

nigel5

nigel5

2009-07-14 09:06

reporter   ~0022494

I am running php version 5.1.6 under apache 2.0.59 on CentOS.

I am not sure what version PCRE I'm supposed to be using, but 4.5 is installed according to my PHP info.

dhx

dhx

2009-07-15 01:13

reporter   ~0022502

Last edited: 2009-07-15 01:17

I can confirm this. It's a problem with CentOS/RHEL where they're using an outdated/crippled version of PCRE/PHP that doesn't like Unicode regex. Possibly it isn't a CentOS/RHEL problem, but a problem with how it was compiled on the server I'm using (it is a shared server out of my control).

I've attached a patch for you nigel5... it is what I use, but will most likely be broken for Unicode characters.

John: is there any chance this regex can be modified to not require Unicode matching?

dhx

dhx

2009-07-15 01:15

reporter  

10718-workaround.diff (1,073 bytes)   
--- a/core/string_api.php	2009-07-15 14:50:44.000000000 +1000
+++ b/core/string_api.php	2009-07-15 14:26:39.000000000 +1000
@@ -457,7 +457,7 @@
 		$t_url_hex = '%[[:digit:]A-Fa-f]{2}';
 
 		# valid set of characters that may occur in url scheme. Note: - should be first (A-F != -AF).
-		$t_url_valid_chars = '-_.,!~*\';\/?%^\\\\:@&={\|}+$#[:alnum:]\pL';
+		$t_url_valid_chars = '-_.,!~*\';\/?%^\\\\:@&={\|}+$#[:alnum:]\w';
 
 		$t_url_chars = "(?:${t_url_hex}|[${t_url_valid_chars}\(\)\[\]])";
 		$t_url_chars2 = "(?:${t_url_hex}|[${t_url_valid_chars}])";
@@ -467,7 +467,7 @@
 		$t_url_part1 = "${t_url_chars}";
 		$t_url_part2 = "(?:\(${t_url_chars_in_parens}*\)|\[${t_url_chars_in_brackets}*\]|${t_url_chars2})";
 
-		$s_url_regex = "/(([[:alpha:]][-+.[:alnum:]]*):\/\/(${t_url_part1}*?${t_url_part2}+))/sue";
+		$s_url_regex = "/(([[:alpha:]][-+.[:alnum:]]*):\/\/(${t_url_part1}*?${t_url_part2}+))/se";
 	}
 
 	$p_string = preg_replace( $s_url_regex, "'<a href=\"'.rtrim('\\1','.').'\">\\1</a> [<a href=\"'.rtrim('\\1','.').'\" target=\"_blank\">^</a>]'", $p_string );

10718-workaround.diff (1,073 bytes)   
nigel5

nigel5

2009-07-15 03:10

reporter   ~0022503

Thanks dhx, that rocks.

While URL's aren't unicode, the processing could be running though Unicode text. In UTF8 for example, aren't only the extended charaters 2 bytes? so while searching for URL's they won't need unicode matching?

Wild guess? Thanks though.

jreese

jreese

2009-10-06 08:55

reporter   ~0023082

Well, URL's can use unicode characters these days, so it is necessary for these regexes to check for unicode characters in URL's for users with non-ASCII domain names or paths. So no, there's no "correct" way to remove unicode processing from these regexes, although removing the 'u' modifier should suffice for those with RHEL installs.

jreese

jreese

2009-10-06 10:00

reporter   ~0023083

I'm tabling this issue, as I feel it's a problem with Red Hat, not PHP or MantisBT. It would be impractical to either remove Unicode support for URL's altogether, or to devise some random method of determining how and when to use Unicode regexes. Hopefully RHEL/CentOS will release a new version soon that has a proper PHP build for once...

giallu

giallu

2010-04-02 12:31

reporter   ~0025000

I'm using mantis on a CentOS 5 server since ages and never seen such issue with PHP regexps in mantis.
Additionally, I'm pretty sure I'm using /u modifiers in some regexps of mine. Maybe you just need to apply the available updates.