View Issue Details

IDProjectCategoryView StatusLast Update
0008353mantisbttaggingpublic2008-08-11 09:41
Reportermatthy Assigned Togiallu  
PrioritynormalSeveritytweakReproducibilityalways
Status closedResolutionfixed 
Product Version1.1.0a4 
Fixed in Version1.2.0a2 
Summary0008353: Handling accentuated tags
Description

Mantis doesn't accept accentuated tags like "règles" or "évolution"...
Here is my suggestion, which should handle every kind of special chars (at least in european languages): in core/tag_api.php, replace tag_name_is_valid by these two functions:

function tag_name_is_valid( $p_name, &$p_matches, $p_prefix="" ) {
    $t_pattern = "/^$p_prefix([a-zA-Z0-9][a-zA-Z0-9\xc0-\xdd\xe0-\xff-_. ]*)$/";
    return preg_match( $t_pattern, replace_accents($p_name), $p_matches );
}
function replace_accents($str) {
  $str = htmlentities($str, ENT_COMPAT, "UTF-8");
  $str = preg_replace('/&([a-zA-Z])(uml|acute|grave|circ|tilde|cedil|ring);/','$1',$str);
  return html_entity_decode($str);
}

Of course, if you have a better piece of code, feel free to tell me, i'll gladly fix mine ;-)

Tagslocalization, port to stable

Relationships

has duplicate 0008649 closedjreese Tags in Russian - can't be created! 

Activities

jreese

jreese

2007-09-27 14:57

reporter   ~0015745

Forgive me for my ignorance when it comes to unicode/utf/whatever. I would like to support this, but being from the US where international character support is not really taken into account, even in top-tier-university classes. I'm not familiar with how unicode/utf works across international borders and such, so bear with me please.

What specifically are these changes doing? Is the ...\xc0-\xdd\xe0-xff... specifying unicode characters that are language-independent? Will these characters appear correctly to users browsing with a different language than that of the person who entered the tag name?

And can you give me a one or two sentence description of the purpose for replace_accents()? Is it simply stripping out all the accents and replacing them with generic letters, or is it doing something more complicated than that?

matthy

matthy

2007-10-01 04:34

reporter   ~0015779

Last edited: 2007-10-01 04:47

Well i tried different methods to get this result (replacing all accents by common letters so the tag is allowed)... First with replace_accents() (which i found in the php manual), but it didn't always work, there were still some unhandled characters...
So i added some hexa values in preg_match, and it worked. Maybe it is because sometimes accents are coded in utf, and sometimes in unicode...? I really don't know :-\

Since then, i've modified other pieces of code, to adapt Mantis to a french use, for example in the email stuff... I couldn't find a way to handle accents in raw text emails, so i switched everything to html and did some replace (é -> &eacute etc); only the email subject couldn't be handled, so i used the same method as above (replace_accents(), etc).
I also changed a lot of things in string_french.txt, replacing accents by html chars (&eacute etc), because SOMETIMES (that's what's very weird: not always !) it generated bad chars on the webpage.

This is surely not very scientifical but my company was in a hurry to setup Mantis in time for a client, so i did it fast :) If you want, i can give you my current code, so you see where i had to handle accents.

jreese

jreese

2007-10-01 10:36

reporter   ~0015781

If you could make a patch against the CVS Head (I'm assuming that's what you're using due to the Product Build listed), that would be greatly appreciated, and would make it much easier to test your changes, etc.

matthy

matthy

2007-10-01 10:54

reporter   ~0015782

Last edited: 2007-10-01 10:57

Errr... I used WinCVS to download the latest CVS build early september, but i don't know how to select some of my modifications (and not all - i changed a lot of stuff to obey our client's specifications) and put them in a patch :-\
Maybe i could send you an email ?

Edit: i changed core/tag_api.php , csv_export.php , email_api.php , custom_strings_inc.php (to replace string_french accents without modifying the original file)

jreese

jreese

2007-10-02 10:35

reporter   ~0015791

If you can't do a diff/patch, then just zip up the affected files and attach that here.

ave

ave

2007-12-17 05:30

reporter   ~0016455

Last edited: 2007-12-17 05:52

What happens if we modify the tag_name_is_valid function as follows?

function tag_name_is_valid( $p_name, &$p_matches, $p_prefix="" ) {
$t_pattern = "/^$pprefix([^. ].*)$/";
return preg_match( $t_pattern, $p_name, $p_matches );
}

I believe it can handle any characters because Mantis 1.1.x uses utf-8.
If there is no particular reason to avoid non-ASCII characters, why don't you just allow it and let us test, please?

I have modified the tag_api.php locally as above, and it seems to work OK with add/remove/search operations.

--
Sorry if the reason is obvious, I have just taken a quick look at the requirement page and source code but couldn't find any.
I really would like to use the tagging feature, but couldn't force users to enter English tags, you know...

jreese

jreese

2007-12-17 10:01

reporter   ~0016457

well, we do want to match spaces, underscores, periods, etc, but we don't want to match commas, plus/minus signs, and such. I also don't like the idea of just allowing everything else, but I can try to look into it.

ave

ave

2007-12-17 10:36

reporter   ~0016458

Thanks! If you need to test something, I would be glad to help :)

Kirill

Kirill

2008-01-17 09:59

reporter   ~0016699

When Mantis tags can support utf-8 non latin chars?

konstbel

konstbel

2008-02-04 09:05

reporter   ~0016936

Last edited: 2008-02-04 09:24

I tried to insert tag on russian and got "Create permission denied."
Is this the same case as above?
(Mantis 1.1.1)

I do not understand what do you mean under "we do want to match spaces, underscores, periods, etc, but we don't want to match commas, plus/minus signs, and such". Match when? During check for unique? When searching? filtering?

ave

ave

2008-02-04 22:11

reporter   ~0016950

Hi konstbel,

I tried to insert tag on russian and got "Create permission denied."
Is this the same case as above?

I believe it's the same case.

Match when?

When validating an user entered tag string (replacing the word 'match' with 'allow' might make more sense).
Those special characters should not be in the tag string itself.

--
I encourage you to comment on this issue if you need the improvement because that may change the priority, but you should not expect an answer to the question 'When?'.

konstbel

konstbel

2008-02-05 04:42

reporter   ~0016952

Hi, Ave,
Why do we need to validate tag string?
Why not to allow tags like "This_is my super++ tag!" ? I thought this is just the user choice. May be disallow only a comma, as long as you use it as a separator.

By the way, I don't find a page of managing tags: edit, delete. Is there any?

ave

ave

2008-02-05 05:09

reporter   ~0016954

For example, '+' is used when filtering issues. Check wiki for details.

http://www.mantisbt.org/wiki/doku.php/mantisbt:tagging_requirements?s=tags

I don't find a page of managing tags: edit, delete. Is there any?

Clicking one of attached tags brings you to 'Tag Details' screen.
If you have enough privilege, you'll find a button to edit/delete the tag.

konstbel

konstbel

2008-02-05 05:44

reporter   ~0016955

So, "as designed" you have to disallow spaces, +, - and commas. OK, we could live with that.
But disallow of local chars is VERY inconvenient. But I suspect, that the reason is strtoupper() function, that fails on localized strings, isn't it?

Clicking one of attached tags brings you to 'Tag Details' screen
So, there is no page, where I could review all tags together?

jreese

jreese

2008-02-07 23:13

reporter   ~0016982

Giallu has mentioned that he has a patch for this issue. As I don't know enough about unicode/utf-8, I'm transferring ownership to him.

giallu

giallu

2008-07-29 06:48

reporter   ~0018903

Fixed with the commit at:

http://mantisbt.svn.sourceforge.net/mantisbt/?rev=5437&view=rev

Related Changesets

MantisBT: master 077682f1

2008-07-28 12:55

giallu


Details Diff
Fix 8353: Handling accentuated tags.

This is done by changing the whitelisting method to a blacklist
Also added proper form submission when some tags are invalid

git-svn-id: http://mantisbt.svn.sourceforge.net/svnroot/mantisbt/trunk@5437 <a class="text" href="/?p=mantisbt.git;a=object;h=f5dc347c">f5dc347c</a>-c33d-0410-90a0-b07cc1902cb9
Affected Issues
0008353
mod - core/tag_api.php Diff File
mod - tag_attach.php Diff File