We use cookies to improve your experience. No personal information is gathered and we don't serve ads. Cookies Policy.

ExpressionEngine Logo ExpressionEngine
Features Pricing Support Find A Developer
Partners Upgrades
Blog Add-Ons Learn
Docs Forums University
Log In or Sign Up
Log In Sign Up
ExpressionEngine Logo
Features Pro new Support Find A Developer
Partners Upgrades
Blog Add-Ons Learn
Docs Forums University Blog
  • Home
  • Forums

A weird case of special characters combination in search input

How Do I?

Tansel's avatar
Tansel
32 posts
9 years ago
Tansel's avatar Tansel

I came across this weird case in the search part of my work. I have no idea if it is PHP and/or MySQL related.

It happens only with ğ and Ğ characters of Turkish alphabet. When these characters are used in the search input field in any of below combinations with – only – a and A:

search    returned
term      keywords
input     variable

 ağa   ->    ğ
 Ağa   ->    ğ
 ağA   ->    ğ
 AğA   ->    ğ
 aĞa   ->    Ğ
 AĞa   ->    Ğ
 aĞA   ->    Ğ
 AĞA   ->    Ğ
 aĞ    ->    Ğ
 ağ    ->    ğ
 Ağ    ->    ğ
 AĞ    ->    Ğ
 ğA    ->    ğ
 ğa    ->    ğ
 Ğa    ->    Ğ
 ĞA    ->    Ğ

then the returned keywords variable on the results page becomes only ğ or Ğ.

In short, if ğ and Ğ characters are in the search terms with a or A, as and As are simply deleted.

I tried all other combinations for this character. It only happens when ğ or Ğ wrapped with single a or A.

All my settings are utf-8 and I didn’t have any other problem with such characters so far.

Any idea to fix it?

       
Derek Jones's avatar
Derek Jones
7,561 posts
9 years ago
Derek Jones's avatar Derek Jones

Try this for me, S-Cube, open system/ee/EllisLab/Addons/search/mod.search.php and on line 199 change:

$part = preg_replace("/\b".preg_quote($badword, '/')."\b/i","", $part);

to

$part = preg_replace("/\b".preg_quote($badword, '/')."\b/iu","", $part);

Notice the u added as a modifier of the regex expression. I think what may be happening is that it’s treating ğ, Ğ, etc. as word boundaries, and the as are then getting removed as stopwords. In other words, instead of ağa, I believe it’s picking up your input as a ğ a. Let me know if that fixes it.

       
Tansel's avatar
Tansel
32 posts
9 years ago
Tansel's avatar Tansel

That fixed it Derek. Thanks…

       
Tansel's avatar
Tansel
32 posts
9 years ago
Tansel's avatar Tansel

This item of EE 2.10.3’s changelog gave me big smile 😊

Fixed a bug where stop word removal in the search module was not UTF-8 compatible. Zaro Ağa is no longer Zaro Ğ.

Thanks…

       
Derek Jones's avatar
Derek Jones
7,561 posts
9 years ago
Derek Jones's avatar Derek Jones

😊 Thanks for letting us know the issue so we could fix it!

       

Reply

Sign In To Reply

ExpressionEngine Home Features Pro Contact Version Support
Learn Docs University Forums
Resources Support Add-Ons Partners Blog
Privacy Terms Trademark Use License

Packet Tide owns and develops ExpressionEngine. © Packet Tide, All Rights Reserved.