Skip to content

Instantly share code, notes, and snippets.

@jgdoncel
Last active April 19, 2024 12:20
Show Gist options
  • Save jgdoncel/bc20b39b8cd612c4a26dfcaf3bb14dd8 to your computer and use it in GitHub Desktop.
Save jgdoncel/bc20b39b8cd612c4a26dfcaf3bb14dd8 to your computer and use it in GitHub Desktop.
MySQL Function to remove accents and special characters
DROP FUNCTION IF EXISTS fn_remove_accents;
DELIMITER |
CREATE FUNCTION fn_remove_accents( textvalue VARCHAR(10000) ) RETURNS VARCHAR(10000)
BEGIN
SET @textvalue = textvalue COLLATE utf8_general_ci;;
-- ACCENTS
SET @withaccents = 'ŠšŽžÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝŸÞàáâãäåæçèéêëìíîïñòóôõöøùúûüýÿþƒ';
SET @withoutaccents = 'SsZzAAAAAAACEEEEIIIINOOOOOOUUUUYYBaaaaaaaceeeeiiiinoooooouuuuyybf';
SET @count = LENGTH(@withaccents);
WHILE @count > 0 DO
SET @textvalue = REPLACE(@textvalue, SUBSTRING(@withaccents, @count, 1), SUBSTRING(@withoutaccents, @count, 1));
SET @count = @count - 1;
END WHILE;
-- SPECIAL CHARS
SET @special = '«»’”“!@#$%¨&()_+=§¹²³£¢¬"`´{[^~}]<,>.:;?/°ºª+|\''';
SET @count = LENGTH(@special);
WHILE @count > 0 do
SET @textvalue = REPLACE(@textvalue, SUBSTRING(@special, @count, 1), '');
SET @count = @count - 1;
END WHILE;
RETURN @textvalue;
END
|
DELIMITER ;
@fakir22
Copy link

fakir22 commented Jun 20, 2018

Nice work thanks ! 👍

Therefore, I was facing the following error when executing a query using your function :

#1270 - Illegal mix of collations (utf8_general_ci,IMPLICIT), (utf8_unicode_ci,IMPLICIT), (utf8_unicode_ci,IMPLICIT) for operation 'replace'

I've managed to resolve my issue by editing the definition line 7 to

SET @textvalue = textvalue COLLATE utf8_unicode_ci;

Hope this helps someone.

@mkalognomos
Copy link

mkalognomos commented Sep 13, 2018

I 've add some more characters.

Replace this:
SET @special = '!@#$%¨&()_+=§¹²³£¢¬"`´{[^~}]<,>.:;?/°ºª+|\''';

With this:
SET @special = '«»’”“!@#$%¨&()_+=§¹²³£¢¬"`´{[^~}]<,>.:;?/°ºª+|\''';

Enjoy :)

@leocavalcante
Copy link

Thank you very much.
And add to the @fakir22 comment, the collate should be collate of your database, in my case I have to:
SET @textvalue = textvalue COLLATE utf8_general_ci;

@Amkoulel
Copy link

Thank you for all in advance for your help.
I have to remove accents and special characters, exactly like above, not only for MySQL DB but for any DB system: Oracle/SQL Server/Posgre/...
Is this SET @textvalue = textvalue COLLATE utf8_general_ci; is enough to solve my problem?

@jgdoncel
Copy link
Author

I

Thank you for all in advance for your help.
I have to remove accents and special characters, exactly like above, not only for MySQL DB but for any DB system: Oracle/SQL Server/Posgre/...
Is this SET @textvalue = textvalue COLLATE utf8_general_ci; is enough to solve my problem?

The line:

SET @textvalue = textvalue COLLATE utf8_general_ci;

Is the way in MySQL to assign a value (the parameter) to a variable (@textvalue) for future modifications, therefore it doesn't remove any special character by itself. You need the whole function for that.

For other DBS you should create a similar function since the sintax would be different. Anyway, this is a quick and dirt solution to manipulate text that is already in the DB. For a "cross-db" solution perhaps a better approach would be cast the text before insterting in the db.

@Amkoulel
Copy link

Dear all,
Please @jgdoncel, below the CAST of the text renturned by the function. My question, is that's correct for cross-tab removing accents characters?
DROP FUNCTION IF EXISTS fn_remove_accents;
CREATE FUNCTION fn_remove_accents( textvalue NVARCHAR(max) ) RETURNS NVARCHAR(max)
BEGIN
DECLARE @textvalue NVARCHAR (max) = N'ŠšŽžÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝŸÞàáâãäåæçèéêëìíîïñòóôõöøùúûüýÿþƒ'
SET @textvalue = SELECT CAST (
REPLACE(@textvalue, 'ŠšŽžÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝŸÞàáâãäåæçèéêëìíîïñòóôõöøùúûüýÿþƒ', 'SsZzAAAAAAACEEEEIIIINOOOOOOUUUUYYBaaaaaaaceeeeiiiinoooooouuuuyybf')
)
RETURN @textvalue;
END ;

@cbichis
Copy link

cbichis commented Apr 22, 2020

Couple more accents to remove (from Romanian, most are not included on your list):

Ă with A
ă with a
Ș with S
ș with s
Ț with T
ț with t

@galvisibarra53
Copy link

thank you very much.
It helped me a lot, although I corrected a few lines to make it work for me, i attach it modified

DROP FUNCTION IF EXISTS fn_remove_accents;
DELIMITER |
CREATE FUNCTION fn_remove_accents( textvalue VARCHAR(10000) ) RETURNS VARCHAR(10000)

BEGIN

 SET @textvalue = textvalue ;

-- ACCENTS
SET @withaccents = 'ŠšŽžÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝŸÞàáâãäåæçèéêëìíîïñòóôõöøùúûüýÿþƒ';
SET @withoutaccents = 'SsZzAAAAAAACEEEEIIIINOOOOOOUUUUYYBaaaaaaaceeeeiiiinoooooouuuuyybf';
SET @count = LENGTH(@withaccents);

WHILE @count > 0 DO
    SET @textvalue = REPLACE(@textvalue, SUBSTRING(@withaccents, @count, 1), SUBSTRING(@withoutaccents, @count, 1));
    SET @count = @count - 1;
END WHILE;

-- SPECIAL CHARS
SET @special = '«»’”“!@#$%¨&()_+=§¹²³£¢¬"`´{[^~}]<,>.:;?/°ºª+|\'';
SET @count = LENGTH(@special);

WHILE @count > 0 do
    SET @textvalue = REPLACE(@textvalue, SUBSTRING(@special, @count, 1), '');
    SET @count = @count - 1;
END WHILE;
RETURN @textvalue;

END
|
DELIMITER ;

@tuandph
Copy link

tuandph commented Jan 5, 2022

First of all, thanks for good job jgdoncel.
for Vietnamese try this function. I modified a little bit @withaccents & @withoutaccents value to map with Tiếng Việt.

`DROP FUNCTION IF EXISTS fn_remove_accents;
DELIMITER |
CREATE FUNCTION fn_remove_accents( textvalue VARCHAR(10000) ) RETURNS VARCHAR(10000)

BEGIN

SET @textvalue = textvalue ;

-- ACCENTS
SET @withaccents = 'ăâđêôơưàảãạáằẳẵặắầẩẫậấèẻẽẹéềểễệếìỉĩịíòỏõọóồổỗộốờởỡợớùủũụúừửữựứỳỷỹỵýĂÂĐÊÔƠƯÀẢÃẠÁẰẲẴẶẮẦẨẪẬẤÈẺẼẸÉỀỂỄỆẾÌỈĨỊÍÒỎÕỌÓỒỔỖỘỐỜỞỠỢỚÙỦŨỤÚỪỬỮỰỨỲỶỸỴÝ';
SET @withoutaccents = 'aadeoouaaaaaaaaaaaaaaaeeeeeeeeeeiiiiiooooooooooooooouuuuuuuuuuyyyyyAADEOOUAAAAAAAAAAAAAAAEEEEEEEEEEIIIIIOOOOOOOOOOOOOOOUUUUUUUUUUYYYYY';
SET @count = LENGTH(@withaccents);

WHILE @count > 0 DO
SET @textvalue = REPLACE(@textvalue, SUBSTRING(@withaccents, @count, 1), SUBSTRING(@withoutaccents, @count, 1));
SET @count = @count - 1;
END WHILE;

-- SPECIAL CHARS
SET @special = '«»’”“!@#$%¨&()_+=§¹²³£¢¬"`´{[^~}]<,>.:;?/°ºª+|'';
SET @count = LENGTH(@special);

WHILE @count > 0 do
SET @textvalue = REPLACE(@textvalue, SUBSTRING(@special, @count, 1), '');
SET @count = @count - 1;
END WHILE;
RETURN @textvalue;
END
|
DELIMITER ;
`

@devicemxl
Copy link

I use in Azure Data Studio the next (only for symbols):

DROP FUNCTION IF EXISTS REMOVE_SYMBOLS;

CREATE FUNCTION REMOVE_SYMBOLS( text_value TEXT(100) )
RETURNS TEXT(100) DETERMINISTIC

BEGIN
SET @text_value = text_value COLLATE utf8_general_ci;
-- SPECIAL CHARS
-- Use the escape character \ before the single quote'
SET @special = '«»’”“!@#$%¨&()_+=§¹²³£¢¬"`´{[^~}]<,>.:;?/°ºª#+.-*|'';
SET @count = LENGTH(@special);

WHILE @count > 0 do
    SET @text_value = REPLACE(@text_value, SUBSTRING(@special, @count, 1), '');
    SET @count = @count - 1;
END WHILE;
RETURN @text_value;

END

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment