Skip to content

Instantly share code, notes, and snippets.

@tkrajina
Created January 31, 2016 08:21
Show Gist options
  • Save tkrajina/d423e9b9fc2e72d63072 to your computer and use it in GitHub Desktop.
Save tkrajina/d423e9b9fc2e72d63072 to your computer and use it in GitHub Desktop.
Golang remove accents
package main
import (
"fmt"
"unicode"
"golang.org/x/text/transform"
"golang.org/x/text/unicode/norm"
)
func isMn(r rune) bool {
return unicode.Is(unicode.Mn, r) // Mn: nonspacing marks
}
func main() {
s := "Yoùr Śtring šđč枊ĐČĆŽ Ötzi's Nationalität èàì"
b := make([]byte, len(s))
t := transform.Chain(norm.NFD, transform.RemoveFunc(isMn), norm.NFC)
_, _, e := t.Transform(b, []byte(s), true)
if e != nil {
panic(e)
}
fmt.Println(string(b))
}
@mh-cbon
Copy link

mh-cbon commented Oct 23, 2017

nDst, _, e := t.Transform(b, []byte(s), true)
        if e != nil {
                panic(e)
        }

fmt.Println(string(b[:nDst))

otherwise there were some trailing x\00

@glebtv
Copy link

glebtv commented Jan 17, 2018

Had to increase destination buffer size (length of the string is sometimes larger after this transform, which leads to "transform: short destination buffer" error)

@micheltlutz
Copy link

micheltlutz commented Mar 27, 2018

I'm using this to rename files but I'm having trouble as it removes the . of the file.
Any suggestion?

@guillaq
Copy link

guillaq commented Feb 6, 2021

A little late to the party but it looks like that function is now deprecated. You can use runes.Remove instead:

transform.Chain(norm.NFD, runes.Remove(runes.In(unicode.Mn)), norm.NFC)

https://play.golang.org/p/ZcSR45sdlCh

@estebanbacl
Copy link

sorry the Ñ is a not character accent in spanish

@conur-floki
Copy link

this works but takes the Ñ as an accent char so does not work in Spanish

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment