Skip to content

Instantly share code, notes, and snippets.

@ogun
Last active March 27, 2021 13:00
Show Gist options
  • Save ogun/a8dc149f145645778fb6 to your computer and use it in GitHub Desktop.
Save ogun/a8dc149f145645778fb6 to your computer and use it in GitHub Desktop.
MongoDB v3 & v4 case insensitivity and diacritic insensitivity bug for Turkish. https://jira.mongodb.org/browse/SERVER-26658
// Creation script
db.tr_TR.drop()
db.tr_TR.createIndex({val: "text"}, {default_language: "turkish"})
db.tr_TR.insert({ _id: "lower_dotless", val : "quıt" })
db.tr_TR.insert({ _id: "lower_withdot", val : "quit" })
db.tr_TR.insert({ _id: "upper_dotless", val : "QUIT" })
db.tr_TR.insert({ _id: "upper_withdot", val : "QUİT" })
// Query
db.tr_TR.find({$text: {$search: "quit", $language: "tr", $caseSensitive: false, $diacriticSensitive: false}})
// Expected results
// > { _id: "lower_dotless", val : "quıt" }
// > { _id: "lower_withdot", val : "quit" }
// > { _id: "upper_dotless", val : "QUIT" }
// > { _id: "upper_withdot", val : "QUİT" })
@ogun
Copy link
Author

ogun commented Mar 27, 2021

It is a known bug and they still don't have any plan to solve it.
https://jira.mongodb.org/browse/SERVER-26658

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment