Skip to content

Instantly share code, notes, and snippets.

@kevinmorio
Created December 15, 2020 14:52
Show Gist options
  • Save kevinmorio/24e6ca50f42afdc67c9c124b64201eb3 to your computer and use it in GitHub Desktop.
Save kevinmorio/24e6ca50f42afdc67c9c124b64201eb3 to your computer and use it in GitHub Desktop.
ZotFile user wildcards for title and original date
{
"T": {
"field": "title",
"operations": [
{
"function": "replace",
"regex": "\\s*[.:;?!|].*",
"replacement": ""
},
{
"function": "replace",
"regex": " \\- ",
"replacement": " ",
"flags": "g"
},
{
"function": "replace",
"regex": "<i>|<\/i>|<b>|<\/b>|<sup>|<\/sup>|<sub>|<\/sub>|<span[^>]+>|<\/span>",
"replacement": "",
"flags": "g"
},
{
"function": "replace",
"regex": "[\\\/()~!@#$%^&*{«»„““”‘’|….,;`^<>'}+:?!®©]*",
"replacement": "",
"flags": "g"
},
{
"function": "trim"
}
]
},
"O": {
"field": "extra",
"operations": [
{
"function": "replace",
"regex": "[\\s\\S]*original-date: (\\d{4})[\\s\\S]*|[\\s\\S].*[\\s\\S]",
"replacement": "$1"
}
]
}
}
@kevinmorio
Copy link
Author

The %T wildcard does the following:

  1. Truncate the title after any of ., :, ;, ?, !, |.
  2. Remove - (hyphen surrounded by spaces).
  3. Remove formatting tags.
  4. Remove punctuation (keeps hyphens in compound words).
  5. Trim whitespace at the beginning and end.

The %O wildcard does the following:

  1. Extract the year of the original-date entry in the extra field if present. Otherwise return the empty string.

The %O wildcard can be used with a fallback like {%O|%y} which expands to the original date if it is present and the regular date otherwise.

To use the wildcards, remove all newlines, go to the config editor in Zotero (Preferences -> Advanced -> Config Editor). and paste it as the value for the extensions.zotfile.wildcards.user property.

@foice
Copy link

foice commented Sep 14, 2021

Where the | fallback is documented? In my tests, it seems like if both %O and %y give results both will appear in the result.

@rnuske
Copy link

rnuske commented Oct 2, 2021

You have a fine collection of operations here. I also needed to replace various dashes and used the following successfully

{
  "function": "replace",
  "regex": "[\\u2010-\\u2015\\u002D\\u2212\\uFE58\\uFE63\\uFF0D]",
  "replacement": "-"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment