Skip to content

Instantly share code, notes, and snippets.

@acsr
Last active June 22, 2023 08:25
Show Gist options
  • Save acsr/716c0d202e6e0c7db4919139506e3aab to your computer and use it in GitHub Desktop.
Save acsr/716c0d202e6e0c7db4919139506e3aab to your computer and use it in GitHub Desktop.
Two Python based Automator Workflows to convert selected text containing tags between different tag listing formats (Zotero newline seperated clipboard from Zutilo to Logseq tag & pagelink properties list). Tries to recognize all simple tags in single line starting with "tags:: " or in bunch of Logseq Blocks (fails on some).
#!/usr/bin/env python3
"""Logseq_tags_to_Zotero.workflow.py
Functions to convert a selected text containing Logseq tags to a different tag listing format (Zotero).
Currently aiming at bidirectional converting between Zotero and Logseq.
This is Work-in-Progress and may contain traces of nuts!
@Copyright 2023 by Armin Stross-Radschinski, ACSR industrialdesign developer@acsr.de
Licence: MIT, do what you like
How to install:
- Open Automator and create a new Automator "Service" workflow (Gearwheel Icon)
- In the search field for actions type "shell" and add a new step "execute shell script"
- In the dropdown for Shell: select "/usr/local/bin/python3"
- replace the Python3 code with this gist
- For the settings at the top select
- Workflow receives: "Text" in "all programs"
- Input 1: None
- Select: "Output replaces selected Text"
- Choose a color and icon of your choice
- Save the workflow as "Logseq_tags_to_Zotero"
- This will save the workflow in "Logseq_tags_to_Zotero.workflow" in your users "~/Library/Services" directory
- Add a suitable keyboard shortcut in the "System Preferences -> Keyboard -> Shortcuts -> Services -> Text" section
- In newer OS there will be a suggestion to open this during initial save
- we use "Control-Command-Alt-L"
- You can modify the Python code to your liking and share the finished *.workflow with the world
- To test the Python code, we suggest to use a Jupyter Notebook.
- The easiest way to install a Jupyter IDE from scratch is using Microsoft Visual Studio Code
- add the VSCode Python and Jupyter Notebook extensions
- Create a new Logseq_tags_to_Zotero.ipynb file in your workspace and paste this code
- To make the input/output handling work in the notebook you need to uncomment some lines and activate others to make the code aware of the new environment. e.g providing proper sample data is important.
- If you are ready, paste back the modified code, revert the input/output handling
Have fun
There is a companion script "Zotero_tags_to_Logseq.workflow.py" for the reverse transformation.
V 0.1.1 20230622_083116-acsr (adding install and license in the docstring)
20230519-20230622 by Armin Stross-Radschinski, acsr.github@dev.acsr.de for acsr/evenios
"""
import sys
import re
def Zotero_tags_to_Logseq(multiline_string):
"""Converts a newline delimeted list of (Zotero) tags to a unique set
of tags in logseq format. Tags are prefixed with a # and tags
containing spaces are wrapped in double square brackets [[]].
This does not preserve order!"""
#print(type(multiline_string))
#tags_list=multiline_string.splitlines()
tags_list=multiline_string
#print(tags_list)
#print(type(tags_list))
logseq_tags_set=set()
for index, item in enumerate(tags_list):
item=re.sub('[\n]', '', item)
if " " in item:
item=f'[[{item}]]'
item=f'#{item}'
logseq_tags_set.add(item)
#print(logseq_tags_set)
logseq_tags=", ".join(logseq_tags_set)
return logseq_tags
def Logseq_tags_to_Zotero(text):
"""Converts a line of mixed Logseq tags to a unique set
of tags in Zotero newline delimited format. Tags markup prefixed with a #
and pagelink type tags wrapped in double square brackets [[]] is removed as
well as commas and a potential tags:: property prefix.
This does not work with multiline input!"""
tags=extract_logseq_tags(text)
unique_tags_set=set(tags)
zotero_tags="\n".join(unique_tags_set)
return zotero_tags
def extract_logseq_hashtags(text):
"""Get all hashtags in text without the #"""
# the regular expression
regex="#(\w+)"
# extracting the hashtags
hashtag_list=re.findall(regex, str(text))
return hashtag_list
def extract_logseq_property_tags(text):
"""Get all tags in property tags:: not starting with #"""
if len(text)==1 and text[0].lower().startswith("tags:: "):
tags=text[0]
#print(f'startswith: tags::')
#print(f'text[0] = {tags}')
#print(type(tags))
tags=re.sub(r"tags\:\: ", '', tags)
tags=re.sub(r"\n", '', tags)
tags_list=tags.split(",")
stripped_list=[]
pattern='[\[\]\#]'
for tag in tags_list:
tag=tag.strip()
tag=re.sub(pattern, '', tag)
stripped_list.append(tag)
tags_list=stripped_list
#print(f'stripped tags_list: {tags_list}')
#print(type(tags_list))
else:
tags_list=[]
return tags_list
def extract_logseq_pagelinks(text):
"""Get all pagelinks in text without the # and [[]]"""
text="\n".join(text)
page_links=re.findall(r'\[.*?\]\]', str(text))
pattern='[\[\]\#]'
tags=[]
for page_link in page_links:
tag=re.sub(pattern, '', str(page_link))
tags.append(tag)
return tags
def extract_logseq_tags(text):
"""Get all hashtags and pagelinks in text"""
# get all Hashtags like [[]] starting with # but without [[]]
hashtag_list=extract_logseq_hashtags(text)
#print(f'hashtag_list: {hashtag_list}')
#print(type(hashtag_list))
# get all tags in property tags:: not starting with #
tags_list=extract_logseq_property_tags(text)
#print(f'tags_list: {tags_list}')
#print(type(tags_list))
# get all PageLinks like [[]] and PageLinksTags like #[[]]
pagelink_list=extract_logseq_pagelinks(text)
#print(f'pagelink_list: {pagelink_list}')
#print(type(pagelink_list))
result=hashtag_list + tags_list + pagelink_list
#print(f'extract_logseq_tags result: {result}')
return result
# zotero_tags_stdin= ['Blender\n', 'Documentation\n', 'Git\n', 'Migration\n', 'Sphinx\n', 'Subversion\n', '3D Visualization\n']
# Zotero tags to Logseq Test
# print(Zotero_tags_to_Logseq(zotero_tags_stdin))
# Logseq tags Property line Test data
#test = "tags:: TinyMCE, HTML, PrettifyCode, CSS, JS, #Javascript, Word, [[Microsoft Word]], #[[Microsoft Office]]"
#input_selection=['tags:: TinyMCE, HTML, PrettifyCode, #CSS, JS, #Javascript, Word, [[Page Link]], #[[Microsoft Word]], #[[Microsoft Office]]\n']
# Logseq tags Property line to_Zotero Test
#result = Logseq_tags_to_Zotero(input_selection)
#print(f'\nresult:\n{result}')
#print(type(result))
# Uncomment to select functionality
# Zotero_tags_to_Logseq
# print(Zotero_tags_to_Logseq(sys.stdin))
# Logseq_tags_to_Zotero
input = list(sys.stdin)
#print(f'input: {input}')
#print(type(input))
print(Logseq_tags_to_Zotero(input))
#!/usr/bin/env python3
"""Zotero_tags_to_Logseq.workflow.py
Functions to convert a selected text containing Zotero tags to a different tag listing format (Logseq).
Currently aiming at bidirectional converting between Zotero and Logseq.
This is Work-in-Progress and may contain traces of nuts!
@Copyright 2023 by Armin Stross-Radschinski, ACSR industrialdesign developer@acsr.de
Licence: MIT, do what you like
How to install:
- Open Automator and create a new Automator "Service" workflow (Gearwheel Icon)
- In the search field for actions type "shell" and add a new step "execute shell script"
- In the dropdown for Shell: select "/usr/local/bin/python3"
- replace the Python3 code with this gist
- For the settings at the top select
- Workflow receives: "Text" in "all programs"
- Input 1: None
- Select: "Output replaces selected Text"
- Choose a color and icon of your choice
- Save the workflow as "Zotero_tags_to_Logseq"
- This will save the workflow in "Zotero_tags_to_Logseq.workflow" in your users "~/Library/Services" directory
- Add a suitable keyboard shortcut in the "System Preferences -> Keyboard -> Shortcuts -> Services -> Text" section
- In newer OS there will be a suggestion to open this during initial save
- we use "Control-Command-Alt-Z"
- You can modify the Python code to your liking and share the finished *.workflow with the world
- To test the Python code, we suggest to use a Jupyter Notebook.
- The easiest way to install a Jupyter IDE from scratch is using Microsoft Visual Studio Code
- add the VSCode Python and Jupyter Notebook extensions
- Create a new Logseq_tags_to_Zotero.ipynb file in your workspace and paste this code
- To make the input/output handling work in the notebook you need to uncomment some lines and activate others to make the code aware of the new environment. e.g providing proper sample data is important.
- If you are ready, paste back the modified code, revert the input/output handling
Have fun
There is a companion script "Zotero_tags_to_Logseq.workflow.py" for the reverse transformation.
V 0.1.1 20230622_083116-acsr (adding install and license in the docstring)
20230519-20230622 by Armin Stross-Radschinski, acsr.github@dev.acsr.de for acsr/evenios
"""
import sys
import re
def Zotero_tags_to_Logseq(multiline_string):
"""Converts a newline delimeted list of (Zotero) tags to a unique set
of tags in logseq format. Tags are prefixed with a # and tags
containing spaces are wrapped in double square brackets [[]].
This does not preserve order!"""
#print(type(multiline_string))
#tags_list=multiline_string.splitlines()
tags_list=multiline_string
#print(tags_list)
#print(type(tags_list))
logseq_tags_set=set()
for index, item in enumerate(tags_list):
item=re.sub('[\n]', '', item)
if " " in item:
item=f'[[{item}]]'
item=f'#{item}'
logseq_tags_set.add(item)
#print(logseq_tags_set)
logseq_tags=", ".join(logseq_tags_set)
return logseq_tags
def Logseq_tags_to_Zotero(text):
"""Converts a line of mixed Logseq tags to a unique set
of tags in Zotero newline delimited format. Tags markup prefixed with a #
and pagelink type tags wrapped in double square brackets [[]] is removed as
well as commas and a potential tags:: property prefix.
This does not work with multiline input!"""
tags=extract_logseq_tags(text)
unique_tags_set=set(tags)
zotero_tags="\n".join(unique_tags_set)
return zotero_tags
def extract_logseq_hashtags(text):
"""Get all hashtags in text without the #"""
# the regular expression
regex="#(\w+)"
# extracting the hashtags
hashtag_list=re.findall(regex, text)
return hashtag_list
def extract_logseq_pagelinks(text):
"""Get all pagelinks in text without the # and [[]]"""
page_links=re.findall(r'\[.*?\]\]', text)
pattern='[\[\]\#]'
tags=[]
for page_link in page_links:
tag=re.sub(pattern, '', page_link)
tags.append(tag)
return tags
def extract_logseq_tags(text):
"""Get all hashtags and pagelinks in text"""
# get all Hashtags like [[]] starting with # but without [[]]
hashtag_list=extract_logseq_hashtags(text)
# get all PageLinks like [[]] and PageLinksTags like #[[]]
pagelink_list=extract_logseq_pagelinks(text)
return hashtag_list + pagelink_list
# zotero_tags_stdin= ['Blender\n', 'Documentation\n', 'Git\n', 'Migration\n', 'Sphinx\n', 'Subversion\n', '3D Visualization\n']
# Zotero tags to Logseq Test
# print(Zotero_tags_to_Logseq(zotero_tags_stdin))
# Logseq tags Property line Test data
# test = "tags:: TinyMCE, HTML, PrettifyCode, CSS, JS, Javasript, Word, [[Microsoft Word]], [[Microsoft Office]]"
# Logseq tags Property line to_Zotero Test
# print(Logseq_tags_to_Zotero(test))
# Uncomment to select functionality
# Zotero_tags_to_Logseq
print(Zotero_tags_to_Logseq(sys.stdin))
# Logseq_tags_to_Zotero
# print(Logseq_tags_to_Zotero(sys.stdin))
@acsr
Copy link
Author

acsr commented Jun 22, 2023

Updated the description. Needs more details on known issues and how to setup the code to work in VSCode Jupyter Notebook in detail. The hint to use VSCode is to lower the entry level for newbies. If you are experience, go your own way and ignore flaws in the code or suggest improvement.

Next Steps:

  • Improve how to setup the code to work in VSCode Jupyter Notebook
  • Create a full repo
  • Include released workflows for download without fiddling in Automator
  • Write a blogpost and record a video/gif

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment