Skip to content

Instantly share code, notes, and snippets.

@rosiel
Last active March 28, 2024 17:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rosiel/88b10360d357f3c4d5f4e382b1cfce03 to your computer and use it in GitHub Desktop.
Save rosiel/88b10360d357f3c4d5f4e382b1cfce03 to your computer and use it in GitHub Desktop.
move-publisher-to-own-field.py
task: get_data_from_view # See Z-View to export
host: "http://islandora.dev"
view_path: '/nodes-for-publisher'
username: admin
password: password
content_type: islandora_object
export_csv_file_path: ./islandora_export.csv
# If export_csv_field_list is not present, all fields will be exported.
# node_id and title are always included.
export_csv_field_list: ['field_linked_agent', 'field_publisher']
export_csv_term_mode: name
input_dir: '.'
#!/usr/local/bin/python3
# vim: set expandtab:
# vim: tabstop=4:
# vim: ai:
# vim: shiftwidth=4:
import csv
import os
import re
# This python file demonstrates how you can update your Islandora data from using the Linked Agent field
# for publishers (with relators:pbl) to using a separate field_publisher which is a text field.
# The input file should be a csv generated by workbench with columns `field_linked_agent` and `field_publisher`.
# Make sure you selected `export_csv_term_mode: name` in your `get_data_from_view` task config so your names are
# actual names, not taxonomy term IDs.
filename = 'islandora_export.csv'
outputfilename = os.path.splitext(filename)[0] + '-modified' + os.path.splitext(filename)[1]
delfilename = os.path.splitext(filename)[0] + '-todelete' + os.path.splitext(filename)[1]
# Initialize input file.
with open(filename, 'r') as f:
# Initialize output file.
reader = csv.DictReader(f, delimiter=',')
with open(outputfilename, 'w') as out:
writer = csv.DictWriter(out, fieldnames=reader.fieldnames)
writer.writeheader()
## Initialize second out put file for nodes where the only contributors are publishers, since Workbench won't delete them.
with open(delfilename, 'w') as delfile:
delwriter = csv.DictWriter(delfile, fieldnames=['node_id','field_linked_agent'], extrasaction='ignore')
delwriter.writeheader()
# Loop over each row in the input.
for row in reader:
# Check if there's a publisher in this row.
if 'relators:pbl' in row['field_linked_agent']:
# Split multi-valued field into individual values like relators:cre:person:Smith, Jane
relators = row['field_linked_agent'].split('|')
# Initialize a list of publishers in case there are more than one.
publishers = []
# Loop through each of the multiple values in this row.
for i in range(len(relators)):
#print(relators[i]) # Debug
# Check if this value is a publisher.
if 'relators:pbl:' in relators[i]:
# Remove the start of the string up to the 3rd colon (using regex).
publisher = re.sub('[^:]*:[^:]*:[^:]*:', '', relators[i])
# Add the remaining value to our plain text publishers list.
publishers.append(publisher)
# Null out this value, to be removed when we're done looping. Otherwise it messes up the looping.
relators[i] = ''
# Filter out the empty elements.
if '' in relators:
relators = [j for j in relators if j != '']
# Join multiple values together with pipe (|) character and set them back into the row array.
row['field_linked_agent'] = '|'.join(relators)
row['field_publisher'] = '|'.join(publishers)
# write the row. Here, we are going to skip writing/updating any row that doesn't have a publisher to move.
writer.writerow(row)
# Write to the delete file if there are no relators left.
if len(row['field_linked_agent']) == 0:
delwriter.writerow(row)
# When done, you can run a Workbench "update" task for the -modified.csv file with `update_mode: replace`
# And a second "update" task for the -todelete.csv file (make sure you use this file!!) with `update_mode:delete`.
task: update
host: "http://islandora.dev"
username: admin
password: password
input_csv: islandora_export-modified.csv
content_type: islandora_object
update_mode: replace
ignore_csv_columns: ['title']
input_dir: .
task: update
host: "http://islandora.dev"
username: admin
password: password
input_csv: islandora_export-todelete.csv
content_type: islandora_object
update_mode: delete
ignore_csv_columns: ['title']
input_dir: .
uuid: 40654746-b231-498e-9e32-3136c9c7eab6
langcode: en
status: true
dependencies:
config:
- field.storage.node.field_linked_agent
- field.storage.node.field_publisher
- node.type.islandora_object
module:
- controlled_access_terms
- node
- rest
- serialization
- user
id: nodes_for_publisher
label: 'Nodes for publisher'
module: views
description: 'for moving publishers to publisher field'
tag: ''
base_table: node_field_data
base_field: nid
display:
default:
id: default
display_title: Default
display_plugin: default
position: 0
display_options:
title: 'Nodes for publisher'
fields:
nid:
id: nid
table: node_field_data
field: nid
relationship: none
group_type: group
admin_label: ''
entity_type: node
entity_field: nid
plugin_id: field
label: 'node ID'
exclude: false
alter:
alter_text: false
text: ''
make_link: false
path: ''
absolute: false
external: false
replace_spaces: false
path_case: none
trim_whitespace: false
alt: ''
rel: ''
link_class: ''
prefix: ''
suffix: ''
target: ''
nl2br: false
max_length: 0
word_boundary: true
ellipsis: true
more_link: false
more_link_text: ''
more_link_path: ''
strip_tags: false
trim: false
preserve_tags: ''
html: false
element_type: ''
element_class: ''
element_label_type: ''
element_label_class: ''
element_label_colon: false
element_wrapper_type: ''
element_wrapper_class: ''
element_default_classes: true
empty: ''
hide_empty: false
empty_zero: false
hide_alter_empty: true
click_sort_column: value
type: number_integer
settings:
thousand_separator: ''
prefix_suffix: true
group_column: value
group_columns: { }
group_rows: true
delta_limit: 0
delta_offset: 0
delta_reversed: false
delta_first_last: false
multi_type: separator
separator: ', '
field_api_classes: false
title:
id: title
table: node_field_data
field: title
relationship: none
group_type: group
admin_label: ''
entity_type: node
entity_field: title
plugin_id: field
label: Title
exclude: false
alter:
alter_text: false
text: ''
make_link: false
path: ''
absolute: false
external: false
replace_spaces: false
path_case: none
trim_whitespace: false
alt: ''
rel: ''
link_class: ''
prefix: ''
suffix: ''
target: ''
nl2br: false
max_length: 0
word_boundary: false
ellipsis: false
more_link: false
more_link_text: ''
more_link_path: ''
strip_tags: false
trim: false
preserve_tags: ''
html: false
element_type: ''
element_class: ''
element_label_type: ''
element_label_class: ''
element_label_colon: true
element_wrapper_type: ''
element_wrapper_class: ''
element_default_classes: true
empty: ''
hide_empty: false
empty_zero: false
hide_alter_empty: true
click_sort_column: value
type: string
settings:
link_to_entity: true
group_column: value
group_columns: { }
group_rows: true
delta_limit: 0
delta_offset: 0
delta_reversed: false
delta_first_last: false
multi_type: separator
separator: ', '
field_api_classes: false
field_linked_agent:
id: field_linked_agent
table: node__field_linked_agent
field: field_linked_agent
relationship: none
group_type: group
admin_label: ''
plugin_id: field
label: 'Contributors (field_linked_agent)'
exclude: false
alter:
alter_text: false
text: ''
make_link: false
path: ''
absolute: false
external: false
replace_spaces: false
path_case: none
trim_whitespace: false
alt: ''
rel: ''
link_class: ''
prefix: ''
suffix: ''
target: ''
nl2br: false
max_length: 0
word_boundary: true
ellipsis: true
more_link: false
more_link_text: ''
more_link_path: ''
strip_tags: false
trim: false
preserve_tags: ''
html: false
element_type: ''
element_class: ''
element_label_type: ''
element_label_class: ''
element_label_colon: false
element_wrapper_type: ''
element_wrapper_class: ''
element_default_classes: true
empty: ''
hide_empty: false
empty_zero: false
hide_alter_empty: true
click_sort_column: target_id
type: typed_relation_default
settings:
link: true
group_column: ''
group_columns: { }
group_rows: true
delta_limit: 0
delta_offset: 0
delta_reversed: false
delta_first_last: false
multi_type: separator
separator: ', '
field_api_classes: false
field_publisher:
id: field_publisher
table: node__field_publisher
field: field_publisher
relationship: none
group_type: group
admin_label: ''
plugin_id: field
label: Publisher
exclude: false
alter:
alter_text: false
text: ''
make_link: false
path: ''
absolute: false
external: false
replace_spaces: false
path_case: none
trim_whitespace: false
alt: ''
rel: ''
link_class: ''
prefix: ''
suffix: ''
target: ''
nl2br: false
max_length: 0
word_boundary: true
ellipsis: true
more_link: false
more_link_text: ''
more_link_path: ''
strip_tags: false
trim: false
preserve_tags: ''
html: false
element_type: ''
element_class: ''
element_label_type: ''
element_label_class: ''
element_label_colon: true
element_wrapper_type: ''
element_wrapper_class: ''
element_default_classes: true
empty: ''
hide_empty: false
empty_zero: false
hide_alter_empty: true
click_sort_column: value
type: string
settings:
link_to_entity: false
group_column: value
group_columns: { }
group_rows: true
delta_limit: 0
delta_offset: 0
delta_reversed: false
delta_first_last: false
multi_type: separator
separator: ', '
field_api_classes: false
pager:
type: full
options:
offset: 0
items_per_page: 10
total_pages: null
id: 0
tags:
next: ››
previous: ‹‹
first: '« First'
last: 'Last »'
expose:
items_per_page: false
items_per_page_label: 'Items per page'
items_per_page_options: '5, 10, 25, 50'
items_per_page_options_all: false
items_per_page_options_all_label: '- All -'
offset: false
offset_label: Offset
quantity: 9
exposed_form:
type: basic
options:
submit_button: Apply
reset_button: false
reset_button_label: Reset
exposed_sorts_label: 'Sort by'
expose_sort_order: true
sort_asc_label: Asc
sort_desc_label: Desc
access:
type: perm
options:
perm: 'access content'
cache:
type: tag
options: { }
empty: { }
sorts:
created:
id: created
table: node_field_data
field: created
relationship: none
group_type: group
admin_label: ''
entity_type: node
entity_field: created
plugin_id: date
order: DESC
expose:
label: ''
field_identifier: ''
exposed: false
granularity: second
arguments: { }
filters:
status:
id: status
table: node_field_data
field: status
entity_type: node
entity_field: status
plugin_id: boolean
value: '1'
group: 1
expose:
operator: ''
type:
id: type
table: node_field_data
field: type
entity_type: node
entity_field: type
plugin_id: bundle
value:
islandora_object: islandora_object
field_publisher_value:
id: field_publisher_value
table: node__field_publisher
field: field_publisher_value
relationship: none
group_type: group
admin_label: ''
plugin_id: string
operator: 'not empty'
value: ''
group: 1
exposed: false
expose:
operator_id: ''
label: ''
description: ''
use_operator: false
operator: ''
operator_limit_selection: false
operator_list: { }
identifier: ''
required: false
remember: false
multiple: false
remember_roles:
authenticated: authenticated
placeholder: ''
is_grouped: false
group_info:
label: ''
description: ''
identifier: ''
optional: true
widget: select
multiple: false
remember: false
default_group: All
default_group_multiple: { }
group_items: { }
style:
type: table
options:
grouping: { }
row_class: ''
default_row_class: true
columns:
nid: nid
title: title
field_linked_agent: field_linked_agent
default: '-1'
info:
nid:
sortable: false
default_sort_order: asc
align: ''
separator: ''
empty_column: false
responsive: ''
title:
sortable: false
default_sort_order: asc
align: ''
separator: ''
empty_column: false
responsive: ''
field_linked_agent:
align: ''
separator: ''
empty_column: false
responsive: ''
override: true
sticky: false
summary: ''
empty_table: false
caption: ''
description: ''
row:
type: fields
options:
default_field_elements: true
inline: { }
separator: ''
hide_empty: false
query:
type: views_query
options:
query_comment: ''
disable_sql_rewrite: false
distinct: true
replica: false
query_tags: { }
relationships: { }
header:
result:
id: result
table: views
field: result
relationship: none
group_type: group
admin_label: ''
plugin_id: result
empty: false
content: 'Displaying @start - @end of @total'
footer: { }
display_extenders: { }
cache_metadata:
max-age: -1
contexts:
- 'languages:language_content'
- 'languages:language_interface'
- url.query_args
- 'user.node_grants:view'
- user.permissions
tags:
- 'config:field.storage.node.field_linked_agent'
- 'config:field.storage.node.field_publisher'
page_1:
id: page_1
display_title: Page
display_plugin: page
position: 1
display_options:
display_extenders:
matomo:
enabled: false
keyword_gets: ''
keyword_behavior: first
keyword_concat_separator: ' '
category_behavior: none
category_gets: ''
category_concat_separator: ' '
category_fallback: ''
category_facets: { }
category_facets_concat_separator: ', '
path: nodes-for-publisher-page
cache_metadata:
max-age: -1
contexts:
- 'languages:language_content'
- 'languages:language_interface'
- url.query_args
- 'user.node_grants:view'
- user.permissions
tags:
- 'config:field.storage.node.field_linked_agent'
- 'config:field.storage.node.field_publisher'
rest_export_1:
id: rest_export_1
display_title: 'REST export'
display_plugin: rest_export
position: 2
display_options:
pager:
type: full
options:
offset: 0
items_per_page: 10
total_pages: null
id: 0
tags:
next: 'Next ›'
previous: '‹ Previous'
first: '« First'
last: 'Last »'
expose:
items_per_page: false
items_per_page_label: 'Items per page'
items_per_page_options: '5, 10, 25, 50'
items_per_page_options_all: false
items_per_page_options_all_label: '- All -'
offset: false
offset_label: Offset
quantity: 9
style:
type: serializer
options:
uses_fields: false
formats:
json: json
display_extenders:
matomo:
enabled: false
keyword_gets: ''
keyword_behavior: first
keyword_concat_separator: ' '
category_behavior: none
category_gets: ''
category_concat_separator: ' '
category_fallback: ''
category_facets: { }
category_facets_concat_separator: ', '
path: nodes-for-publisher
auth:
- basic_auth
- cookie
cache_metadata:
max-age: -1
contexts:
- 'languages:language_content'
- 'languages:language_interface'
- request_format
- url.query_args
- 'user.node_grants:view'
- user.permissions
tags:
- 'config:field.storage.node.field_linked_agent'
- 'config:field.storage.node.field_publisher'
## THIS REQUIRES VIEWS BULK OPERATIONS (remove this line)
uuid: fff435bc-ed98-4841-8c94-0fe58e98e005
langcode: en
status: true
dependencies:
config:
- taxonomy.vocabulary.corporate_body
- taxonomy.vocabulary.family
- taxonomy.vocabulary.person
- taxonomy.vocabulary.subject
module:
- node
- taxonomy
- user
- views_bulk_operations
id: unused_taxonomy_terms
label: 'Unused Taxonomy Terms'
module: views
description: ''
tag: ''
base_table: taxonomy_term_field_data
base_field: tid
display:
default:
id: default
display_title: Default
display_plugin: default
position: 0
display_options:
title: 'Unused Taxonomy Terms'
fields:
views_bulk_operations_bulk_form:
id: views_bulk_operations_bulk_form
table: views
field: views_bulk_operations_bulk_form
relationship: none
group_type: group
admin_label: ''
plugin_id: views_bulk_operations_bulk_form
label: 'Views bulk operations'
exclude: false
alter:
alter_text: false
text: ''
make_link: false
path: ''
absolute: false
external: false
replace_spaces: false
path_case: none
trim_whitespace: false
alt: ''
rel: ''
link_class: ''
prefix: ''
suffix: ''
target: ''
nl2br: false
max_length: 0
word_boundary: true
ellipsis: true
more_link: false
more_link_text: ''
more_link_path: ''
strip_tags: false
trim: false
preserve_tags: ''
html: false
element_type: ''
element_class: ''
element_label_type: ''
element_label_class: ''
element_label_colon: true
element_wrapper_type: ''
element_wrapper_class: ''
element_default_classes: true
empty: ''
hide_empty: false
empty_zero: false
hide_alter_empty: true
batch: true
batch_size: 10
form_step: true
ajax_loader: false
buttons: false
action_title: Action
clear_on_exposed: true
force_selection_info: false
selected_actions:
1:
action_id: views_bulk_operations_delete_entity
name:
id: name
table: taxonomy_term_field_data
field: name
relationship: none
group_type: group
admin_label: ''
entity_type: taxonomy_term
entity_field: name
plugin_id: term_name
label: Name
exclude: false
alter:
alter_text: false
make_link: false
absolute: false
word_boundary: false
ellipsis: false
strip_tags: false
trim: false
html: false
element_type: ''
element_class: ''
element_label_type: ''
element_label_class: ''
element_label_colon: true
element_wrapper_type: ''
element_wrapper_class: ''
element_default_classes: true
empty: ''
hide_empty: false
empty_zero: false
hide_alter_empty: true
click_sort_column: value
type: string
settings:
link_to_entity: true
group_column: value
group_columns: { }
group_rows: true
delta_limit: 0
delta_offset: 0
delta_reversed: false
delta_first_last: false
multi_type: separator
separator: ', '
field_api_classes: false
convert_spaces: false
pager:
type: mini
options:
offset: 0
items_per_page: 30
total_pages: null
id: 0
tags:
next: ››
previous: ‹‹
expose:
items_per_page: false
items_per_page_label: 'Items per page'
items_per_page_options: '5, 10, 25, 50'
items_per_page_options_all: false
items_per_page_options_all_label: '- All -'
offset: false
offset_label: Offset
exposed_form:
type: basic
options:
submit_button: Apply
reset_button: false
reset_button_label: Reset
exposed_sorts_label: 'Sort by'
expose_sort_order: true
sort_asc_label: Asc
sort_desc_label: Desc
access:
type: perm
options:
perm: 'access content'
cache:
type: tag
options: { }
empty: { }
sorts: { }
arguments: { }
filters:
nid:
id: nid
table: node_field_data
field: nid
relationship: nid
group_type: group
admin_label: ''
entity_type: node
entity_field: nid
plugin_id: numeric
operator: empty
value:
min: ''
max: ''
value: ''
group: 1
exposed: false
expose:
operator_id: ''
label: ''
description: ''
use_operator: false
operator: ''
operator_limit_selection: false
operator_list: { }
identifier: ''
required: false
remember: false
multiple: false
remember_roles:
authenticated: authenticated
min_placeholder: ''
max_placeholder: ''
placeholder: ''
is_grouped: false
group_info:
label: ''
description: ''
identifier: ''
optional: true
widget: select
multiple: false
remember: false
default_group: All
default_group_multiple: { }
group_items: { }
vid:
id: vid
table: taxonomy_term_field_data
field: vid
relationship: none
group_type: group
admin_label: ''
entity_type: taxonomy_term
entity_field: vid
plugin_id: bundle
operator: in
value:
corporate_body: corporate_body
family: family
person: person
subject: subject
group: 1
exposed: false
expose:
operator_id: ''
label: ''
description: ''
use_operator: false
operator: ''
operator_limit_selection: false
operator_list: { }
identifier: ''
required: false
remember: false
multiple: false
remember_roles:
authenticated: authenticated
reduce: false
is_grouped: false
group_info:
label: ''
description: ''
identifier: ''
optional: true
widget: select
multiple: false
remember: false
default_group: All
default_group_multiple: { }
group_items: { }
style:
type: table
row:
type: fields
query:
type: views_query
options:
query_comment: ''
disable_sql_rewrite: false
distinct: false
replica: false
query_tags: { }
relationships:
nid:
id: nid
table: taxonomy_index
field: nid
relationship: none
group_type: group
admin_label: node
plugin_id: standard
required: false
header:
result:
id: result
table: views
field: result
relationship: none
group_type: group
admin_label: ''
plugin_id: result
empty: false
content: 'Displaying @start - @end of @total'
footer: { }
display_extenders: { }
cache_metadata:
max-age: 0
contexts:
- 'languages:language_content'
- 'languages:language_interface'
- url.query_args
- user.permissions
tags: { }
page_1:
id: page_1
display_title: Page
display_plugin: page
position: 1
display_options:
display_extenders: { }
path: unused-taxonomy-terms
cache_metadata:
max-age: 0
contexts:
- 'languages:language_content'
- 'languages:language_interface'
- url.query_args
- user.permissions
tags: { }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment