Skip to content

Instantly share code, notes, and snippets.

@kaplun
kaplun / y
Last active August 29, 2015 14:08 — forked from dset0x/y
---
name: check_mandatory_fields
check: checker/mandatory.py
arguments:
fields: ["001%%_", "005%%_"]
---
name: check_utf8
check: checker/utf8.py
options:
consider_deleted_records: true
@kaplun
kaplun / gist:dfba55a5ba376fecb47e
Last active September 21, 2015 14:41
Commits from INSPIRE not yet in legacy
BibCatalog: exception logging fix
BibCatalog: fix username check
BibCatalog: no username and password fix
BibCatalog: proper ticket stealing
BibCatalog: REST-based RT reimplementation
BibCatalog RT4 attachment handling
BibCheck: allow filtering by subfield contents
BibCheck: collection-based filters and timestamps
BibCheck: missing subfield code
BibCheck: new creation_date check
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""Utility script to build the Invenio Status Dashboard."""
from __future__ import print_function
import sys
@kaplun
kaplun / marc_repeatable.py
Created May 3, 2016 14:03
Analysis tool to get statistics on repeatable fields and subfields of an Invenio instance
#!/usr/bin/env python
from invenio.search_engine import get_collection_reclist, get_record
from invenio.intbitset import intbitset
from click import progressbar
collection = sys.argv[1]
recids = list(get_collection_reclist(collection))
recids.reverse()
@kaplun
kaplun / marc_status.py
Created May 3, 2016 14:04
Analysis tool to find the current MARC usage of an Invenio installation (to understand common values, formats and outliers)
#!/usr/bin/env python
import sys
from invenio.dbquery import run_sql
from invenio.search_engine import get_tag_name
from invenio.search_engine import get_collection_reclist
from invenio.intbitset import intbitset
collection = sys.argv[1]
@kaplun
kaplun / gist:8f768b5104feabc1a8621f3cfb94167a
Created May 12, 2016 14:04
%prun of marc_create_record
12868110 function calls (12126723 primitive calls) in 922.833 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
7523 919.223 0.122 921.309 0.122 bd1xx.py:32(authors)
9697202 1.284 0.000 1.284 0.000 {method 'append' of 'list' objects}
742212/1230 0.525 0.000 0.790 0.001 utils.py:97(strip_empty_values)
1411183 0.266 0.000 0.266 0.000 {isinstance}
8689 0.224 0.000 0.232 0.000 utils.py:115(remove_duplicates_from_list)
1765648 function calls (1600321 primitive calls) in 2.545 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
32 1.210 0.038 1.450 0.045 bd1xx.py:79(authors)
117066 0.204 0.000 0.204 0.000 collections.py:71(__setitem__)
100 0.182 0.002 0.739 0.007 utils.py:241(create_record)
11368 0.159 0.000 0.481 0.000 utils.py:29(__new__)
164760/118 0.127 0.000 0.204 0.002 utils.py:97(strip_empty_values)
14401047 function calls (13647928 primitive calls) in 879.393 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
7523 875.234 0.116 876.720 0.117 bd1xx.py:79(authors)
9993772 0.802 0.000 0.802 0.000 {method 'append' of 'list' objects}
742396/1230 0.517 0.000 0.785 0.001 utils.py:97(strip_empty_values)
1677726 0.320 0.000 0.320 0.000 {isinstance}
22636/11368 0.315 0.000 0.995 0.000 utils.py:141(__new__)
There are only 'skip'ped commits left to test.
The first bad commit could be any of:
1a37c54216e3ae3256a6321cb49cee13bc1d9993
47cb256ce2c5acaa272d49860a023e980db8763b
648a7d088c42cf59a4610b914aea21f45a9fc6db
b4b12e82e6cc9a3be4698d0abfff5c4288efe8c5
c9bafb85820cb5dce4474bab7b1043d17aca2c45
feb3732d275d556c5f77950e1947686a5a010f71
517928cf1e8c99800f375a98594b237aabc77d54
ef709cd37c143e736c1cde0a327432da34b220ab
@kaplun
kaplun / 0_reuse_code.js
Created July 7, 2016 07:29
Here are some things you can do with Gists in GistBox.
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console