Skip to content

Instantly share code, notes, and snippets.

View rushirajnenuji's full-sized avatar

Rushiraj Nenuji rushirajnenuji

  • University of California Santa Barbara
  • Santa Barbara, California
  • 10:29 (UTC -07:00)
View GitHub Profile
@rushirajnenuji
rushirajnenuji / metricsReport.json
Last active March 19, 2018 18:12
Sample Dataset Master Report from DataONE to MDC Hub
{
"report_header": {
"report_name": "Dataset Master Report",
"report_id": "DSR1",
"release": "RD1",
"report-filters": [
{
"Name": "Begin-Date",
"Value": "2018-02-01"
},
@rushirajnenuji
rushirajnenuji / citationsMetadata.json
Last active June 19, 2018 16:40
Sample metadata report from DOI resolve end-point
{
"status": "ok",
"message-type": "work",
"message-version": "1.0.0",
"message": {
"indexed": {
"date-parts": [
[
2018,
5,
@rushirajnenuji
rushirajnenuji / citationsReport.json
Last active June 19, 2018 16:40
Sample Citation record obtained from Crossref
{
"LinkPublicationDate": "2017-05-31T03:01:02Z",
"LinkProvider": [
{
"Name": "crossref"
}
],
"RelationshipType": {
"Name": "References"
},
@rushirajnenuji
rushirajnenuji / metrics-service.rst
Created August 20, 2018 19:34
DataONE Metrics Service implementation

Setup and Operation of Elasticsearch Event Index

Log events travel a long path to get into the elasticsearch index:

  1. MN or CN event
  2. Log aggregation process running on a CN collects event
  3. Log aggregation processes event, augmenting and pushing into solr
  4. Python script copies events from solr index to log files on disk
  5. Filebeat watches log files, sends entries to logstash
@rushirajnenuji
rushirajnenuji / metrics-service.rst
Last active August 20, 2018 19:58
DataONE Metrics Service implementation
@rushirajnenuji
rushirajnenuji / August18.json
Created October 19, 2018 17:37
ESS_DIVE August 2018 logs
{
"took": 58,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
},
"hits": {
@rushirajnenuji
rushirajnenuji / MetricsAnalyzer.py
Created October 29, 2018 18:28
Analyzing top N datasets for a given time frame from ES
"""
Metrics Analyzer module
Implemented as a falcon web application, https://falcon.readthedocs.io/en/stable/
"""
import json
import falcon
from urllib.parse import urlparse
@rushirajnenuji
rushirajnenuji / sample_report.json
Created November 2, 2018 16:51
Gulf of Alaska MN report from Nov 01, 2013 to Dec 01, 2013
{
"report-header": {
"report-name": "Dataset Master Report",
"report-id": "dsr",
"release": "rd1",
"reporting-period": {
"begin-date": "2013-11-01",
"end-date": "2013-12-01"
},
"created": "2018-11-02",
@rushirajnenuji
rushirajnenuji / Kremen-dataset.json
Created November 2, 2018 21:13
DataONE Kremen dataset with 5M views
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
},
"hits": {
@rushirajnenuji
rushirajnenuji / pid_resolution.txt
Created December 20, 2018 20:57
Dataset testing the logic
List of datasets with increasing counts as the version progresses
Start
1.
- (V4) https://handy-owl.nceas.ucsb.edu/metacatui/view/doi:10.5063/F1MW2FC2
- (V3) https://handy-owl.nceas.ucsb.edu/metacatui/view/urn:uuid:e27ba7ef-2758-4a07-9bc5-419dac6f61d7
- (V2) https://handy-owl.nceas.ucsb.edu/metacatui/view/doi:10.5063/F1C53J27
- (V1) https://handy-owl.nceas.ucsb.edu/metacatui/view/urn:uuid:c93cac3c-789e-4c44-9052-5c95e416d5e1