Skip to content

Instantly share code, notes, and snippets.

View alexcpn's full-sized avatar

Alex Punnen alexcpn

View GitHub Profile
@alexcpn
alexcpn / line_segment_intersction.md
Created March 23, 2023 14:48
Explanation of line segment intersection for two points
from transformers import T5Tokenizer, T5ForConditionalGeneration
import numpy as np
import torch
class FlaxDataCollatorForT5MLM:
"""
From https://github.com/huggingface/transformers/blob/main/examples/flax/language-modeling/run_t5_mlm_flax.py
"""
def __init__(self,tokenizer,noise_density,mean_noise_span_length) -> None:
self.tokenizer = tokenizer
from transformers import T5Tokenizer
import numpy as np
class FlaxDataCollatorForT5MLM:
"""
From https://github.com/huggingface/transformers/blob/main/examples/flax/language-modeling/run_t5_mlm_flax.py
"""
def __init__(self,tokenizer,noise_density,mean_noise_span_length) -> None:
self.tokenizer = tokenizer
self.noise_density = noise_density
loki:
auth_enabled: false
commonConfig:
path_prefix: /var/loki
replication_factor: 1
compactor:
apply_retention_interval: 1h
compaction_interval: 5m
retention_delete_worker_count: 500
retention_enabled: true
@alexcpn
alexcpn / bert.py
Created January 27, 2023 12:58
Sentence classification with transformer model Bert
'''
Adapted and extended from
https://github.com/huggingface/transformers/issues/1950#issuecomment-558679189
'''
import pandas as pd
from transformers import BertTokenizer, BertModel
from sklearn.metrics.pairwise import cosine_similarity
import torch
@alexcpn
alexcpn / wrongly_classified.py
Created October 20, 2022 13:32
List out wrongly and irightly classified classes
#---------------------------------------------------------------------------------------------
# Populate the Confusion Matrix
#---------------------------------------------------------------------------------------------
for key,val in wrong_per_class.items(): # Key is category and val is a list of wrong classes
summed_wrong_classes =Counter(val).most_common()
print(f"**To Predict {categories[key]}")
for ele in summed_wrong_classes:
print(f" --Predicted {categories[ele[0]]} count={ele[1]}")
confusion_matrix[key][ele[0]]=ele[1]
@alexcpn
alexcpn / bind_to_released_pv.md
Created August 2, 2022 11:26
Kuberenetes How to Get the Data back for a StorageClass with reclaimPolicy as Retain

How to Get the Data back for a StorageClass with reclaimPolicy as Retain

Testing Using Rook-Ceph Storage Class with External Ceph

(Should work with any storage class)

Step 1: Create a Storage Class with reclaimPolicy as Retain

@alexcpn
alexcpn / Input_ssylog.csv
Last active May 4, 2022 11:24
TFIDF Rank
We can make this file beautiful and searchable if this error is corrected: No commas found in this CSV file in line 0.
Mar 31 09:31:48 1freeipa-f476c9ffb-8gtc4 ns-slapd[2762]: [31/Mar/2022:09:31:48.659318010 +0000] DSRetroclPlugin - delete_changerecord: could not delete change record 11855 (rc: 32)
Mar 31 09:31:48 2freeipa-f476c9ffb-8gtc4 ns-slapd[2762]: [31/Mar/2022:09:31:48.751333697 +0000] DSRetroclPlugin - delete_changerecord: could not delete change record 11856 (rc: 32)
3DDDDfreeipa-f476c9ffb-8gtc4 ns-slapd[2762]: [31/Mar/2022:09:31:49.888424524 +0000] DSRetroclPlugin - delete_changerecord: could not delete change record 11864 (rc: 32)
Mar 31 09:31:48 4freeipa-f476c9ffb-8gtc4 ns-slapd[2762]: [31/Mar/2022:09:31:48.783381048 +0000] DSRetroclPlugin - delete_changerecord: could not delete change record 11857 (rc: 32)
Mar 31 09:31:48 5freeipa-f476c9ffb-8gtc4 ns-slapd[2762]: [31/Mar/2022:09:31:48.826483871 +0000] DSRetroclPlugin - delete_changerecord: could not delete change record 11858 (rc: 32)
Mar 31 09:31:49 freeipa-f476c9ffb-8gtc4 ns-slapd[2762]: [31/Mar/2022:09:31:49.156622971 +0000] DSRetroclPlugin - de
@alexcpn
alexcpn / tfidf_vectorizer .py
Created April 28, 2022 10:54
TFID Vectorizer
# Using TFidfVectorizer
# https://melaniewalsh.github.io/Intro-Cultural-Analytics/05-Text-Analysis/03-TF-IDF-Scikit-Learn.html
tfidf_vectorizer = TfidfVectorizer(token_pattern=u'(?ui)\\b\\w*[a-z]+\\w*\\b',stop_words='english') #token_pattern=u'(?ui)\\b\\w*[a-z]+\\w*\\b'
df = read_syslog(sys.argv[1])
tfidf_vector = tfidf_vectorizer.fit_transform(df['y_org'])
print(tfidf_vectorizer.get_feature_names_out())
tfidf_df = pd.DataFrame(tfidf_vector.toarray(), index=df['ds_org'], columns=tfidf_vectorizer.get_feature_names())
# Create a new row with sum of all the terms of the existing rows
tfidf_df.loc['00_Document Frequency'] = (tfidf_df > 0).sum()
```
^Croot@balamurugan-VirtualBox:~# kubectl logs rook-ceph-osd-8-ccb58986d-5bm22 -n rook-ceph -f
debug 2022-04-13T08:55:05.330+0000 7f580cf7af40 0 set uid:gid to 167:167 (ceph:ceph)
debug 2022-04-13T08:55:05.330+0000 7f580cf7af40 0 ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable), process ceph-osd, pid 1
debug 2022-04-13T08:55:05.330+0000 7f580cf7af40 0 pidfile_write: ignore empty --pid-file
debug 2022-04-13T08:55:05.331+0000 7f580cf7af40 1 bdev create path /var/lib/ceph/osd/ceph-8/block type kernel
debug 2022-04-13T08:55:05.331+0000 7f580cf7af40 1 bdev(0x56533512c000 /var/lib/ceph/osd/ceph-8/block) open path /var/lib/ceph/osd/ceph-8/block
debug 2022-04-13T08:55:05.331+0000 7f580cf7af40 -1 bdev(0x56533512c000 /var/lib/ceph/osd/ceph-8/block) _aio_start io_setup(2) failed with EAGAIN; try increasing /proc/sys/fs/aio-max-nr
debug 2022-04-13T08:55:05.332+0000 7f580cf7af40 0 starting osd.8 osd_data /var/lib/ceph/osd/ceph-8 /var/lib/ceph/osd/ceph-8/journal
debug 2022-04-13T08:55:05.3