Skip to content

Instantly share code, notes, and snippets.

@donbr
donbr / arize-phoenix-iceberg.md
Last active December 6, 2024 17:18
Integrating Arize Phoenix and Apache Iceberg for Local Telemetry Data Management and Querying

Title: Integrating Arize Phoenix and Apache Iceberg for Local Telemetry Data Management and Querying

Authors: Don Branson


Abstract

In modern data observability workflows, capturing and managing telemetry data is crucial for debugging and improving machine learning systems. This paper demonstrates the integration of Arize Phoenix, an open-source observability platform, with Apache Iceberg, a high-performance table format for data lakes, to create a scalable and efficient local telemetry data management and querying system. We present a step-by-step implementation for capturing telemetry data as Parquet files using Arize Phoenix on a local system and using Apache Iceberg to enable schema evolution, time travel, and efficient queries. This solution bridges the gap between data observability and data lake management for machine learning monitoring.

@donbr
donbr / presto-iceberg-hive-minio-synth-prototype.md
Last active December 5, 2024 22:07
Presto Iceberg Hive Minio - a mock prototype

A Hypothetical Medium Post...

I created a cool prototype for airline data last year but misplaced the code, so I recently created this "synthetic" prototype stringing together a few excerpts from old ChatGPT messages and simple prompt engineering.

I haven't found the datasource yet, but going through the process has been such a gift. I knew these concepts were becoming essential again and this helped to both remember key concepts AND generate excitement around new opportunities.

Reminds me of Hypothetical Document Embeddings... appparently HyDE can help me retrieve stuff from long term memory too.

Feel free to have a laugh at my expense!

@donbr
donbr / ollama-wsl-ollama-service-error.md
Created November 25, 2024 22:04
Resolving Ollama Service "Permission Denied" Error

FAQ: Resolving Ollama Service "Permission Denied" Error

Issue

When starting the Ollama service, it fails with the following error message:

Error: could not create directory mkdir /usr/share/ollama: permission denied
@donbr
donbr / emerging-standards-drug-repurposing-bioinformatics.md
Last active November 12, 2024 15:02
Emerging Standards and Trends in Drug Repurposing Bioinformatics: Explainable AI, Data Privacy, and Cloud-Based Advancements in 2024

Emerging Standards and Trends in Drug Repurposing Bioinformatics: Explainable AI, Data Privacy, and Cloud-Based Advancements in 2024

Abstract

Traditional drug discovery is a costly and time-consuming process. Drug repurposing offers a promising alternative by seeking new therapeutic applications for existing drugs. In 2024, bioinformatics has become essential to drug repurposing, integrating multi-omics data, computational models, and knowledge bases to accelerate this process. This report explores the current bioinformatics standards in drug repurposing, highlighting the growing importance of explainable AI, data privacy, cloud computing, and standardized ontologies. We examine key players in this field, including government agencies, philanthropic organizations, and industry stakeholders, and detail the vital tools propelling data integration and workflow efficiency.

1. Introduction

Drug repurposing presents a paradigm shift in drug discovery, aiming to find new therapeutic uses for existing dru

@donbr
donbr / gtpd_init.ipynb
Last active November 12, 2024 03:19
gtpd_init.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@donbr
donbr / langchain_ollama_image_summarization.ipynb
Created January 23, 2024 22:57
Ollama models - Image Summarization
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@donbr
donbr / network-pharma-string-stitch-clustering.md
Last active October 3, 2024 01:20
Network Pharmacology: Harnessing STRING, STITCH, and Clustering for Cellular Biology

Slide 1: Title & Introduction

Title:

Network Pharmacology: Harnessing STRING, STITCH, and Clustering for Cellular Biology


Introduction:

@donbr
donbr / non-small-cell-lung-cancer-string-db.md
Last active September 26, 2024 16:06
Investigating Potential Drug Targets for Non-small Cell Lung Cancer through STRING Database Analysis

Investigating Potential Drug Targets for Non-small Cell Lung Cancer through STITCH Database Analysis

Purpose

The aim of this study is to identify potential drug targets and repurposing opportunities by analyzing interactions between key proteins involved in non-small cell lung cancer (NSCLC) pathways, focusing on the inhibition of aberrant signaling in cancer therapy.

Hypothesis Formulation

  • Initial Hypothesis: Targeting key nodes in the Ras, PI3K-Akt, Cell Cycle, p53, and Retinoid signaling pathways may provide effective therapeutic strategies for NSCLC by inhibiting cancer cell proliferation and survival.
@donbr
donbr / a-string-stitch-apis.md
Last active September 23, 2024 00:23
Leveraging STRING and STITCH APIs for White Paper Insights: A Structured Approach

Leveraging STRING and STITCH APIs for White Paper Insights: A Structured Approach

This document provides a structured approach to utilizing the STRING and STITCH APIs for retrieving protein-protein and chemical-protein interaction data, enriching the findings of your white paper on drug response assessment in lung cancer, particularly in the context of TP53 and RB1 pathways.

Understanding the APIs

  • STRING (Search Tool for the Retrieval of Interacting Genes/Proteins):
    STRING is a database and web resource dedicated to protein-protein interactions, including both known and predicted interactions. It integrates data from multiple sources, such as experimental repositories, computational prediction methods, and public text collections.

  • STITCH (Search Tool for Interactions of Chemicals):

WikiPathway: Retinoblastoma gene in cancer

PMID Proteins Year Title
30773851 24 2019 Retinoblastoma mutation predicts poor outcomes in advanced non small cell lung cancer.
36399634 28 2023 Retinoblastoma Expression and Targeting by CDK46 Inhibitors in Small Cell Lung Cancer.
38034873 38 2023 Ropivacaine inhibits the malignant behavior of lung cancer cells by regulating retinoblastoma-binding protein 4.
17804741 3 2007 Retinoblastoma deficiency increases chemosensitivity in lung cancer.
25162518 31 2014 Retinoblastoma binding protein 2 (RBP2) promotes HIF-1alpha-VEGF-induced angiogenesis of non-small cell lung cancer via the Akt pathway.
[2