Skip to content

Instantly share code, notes, and snippets.

@BroaderImpact
Last active February 19, 2023 19:04
Show Gist options
  • Save BroaderImpact/99c482459b00ce7f161d5191b9a2339e to your computer and use it in GitHub Desktop.
Save BroaderImpact/99c482459b00ce7f161d5191b9a2339e to your computer and use it in GitHub Desktop.
Lipidomic Analysis Workflow Using LipidR
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Lipidomics is a complex analytical endeavor. The goal of identifying differential expression *in silico* often becomes a proverbial \"fishing expedition\". To inject precision into the label-free mass spectrometric analysis of lipids, the following pipeline was developed. \n",
"\n",
"First, .raw files from the liquid chromatography-mass spectrometry (LCMS) were converted into mzML files. A first pass of these files was performed with SIMLipid, a proprietary software that relies upon the [LipidMAPS](https://www.lipidmaps.org/) database. Files were then converted to .csv for data cleaning and batch analysis. The homogeneous distribution of matrix-cluster signals were identified for each of the matrices employed, followed by the identification of homogeneous distribution of endogenous molecular signals in positive and negative ion mode. The inclusion of both external (ESTD) and internal (ISTD) standards allows for confirmation of lipid abundance using normalization algorithms. Subsequently, principal component analysis (PCA) was performed."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install and load the necessary packages\n",
"install.packages(\"lipidr\")\n",
"library(lipidr)\n",
"library(tidyverse)\n",
"library(ggplot2)\n",
"\n",
"# Load and preprocess the data\n",
"data <- read.csv(\"lipidomics_data.csv\", header = TRUE) # replace \"lipidomics_data.csv\" with the name of your data file\n",
"data_filtered <- filter_lipids(data, min_intensity = 1000, max_missing = 0.5) # filter out low-intensity lipids and lipids with >50% missing values\n",
"data_normalized <- normalize_lipids(data_filtered, method = \"TIC\") # normalize the data using Total Ion Current (TIC)\n",
"\n",
"# Exploratory data analysis\n",
"summary(data_normalized) # summarize the normalized data\n",
"hist(data_normalized[,1]) # plot a histogram of the normalized intensities for the first lipid\n",
"boxplot(data_normalized) # plot boxplots of the normalized intensities for all lipids\n",
"\n",
"# Multivariate analysis\n",
"pca <- prcomp(data_normalized, scale. = TRUE) # perform PCA on the normalized data\n",
"biplot(pca, col = \"black\", cex = 0.8) # plot a biplot of the PCA results\n",
"\n",
"# Data visualization\n",
"ggplot(data_normalized, aes(x = LipidClass, y = Intensity, fill = SampleType)) + geom_boxplot() # plot a boxplot of the normalized intensities by lipid class and sample type\n",
"ggplot(data_normalized, aes(x = LipidClass, y = Intensity, color = SampleType)) + geom_jitter() # plot a jitter plot of the normalized intensities by lipid class and sample type\n",
"ggplot(data_normalized, aes(x = LipidClass, y = Intensity, fill = SampleType)) + geom_violin() # plot a violin plot of the normalized intensities by lipid class and sample type"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Briefly, following attribution of signals to LipidMAPS identities, means and deviations thereof were calculated with relative abundance values. Covariant matrices were constructed, with several iterations of eigenvector calculations and values of covariance for each. Finally, data z-scores were projected onto new basis. Resulting graph shown."
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment