Skip to content

Instantly share code, notes, and snippets.

@canavandl
Last active December 29, 2015 23:49
Show Gist options
  • Save canavandl/7744991 to your computer and use it in GitHub Desktop.
Save canavandl/7744991 to your computer and use it in GitHub Desktop.

Efficiency Determination at Temperature of Nanomaterials

The efficiency of our materials at application temperatures is a metric of high importance at PLT. To understand their efficiency, nicknamed PLQY for photoluminescent quantum yield, we temperature cycle the materials to characterization the PLQY change with respect to temperature.

This script calculates the sample PLQY at designated 'temperatures of merit' from the raw measurement files.

The measurement films from the instrument are saved in the TDMS format, a binary file format from National Instruments that is similar in structure to a Mongo BSON.

Temperature Cycle Measurement Batching

The temperature analysis is generally performed upon the completion of a temperature cycle of a set of film samples, though several measurement cycles may occur before running.

The file sorting portion of the program groups measurement files by batch ID and measurement date. Assumptions are that each object is only measured once in a single day and that no measurements occur through the day change (midnight in the local timezone). This may change in the future and need to be reimplemented.

Film Temperature Correction

The films themselves and the puck that they sit on are composed of highly insulative materials - thus the 'effective' temperature of the films and particles in film may deviate from the drive temperature and vary from film to film. Therefore, it is necessary to calculate a corrected film temperature for each PLQY measurement.

The Varshni equations describes the change in the bandgap energy Eg of a semiconductor material with respect to temperature (https://en.wikipedia.org/wiki/Band_gap).

We use a linear function model for simplified Varshni's empirical expression where B -> 0 (a material dependant property that closely approximates our system to calculate the 'actual' film temperature based on the emission centroid.

Sample Set Averaging

Temperature measurements appears especially 'noisy', PLT measures multiple films made from the same sample batch in order to accurately characterize the sample.

We employ a location-weighted moving average algorythm to average the set of films within a single batch and return a representative curve. (As the measurements are taken over an inconsistent time and temperature span, a simpler boxcar moving average or similar doesn't work.)

Interpolation to Calculate PLQY values

In order to compare distinct film batches, PLT has decided on a set of 'temperautures of merit' at which to contrast samples. Currently these are 30C, 60C, 100C and 120C - although these are highly likely to change. As films are not always measured at exactly each temperature, these values are created by interpolation of the 'representative' PLQY values.

Writing to LIMS

@todo - document better

"""
This module implements the Lowess function for nonparametric regression.
Functions:
lowess Fit a smooth nonparametric regression curve to a scatterplot.
For more information, see
William S. Cleveland: "Robust locally weighted regression and smoothing
scatterplots", Journal of the American Statistical Association, December 1979,
volume 74, number 368, pp. 829-836.
William S. Cleveland and Susan J. Devlin: "Locally weighted regression: An
approach to regression analysis by local fitting", Journal of the American
Statistical Association, September 1988, volume 83, number 403, pp. 596-610.
"""
from math import ceil
import numpy as np
from scipy import linalg
def lowess(x, y, f=2./3., iter=3):
"""lowess(x, y, f=2./3., iter=3) -> yest
Lowess smoother: Robust locally weighted regression.
The lowess function fits a nonparametric regression curve to a scatterplot.
The arrays x and y contain an equal number of elements; each pair
(x[i], y[i]) defines a data point in the scatterplot. The function returns
the estimated (smooth) values of y.
The smoothing span is given by f. A larger value for f will result in a
smoother curve. The number of robustifying iterations is given by iter. The
function will run faster with a smaller number of iterations."""
n = len(x)
r = int(ceil(f*n))
h = [np.sort(np.abs(x - x[i]))[r] for i in range(n)]
w = np.clip(np.abs((x[:,None] - x[None,:]) / h), 0.0, 1.0)
w = (1 - w**3)**3
yest = np.zeros(n)
delta = np.ones(n)
for iteration in range(iter):
for i in range(n):
weights = delta * w[:,i]
b = np.array([np.sum(weights*y), np.sum(weights*y*x)])
A = np.array([[np.sum(weights), np.sum(weights*x)],
[np.sum(weights*x), np.sum(weights*x*x)]])
beta = linalg.solve(A, b)
yest[i] = beta[0] + beta[1]*x[i]
residuals = y - yest
s = np.median(np.abs(residuals))
delta = np.clip(residuals / (6.0 * s), -1, 1)
delta = (1 - delta**2)**2
return yest
if __name__ == '__main__':
import math
n = 100
x = np.linspace(0, 2 * math.pi, n)
y = np.sin(x) + 0.3*np.random.randn(n)
f = 0.25
yest = lowess(x, y, f=f, iter=3)
import pylab as pl
pl.clf()
pl.plot(x, y, label='y noisy')
pl.plot(x, yest, label='y pred')
pl.legend()
pl.show()
"""TempDep TMDS measurement file library for PLT temperature test data
The following functions parse and assemble temperature test data using
the Pandas Python library.
"""
#-----------------------------------------------------------------------------
# Created on June 17, 2013
#
# @author: lcanavan
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Imports
#-----------------------------------------------------------------------------
# Stdlib imports
from collections import OrderedDict
import datetime as datetime
# Third-party imports
from nptdms import TdmsFile
import pandas as pd
import pytz
#-----------------------------------------------------------------------------
# Globals and constants
#-----------------------------------------------------------------------------
LOCAL_TZ = 'US/Pacific'
#-----------------------------------------------------------------------------
# Classes and functions
#-----------------------------------------------------------------------------
def get_tdms_properties(raw_file,group):
'''
Takes a temperature measurement TDMS filename and file directory as
inputs and returns the tdms file properties as a dictionary.
'''
return TdmsFile(raw_file).object(group).properties
def get_batchID(raw_file,group):
'''
Takes a temperature measurement TDMS filename name and file directory as
inputs and returns the Batch ID for the corresponding measured film.
'''
return TdmsFile(raw_file).object(group).properties['BatchID']
def get_measurementdatetime(raw_file,group):
'''
Takes a temperature measurement TDMS filename name and file directory as
inputs and returns the timestamp for the corresponding measured film.
This time stamp is adjusted to local PST timezone (the Test Department
saves files as UTC, which causes batching issues)
'''
utc_dt = TdmsFile(raw_file).object(group).properties['DateTime']
return utc_dt.replace(tzinfo=pytz.utc).astimezone(pytz.timezone(LOCAL_TZ))
def get_sampleID(raw_file,group):
'''
Takes a temperature measurement TDMS filename name and file directory as
inputs and returns the Sample ID for the corresponding measured film.
'''
return TdmsFile(raw_file).object(group).properties['SampleID']
def get_summary_frame(raw_list,group):
'''
Takes a list of temperature measurement TDMS filenames and a single file
directory that contains the TDMS files as inputs and returns a Pandas
DataFrame object that contains all of the TDMS property data.
Please note that this function does not return the channel (spectral) data.
@todo reimplement this using calculated PLQY and centroid values, not on-
device values. The instrument calculations are suspect...
'''
dataframe_set = []
for filename in raw_list:
raw = TdmsFile(filename)
raw_data = pd.DataFrame(raw.object(group).properties, index = [filename])
dataframe_set.append(raw_data)
new_frame = pd.concat(dataframe_set)
return new_frame.sort(columns=['DateTime'])
def get_clean_frame(raw_data):
'''
Takes a Pandas DataFrame "SummaryFrame" object as an input and returns a new
Pandas DataFrame "SummaryFrame" object where all of the measured film temperatures
are above 0 deg C. It was noted that occasionally such 'bad' data points are
created by the test instrument.
This function may be expanded later to remove other erroneous
measurements.
'''
return raw_data[raw_data['Film_Temp_C'] > 0]
def sort_rawfiles_by_batchid(tdms_files,group):
'''
Takes a list of temperature measurement TDMS files and the file
directory where they are located and sorts the files by
film Batch ID and date.
This function returns a dictionary of lists, where the
keys are film (batch ID, measuredate) tuples and the values are
lists of TDMS-format measurement files corresponding to the key Batch ID
The measuredate part was added to avoid bundling samples measured
consecutively
'''
unique_batchIDs_bin = OrderedDict()
for item in tdms_files:
currentID = get_batchID(item,group)
currentDate = get_measurementdatetime(item,group).date()
if (currentID, currentDate) not in unique_batchIDs_bin:
unique_batchIDs_bin[(currentID, currentDate)]=[]
unique_batchIDs_bin[(currentID, currentDate)].append(item)
return unique_batchIDs_bin
def sort_rawfiles_by_sampleid(tdms_files,group):
'''
Takes a list of temperature measurement TDMS files and the file
directory where they are located and sorts the files by
film Batch ID.
This function returns a dictionary of lists, where the
keys are film batch IDs and the values are lists of TDMS-format
measurement files corresponding to the key Batch ID
'''
unique_sampleIDs_bin = OrderedDict()
for item in tdms_files:
currentID = get_sampleID(item,group)
if currentID not in unique_sampleIDs_bin:
unique_sampleIDs_bin[currentID]=[]
unique_sampleIDs_bin[currentID].append(item)
return unique_sampleIDs_bin
if __name__ == '__main__':
main()
"""PLT Temperature Test Data Analysis
The main function takes raw temperature measurement TDMS files and writes
the calculated RT, 60C, 100C and 120C PLQY values, along with other relevant
parameters to LIMS
"""
#-----------------------------------------------------------------------------
# Created on June 17, 2013
#
# @author: lcanavan
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Imports
#-----------------------------------------------------------------------------
# Stdlib imports
import os
import glob
# Third-party imports
import pandas as pd
import numpy as np
from scipy.optimize import curve_fit
from scipy.interpolate import interp1d
from django.core.management.base import BaseCommand
from django.forms.models import model_to_dict
# Module imports
import tdms_scrape
import lowess
from sample_inventory.models import ReactionManager
from sample_characterization.models import UpdatedTemperatureTestResult
#-----------------------------------------------------------------------------
# Globals and constants
#-----------------------------------------------------------------------------
TDMS_GROUP = 'Tempdep_Data'
REL_TEMPS = [303.15, 333.15, 373.15, 393.15]
HEATING_MEASUREMENTS = 3
DELETE_FILES_AFTER_ANALYSIS = False
#-----------------------------------------------------------------------------
# Classes and functions
#-----------------------------------------------------------------------------
def linear_band_gap_energy(T,m,b):
'''
Linear function model for simplified Varshni's empirical expression where
B -> 0. Takes floats of temperature, slope and intercept as inputs and
returns a float of the estimated band gap energy wavelength.
'''
return m*T+b
def emission_corrected_temp(wl,m,b):
'''
Inverse of linear_band_gap_energy function. Takes floats of band gap
energy wavelength, slope and intercept and returns a float of the
estimated emission corrected temperature.
'''
return (wl-b)/m
def get_emission_corrected_temp(FilmTempK,BandGapEnergy):
'''
Takes 1D arrays of film temperature and calculated band gap energy
and returns a 1D array of the corrected film temperature along with
first parameters popt and pcov (r-square and var)
'''
popt, pcov = curve_fit(linear_band_gap_energy,FilmTempK,BandGapEnergy)
CorrectedFilmTempK = emission_corrected_temp(BandGapEnergy,
popt[0],
popt[1]
)
return CorrectedFilmTempK, popt, pcov
def getmovingaverage(FilmTempK,PLQY,*args,**kwargs):
'''
Takes 1D arrays of corrected film temp (x) and PLQY (y) and
utilizes Cleveland's location-weighted moving average transpose
to return a smoothed PLQY curve
'''
return lowess.lowess(FilmTempK,PLQY,*args,**kwargs)
def get_interpolated_fit(FilmTempK,PLQY,PLQYAVG):
'''
Takes 1D arrays of film temperatures and unsmooth and smoothed
PLQY measures and returns a function object of the interpolated
PLQY curve, along with the calculated residual (quality of fit)
'''
interpfxn = interp1d(FilmTempK,PLQYAVG,kind='linear')
raw_mean = PLQY.sum()/len(PLQY)
SStot = ((PLQY-raw_mean)**2).sum()
SSres = ((PLQY-interpfxn(FilmTempK))**2).sum()
return interpfxn, 1-(SSres/SStot)
def fit_interpolated_temps(interpfxn, REL_TEMPS):
'''
Takes a list of float temperatures of interest (global REL_TEMPS) and the
derived interpolatation function (interpfxn) from getinterpvals
and returns a list of floats of the interpolated PLQY values at the
REL_TEMPS values
'''
interpolated_PLQYlist = []
for temp in REL_TEMPS:
try:
interpolated_PLQYlist.append(interpfxn(temp))
except ValueError:
interpolated_PLQYlist.append(None)
return interpolated_PLQYlist
class Command(BaseCommand):
args = 'File path to TDMS files'
help = 'Performs film temperature test analysis'
def handle(self,*args,**options):
'''
Performs the daily temperature test calculations
Takes a filepath of the daily TDMS files as the single argument
'''
# Directory Level Stuff
PATH = args[0]
clean_raw_filelist = glob.glob(PATH+'*.tdms')
if len(clean_raw_filelist) == 0:
print 'No files to analyze'
return
print 'TDMS files sourced from: %s' %PATH
print 'Total Number of TDMS Files: %d' %len(clean_raw_filelist)
# Creates sorted-by-batchID object for calculations
sorted_raw_filelist = tdms_scrape.sort_rawfiles_by_batchid(clean_raw_filelist, TDMS_GROUP)
for batchID_set in sorted_raw_filelist.values():
try:
# Initial get/cleaning of dataframe
batchID_frame = tdms_scrape.get_summary_frame(batchID_set,TDMS_GROUP)
batchID_frame['Film_Temp_K'] = batchID_frame['Film_Temp_C'] + 273.15
batchID_frame = tdms_scrape.get_clean_frame(batchID_frame)
# Uniques = number of films per sample
uniques = batchID_frame['SampleID'].unique()
# Instantiates TemperatureTestResult model
filmresult = UpdatedTemperatureTestResult(SampleLabel = batchID_frame['BatchID'][0],
MeasureDate = batchID_frame['DateTime'][0],
)
filmresult.InitialRTPLQY = batchID_frame['PLQY'][:len(uniques)-1].mean()
filmresult.PLPeakWL = batchID_frame['PLPeakWL'][:len(uniques)-1].mean()
filmresult.PLCentroidWL = batchID_frame['WeightedMeanWL'][:len(uniques)-1].mean()
filmresult.PLPeakFWHM = batchID_frame['PLPeakFWHM'][:len(uniques)-1].mean()
filmresult.PLPeakDifference = filmresult.PLCentroidWL - filmresult.PLPeakWL
filmresult.SampleTimeStamp = batchID_frame['Sample_Time_s'][0]
# Removes film measurements taken during heating cycle
batchID_frame = batchID_frame[len(uniques)*HEATING_MEASUREMENTS:]
print 'Analyzing Batch ID: %s' %filmresult.SampleLabel
# Does film temp correction using centroid WL
correctedfilmtempk, popt, pcov = get_emission_corrected_temp(batchID_frame['Film_Temp_K'],
batchID_frame['WeightedMeanWL']
)
batchID_frame['Corrected_Film_Temp_K'] = correctedfilmtempk
filmresult.PLPeakShift = popt[0]
# Sort dataframe by emission corrected temp for interpolation
sorted_batchID_frame = batchID_frame.sort(columns = ['Corrected_Film_Temp_K'])
# Lowess smoothing (applied locally-weighted moving average to film temps)
sorted_batchID_frame['PLQY_AVG'] = getmovingaverage(sorted_batchID_frame['Corrected_Film_Temp_K'],
sorted_batchID_frame['PLQY'],
f=3./5.,
iter=3
)
# Do interpolation fitting
interpfxn, interperror = get_interpolated_fit(sorted_batchID_frame['Corrected_Film_Temp_K'],
sorted_batchID_frame['PLQY'],
sorted_batchID_frame['PLQY_AVG']
)
filmresult.FitQuality = interperror
# Add interpolated PLQYs at REL_TEMPS to TemperatureTestResult Model
interptemps = fit_interpolated_temps(interpfxn, REL_TEMPS)
filmresult.PLQY25C = interptemps[0]
filmresult.PLQY60C = interptemps[1]
filmresult.PLQY100C = interptemps[2]
filmresult.PLQY120C = interptemps[3]
try:
filmresult.PLQYHysteresis = filmresult.PLQY25C - filmresult.InitialRTPLQY
except:
filmresult.PLQYHysteresis = None
filmresult.PLPeakBroadening = 0
# Attempt to match BatchID in TDMS file (SampleLabel) to ProductID in ReactionManager Model
try:
filmresult.ProductID = ReactionManager.objects.get(ReactionID = filmresult.SampleLabel.replace('-I','').upper())
except:
filmresult.ProductID = ReactionManager.objects.get(pk=240)
# Gets QuerySet of previous measurements of TMDS file films
previous_meas = UpdatedTemperatureTestResult.objects.filter(SampleLabel = filmresult.SampleLabel.replace('-I',''))
# If no previous measurements, assumes TDMS file is first measurement
if len(previous_meas) > 0:
first_meas = previous_meas[0]
filmresult.ExposureTime = (filmresult.SampleTimeStamp - first_meas.SampleTimeStamp)/3600.0
filmresult.FirstMeasurement = False
else:
filmresult.ExposureTime = 0
filmresult.FirstMeasurement = True
# TemperatureTestResult validation and commit
filmresult.full_clean()
filmresult.save()
# Clear Daily folder of analyzed files
if DELETE_FILES_AFTER_ANALYSIS:
for file in batchID_set:
os.remove(os.path.join(PATH, file))
except:
pass
if DELETE_FILES_AFTER_ANALYSIS:
print 'Directory has been cleared: %s' %PATH
print 'Analysis Complete'
return
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment