Skip to content

Instantly share code, notes, and snippets.

@brickZA
Last active August 25, 2016 15:13
Show Gist options
  • Save brickZA/d5da002e349af9a02e37cdd2cb25ef7d to your computer and use it in GitHub Desktop.
Save brickZA/d5da002e349af9a02e37cdd2cb25ef7d to your computer and use it in GitHub Desktop.
Experimentation with hdbextractor python library

Seems to be built as part of hdb++/hdbextractor/cpp

Initial test files taken from

svn://svn.code.sf.net/p/tango-cs/code/archiving/hdb++/hdbextractor/cpp/tags/release-0.94.0/hdbxtest

Assuming you are using HDB++ with the new mysql schema with the database host, database user, database password and database name as specified in db++config.dat, and that hdbextractor is install prefix is /usr/local, execute

export PYTHON_PATH=/usr/local/lib/python2.7/dist-packages/hdbextractor
./hdbximpl.py db++config.dat mkat/proxies/anc/air_pressure "2013-01-19 19:07:10" "2017-01-19 19:08:10"

to extract the attribute histor of the air_pressure attribute on the mkat/proxies/anc/ device from the start time 2013-01-19 19:07:10 to the end time 2017-01-19 19:08:10.

Note that the original hdbximpl.py was edited to format the results as a list of values rather than printing them as a proof of concept. That list (res_list) should be easy to write out to csv using the csv module. Also hardcoded the setup routine to assume the new hdb++ mysql schema (rather than the old hdb schema) since it was determining the schema by parsing the database name, and was getting it wrong.

# This file is parsed by the ConfigurationParser class.
# Lines are ignored after the '#' character.
# key - value pairs must be separated by a '=' character.
# White spaces and tabs are ignored.
#
#
dbuser = hdbreader
dbpass = R3aderP4ss
dbhost = monctl.devn.camlab.kat.ac.za
dbname = hdb
# The FillFromThePastMode configures how the lack of data in the desired time window is managed.
# Possible values (if not specified, the default is None):
#
# None: nothing is done if the window does not contain data or if valid data starts late.
# KeepWindow: look in the past for the most recent valid data and make it the first value of the window.
# The timestamp is changed to start the date/time of the time window.
# WidenWindow: look in the past for the most recent valid data and put it in as the first result with its
# preserving its original timestamp.
#
FillFromThePastMode = WidenWindow
# The FillFromThePastThresholdPercent instructs the extractor to look in the past for the most recent
# valid value, if FillFromThePastMode is not set to None.
# The value is a floating point number representing the percentage of the whole time interval that the
# first data in the desired must fall in so that fill from the past is unnecessary.
# For example, suppose you requested a time interval of 10 hours, i.e. tend - start = 10hours.
# Then, if FillFromThePastThresholdPercent is 10.0 and the first valid data occurs within the first hour,
# no need to fill from the past. Otherwise FillFromThePastMode determines how the the first value is
# brought into the data set at the beginning of the window.
# Default: 5%
#
FillFromThePastThresholdPercent = 5
# This file is parsed by the ConfigurationParser class.
# Lines are ignored after the '#' character.
# key - value pairs must be separated by a '=' character.
# White spaces and tabs are ignored.
#
#
dbuser = hdbbrowser
dbpass =hdbbrowser
dbhost = fcsproxy
dbname=hdb
# The FillFromThePastMode configures how the lack of data in the desired time window is managed.
# # Possible values (if not specified, the default is None):
# #
# # None: nothing is done if the window does not contain data or if valid data starts late.
# # KeepWindow: look in the past for the most recent valid data and make it the first value of the window.
# # The timestamp is changed to start the date/time of the time window.
# # WidenWindow: look in the past for the most recent valid data and put it in as the first result with its
# # preserving its original timestamp.
# #
FillFromThePastMode = WidenWindow
# #
# # The FillFromThePastThresholdPercent instructs the extractor to look in the past for the most recent
# # valid value, if FillFromThePastMode is not set to None.
# # The value is a floating point number representing the percentage of the whole time interval that the
# # first data in the desired must fall in so that fill from the past is unnecessary.
# # For example, suppose you requested a time interval of 10 hours, i.e. tend - start = 10hours.
# # Then, if FillFromThePastThresholdPercent is 10.0 and the first valid data occurs within the first hour,
# # no need to fill from the past. Otherwise FillFromThePastMode determines how the the first value is
# # brought into the data set at the beginning of the window.
# # Default: 5%
# #
FillFromThePastThresholdPercent = 5
#! /usr/bin/env python
import sys
import logging
from hdbextractor import *
class Hdbximpl(HdbExtractorListener):
def __init__(self, dbu, dbh, dbn, dbp):
HdbExtractorListener.__init__(self)
self.dbuser = dbu
self.dbhost = dbh
self.dbname = dbn
self.dbpass = dbp
self.inttest = 0;
self.valuelist = []
self.ex = Hdbextractor(self)
# if dbn == "hdbpp":
# type = Hdbextractor.HDBPPMYSQL
# else:
# type = Hdbextractor.HDBMYSQL
type = Hdbextractor.HDBPPMYSQL
res = self.ex.connect(type, dbh, dbn, dbu, dbp, 3306)
print ("Connected", res)
def onSourceProgressUpdate(self, name, percent):
print(name + " data extraction: " + str(percent) + "%\n")
self.extractData()
def onExtractionFinished(self, totalrows, elapsed):
print("extraction completed: got " + str(totalrows) + " rows in " + str(elapsed) + "s\n")
def onSourceExtractionFinished(self, name, totalrows, elapsed):
print(name + " data extraction completed in " + str(elapsed) + "s [" + str(totalrows) + " rows]\n")
self.extractData()
def addToTest(self, inpu):
inpu = inpu + 1
return inpu
def getHdbExtractor(self):
return self.ex
def extractData(self):
ret = self.ex.get(self.valuelist)[1]
# print("extractData: got" , ret);
# print("______ VALUELIST _____")
self.valuelist = ret
#print(self.valuelist)
def getValueList(self):
return self.valuelist
def getData(self, sources, start_date, stop_date):
res = self.ex.getData(sources, start_date, stop_date);
if res == False:
for s in sources:
print("Error fetching data: " + s + " -> " + self.ex.getErrorMessage() )
# scalar_double_ro: ./hdbxtest db++config.dat
# "2013-01-10 11:14:10.000000" "2013-01-11 11:14:10"
#
# scalar_double_rw: ./hdbxtest db++config.dat kg13/mod/llrf_kg13.01/RfReverse "2011-11-06 21:05:43.000000" "2011-11-07 21:05:43.000000"
#
# scalar_double_rw: ./hdbxtest db++config.dat kg10/mod/llrf_kg10.01/RfReverse "2011-11-06 21:05:43.000000" "2011-11-07 21:05:43.000000"
#
# array_double_ro: ./hdbxtest db++config.dat f/radiation_protection/blm_master_uh.01/BlmIntData "2013-01-18 01:38:20" "2013-01-19 11:38:20"
#
# HDB, spectrum, same start-stop date will fetch the first available data in the past
# ./hdbxtest dbconfig.dat f/radiation_protection/blm_master_uh.01/BlmIntData "2013-05-29 19:09:10" "2013-05-29 19:09:10"
#
# HDBPP, scalar, same start-stop date will fetch the first available data in the past
# ./hdbxtest db++config.dat kg10/mod/llrf_kg10.01/RfReverse "2011-11-07 21:05:43.000000" "2011-11-07 21:05:43.000000"
# HDB, two sources, short time window, to use to check data filling.
# /hdbxtest dbconfig.dat f/radiation_protection/blm_master_linac.01/BlmIntData f/radiation_protection/blm_master_uh.01/BlmIntData "2013-01-19 19:07:10" "2013-01-19 19:08:10"
if __name__ == "__main__":
logging.basicConfig(
format="%(asctime)s - %(name)s - %(levelname)s - %(filename)s "
": %(lineno)d - %(message)s", level=logging.INFO)
argc = len(sys.argv)
conf = sys.argv[1];
att = sys.argv[2]
logging.info("argc: {}, conf: {}, att: {}".format(argc, conf, att))
settings = HdbXSettings()
settings.loadFromFile(conf)
sources = []
for i in range (2, len(sys.argv) - 2):
print("+ adding source " + sys.argv[i])
sources.append(sys.argv[i])
d1 = sys.argv[argc - 2]
d2 = sys.argv[argc - 1]
hdbximpl = Hdbximpl(settings.get("dbuser"), settings.get("dbhost"), settings.get("dbname"), settings.get("dbpass"))
logging.info('Got hdbximpl')
hdbximpl.getHdbExtractor().setHdbXSettings(settings);
logging.info('Before getData')
hdbximpl.getData(sources, d1, d2);
logging.info('After getData')
valuelist = hdbximpl.getValueList();
#print(valuelist)
print("There are " + str(len(valuelist)) + " datas");
siever = DataSiever()
siever.divide(valuelist);
siever.fill();
srcs = siever.getSources();
print("Siever has " + str(siever.getSize()) + " sources");
# Generate a list that could be exported to csv as attribute name,
# timestamp, quality, value
res_list = []
for i in range(0, siever.getSize()):
values = siever.getData(srcs[i])
#XVariantPrinter().printValueList(values, 2)
for val in values:
res_list.append([srcs[i],
val.getTimestamp(),
val.getQuality(),
val.convertToString()])
# import IPython ; IPython.embed()
TEMPLATE = app
CONFIG += console
CONFIG -= app_bundle
CONFIG -= qt
QMAKE_CXXFLAGS += -std=gnu++98
SOURCES += main.cpp \
myhdbextractorimpl.cpp
INCLUDEPATH += ../src ../src/db
LIBS += -L../src/.libs -lhdbextractor++
HEADERS += \
myhdbextractorimpl.h
OTHER_FILES += \
hdbxtest_example_usages.txt
scalar_double_ro: ./hdbxtest db++config.dat "lh/radiation_protection/blm_mscr_lh.02/BlmData" "2013-01-10 11:14:10.000000" "2013-01-11 11:14:10"
scalar_double_rw: ./hdbxtest db++config.dat kg13/mod/llrf_kg13.01/RfReverse "2011-11-06 21:05:43.000000" "2011-11-07 21:05:43.000000"
scalar_double_rw: ./hdbxtest db++config.dat kg10/mod/llrf_kg10.01/RfReverse "2011-11-06 21:05:43.000000" "2011-11-07 21:05:43.000000"
array_double_ro: ./hdbxtest db++config.dat f/radiation_protection/blm_master_uh.01/BlmIntData "2013-01-18 01:38:20" "2013-01-19 11:38:20"
HDB, spectrum, same start-stop date will fetch the first available data in the past
./hdbxtest dbconfig.dat f/radiation_protection/blm_master_uh.01/BlmIntData "2013-05-29 19:09:10" "2013-05-29 19:09:10"
HDBPP, scalar, same start-stop date will fetch the first available data in the past
./hdbxtest db++config.dat kg10/mod/llrf_kg10.01/RfReverse "2011-11-07 21:05:43.000000" "2011-11-07 21:05:43.000000"
HDB, two sources, short time window, to use to check data filling.
/hdbxtest dbconfig.dat f/radiation_protection/blm_master_linac.01/BlmIntData f/radiation_protection/blm_master_uh.01/BlmIntData "2013-01-19 19:07:10" "2013-01-19 19:08:10"
#include <stdio.h>
#include <stdlib.h>
#include "../src/utils/datasiever.h"
#include "myhdbextractorimpl.h"
#include "hdbxsettings.h"
#include "../src/hdbextractor.h"
#include "../src/configurationparser.h"
#include "../src/utils/xvariantprinter.h"
#include <map>
using namespace std;
int main(int argc, char **argv)
{
if(argc < 5)
{
printf("\e[1;31mUsage\e[0m \"%s configfile.dat domain/family/member/attribute \"2014-07-20 10:00:00\" \"2014-07-20 12:00:00\"\n",
argv[0]);
exit(EXIT_FAILURE);
}
else
{
const char* start_date = argv[argc - 2];
const char* stop_date = argv[argc - 1];
std::vector<std::string> sources;
for(int i = 2; i < argc - 2; i++)
sources.push_back(std::string(argv[i]));
HdbXSettings *qc = new HdbXSettings();
qc->loadFromFile(argv[1]);
MyHdbExtractorImpl *hdbxi = new MyHdbExtractorImpl(qc->get("dbuser").c_str(),
qc->get("dbpass").c_str(), qc->get("dbhost").c_str(), qc->get("dbname").c_str());
hdbxi->getHdbExtractor()->setHdbXSettings(qc);
hdbxi->getData(sources, start_date, stop_date);
const std::vector<XVariant> & valuelist = hdbxi->getValuelistRef();
DataSiever siever;
siever.divide(valuelist);
siever.fill();
std::vector<std::string> srcs = siever.getSources();
for(size_t i = 0; i < srcs.size(); i++)
{
std::vector<XVariant > values = siever.getData(srcs.at(i));
XVariantPrinter().printValueList(values, 2);
}
printf("main.cpp: deleting hdbxsettings\n");
delete qc;
printf("main.cpp: deleting MyHdbExtractorImpl\n");
delete hdbxi;
}
return 0;
}
#include "myhdbextractorimpl.h"
#include <stdio.h>
#include <string.h>
#include "../src/hdbextractor.h"
#include "../src/db/xvariant.h"
MyHdbExtractorImpl::MyHdbExtractorImpl(const char *dbuser, const char *dbpass,
const char *dbhost, const char *dbnam)
{
printf("\033[0;37mtrying to connect to host: \"%s\" db name: \"%s\" user: \"%s\"\033[0m\t", dbhost, dbnam, dbuser);
mExtractor = new Hdbextractor(this);
Hdbextractor::DbType type = Hdbextractor::HDBMYSQL;
if(strcmp(dbnam, "hdb") == 0)
type = Hdbextractor::HDBMYSQL;
else if(strcmp(dbnam, "hdbpp") == 0)
type = Hdbextractor::HDBPPMYSQL;
bool res = mExtractor->connect(type, dbhost, dbnam, dbuser, dbpass);
if(res)
{
printf("\e[1;32mOK\e[0m\n");
mExtractor->setUpdateProgressPercent(10);
}
else {
printf("\e[1;31merror connecting to host: %s\e[0m\n", dbhost);
}
}
MyHdbExtractorImpl::~MyHdbExtractorImpl()
{
printf("DELETING mExtractor\n");
delete mExtractor;
}
void MyHdbExtractorImpl::getData(std::vector<std::string> sources, const char* start_date, const char *stop_date)
{
bool res = mExtractor->getData(sources, start_date, stop_date);
if(!res)
{
for(size_t i = 0; i < sources.size(); i++)
printf("\e[1;31merror fetching data: %s: %s\e[0m\n", sources[i].c_str(), mExtractor->getErrorMessage());
}
}
void MyHdbExtractorImpl::onSourceExtractionFinished(const char *name, int totalRows, double elapsed)
{
printf("\"%s\" data extraction completed in %.2fs [%d rows]\n", name, elapsed, totalRows);
}
/** \brief this method is invoked when data extraction is fully accomplished.
*
*/
void MyHdbExtractorImpl::onExtractionFinished(int totalRows, double elapsed)
{
printf("extraction completed: got %d rows in %fs\n", totalRows, elapsed);
extractData();
}
/** \brief this method is invoked according to the percentage value configured in setUpgradeProgressPercent.
*
* \note By default, if percentage is less than or equal to 0, onProgressUpdate is not invoked and the results
* are available when onExtractionFinished is invoked.
*
* @see onExtractionFinished
*/
void MyHdbExtractorImpl::onSourceProgressUpdate(const char *name , double percent)
{
printf("\"%s\" data extraction: %.2f%%\n", name, percent);
extractData();
}
void MyHdbExtractorImpl::extractData()
{
mExtractor->get(d_valuelist);
}
const std::vector<XVariant> &MyHdbExtractorImpl::getValuelistRef() const
{
return d_valuelist;
}
#ifndef MYHDBEXTRACTORIMPL_H
#define MYHDBEXTRACTORIMPL_H
#include "../src/hdbextractorlistener.h"
class Hdbextractor;
#include <vector>
#include <xvariant.h>
/** \brief an <em>example</em> of an implementation of the HdbExtractorListener
*
*/
class MyHdbExtractorImpl : public HdbExtractorListener
{
public:
MyHdbExtractorImpl(const char *dbuser, const char *dbpass,
const char *dbhost, const char *dbnam);
virtual ~MyHdbExtractorImpl();
void getData(std::vector<std::string> sources, const char* start_date, const char *stop_date);
virtual void onSourceProgressUpdate(const char *name, double percent);
virtual void onExtractionFinished(int totalRows, double elapsed);
virtual void onSourceExtractionFinished(const char* name, int totalRows, double elapsed);
Hdbextractor* getHdbExtractor() const { return mExtractor; }
void extractData();
const std::vector<XVariant> &getValuelistRef() const;
private:
Hdbextractor *mExtractor;
std::vector<XVariant> d_valuelist;
};
#endif // MYHDBEXTRACTORIMPL_H
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment