Skip to content

Instantly share code, notes, and snippets.

View kfeoktistoff's full-sized avatar

Kirill Feoktistov kfeoktistoff

View GitHub Profile
@kfeoktistoff
kfeoktistoff / shuffle_function.sql
Last active July 14, 2016 11:28
PL/PgSQL function shuffles all values in a column. A table must have unique 'id' field.
CREATE OR REPLACE FUNCTION shuffle(table_name TEXT, field_name TEXT) RETURNS void AS $$
DECLARE entry RECORD;
DECLARE entry_to_swap RECORD;
BEGIN
FOR entry IN execute format('select * from %s', table_name) LOOP
drop table if exists tmp_entry_to_swap;
execute format('create temp table tmp_entry_to_swap as (select * from %s order by random() limit 1)', table_name);
select * into entry_to_swap from tmp_entry_to_swap;
@kfeoktistoff
kfeoktistoff / blx_parser.py
Created February 24, 2016 16:33
Parsing SAP BO (blx) file, saving folders, dimensions and measures in csv file. To get bxl file, add .zip to the end of business layer file and extract blx file inside the archive. This extraction path should be used as an inputFile.
import xml.etree.ElementTree as ET
inputFile=open('C:/tmp/input.blx', 'r')
outputFile=open('C:/tmp/output.csv', 'w+')
separator = ','
def isDir(elem):
return elem.attrib.get('xsi:type') == 'business:Folder'
def isDimension(elem):
@kfeoktistoff
kfeoktistoff / best.R
Created June 21, 2014 19:13
Programming assignment 3 for Coursera "R Programming" course by Johns Hopkins University
best <- function(state, outcome) {
## Read outcome data
## Check that state and outcome are valid
## Return hospital name in that state with lowest 30-day death
## rate
source("sortHospitalsByOutcome.R")
head(sortHospitalsByOutcome(state, outcome), 1)
}
@kfeoktistoff
kfeoktistoff / complete.R
Created June 18, 2014 20:22
The zip file contains 332 comma-separated-value (CSV) files containing pollution monitoring data for fine particulate matter (PM) air pollution at 332 locations in the United States. Each file contains data from a single monitor and the ID number for each monitor is contained in the file name. For example, data for monitor 200 is contained in th…
## Write a function that reads a directory full of files and reports the number of completely observed cases in each data file.
## The function should return a data frame where the first column is the name of the file and the second column is the number
## of complete cases. A prototype of this function follows
complete <- function(directory, id = 1:332) {
## 'directory' is a character vector of length 1 indicating
## the location of the CSV files
## 'id' is an integer vector indicating the monitor ID numbers
## to be used
## Return a data frame of the form:
1
Take a look at the 'iris' dataset that comes with R. The data can be loaded with the code:
library(datasets)
data(iris)
A description of the dataset can be found by running
?iris
@kfeoktistoff
kfeoktistoff / rprog-quiz1.txt
Last active March 9, 2024 06:26
R Programming: Quiz1
1
The R language is a dialect of which of the following programming languages?
S
2
The definition of free software consists of four freedoms (freedoms 0 through 3). Which of the following is NOT one of the freedoms that are part of the definition?
The freedom to prevent users from using the software for undesirable purposes.
3
In R the following are all atomic data types EXCEPT
@kfeoktistoff
kfeoktistoff / LargeStringScanner.java
Last active August 29, 2015 14:02
Using java.util.Scanner with strings bigger than 1024
/**
java.util.Scanner is a great tool for parsing but it has some disadvantages. One of them is
unchangable buffer with length 1024. It means that working with strings bigger than 1024 will not
be correct - in fact only the first 1024 symbols will be scanned. Also java.util.Scanner class is final,
so overriding methods is not awailable.
Here is a draft implementation of solution when whole text is splitted to N parts on lexeme the nearest to
n*1024th symbol and each part is scanned seperately. In this example found lexeme is being replaced with upper case
and enclosed with "<>"
*/