primaryobjects/1-readme.md

## 1-readme.md

      
    Raw
  

              1-readme.md
            
          
    Programming By Example

The following program is a basic proof-of-concept implementation of the program synthesis technique of Programming by Example, as included in Microsoft Excel FlashFill.
The Data Set

The data-set "features.csv" consists of extracted features from input/output examples, as a user would provide prior to beginning program synthesis. For example, to produce a program that extracts the first character of every input string, the user might give examples of strings as input with lengths of 1, 2, or 5 characters and an output example of a single character. We can guess the most likely program to select as a solution might be "firstCharacter". Similarly, if the input consists of numbers, we can guess the most likely program to select as a solution might be "addition".
Large Database of Programs

In a true Programming by Example system, the data-set would consist of a massive number of stored programs and their corresponding features. However, for this proof-of-concept, we include a very limited set. This example also omits the creation of the feature set, which would involve logical reasoning over the input/output examples, prior to performing neural-guided heuristics for selecting a solution. That is, knowing to extract the first character versus the second character, or some other substring, requires much more feature transformation than is provided in this simple example.

  
## features.csv

          
            inputType
            numInputs
            inputLength
            outputType
            numOutputs
            outputLength
            program

            
              character
              1
              1
              character
              1
              1
              firstCharacter

            
              character
              1
              2
              character
              1
              1
              firstCharacter

            
              character
              1
              5
              character
              1
              1
              firstCharacter

            
              character
              2
              1
              character
              1
              2
              concat

            
              character
              2
              2
              character
              1
              4
              concat

            
              character
              2
              5
              character
              1
              10
              concat

            
              numeric
              2
              1
              numeric
              1
              1
              addition

            
              numeric
              3
              1
              numeric
              1
              1
              addition

            
              character
              1
              1
              character
              1
              1
              firstCharacter

            
              numeric
              2
              1
              numeric
              1
              1
              addition

            
              character
              1
              15
              character
              1
              1
              firstCharacter

            
              character
              4
              3
              character
              1
              12
              concat

            
              numeric
              8
              8
              numeric
              1
              1
              addition

## output.txt
# weights:  24 (14 variable)

initial  value 8.788898
iter  10 value 0.054007
final  value 0.000072
converged

[1] "All predictions correct!"

results          addition concat firstCharacter
  addition              2      0              0
  concat                0      1              0
  firstCharacter        0      0              2

> results
[1] firstCharacter addition       firstCharacter concat         addition

## programmingByExample.R
# Very basic Programming by Example implementation with a machine learning model based on input/output features.
library(nnet)

# Load a data-set of features based on input/output characteristics.
df <- read.csv('features.csv')
test <- df[9:nrow(df),]

# Multinomial logistic regression.
fit <- multinom(program ~ ., data = df[1:8,])

# Predict the solution program for each input/output set.
results <- predict(fit, newdata=test)

# Confirm results.
print(ifelse(all(results == test$program), 'All predictions correct!', 'Some predictions failed.'))
print(table(results, test$program))
inputType	numInputs	inputLength	outputType	numOutputs	outputLength	program
character	1	1	character	1	1	firstCharacter
character	1	2	character	1	1	firstCharacter
character	1	5	character	1	1	firstCharacter
character	2	1	character	1	2	concat
character	2	2	character	1	4	concat
character	2	5	character	1	10	concat
numeric	2	1	numeric	1	1	addition
numeric	3	1	numeric	1	1	addition
character	1	1	character	1	1	firstCharacter
numeric	2	1	numeric	1	1	addition
character	1	15	character	1	1	firstCharacter
character	4	3	character	1	12	concat
numeric	8	8	numeric	1	1	addition
	# weights: 24 (14 variable)

	initial value 8.788898
	iter 10 value 0.054007
	final value 0.000072
	converged

	[1] "All predictions correct!"

	results addition concat firstCharacter
	addition 2 0 0
	concat 0 1 0
	firstCharacter 0 0 2

	> results
	[1] firstCharacter addition firstCharacter concat addition
	# Very basic Programming by Example implementation with a machine learning model based on input/output features.
	library(nnet)

	# Load a data-set of features based on input/output characteristics.
	df <- read.csv('features.csv')
	test <- df[9:nrow(df),]

	# Multinomial logistic regression.
	fit <- multinom(program ~ ., data = df[1:8,])

	# Predict the solution program for each input/output set.
	results <- predict(fit, newdata=test)

	# Confirm results.
	print(ifelse(all(results == test$program), 'All predictions correct!', 'Some predictions failed.'))
	print(table(results, test$program))