Skip to content

Instantly share code, notes, and snippets.

@Bloofer
Created January 8, 2018 06:32
Show Gist options
  • Save Bloofer/cdc8e275dcb217a2a52bc425e2012c44 to your computer and use it in GitHub Desktop.
Save Bloofer/cdc8e275dcb217a2a52bc425e2012c44 to your computer and use it in GitHub Desktop.
Deckard config
#############################################################
# Configuration file for clone detection.
#
############################################################
# Often, need to change these common parameters:
# - FILE_PATTERN : for source files in different languages
# - SRC_DIR : the root directory containing the source files
# - DECKARD_DIR : Where is the home directory of DECKARD
# - clone detection parameters: c.f. DECKARD's paper
# -- MIN_TOKENS
# -- STRIDE
# -- SIMILARITY
#
# java, c, or php?
FILE_PATTERN='*.java' # used for the 'find' command
# where are the source files?
SRC_DIR='src'
# where is Deckard?
DECKARD_DIR='/home/yang/Deckard'
# clone parameters; refer to paper.
MIN_TOKENS='50 100' # can be a sequence of integers
STRIDE='2 0' # can be a sequence of integers
SIMILARITY='1.0 0.95' # can be a sequence of values <= 1
#DISTANCE='0 0.70711 1.58114 2.236'
###########################################################
# Where to store result files?
#
# where to output generated vectors?
VECTOR_DIR='vectors'
# where to output detected clone clusters?
CLUSTER_DIR='clusters'
# where to output timing/debugging info?
TIME_DIR='times'
##########################################################
# where are several programs we need?
#
# where is the vector generator?
VGEN_EXEC="$DECKARD_DIR/src/main"
case $FILE_PATTERN in
*.java )
VGEN_EXEC="$VGEN_EXEC/jvecgen" ;;
*.php )
VGEN_EXEC="$VGEN_EXEC/phpvecgen" ;;
*.c | *.h )
VGEN_EXEC="$VGEN_EXEC/cvecgen" ;;
* )
echo "Error: invalid FILE_PATTERN: $FILE_PATTERN"
VGEN_EXEC="$VGEN_EXEC/invalidvecgen" ;;
esac
# how to divide the vectors into groups?
GROUPING_EXEC="$DECKARD_DIR/src/vgen/vgrouping/runvectorsort"
# where is the lsh for vector clustering?
CLUSTER_EXEC="$DECKARD_DIR/src/lsh/bin/enumBuckets"
# how to post process clone groups?
POSTPRO_EXEC="$DECKARD_DIR/scripts/clonedetect/post_process_groupfile"
# how to transform source code html?
SRC2HTM_EXEC=source-highlight
SRC2HTM_OPTS=--line-number-ref
############################################################
# For parallel processing
#
# the maximal number of processes to be used (by xargs)
# - 0 means as many as possible (upto xargs)
MAX_PROCS=8
##################################################################
# Some additional, internal parameters; can be ignored
#
# the maximal vector size for the first group; not really useful
GROUPING_S='50' # should be a single value
#GROUPING_D
#GROUPING_C
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment