Skip to content

Instantly share code, notes, and snippets.

@trevorlinton
Last active August 29, 2015 14:17
Show Gist options
  • Save trevorlinton/094faee2f5cefe3d4750 to your computer and use it in GitHub Desktop.
Save trevorlinton/094faee2f5cefe3d4750 to your computer and use it in GitHub Desktop.
Generate code statistics (Java/JSF/Scala)

Generating Pressure Stats

These can be useful statistics (along with other static analysis tools) to see what types of pressure a class within Java or Scala is under. This will also search for xhtml/includes to measure on JSF the inclusion rate for XHTML files. This is implemented as a simple bash script that outputs a JSON file:

java_class_use.sh

#!/bin/sh
if [ "$1" == "" ]; then
  echo "java_class_use.sh directory"
  exit 1
fi

SEARCHDIR=$1

echo "\"java_results\":{"
for i in $(find $SEARCHDIR -not -path '*/\.*' -type f -name "*.java" -exec perl -ne 'while(/\s*(public|private)\s+class\s+(\w+)\s+((extends\s+\w+)|(implements\s+\w+( ,\w+)*))?\s*\{/g) { print "$2\n";}' {} \;); do
  RES=`find $SEARCHDIR -not -path '*/\.*' -type f -name "*.java" -exec grep -o $i {} \; | wc -l`
  echo "\t\"$i\":\"$RES\","
done
echo "}"

The output can then be collected and analyzed over time. These aren't very good measures of QUALITY but PRESSURE, or the POTENTIAL amount of dependencies and its usage amount.

We can also do the same for scala:

scala_class_use.sh

#!/bin/sh
if [ "$1" == "" ]; then
  echo "scala_class_use.sh directory"
  exit 1
fi

SEARCHDIR=$1

echo "\"scala_results\":{"
for i in $(find $SEARCHDIR -not -path '*/\.*' -type f -name "*.scala" -exec perl -ne 'while(/\s*class\s+(\w+)\s+((extends\s+\w+)|(implements\s+\w+( ,\w+)*))?\s*\{/g) { print "$1\n";}' {} \;); do
  RES=`find $SEARCHDIR -not -path '*/\.*' -type f -name "*.scala" -exec grep -o $i {} \; | wc -l`
  echo "\t\"$i\":\"$RES\","
done
echo "}"

We can also turn our heads to JSF and Faces and see how often XHTML files are included.

xhtml_include_use.sh

#!/bin/sh
if [ "$1" == "" ]; then
  echo "xhtml_include_use.sh directory"
  exit 1
fi

SEARCHDIR=$1

echo "\"xhtml_results\":{"
for i in $(find $SEARCHDIR -not -path '*/\.*' -type f -name "*.xhtml" -exec basename {} \;); do
  RES=`find $SEARCHDIR -not -path '*/\.*' -type f -name "*.xhtml" -exec grep -o $i {} \; | wc -l`
  echo "\t\"$i\":\"$RES\","
done
echo "}"

Generating Java Class Dependencies

We can also go beyond use and get the actual class dependencies that exist (note, were still using a naive method of regex, its not perfect, but the only other options I can think of is a more sophisticated AST tree).

java_class_deps.sh
#!/bin/sh
if [ "$1" == "" ]; then
  echo "java_class_deps.sh directory"
  exit 1
fi

SEARCHDIR=$1

echo "\"java_deps_results\":{"
for i in $(find $SEARCHDIR  -not -path '*/\.*' -type f -name "*.java" -exec perl -ne 'while(/\s*(public|private)\s+class\s+(\w+)\s+((extends\s+\w+)|(implements\s+\w+( ,\w+)*))?\s*\{/g) { print "$2\n";}' {} \;); do
  RES=`find $SEARCHDIR -not -path '*/\.*' -type f -name "*.java" -exec grep -l $i {} \; `
  echo "\t\"$i\":{"
  for n in $RES; do
    BN=`basename -a -s .java $n`
    echo "\t\t\"$BN\":\"$n\","
  done
  echo "\t},"
done
echo "}"

Making it fast

Unfortunately as you might notice this has a O(n^2) runtime (it must search each document twice for reference and definition). To help with time necessary instead of implementing a more sophisticated AST parser we'll just go with BRUTE FORCE! We can do this by creating a ramdisk and cloning into the ramdisk, then searching in that ramdisk.

This is OS X specific but similar commands exist for Linux (loopback ramdisk mount?) and Windows (imdisk).

In addition we only need the tip of the master branch for the git repo's to search, not the entire history and all branches, we can make this considerably faster by only cloning (and thus, searching) the master branch.

makeitfast.sh
#!/bin/sh
diskutil erasevolume HFS+ 'BlastRadius' `hdiutil attach -nomount ram://4194304`

git clone --depth 1 --branch master https://github.com/somename/somerepo /Volumes/BlastRadius/somerepo

... repeat for each repo ...

./java_class_deps.sh /Volumes/BlastRadius/ > java_class_deps.json
./java_class_use.sh /Volumes/BlastRadius/ > java_class_use.json
./scala_class_use.sh /Volumes/BlastRadius/ > scala_class_use.json
./xhtml_include_use.sh /Volumes/BlastRadius/ > xhtml_include_use.json

diskutil unmount /Volumes/BlastRadius

Note this script requires adding in (manually) the git repo's you wish to analyze. Its also limited to 2GB, you can increase this by upping the ram://XXXXX (it must be in intervals of 2048 to avoid page faults!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment