chrispsn/k_file_analytics.md

## k_file_analytics.md

      
    Raw
  

              k_file_analytics.md
            
          
    Analytics on k code

Mildly interesting results from string analysis of a handful of k files. If you have more public k files or suggestions for further analysis, comment below!
This uses the latest Shakti.
Corpus:

kparc's Advent of Code answers from 2015
ngn's Advent of Code answers from 2015 and 2019
ngn's Project Euler answers

Verb char frequency

Doesn't distinguish between monadic/dyadic/etc cases, unfortunately.
verbFreqs: |^#'= "~!@#$%^&*-_=+|,<.>?" #

Results are pretty consistent! +,*# at the top. Sorts (<>^) probably rank near the bottom because there's a lot of ways to do it. ? near the bottom too. % barely registers. Probably reflects the nature of the input set (Advent of Code challenges).
/ ngn-2019-aoc.k 
,|130
+|120
*|112
#| 96
|| 78
!| 76
-| 71
@| 60
.| 46
&| 42
=| 41
$| 41
^| 22
~| 21
<| 21
>| 16
_| 15
?| 15

/ ngn-2015-aoc.k 
+|58
,|42
*|37
#|35
-|29
||28
&|25
=|22
_|19
.|17
@|16
?|16
$|16
!|16
~| 9
^| 9
<| 7
>| 4

/ kparc-2015-aoc.k 
+|97
,|63
*|55
#|40
&|39
_|35
=|34
-|34
||32
@|23
!|23
$|20
.|19
<|16
~|15
?|14
>|12
^| 7
%| 7

/ ngn-euler.k
+|313
*|253
!|196
#|190
-|188
,|180
||120
@|109
&|101
=| 95
_| 89
$| 55
.| 50
~| 46
>| 44
<| 42
^| 39
?| 30
%| 17

Adverb char frequency

Doesn't distinguish between c and c: cases yet.
adverbFreqs: |^#'= "/\\'" #

/ ngn-2019-aoc.k 
/|172
'|104
\| 78

/ ngn-2015-aoc.k 
/|85
'|56
\|38

/ kparc-2015-aoc.k 
/|115
'| 72
\| 27

/ ngn-euler.k
/|397
'|227
\|107

Bracket frequency: ( vs [ vs {


( for verbose lists or controlling evaluation order
[ for m-expression style function call, explicit lambda arg lists or (in some dialects) dict or table literals
{ for lambdas

bracketCount: #'= "([{" #

This will probably change once Shakti is used more widely ([ is used for dict and table literals).
/ ngn-2019-aoc.k 
(|151
[| 98
{| 77

/ ngn-2015-aoc.k
(|78
[|25
{|33

/ kparc-2015-aoc.k
(|89
[|46
{|52

/ ngn-euler.k
(|275
[|144
{|150

Top 25 most common char pairs

Until Shakti gets the ability to catch (and therefore filter out) parse errors, this will generate a lot of noise. And even then, different dialects will have different parsing rules.
+/ and *| rank high, as well as various reductions. 1_ was surprising! (Benford's Law?)
pairFreqs: 25#|^#'=2'

/ ngn 2019
0;  |53
0a20|42
x;  |36
:{  |35
a;  |31
+/  |31
00  |30
*|  |29
a:  |27
:(  |26
10  |25
];  |24
0:  |24
    |24
\n  |23
!0  |23
0a61|23
x}  |22
-1  |22
{(  |21
[x  |21
;(  |21
$[  |21
}/  |20
,/  |20

/ ngn 2015
+/  |23
a:  |18
x)  |17
0:  |16
0a61|16
;(  |15
x;  |14
a)  |14
/(  |14
-1  |14
a;  |13
:"  |13
290a|13
 2  |13
}'  |12
|/  |12
1_  |12
1+  |12
"\  |12
0a20|12
{(  |10
x}  |10
\x  |10
:(  |10
2#  |10

/ kparc 2015
+/  |42
x)  |33
x:  |31
:{  |28
`p  |27
0a0a|26
 1  |24
0   |22
x}  |21
),  |21
_x  |19
00  |19
x;  |18
};  |16
|/  |16
1_  |16
}'  |15
x   |15
10  |15
/x  |15
;x  |14
:x  |14
1;  |14
1+  |14
0:  |14

/ ngn euler
    |643
00  |261
10  |139
0a50|101
P   |100
 1  | 92
0a20| 88
+/  | 86
1+  | 81
-1  | 70
1;  | 67
x}  | 65
 2  | 62
x;  | 58
0;  | 57
:1  | 56
 /  | 55
0   | 51
s   | 49
,/  | 49
:{  | 47
300a| 46
x)  | 45
 3  | 45
pr  | 43