Skip to content

Instantly share code, notes, and snippets.

@jackrusher
Last active February 22, 2016 02:30
Show Gist options
  • Save jackrusher/57525bef17382c9a7207 to your computer and use it in GitHub Desktop.
Save jackrusher/57525bef17382c9a7207 to your computer and use it in GitHub Desktop.
;; This is the sort of thing that makes me think you'd enjoy using one
;; of the better Common Lisp implementations for certain kinds of
;; work. I'm using the latest version of SBCL here.
;; Ok, contrived example, but easy to imagine the code that it should
;; generate...
(defun add (x y)
(+ x y))
(disassemble #'add)
; disassembly for ADD
; Size: 41 bytes. Origin: #x1004CB7393
; 93: 498B4C2460 MOV RCX, [R12+96] ; thread.binding-stack-pointer
; no-arg-parsing entry point
; 98: 48894DF8 MOV [RBP-8], RCX
; 9C: 488BD6 MOV RDX, RSI
; 9F: 488BFB MOV RDI, RBX
; A2: 41BBD0010020 MOV R11D, 536871376 ; GENERIC-+
; A8: 41FFD3 CALL R11
; AB: 488B5DE8 MOV RBX, [RBP-24]
; AF: 488B75F0 MOV RSI, [RBP-16]
; B3: 488BE5 MOV RSP, RBP
; B6: F8 CLC
; B7: 5D POP RBP
; B8: C3 RET
; B9: 0F0B10 BREAK 16 ; Invalid argument count trap
;; ... not terrible, especially for a runtime that provides arbitrary
;; precision arithmetic and traps errors in a civilized way, but with
;; safety and generic #'+ that figures out arity and handles different
;; types we pay a price in performance relative to the asm we'd write.
;; There are a number of different ways to speed this up: declaring a
;; greater speed to safety preference, annotating types. Here's an
;; optimized version with that sort of ugly cruft added to get really
;; good performance specifically for 32-bit integers.
(defun add (x y)
(declare
(type (unsigned-byte 32) x y)
(optimize (speed 3) (safety 0)))
(the (unsigned-byte 32) (+ x y)))
(disassemble #'add)
; disassembly for ADD
; Size: 12 bytes. Origin: #x1004CEFD82
; 2: 4801F9 ADD RCX, RDI ; no-arg-parsing entry point
; 5: 488BD1 MOV RDX, RCX
; 8: 488BE5 MOV RSP, RBP
; B: F8 CLC
; C: 5D POP RBP
; D: C3 RET
;; ... I can definitely live with that.
;; We can write macros that produce code with these annotation to
;; compile our own high-level DSLs into the assembly we would have
;; written. Common Lisp has its warts, but it really is amazing when
;; it comes building whatever abstractions you prefer while retaining
;; control over low-level code generation.
;; I'm not sure what Common Lisp you're using, ot why you'd want to
;; use a counter to repeatedly poke some data into a value cell, but
;; but with SBCL's compiler I get:
(defun test ()
(let ((x 0))
(declare
(optimize (speed 3) (safety 0)))
(type (unsigned-byte 32) x)
(loop for i to 999999999 do (incf x))))
(disassemble #'test)
; disassembly for TEST
; Size: 58 bytes. Origin: #x1004AA1614
; 14: 498B442460 MOV RAX, [R12+96] ; thread.binding-stack-pointer
; no-arg-parsing entry point
; 19: 488945F8 MOV [RBP-8], RAX
; 1D: 31D2 XOR EDX, EDX
; 1F: 31C0 XOR EAX, EAX
; 21: EB15 JMP L1
; 23: 660F1F840000000000 NOP
; 2C: 0F1F4000 NOP
; 30: L0: 4883C202 ADD RDX, 2
; 34: 4883C002 ADD RAX, 2
; 38: L1: 483DFE933577 CMP RAX, 1999999998
; 3E: 7EF0 JLE L0
; 40: BA17001020 MOV EDX, 537919511
; 45: 488BE5 MOV RSP, RBP
; 48: F8 CLC
; 49: 5D POP RBP
; 4A: C3 RET
; 4B: 0F0B10 BREAK 16 ; Invalid argument count trap
;; ... which runs in 0.000586 seconds of system time at 99.67% CPU utilization:
;; 811,693,353 processor cycles
;; 0 bytes consed
;; ... note that this last bit means no memory was allocated during the function call and loop.
@Ismael-VC
Copy link

@jackrusher how about Julia with some LispSyntax (work in progress)? 😄

Note: Julia uses machine integer arithmetic by default with no automatic promotion to arbitrary precision numbers, but that may change in the future:

ismaelvc@richit ~ % julia
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _' |  |
  | | |_| | | | (_| |  |  Version 0.4.3 (2016-01-12 21:37 UTC)
 _/ |\__'_|_|_|\__'_|  |  
|__/                   |  x86_64-unknown-linux-gnu

julia> using LispREPL    # then type ) the prompt will change in place to lisp>

lisp> (@code_native (+ 1 1))    ; (Int64, Int64)
        .text
Filename: int.jl
Source line: 8
        pushq   %rbp
        movq    %rsp, %rbp
Source line: 8
        addq    %rsi, %rdi
        movq    %rdi, %rax
        popq    %rbp
        ret

lisp> (@code_llvm (+ 1 1))

define i64 @"julia_+_21570"(i64, i64) {
top:
  %2 = add i64 %1, %0
  ret i64 %2
}

lisp> (@code_native (+ 1 1d))    ; (Int64, Float64)
        .text
Filename: promotion.jl
Source line: 167
        pushq   %rbp
        movq    %rsp, %rbp
Source line: 167
        cvtsi2sdq       %rdi, %xmm1
        addsd   %xmm0, %xmm1
        movaps  %xmm1, %xmm0
        popq    %rbp
        ret

lisp> (@code_native (+ 1d 1d))    ; (Float64, Float64)
        .text
Filename: float.jl
Source line: 208
        pushq   %rbp
        movq    %rsp, %rbp
Source line: 208
        addsd   %xmm1, %xmm0
        popq    %rbp
        ret

lisp> (@code_native (+ 1d (big 1d)))    ; (Float64, BigFloat)
        .text
Filename: mpfr.jl
Source line: 235
        pushq   %rbp
        movq    %rsp, %rbp
Source line: 235
        movabsq $139793251432704, %rax  # imm = 0x7F242717F900
        callq   *%rax
        popq    %rbp
        ret

lisp> (@code_native (+ 1 (big 1)))    ; (Int64, BigInt)
        .text
Filename: gmp.jl
Source line: 321
        pushq   %rbp
        movq    %rsp, %rbp
        movq    %rsi, %rax
        movq    %rdi, %rsi
        testq   %rsi, %rsi
        js      L34
Source line: 321
        movabsq $139793251432752, %rcx  # imm = 0x7F242717F930
        jmpq    L47
L34:    negq    %rsi
        movabsq $139793251432768, %rcx  # imm = 0x7F242717F940
L47:    movq    %rax, %rdi
        callq   *%rcx
        popq    %rbp
        ret



julia>

@Ismael-VC
Copy link

Julia

(using the LispREPL)

lisp> (defn test []
        (do
          (def x 0)
          (for [i (range 1 1000000000)]
            (def x (+ x 1)))
          (@show x)))
test (generic function with 1 method)

lisp> (@time (test))    ; JIT warmup
x = 1000000000
  0.008133 seconds (2.02 k allocations: 100.414 KB)
1000000000

lisp> (@time (test))
x = 1000000000
  0.000078 seconds (25 allocations: 976 bytes)
1000000000

lisp> (@code_llvm (test))

define i64 @julia_test_21611() {
top:
  %0 = alloca [6 x %jl_value_t*], align 8
  %.sub = getelementptr inbounds [6 x %jl_value_t*]* %0, i64 0, i64 0
  %1 = getelementptr [6 x %jl_value_t*]* %0, i64 0, i64 2
  %2 = getelementptr [6 x %jl_value_t*]* %0, i64 0, i64 4
  store %jl_value_t* inttoptr (i64 8 to %jl_value_t*), %jl_value_t** %.sub, align 8
  %3 = getelementptr [6 x %jl_value_t*]* %0, i64 0, i64 1
  %4 = load %jl_value_t*** @jl_pgcstack, align 8
  %.c = bitcast %jl_value_t** %4 to %jl_value_t*
  store %jl_value_t* %.c, %jl_value_t** %3, align 8
  store %jl_value_t** %.sub, %jl_value_t*** @jl_pgcstack, align 8
  store %jl_value_t* null, %jl_value_t** %1, align 8
  %5 = getelementptr [6 x %jl_value_t*]* %0, i64 0, i64 3
  store %jl_value_t* null, %jl_value_t** %5, align 8
  store %jl_value_t* null, %jl_value_t** %2, align 8
  %6 = getelementptr [6 x %jl_value_t*]* %0, i64 0, i64 5
  store %jl_value_t* null, %jl_value_t** %6, align 8
  %7 = load %jl_value_t** inttoptr (i64 140018150460440 to %jl_value_t**), align 8
  store %jl_value_t* %7, %jl_value_t** %2, align 8
  %8 = call %jl_value_t* @julia_getindex1788(%jl_value_t* inttoptr (i64 140018154010064 to %jl_value_t*), %jl_value_t** %2, i32 1)
  store %jl_value_t* %8, %jl_value_t** %1, align 8
  %9 = getelementptr inbounds %jl_value_t* %8, i64 1
  %10 = bitcast %jl_value_t* %9 to i64*
  %11 = load i64* %10, align 8
  %12 = call %jl_value_t* @jl_gc_allocobj(i64 48)
  %13 = getelementptr inbounds %jl_value_t* %12, i64 -1, i32 0
  store %jl_value_t* inttoptr (i64 140018134741360 to %jl_value_t*), %jl_value_t** %13, align 8
  %14 = getelementptr inbounds %jl_value_t* %12, i64 0, i32 0
  store %jl_value_t* %8, %jl_value_t** %14, align 8
  store %jl_value_t* %12, %jl_value_t** %2, align 8
  %15 = load %jl_value_t** @jl_true, align 8
  %16 = getelementptr %jl_value_t* %12, i64 1
  %17 = bitcast %jl_value_t* %16 to i8*
  %18 = bitcast %jl_value_t* %15 to i8*
  %19 = load i8* %18, align 16
  store i8 %19, i8* %17, align 8
  %20 = load %jl_value_t** @jl_true, align 8
  %21 = bitcast %jl_value_t* %12 to i8*
  %22 = getelementptr i8* %21, i64 9
  %23 = bitcast %jl_value_t* %20 to i8*
  %24 = load i8* %23, align 16
  store i8 %24, i8* %22, align 1
  %25 = load %jl_value_t** @jl_true, align 8
  %26 = getelementptr i8* %21, i64 10
  %27 = bitcast %jl_value_t* %25 to i8*
  %28 = load i8* %27, align 16
  store i8 %28, i8* %26, align 2
  %29 = load %jl_value_t** @jl_false, align 8
  %30 = getelementptr i8* %21, i64 11
  %31 = bitcast %jl_value_t* %29 to i8*
  %32 = load i8* %31, align 16
  store i8 %32, i8* %30, align 1
  %33 = getelementptr inbounds %jl_value_t* %12, i64 2, i32 0
  %.c4 = inttoptr i64 %11 to %jl_value_t*
  store %jl_value_t* %.c4, %jl_value_t** %33, align 16
  %34 = load i64* inttoptr (i64 140018177782208 to i64*), align 64
  %35 = getelementptr inbounds %jl_value_t* %12, i64 3, i32 0
  %.c5 = inttoptr i64 %34 to %jl_value_t*
  store %jl_value_t* %.c5, %jl_value_t** %35, align 8
  %36 = load i64* inttoptr (i64 140018132803712 to i64*), align 128
  %37 = getelementptr inbounds %jl_value_t* %12, i64 4, i32 0
  %.c6 = inttoptr i64 %36 to %jl_value_t*
  store %jl_value_t* %.c6, %jl_value_t** %37, align 16
  %38 = load i64* inttoptr (i64 140018132803616 to i64*), align 32
  %39 = getelementptr inbounds %jl_value_t* %12, i64 5, i32 0
  %.c7 = inttoptr i64 %38 to %jl_value_t*
  store %jl_value_t* %.c7, %jl_value_t** %39, align 8
  store %jl_value_t* %12, %jl_value_t** %5, align 8
  %40 = call %jl_value_t* @julia_showall4784(%jl_value_t* %12, i64 1000000000)
  store %jl_value_t* inttoptr (i64 140018208272512 to %jl_value_t*), %jl_value_t** %2, align 8
  store %jl_value_t* %12, %jl_value_t** %6, align 8
  %41 = call %jl_value_t* @julia_takebuf_string(%jl_value_t* inttoptr (i64 140018152780912 to %jl_value_t*), %jl_value_t** %6, i32 1)
  store %jl_value_t* %41, %jl_value_t** %6, align 8
  %42 = call %jl_value_t* @jl_apply_generic(%jl_value_t* inttoptr (i64 140018154603088 to %jl_value_t*), %jl_value_t** %2, i32 2)
  %43 = load %jl_value_t** %3, align 8
  %44 = getelementptr inbounds %jl_value_t* %43, i64 0, i32 0
  store %jl_value_t** %44, %jl_value_t*** @jl_pgcstack, align 8
  ret i64 1000000000
}

lisp> (@code_native (test))
        .text
Filename: show.jl
Source line: 127
        pushq   %rbp
        movq    %rsp, %rbp
Source line: 127
        pushq   %r15
        pushq   %r14
        pushq   %r13
        pushq   %r12
        pushq   %rbx
        subq    $56, %rsp
        movq    $8, -88(%rbp)
        movabsq $jl_pgcstack, %r12
        movq    (%r12), %rax
        movq    %rax, -80(%rbp)
        leaq    -88(%rbp), %rax
        movq    %rax, (%r12)
        xorps   %xmm0, %xmm0
        movups  %xmm0, -72(%rbp)
        movups  %xmm0, -56(%rbp)
        movabsq $140018150460440, %rax  # imm = 0x7F58841ECC18
        movq    (%rax), %rcx
Source line: 0
        leaq    -56(%rbp), %r14
Source line: 127
        movabsq $140026750891312, %rax  # imm = 0x7F5A84BEF530
        movq    %rcx, -56(%rbp)
        movabsq $140018154010064, %rdi  # imm = 0x7F588454F5D0
        movq    %r14, %rsi
        movl    $1, %edx
        callq   *%rax
        movq    %rax, %r15
        movabsq $jl_gc_allocobj, %rax
        movq    %r15, -72(%rbp)
        movq    8(%r15), %r13
        movl    $48, %edi
        callq   *%rax
        movq    %rax, %rbx
        movabsq $jl_true, %rax
        movq    (%rax), %rdx
        movabsq $140018134741360, %rcx  # imm = 0x7F58832EF170
        movabsq $jl_false, %rax
        movq    (%rax), %rsi
        movabsq $140026752408992, %r8   # imm = 0x7F5A84D61DA0
        movq    %rcx, -8(%rbx)
        movabsq $140018132803616, %rcx  # imm = 0x7F5883116020
        movq    %r15, (%rbx)
        movq    %rbx, -56(%rbp)
        movb    (%rdx), %al
        movb    %al, 8(%rbx)
        movb    (%rdx), %al
        movb    %al, 9(%rbx)
        movb    (%rdx), %al
        movabsq $140018132803712, %rdx  # imm = 0x7F5883116080
        movb    %al, 10(%rbx)
        movb    (%rsi), %al
        movabsq $140018177782208, %rsi  # imm = 0x7F5885BFB1C0
        movb    %al, 11(%rbx)
        movq    %r13, 16(%rbx)
        movq    (%rsi), %rax
        movq    %rax, 24(%rbx)
        movq    (%rdx), %rax
        movq    %rax, 32(%rbx)
        movq    (%rcx), %rax
        movq    %rax, 40(%rbx)
        movq    %rbx, -64(%rbp)
        movq    %rbx, %rdi
        movl    $1000000000, %esi       # imm = 0x3B9ACA00
        callq   *%r8
        leaq    -48(%rbp), %rsi
        movabsq $140026751044992, %rax  # imm = 0x7F5A84C14D80
        movabsq $140018208272512, %rcx  # imm = 0x7F588790F080
        movq    %rcx, -56(%rbp)
        movq    %rbx, -48(%rbp)
        movabsq $140018152780912, %rdi  # imm = 0x7F5884423470
        movl    $1, %edx
        callq   *%rax
        movabsq $jl_apply_generic, %rcx
        movq    %rax, -48(%rbp)
        movabsq $140018154603088, %rdi  # imm = 0x7F58845E0250
        movq    %r14, %rsi
        movl    $2, %edx
        callq   *%rcx
        movq    -80(%rbp), %rax
        movq    %rax, (%r12)
        movl    $1000000000, %eax       # imm = 0x3B9ACA00
        addq    $56, %rsp
        popq    %rbx
        popq    %r12
        popq    %r13
        popq    %r14
        popq    %r15
        popq    %rbp
        ret

Common Lisp:

How would you do this efficiently?

[1]> (defun test ()
       (setq x 0)
         (loop for i to 999999999
           do (setq x (+ x 1)))
         (print x)))
TEST
[2]> (disassemble #'test)
WARNING: in TEST : X is neither declared nor bound,
         it will be treated as if it were declared SPECIAL.
WARNING: in TEST : X is neither declared nor bound,
         it will be treated as if it were declared SPECIAL.
WARNING: in TEST : X is neither declared nor bound,
         it will be treated as if it were declared SPECIAL.
WARNING: in TEST : X is neither declared nor bound,
         it will be treated as if it were declared SPECIAL.

Disassembly of function TEST
(CONST 0) = 0
(CONST 1) = X
(CONST 2) = 999999999
0 required arguments
0 optional arguments
Sin parámetro de resto
Sin parámetros clave
reads special variable: X
writes special variable: X
18 byte-code instructions:
0     (CONST 0)                           ; 0
1     (SETVALUE 1)                        ; X
3     (PUSH)
4     (JMP L14)
6     L6
6     (GETVALUE&PUSH 1)                   ; X
8     (CALLS2 177)                        ; 1+
10    (SETVALUE 1)                        ; X
12    (LOAD&INC&STORE 0)
14    L14
14    (LOAD&PUSH 0)
15    (CONST&PUSH 2)                      ; 999999999
16    (CALLSR&JMPIFNOT 1 50 L6)           ; >
20    (SKIP 1)
22    (GETVALUE&PUSH 1)                   ; X
24    (PUSH-UNBOUND 1)
26    (CALLS1 142)                        ; PRINT
28    (SKIP&RET 1)
NIL
[3]> (time (test))    ; had to put it to "sleep" after 10 minutes :(

What am I doing wrong?

Edit:

I read more about it and used let:

[5]> (defun test ()
       (let ((x 0))
         (loop for i to 999999999
           do (setf x (+ x 1)))
         (print x)))
TEST
[6]> (disassemble #'test)

Disassembly of function TEST
(CONST 0) = 0
(CONST 1) = 999999999
0 required arguments
0 optional arguments
Sin parámetro de resto
Sin parámetros clave
15 byte-code instructions:
0     (CONST&PUSH 0)                      ; 0
1     (CONST&PUSH 0)                      ; 0
2     (JMP L8)
4     L4
4     (LOAD&INC&STORE 1)
6     (LOAD&INC&STORE 0)
8     L8
8     (LOAD&PUSH 0)
9     (CONST&PUSH 1)                      ; 999999999
10    (CALLSR&JMPIFNOT 1 50 L4)           ; >
14    (SKIP 1)
16    (LOAD&PUSH 0)
17    (PUSH-UNBOUND 1)
19    (CALLS1 142)                        ; PRINT
21    (SKIP&RET 2)
NIL
[7]> (time (test))
1000000000
Real time: 766.20514 sec.
Run time: 766.13995 sec.
Space: 440 Bytes
1000000000

Python

(I tried to use the Hy lang) but I didn't understand how to use loop and recur even for something so trivial, well I have to admit I am hungry and sleepy)

In [66]: def test():
   ....:     x = 0
   ....:     for i in range(1000000000):
   ....:         x += 1
   ....:     print(x)
   ....:

In [67]: %timeit -n 1 -r 1 test()
1000000000
1 loop, best of 1: 1min 7s per loop

@Ismael-VC
Copy link

@jackrusher it doesn't work for me in neither lisp implementation:

Clisp

Break 1 [3]> (defun test ()
  (let ((x 0))
    (declare
      (optimize (speed 3) (safety 0)))
      (type (unsigned-byte 32) x)
    (loop for i to 999999999 do (incf x))))
TEST

Break 1 [3]> (time (test))
*** - EVAL: undefined function TYPE
The following restarts are available:
USE-VALUE      :R1      Input a value to be used instead of (FDEFINITION 'TYPE).
RETRY          :R2      Retry
STORE-VALUE    :R3      Input a new value for (FDEFINITION 'TYPE).
ABORT          :R4      Abort debug loop

Break 2 [4]> (disassemble #'test)
WARNING: in TEST : Function TYPE is not defined
WARNING: in TEST : Function UNSIGNED-BYTE is not defined

Disassembly of function TEST
(CONST 0) = 0
(CONST 1) = 32
(CONST 2) = UNSIGNED-BYTE
(CONST 3) = TYPE
(CONST 4) = 999999999
0 required arguments
0 optional arguments
No rest parameter
No keyword parameters
16 byte-code instructions:
0     (CONST&PUSH 0)                      ; 0
1     (CONST&PUSH 1)                      ; 32
2     (CALL1&PUSH 2)                      ; UNSIGNED-BYTE
4     (LOAD&PUSH 1)
5     (CALL2 3)                           ; TYPE
7     (CONST&PUSH 0)                      ; 0
8     (JMP L14)
10    L10
10    (LOAD&INC&STORE 1)
12    (LOAD&INC&STORE 0)
14    L14
14    (LOAD&PUSH 0)
15    (CONST&PUSH 4)                      ; 999999999
16    (CALLSR&JMPIFNOT 1 50 L10)          ; >
20    (NIL)
21    (SKIP&RET

CBSL

* (defun test () 
  (let ((x 0))
    (declare
      (optimize (speed 3) (safety 0)))
      (type (unsigned-byte 32) x)
    (loop for i to 999999999 do (incf x))))
; in: DEFUN TEST
;     (TYPE (UNSIGNED-BYTE 32) X)
; 
; caught WARNING:
;   The function TYPE is undefined, and its name is reserved by ANSI CL so that
;   even if it were defined later, the code doing so would not be portable.

;     (UNSIGNED-BYTE 32)
; 
; caught WARNING:
;   The function UNSIGNED-BYTE is undefined, and its name is reserved by ANSI CL so
;   that even if it were defined later, the code doing so would not be portable.
; 
; compilation unit finished
;   Undefined functions:
;     TYPE UNSIGNED-BYTE
;   caught 2 WARNING conditions

TEST

* (time (test))

debugger invoked on a UNDEFINED-FUNCTION in thread
#<THREAD "main thread" RUNNING {1002AAC483}>:
  The function COMMON-LISP:UNSIGNED-BYTE is undefined.

Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [ABORT] Exit debugger, returning to top level.

("undefined function")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment