- Note: This is a continuation from Part 2 [1].
[1] Experiments with libarchive read blocking: Part 2
- Possible advantage: We might not need to modify libarchive itself!
- What is a coroutine? [1]
- What are the implementations for C? [2]
- So in C, to implement a coroutine there are 2 steps:
- Step 1: Obtain a second call stack; "In order to implement general-purpose coroutines, a second call stack must be obtained, which is a feature not directly supported by the C language."
- The problem with this step is "A reliable (albeit platform-specific) way to achieve this is to use a small amount of inline assembly to explicitly manipulate the stack pointer during initial creation of the coroutine."
- This means all implementations are likely platform specific because they all use platform specific assembly language to manipulate the stack.
- Step 2: Use setjmp() and longjmp() functions; "Once a second call stack has been obtained with one of the methods listed above, the setjmp and longjmp functions in the standard C library can then be used to implement the switches between coroutines."
[1] Coroutines [2] Coroutines: C Implementations
- I'm interested in a portable coroutine implementation for C.
- Preferrably -- for my purposes -- the implentation works on x86_64 and aarch64 Linux.
- Ideally there should be no assembly language, i.e. no assembly language to support if it needs debugging later :-)
- Ideally 'portable' does not mean a larger
#if .. #else .. #endif
block for each supported platform, i.e. less complicated debugging later :-) - I examine the implementations listed in [1].
- In the process of examining implementations, I came acrosss this interesting paper [2] regarding 'The Signal Stack Trick' for creating new stacks via the POSIX
signalstack()
function. - Portable Coroutine Library does not look very portable in its source code [3].
libcoroutine
contains assembly language source code [4].libconcurrency
works on Windows (not Linux?) via fibers [5].libcoro
contains assembly language source code for some implementations on the same platform, but also seems like an overwhelming melting pot of different implementations [6].libdill
has issues compiling on ARM and goes way beyond prividing an API for coroutines, perhaps being more of a 'Golang for C' [7]?libaco
has no support for ARM yet [8].libco
contains assembly language source code [9].- There seem to be no implementations for C which are obvious candidates to try with libarchive with my selection criteria :-(
[1] Implementations for C [2] PDF: Portable Multithreading: The Signal Stack Trick For User-Space Thread Creation [3] Portable Coroutine Library [4] libcoroutine [5] libconcurrency [6] libcoro [7] libdill [8] libaco [9] libco
- One technique (proposed here [1]) is to not use different stacks, and therefore assembly language is not required to manipulate the stack.
- This is achieved by pre-allocating all the stack space for coroutines using C (instead of assembly language) on the regular stack.
- It works as follows:
- The main coroutine / the regular thread of execution already has its own stack, and this can be configured to be larger if need be.
- At run-time, the main coroutine uses
alloca()
to reserve an arbitrary amount of local stack space for each coroutine used. - Note: The space allocated by alloca() is never used by the main coroutine which created it.
- The newly created corountine uses the space reserved by
alloca()
as its stack. - This means you must be very careful to reserve more stack space than the coroutine ever needs, otherwise stack collision / corruption will occur.
- In theory this technique should work on any platform (because
alloca()
is on all platforms) regardless of whether the stack goes up or down.
[1] Coroutines in less than 20 lines of standard C
- It was very difficult to get this code [1] working.
- The code likes to seg fault a lot, and it's not hardened to prevent and/or show why failures occur.
- So I created a longer but more instrumented version of the code, which serves for both comprehension and sanity checking at run-time;
cogo.c
. - I learned the following by hacking the code:
- Often the code runs to the end without a seg fault and appears to work, but it has subtly not worked.
- For example, I had a case where there was no seg fault but piping stdout no longer worked, although there were no errors.
- Another example, is passing control to another coroutine, but the wrong coroutine.
- Another example, is passing control to another coroutine, but the stack is wrong, but not necessarily causing a seg fault.
- The code is only consistently reliable if there are no statements between
setjmp()
andreturn
, e.g.if (setjmp(here)) { /* unreliable if code here! */ return; }
. - I could never get any gcc optimized build to work consistently and reliably over time. Which kind of makes sense.
- Therefore, I made the compile fail if the gcc command line for
cogo.c
does not use-O0
, i.e. no optimization. - Interestingly, another way to force
-O0
is via#pragma GCC optimize ("O0")
but for me this did not consistently avoid seg faults :-(
[1] Coroutines in less than 20 lines of standard C
- The following output shows instrumentation which enables easier comprehension of what is happening at run-time.
- The code is not tested on platforms with a stack in the opposite direction, but in theory it shouldn't take much to get it working.
- We use Linux calls to get the size of the total size of the stack.
- The code has a bunch of calls to `co_sanity_check()' which checks that the stack pointer is in the correct range for the current coroutine ID.
- With the sanity checks, and compiling without optimization, then the code appears to 'just work' as expected :-)
- Note: The code attempts to avoid outputting pointers as much as possible.
- Note: The data of the coroutines is kept in a simple array which can be indexed by the corountine ID.
$ truncate --size=0 cogo.log ; gcc -O0 -fstack-protector-all -o cogo cogo.c && ./cogo | tee cogo.log ; ll cogo.log
- 0: Algorithm:
- 0: - Coroutine 0 launches and passes control to up to n new corountines, each of which has a fixed stack size.
- 0: - Each created coroutine enters an endless loop, which transfers control to a random other coroutine.
- 0: - If control is randomly passed to coroutine 0, a new coroutine is launched, or process exit if all launched
- 0: - Note: Instrumented version of https://fanf.livejournal.com/105413.html
- 0: - Note: Stack space for each new coroutine is reserved using alloca().
- 0: - Note: After switching coroutine control, the new stack pointer is sanity checked to be in expected range.
- 0: - Note: The first digit, after the dash on each output line, shows the coroutine ID with control.
- 0: - Note: The output lines with 'longjmp() ... voodoo' show when coroutine control is switched via longjmp().
- 0: sizeof(jmp_buf)=200
- 0: co_stack_direction_get(stack_addr_caller=0x7fffe47f5c78) {} = -1 AKA down // stack_addr_callee=0x7fffe47f5c28 diff=-80
- 0: co_stack_size_get() {} = 8,388,608 bytes downward @ stack_addr_main=0x7fffe47f5cac
- 0: launching 3 corountines in addition to coroutine 0
- 0: co_launch_internal(0 -> 1) // co_stack_corountine_partitions=1 size_of_all_partitions_so_far=1*10,000=10,000 co_stack_addr_main@-196 end1@-10,196 end2@-20,196
- 0: co_launch_internal(0 -> 1) // setjmp(0) for jump point a
- 0: co_launch_internal(0 -> 1) // thread_main() AKA co_main()
- 1: co_sanity_check() // co_stack_addr_main@-10,256 end1@-10,196 end2@-20,196
- 1: co_main() // jumping to random corountine 1 of main+1 so far
- 1: co_switch_internal(1 -> 1) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 1) // longjmp(1)
... voodoo ...
- 1: co_switch() continued via jump point b
- 1: co_sanity_check() // co_stack_addr_main@-10,336 end1@-10,196 end2@-20,196
- 1: co_main() // co_switch() returned!
- 1: co_main() // jumping to random corountine 1 of main+1 so far
- 1: co_switch_internal(1 -> 1) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 1) // longjmp(1)
... voodoo ...
- 1: co_switch() continued via jump point b
- 1: co_sanity_check() // co_stack_addr_main@-10,336 end1@-10,196 end2@-20,196
- 1: co_main() // co_switch() returned!
- 1: co_main() // jumping to random corountine 0 of main+1 so far
- 1: co_switch_internal(1 -> 0) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 0) // longjmp(0)
... voodoo ...
- 0: co_launch() continued via jump point a
- 0: co_sanity_check() // co_stack_addr_main@-112 end1@-0 end2@-8,388,608
- 0: co_launch() returned!
- 0: co_launch_internal(0 -> 2) // co_stack_corountine_partitions=2 size_of_all_partitions_so_far=2*10,000=20,000 co_stack_addr_main@-196 end1@-20,196 end2@-30,196
- 0: co_launch_internal(0 -> 2) // setjmp(0) for jump point a
- 0: co_launch_internal(0 -> 2) // thread_main() AKA co_main()
- 2: co_sanity_check() // co_stack_addr_main@-20,256 end1@-20,196 end2@-30,196
- 2: co_main() // jumping to random corountine 2 of main+2 so far
- 2: co_switch_internal(2 -> 2) // setjmp(2) for jump point b
- 2: co_switch_internal(2 -> 2) // longjmp(2)
... voodoo ...
- 2: co_switch() continued via jump point b
- 2: co_sanity_check() // co_stack_addr_main@-20,336 end1@-20,196 end2@-30,196
- 2: co_main() // co_switch() returned!
- 2: co_main() // jumping to random corountine 1 of main+2 so far
- 2: co_switch_internal(2 -> 1) // setjmp(2) for jump point b
- 2: co_switch_internal(2 -> 1) // longjmp(1)
... voodoo ...
- 1: co_switch() continued via jump point b
- 1: co_sanity_check() // co_stack_addr_main@-10,336 end1@-10,196 end2@-20,196
- 1: co_main() // co_switch() returned!
- 1: co_main() // jumping to random corountine 1 of main+2 so far
- 1: co_switch_internal(1 -> 1) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 1) // longjmp(1)
... voodoo ...
- 1: co_switch() continued via jump point b
- 1: co_sanity_check() // co_stack_addr_main@-10,336 end1@-10,196 end2@-20,196
- 1: co_main() // co_switch() returned!
- 1: co_main() // jumping to random corountine 0 of main+2 so far
- 1: co_switch_internal(1 -> 0) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 0) // longjmp(0)
... voodoo ...
- 0: co_launch() continued via jump point a
- 0: co_sanity_check() // co_stack_addr_main@-112 end1@-0 end2@-8,388,608
- 0: co_launch() returned!
- 0: co_launch_internal(0 -> 3) // co_stack_corountine_partitions=3 size_of_all_partitions_so_far=3*10,000=30,000 co_stack_addr_main@-196 end1@-30,196 end2@-40,196
- 0: co_launch_internal(0 -> 3) // setjmp(0) for jump point a
- 0: co_launch_internal(0 -> 3) // thread_main() AKA co_main()
- 3: co_sanity_check() // co_stack_addr_main@-30,256 end1@-30,196 end2@-40,196
- 3: co_main() // jumping to random corountine 3 of main+3 so far
- 3: co_switch_internal(3 -> 3) // setjmp(3) for jump point b
- 3: co_switch_internal(3 -> 3) // longjmp(3)
... voodoo ...
- 3: co_switch() continued via jump point b
- 3: co_sanity_check() // co_stack_addr_main@-30,336 end1@-30,196 end2@-40,196
- 3: co_main() // co_switch() returned!
- 3: co_main() // jumping to random corountine 0 of main+3 so far
- 3: co_switch_internal(3 -> 0) // setjmp(3) for jump point b
- 3: co_switch_internal(3 -> 0) // longjmp(0)
... voodoo ...
- 0: co_launch() continued via jump point a
- 0: co_sanity_check() // co_stack_addr_main@-112 end1@-0 end2@-8,388,608
- 0: co_launch() returned!
- 0: launched all corountines
-rw-rw-r-- 1 simon simon 5374 Nov 4 17:10 cogo.log
- Modify the algorithm so that all coroutine stack space is initially pre-allocated.
- We do this by pre-launching all coroutines, but not passing control to them.
- After pre-launching, coroutine 0 continues in
sub_main()
and only returns when the process intends to exit, and all stack space allocated for pre-launched coroutines gets destroyed. - This has the advantage that coroutine 0 can continue to use the rest of the stack without 'bumping into' stack space allocated for pre-launched coroutines.
- Hopefully in this way, a monolithic legacy process using libarchive can continue to run in coroutine 0 without worrying how much stack space it uses.
- But it can also pre-launch n coroutines with fixed, pre-determined stack sizes for co-operative operation, such as, the libarchive read callback using
co_switch()
to produce a result similar to an async read? - The disadvantage of this method is that the maximum number of coroutines must be known at process start, since their 'stacks' are permanently allocated on the regular stack until the end of the process:
+ main() { <- stack top
<- allocate stack bytes for coroutine 1
...
<- allocate stack bytes for coroutine 1+n
+ sub_main() { <- regular stack continues...
... <- regular code continues...
}
} <- coroutine stacks destroyed here
$ cp cogo.c cogo-2-pre-launch.c
$ # hack code for pre-launching coroutines!
$ truncate --size=0 cogo-2-pre-launch.log ; gcc -O0 -fstack-protector-all -o cogo-2-pre-launch cogo-2-pre-launch.c && ./cogo-2-pre-launch | tee cogo-2-pre-launch.log ; ll cogo-2-pre-launch.log
- 0: Algorithm:
- 0: - Coroutine 0 pre-launches n new corountines, without passing control to them, each of which has a fixed stack size.
- 0: - After pre-launch coroutine 0 continues in sub_main() which becomes the new main()
- 0: - sub_main() switches control to coroutine 1.
- 0: - Each created coroutine enters an endless loop, which transfers control to a random other coroutine.
- 0: - If control is randomly passed to coroutine 0, the process exits
- 0: - Note: Instrumented version of https://fanf.livejournal.com/105413.html
- 0: - Note: Stack space for each new coroutine is reserved using alloca().
- 0: - Note: After switching coroutine control, the new stack pointer is sanity checked to be in expected range.
- 0: - Note: The first digit, after the dash on each output line, shows the coroutine ID with control.
- 0: - Note: The output lines with 'longjmp() ... voodoo' show when coroutine control is switched via longjmp().
- 0: sizeof(jmp_buf)=200
- 0: co_stack_direction_get(stack_addr_caller=0x7ffd6ece9eb8) {} = -1 AKA down // stack_addr_callee=0x7ffd6ece9e68 diff=-80
- 0: co_stack_size_get() {} = 8,388,608 bytes downward @ stack_addr_main=0x7ffd6ece9eec
- 0: launching 3 corountines in addition to coroutine 0
- 0: co_launch_internal(0 -> 1) // co_stack_corountine_partitions=1 size_of_all_partitions_so_far=1*10,000=10,000 co_stack_addr_main@-196 end1@-10,196 end2@-20,196
- 0: co_launch_internal(0 -> 1) // setjmp(0) for jump point a
- 0: co_launch_internal(0 -> 1) // thread_main() AKA co_main()
- 1: co_sanity_check() // co_stack_addr_main@-10,256 end1@-10,196 end2@-20,196
- 1: co_switch_internal(1 -> 0) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 0) // longjmp(0)
... voodoo ...
- 0: co_launch() continued via jump point a
- 0: co_sanity_check() // co_stack_addr_main@-112 end1@-0 end2@-8,388,608
- 0: co_launch() returned!
- 0: co_launch_internal(0 -> 2) // co_stack_corountine_partitions=2 size_of_all_partitions_so_far=2*10,000=20,000 co_stack_addr_main@-196 end1@-20,196 end2@-30,196
- 0: co_launch_internal(0 -> 2) // setjmp(0) for jump point a
- 0: co_launch_internal(0 -> 2) // thread_main() AKA co_main()
- 2: co_sanity_check() // co_stack_addr_main@-20,256 end1@-20,196 end2@-30,196
- 2: co_switch_internal(2 -> 0) // setjmp(2) for jump point b
- 2: co_switch_internal(2 -> 0) // longjmp(0)
... voodoo ...
- 0: co_launch() continued via jump point a
- 0: co_sanity_check() // co_stack_addr_main@-112 end1@-0 end2@-8,388,608
- 0: co_launch() returned!
- 0: co_launch_internal(0 -> 3) // co_stack_corountine_partitions=3 size_of_all_partitions_so_far=3*10,000=30,000 co_stack_addr_main@-196 end1@-30,196 end2@-40,196
- 0: co_launch_internal(0 -> 3) // setjmp(0) for jump point a
- 0: co_launch_internal(0 -> 3) // thread_main() AKA co_main()
- 3: co_sanity_check() // co_stack_addr_main@-30,256 end1@-30,196 end2@-40,196
- 3: co_switch_internal(3 -> 0) // setjmp(3) for jump point b
- 3: co_switch_internal(3 -> 0) // longjmp(0)
... voodoo ...
- 0: co_launch() continued via jump point a
- 0: co_sanity_check() // co_stack_addr_main@-112 end1@-0 end2@-8,388,608
- 0: co_launch() returned!
- 0: launched all corountines
- 0: co_launch_internal(0 -> 0) // co_stack_corountine_partitions=4 size_of_all_partitions_so_far=4*10,000=40,000 co_stack_addr_main@-196 end1@-0 end2@-8,388,608 <-- coroutine 0 stack skips past all others!
- 0: co_launch_internal(0 -> 0) // thread_main() AKA sub_main()
- 0: co_sanity_check() // co_stack_addr_main@-40,256 end1@-0 end2@-8,388,608
- 0: sub_main() // continuing main() with stack position skipped past launched coroutine stacks
- 0: co_sanity_check() // co_stack_addr_main@-40,288 end1@-0 end2@-8,388,608
- 0: co_switch_internal(0 -> 1) // setjmp(0) for jump point b
- 0: co_switch_internal(0 -> 1) // longjmp(1)
... voodoo ...
- 1: co_switch() continued via jump point b
- 1: co_sanity_check() // co_stack_addr_main@-10,336 end1@-10,196 end2@-20,196
- 1: co_main() // jumping to random corountine 1 of main+3 so far
- 1: co_switch_internal(1 -> 1) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 1) // longjmp(1)
... voodoo ...
- 1: co_switch() continued via jump point b
- 1: co_sanity_check() // co_stack_addr_main@-10,336 end1@-10,196 end2@-20,196
- 1: co_main() // co_switch() returned!
- 1: co_main() // jumping to random corountine 3 of main+3 so far
- 1: co_switch_internal(1 -> 3) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 3) // longjmp(3)
... voodoo ...
- 3: co_switch() continued via jump point b
- 3: co_sanity_check() // co_stack_addr_main@-30,336 end1@-30,196 end2@-40,196
- 3: co_main() // jumping to random corountine 2 of main+3 so far
- 3: co_switch_internal(3 -> 2) // setjmp(3) for jump point b
- 3: co_switch_internal(3 -> 2) // longjmp(2)
... voodoo ...
- 2: co_switch() continued via jump point b
- 2: co_sanity_check() // co_stack_addr_main@-20,336 end1@-20,196 end2@-30,196
- 2: co_main() // jumping to random corountine 2 of main+3 so far
- 2: co_switch_internal(2 -> 2) // setjmp(2) for jump point b
- 2: co_switch_internal(2 -> 2) // longjmp(2)
... voodoo ...
- 2: co_switch() continued via jump point b
- 2: co_sanity_check() // co_stack_addr_main@-20,336 end1@-20,196 end2@-30,196
- 2: co_main() // co_switch() returned!
- 2: co_main() // jumping to random corountine 3 of main+3 so far
- 2: co_switch_internal(2 -> 3) // setjmp(2) for jump point b
- 2: co_switch_internal(2 -> 3) // longjmp(3)
... voodoo ...
- 3: co_switch() continued via jump point b
- 3: co_sanity_check() // co_stack_addr_main@-30,336 end1@-30,196 end2@-40,196
- 3: co_main() // co_switch() returned!
- 3: co_main() // jumping to random corountine 1 of main+3 so far
- 3: co_switch_internal(3 -> 1) // setjmp(3) for jump point b
- 3: co_switch_internal(3 -> 1) // longjmp(1)
... voodoo ...
- 1: co_switch() continued via jump point b
- 1: co_sanity_check() // co_stack_addr_main@-10,336 end1@-10,196 end2@-20,196
- 1: co_main() // co_switch() returned!
- 1: co_main() // jumping to random corountine 1 of main+3 so far
- 1: co_switch_internal(1 -> 1) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 1) // longjmp(1)
... voodoo ...
- 1: co_switch() continued via jump point b
- 1: co_sanity_check() // co_stack_addr_main@-10,336 end1@-10,196 end2@-20,196
- 1: co_main() // co_switch() returned!
- 1: co_main() // jumping to random corountine 3 of main+3 so far
- 1: co_switch_internal(1 -> 3) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 3) // longjmp(3)
... voodoo ...
- 3: co_switch() continued via jump point b
- 3: co_sanity_check() // co_stack_addr_main@-30,336 end1@-30,196 end2@-40,196
- 3: co_main() // co_switch() returned!
- 3: co_main() // jumping to random corountine 0 of main+3 so far
- 3: co_switch_internal(3 -> 0) // setjmp(3) for jump point b
- 3: co_switch_internal(3 -> 0) // longjmp(0)
... voodoo ...
- 0: co_switch() continued via jump point b
- 0: co_sanity_check() // co_stack_addr_main@-40,336 end1@-0 end2@-8,388,608
- 0: sub_main() finishing
- 0: co_launch_internal() // coroutine 0 exiting; process shutting down shortly?
- 0: back from sub_main(); process exiting
-rw-rw-r-- 1 simon simon 7133 Nov 5 17:45 cogo-2-pre-launch.log
- Modify the code so that a dummy libarchive with dummy blocking read callback is called.
- Make a dummy libarchive read function which has its own dummy read callback, and needs n dummy read callbacks to complete.
- Call the dummy read functions from 2 coroutines and design the callbacks to only operate in a special certain order, simulating blocking reads.
$ cp cogo-2-pre-launch.c cogo-3-dummy-libarchive.c
$ # hack code for dummy libarchive!
$ truncate --size=0 cogo-3-dummy-libarchive.log ; gcc -O0 -fstack-protector-all -o cogo-3-dummy-libarchive cogo-3-dummy-libarchive.c && ./cogo-3-dummy-libarchive | tee cogo-3-dummy-libarchive.log ; ll cogo-3-dummy-libarchive.log ; cat cogo-3-dummy-libarchive.log | egrep "(instance_init|// order|callback.*instance.*leave)"
- 0: main() // enter
- 0: Algorithm:
- 0: - Coroutine 0 pre-launches n new corountines, without passing control to them, each of which has a fixed stack size.
- 0: - After pre-launch coroutine 0 continues in sub_main() which becomes the new main()
- 0: - sub_main() switches control to each new coroutine one by one, running until the first callback in invoked.
- 0: - Each callback is invoked after a random number of recursive function calls, simulating libarchive stack use.
- 0: - At this point all new coroutines have run until calling their blocking callback.
- 0: - sub_main() determines the order in which the blocking callbacks will continue.
- 0: - sub_main() switches control to each new coroutine in the predetermined order to unblock each callback
- 0: - Note: Instrumented version of https://fanf.livejournal.com/105413.html modified for pre-launching.
- 0: - Note: Stack space for each new coroutine is reserved using alloca().
- 0: - Note: Stack space for exiting coroutine 0 can grow beyond co_stack_size_per_coroutine.
- 0: - Note: After switching coroutine control, the new stack pointer is sanity checked to be in expected range.
- 0: - Note: The first digit, after the dash on each output line, shows the coroutine ID with control.
- 0: - Note: The output lines with 'longjmp() ... voodoo' show when coroutine control is switched via longjmp().
- 0: sizeof(jmp_buf)=200
- 0: co_stack_direction_get(stack_addr_caller=0x7ffe54615758) {} = -1 AKA down // stack_addr_callee=0x7ffe54615708 diff=-80
- 0: co_stack_size_get() {} = 8,388,608 bytes downward @ stack_addr_main=0x7ffe5461578c
- 0: launching 2 corountines in addition to coroutine 0 <-- milestone 1
- 0: co_launch_internal(0 -> 1) // co_stack_corountine_partitions=1 size_of_all_partitions_so_far=1*10,000=10,000 co_stack_addr_main@-196 end1@-10,196 end2@-20,196
- 0: co_launch_internal(0 -> 1) // setjmp(0) for jump point a
- 0: co_launch_internal(0 -> 1) // thread_main() AKA co_main()
- 1: co_sanity_check() // co_stack_addr_main@-10,256 end1@-10,196 end2@-20,196
- 1: co_switch_internal(1 -> 0) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 0) // longjmp(0)
... voodoo ...
- 0: co_launch() continued via jump point a
- 0: co_sanity_check() // co_stack_addr_main@-112 end1@-0 end2@-8,388,608
- 0: co_launch() returned!
- 0: co_launch_internal(0 -> 2) // co_stack_corountine_partitions=2 size_of_all_partitions_so_far=2*10,000=20,000 co_stack_addr_main@-196 end1@-20,196 end2@-30,196
- 0: co_launch_internal(0 -> 2) // setjmp(0) for jump point a
- 0: co_launch_internal(0 -> 2) // thread_main() AKA co_main()
- 2: co_sanity_check() // co_stack_addr_main@-20,256 end1@-20,196 end2@-30,196
- 2: co_switch_internal(2 -> 0) // setjmp(2) for jump point b
- 2: co_switch_internal(2 -> 0) // longjmp(0)
... voodoo ...
- 0: co_launch() continued via jump point a
- 0: co_sanity_check() // co_stack_addr_main@-112 end1@-0 end2@-8,388,608
- 0: co_launch() returned!
- 0: launched all corountines <-- milestone 2
- 0: co_launch_internal(0 -> 0) // co_stack_corountine_partitions=3 size_of_all_partitions_so_far=3*10,000=30,000 co_stack_addr_main@-196 end1@-0 end2@-8,388,608 <-- coroutine 0 stack skips past all others!
- 0: co_launch_internal(0 -> 0) // thread_main() AKA sub_main()
- 0: co_sanity_check() // co_stack_addr_main@-30,256 end1@-0 end2@-8,388,608
- 0: sub_main() // enter; continuing main() with stack position skipped past launched coroutine stacks
- 0: co_sanity_check() // co_stack_addr_main@-30,288 end1@-0 end2@-8,388,608
- 0: sub_main() // pass control to coroutines until first callback entered <-- milestone 3
- 0: sub_main() // pass control to coroutine 1 until first callback entered
- 0: co_switch_internal(0 -> 1) // setjmp(0) for jump point b
- 0: co_switch_internal(0 -> 1) // longjmp(1)
... voodoo ...
- 1: co_switch() continued via jump point b
- 1: co_sanity_check() // co_stack_addr_main@-10,336 end1@-10,196 end2@-20,196
- 1: dummy_libarchive_instance_init()
- 1: dummy_libarchive_random_stack_depth(2) // enter
- 1: dummy_libarchive_random_stack_depth(1) // enter
- 1: dummy_libarchive_random_stack_depth(0) // enter
- 1: dummy_libarchive_callback() // callback 1 for instance 1; enter
- 1: co_switch_internal(1 -> 0) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 0) // longjmp(0)
... voodoo ...
- 0: co_switch() continued via jump point b
- 0: co_sanity_check() // co_stack_addr_main@-30,336 end1@-0 end2@-8,388,608
- 0: sub_main() // pass control to coroutine 2 until first callback entered
- 0: co_switch_internal(0 -> 2) // setjmp(0) for jump point b
- 0: co_switch_internal(0 -> 2) // longjmp(2)
... voodoo ...
- 2: co_switch() continued via jump point b
- 2: co_sanity_check() // co_stack_addr_main@-20,336 end1@-20,196 end2@-30,196
- 2: dummy_libarchive_instance_init()
- 2: dummy_libarchive_random_stack_depth(1) // enter
- 2: dummy_libarchive_random_stack_depth(0) // enter
- 2: dummy_libarchive_callback() // callback 1 for instance 2; enter
- 2: co_switch_internal(2 -> 0) // setjmp(2) for jump point b
- 2: co_switch_internal(2 -> 0) // longjmp(0)
... voodoo ...
- 0: co_switch() continued via jump point b
- 0: co_sanity_check() // co_stack_addr_main@-30,336 end1@-0 end2@-8,388,608
- 0: dummy_libarchive_callback_order_init() {} // order 2 1 1 2 <-- milestone 4
- 0: sub_main() // pass control to coroutine callbacks in desired order <-- milestone 5
- 0: sub_main() // pass control to coroutine 2 to complete its callback
- 0: co_switch_internal(0 -> 2) // setjmp(0) for jump point b
- 0: co_switch_internal(0 -> 2) // longjmp(2)
... voodoo ...
- 2: co_switch() continued via jump point b
- 2: co_sanity_check() // co_stack_addr_main@-20,464 end1@-20,196 end2@-30,196
- 2: dummy_libarchive_callback() // callback 1 for instance 2; leave
- 2: dummy_libarchive_random_stack_depth(0) // leave
- 2: dummy_libarchive_random_stack_depth(1) // leave
- 2: dummy_libarchive_random_stack_depth(4) // enter
- 2: dummy_libarchive_random_stack_depth(3) // enter
- 2: dummy_libarchive_random_stack_depth(2) // enter
- 2: dummy_libarchive_random_stack_depth(1) // enter
- 2: dummy_libarchive_random_stack_depth(0) // enter
- 2: dummy_libarchive_callback() // callback 2 for instance 2; enter
- 2: co_switch_internal(2 -> 0) // setjmp(2) for jump point b
- 2: co_switch_internal(2 -> 0) // longjmp(0)
... voodoo ...
- 0: co_switch() continued via jump point b
- 0: co_sanity_check() // co_stack_addr_main@-30,336 end1@-0 end2@-8,388,608
- 0: sub_main() // pass control to coroutine 1 to complete its callback
- 0: co_switch_internal(0 -> 1) // setjmp(0) for jump point b
- 0: co_switch_internal(0 -> 1) // longjmp(1)
... voodoo ...
- 1: co_switch() continued via jump point b
- 1: co_sanity_check() // co_stack_addr_main@-10,512 end1@-10,196 end2@-20,196
- 1: dummy_libarchive_callback() // callback 1 for instance 1; leave
- 1: dummy_libarchive_random_stack_depth(0) // leave
- 1: dummy_libarchive_random_stack_depth(1) // leave
- 1: dummy_libarchive_random_stack_depth(2) // leave
- 1: dummy_libarchive_random_stack_depth(3) // enter
- 1: dummy_libarchive_random_stack_depth(2) // enter
- 1: dummy_libarchive_random_stack_depth(1) // enter
- 1: dummy_libarchive_random_stack_depth(0) // enter
- 1: dummy_libarchive_callback() // callback 2 for instance 1; enter
- 1: co_switch_internal(1 -> 0) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 0) // longjmp(0)
... voodoo ...
- 0: co_switch() continued via jump point b
- 0: co_sanity_check() // co_stack_addr_main@-30,336 end1@-0 end2@-8,388,608
- 0: sub_main() // pass control to coroutine 1 to complete its callback
- 0: co_switch_internal(0 -> 1) // setjmp(0) for jump point b
- 0: co_switch_internal(0 -> 1) // longjmp(1)
... voodoo ...
- 1: co_switch() continued via jump point b
- 1: co_sanity_check() // co_stack_addr_main@-10,560 end1@-10,196 end2@-20,196
- 1: dummy_libarchive_callback() // callback 2 for instance 1; leave
- 1: dummy_libarchive_random_stack_depth(0) // leave
- 1: dummy_libarchive_random_stack_depth(1) // leave
- 1: dummy_libarchive_random_stack_depth(2) // leave
- 1: dummy_libarchive_random_stack_depth(3) // leave
- 1: co_switch_internal(1 -> 0) // setjmp(1) for jump point b
- 1: co_switch_internal(1 -> 0) // longjmp(0)
... voodoo ...
- 0: co_switch() continued via jump point b
- 0: co_sanity_check() // co_stack_addr_main@-30,336 end1@-0 end2@-8,388,608
- 0: sub_main() // pass control to coroutine 2 to complete its callback
- 0: co_switch_internal(0 -> 2) // setjmp(0) for jump point b
- 0: co_switch_internal(0 -> 2) // longjmp(2)
... voodoo ...
- 2: co_switch() continued via jump point b
- 2: co_sanity_check() // co_stack_addr_main@-20,608 end1@-20,196 end2@-30,196
- 2: dummy_libarchive_callback() // callback 2 for instance 2; leave
- 2: dummy_libarchive_random_stack_depth(0) // leave
- 2: dummy_libarchive_random_stack_depth(1) // leave
- 2: dummy_libarchive_random_stack_depth(2) // leave
- 2: dummy_libarchive_random_stack_depth(3) // leave
- 2: dummy_libarchive_random_stack_depth(4) // leave
- 2: co_switch_internal(2 -> 0) // setjmp(2) for jump point b
- 2: co_switch_internal(2 -> 0) // longjmp(0)
... voodoo ...
- 0: co_switch() continued via jump point b
- 0: co_sanity_check() // co_stack_addr_main@-30,336 end1@-0 end2@-8,388,608
- 0: sub_main() // leave
- 0: co_launch_internal() // coroutine 0 exiting; process shutting down shortly?
- 0: main() // leave
-rw-rw-r-- 1 simon simon 9416 Nov 6 17:32 cogo-3-dummy-libarchive.log
- Here we can see the in grep output:
- The the dummy libarchive code is called in coroutine ID order until the first dummy blocking read callback.
- The test program determines the random order that the dummy blocking read callbacks will be unblocked.
- Control in passed to the dummy blocking read callbacks in the predetermined order.
$ cat cogo-3-dummy-libarchive.log | egrep "(instance_init|// order|callback.*instance.*leave)
- 1: dummy_libarchive_instance_init()
- 2: dummy_libarchive_instance_init()
- 0: dummy_libarchive_callback_order_init() {} // order 2 1 1 2 <-- milestone 4
- 2: dummy_libarchive_callback() // callback 1 for instance 2; leave
- 1: dummy_libarchive_callback() // callback 1 for instance 1; leave
- 1: dummy_libarchive_callback() // callback 2 for instance 1; leave
- 2: dummy_libarchive_callback() // callback 2 for instance 2; leave
- All seems to 'just work' according to the instrumentation output below.
- Same number of bytes in output log.
- Same order of callbacks and call depth due to predetermined random order due to
rand()
with samesrand()
each process start. - Same order of
co_sanity_check()
calls. - Note: Both platforms have 'down' stacks. I would not trust the current code with an 'up' stack platform and want to test it some more...
$ # find an ARM box ... :)
$ lscpu | egrep Arch
Architecture: aarch64
$ truncate --size=0 cogo-3-dummy-libarchive.log ; gcc -O0 -fstack-protector-all -o cogo-3-dummy-libarchive cogo-3-dummy-libarchive.c && ./cogo-3-dummy-libarchive | tee cogo-3-dummy-libarchive.log ; ll cogo-3-dummy-libarchive.log ; cat cogo-3-dummy-libarchive.log | egrep "(instance_init|// order|callback.*instance.*leave)"
...
- 0: main() // leave
-rw-rw-r-- 1 simon simon 9416 Nov 7 22:06 cogo-3-dummy-libarchive.log
- 1: dummy_libarchive_instance_init()
- 2: dummy_libarchive_instance_init()
- 0: dummy_libarchive_callback_order_init() {} // order 2 1 1 2 <-- milestone 4
- 2: dummy_libarchive_callback() // callback 1 for instance 2; leave
- 1: dummy_libarchive_callback() // callback 1 for instance 1; leave
- 1: dummy_libarchive_callback() // callback 2 for instance 1; leave
- 2: dummy_libarchive_callback() // callback 2 for instance 2; leave
- We can see that the stack pointer addresses only vary slightly from platform to platform:
$ lscpu | egrep Arch ; cat cogo-3-dummy-libarchive.log | egrep co_sanity_check
Architecture: x86_64
- 1: co_sanity_check() // co_stack_addr_main@-10,256 end1@-10,196 end2@-20,196
- 0: co_sanity_check() // co_stack_addr_main@-112 end1@-0 end2@-8,388,608
- 2: co_sanity_check() // co_stack_addr_main@-20,256 end1@-20,196 end2@-30,196
- 0: co_sanity_check() // co_stack_addr_main@-112 end1@-0 end2@-8,388,608
- 0: co_sanity_check() // co_stack_addr_main@-30,256 end1@-0 end2@-8,388,608
- 0: co_sanity_check() // co_stack_addr_main@-30,288 end1@-0 end2@-8,388,608
- 1: co_sanity_check() // co_stack_addr_main@-10,336 end1@-10,196 end2@-20,196
- 0: co_sanity_check() // co_stack_addr_main@-30,336 end1@-0 end2@-8,388,608
- 2: co_sanity_check() // co_stack_addr_main@-20,336 end1@-20,196 end2@-30,196
- 0: co_sanity_check() // co_stack_addr_main@-30,336 end1@-0 end2@-8,388,608
- 2: co_sanity_check() // co_stack_addr_main@-20,464 end1@-20,196 end2@-30,196
- 0: co_sanity_check() // co_stack_addr_main@-30,336 end1@-0 end2@-8,388,608
- 1: co_sanity_check() // co_stack_addr_main@-10,512 end1@-10,196 end2@-20,196
- 0: co_sanity_check() // co_stack_addr_main@-30,336 end1@-0 end2@-8,388,608
- 1: co_sanity_check() // co_stack_addr_main@-10,560 end1@-10,196 end2@-20,196
- 0: co_sanity_check() // co_stack_addr_main@-30,336 end1@-0 end2@-8,388,608
- 2: co_sanity_check() // co_stack_addr_main@-20,608 end1@-20,196 end2@-30,196
- 0: co_sanity_check() // co_stack_addr_main@-30,336 end1@-0 end2@-8,388,608
$ lscpu | egrep Arch ; cat cogo-3-dummy-libarchive.log | egrep co_sanity_check
Architecture: aarch64
- 1: co_sanity_check() // co_stack_addr_main@-10,288 end1@-10,164 end2@-20,164
- 0: co_sanity_check() // co_stack_addr_main@-112 end1@-0 end2@-8,388,608
- 2: co_sanity_check() // co_stack_addr_main@-20,288 end1@-20,164 end2@-30,164
- 0: co_sanity_check() // co_stack_addr_main@-112 end1@-0 end2@-8,388,608
- 0: co_sanity_check() // co_stack_addr_main@-30,288 end1@-0 end2@-8,388,608
- 0: co_sanity_check() // co_stack_addr_main@-30,320 end1@-0 end2@-8,388,608
- 1: co_sanity_check() // co_stack_addr_main@-10,368 end1@-10,164 end2@-20,164
- 0: co_sanity_check() // co_stack_addr_main@-30,368 end1@-0 end2@-8,388,608
- 2: co_sanity_check() // co_stack_addr_main@-20,368 end1@-20,164 end2@-30,164
- 0: co_sanity_check() // co_stack_addr_main@-30,368 end1@-0 end2@-8,388,608
- 2: co_sanity_check() // co_stack_addr_main@-20,496 end1@-20,164 end2@-30,164
- 0: co_sanity_check() // co_stack_addr_main@-30,368 end1@-0 end2@-8,388,608
- 1: co_sanity_check() // co_stack_addr_main@-10,544 end1@-10,164 end2@-20,164
- 0: co_sanity_check() // co_stack_addr_main@-30,368 end1@-0 end2@-8,388,608
- 1: co_sanity_check() // co_stack_addr_main@-10,592 end1@-10,164 end2@-20,164
- 0: co_sanity_check() // co_stack_addr_main@-30,368 end1@-0 end2@-8,388,608
- 2: co_sanity_check() // co_stack_addr_main@-20,640 end1@-20,164 end2@-30,164
- 0: co_sanity_check() // co_stack_addr_main@-30,368 end1@-0 end2@-8,388,608
- Coming in part 4 !