I hereby claim:
- I am snickerbockers on github.
- I am snickerbockers (https://keybase.io/snickerbockers) on keybase.
- I have a public key ASCtACoOrMLdcwaEPscgEjMLxff6WEZexDFQGLLFUJLc1go
To claim this, I am signing this object:
(c-add-style "custom-C++" | |
'("stroustrup" | |
(c-indent-comments-syntactically-p 1) | |
(c-offsets-alist | |
(substatement-open . 0) | |
(defun-open . 0) | |
(defun-close . 0) | |
(defun-block-intro . 4) | |
(class-open . 0) |
#include <stdlib.h> | |
#include <stdio.h> | |
/* | |
* Options: | |
* PAREN - use parenthesis around the ternary | |
* GREATER - make the root_node's key greater than the key sent to less | |
* (default is less) | |
*/ |
Last updated: 11:20 PM, 11/12/2017 | |
Name: Dave | |
Swim: 26 | |
Fly: 28 | |
Run: 41 | |
Power: 38 | |
Luck: 00 | |
Intelligence: 15 | |
Stamina: 18 |
I hereby claim:
To claim this, I am signing this object:
This is with the native x86_64 jit backend (-x option). | |
I'm not sure if the generated code counts towards dreamcast_run or not. | |
It does seem like the code cache is one of my primary bottlenecks, though. | |
Flat profile: | |
Each sample counts as 0.01 seconds. | |
% cumulative self self total | |
time seconds seconds calls ms/call ms/call name | |
20.67 2.77 2.77 dreamcast_run |
diff --git a/src/gfx/opengl/opengl_output.c b/src/gfx/opengl/opengl_output.c | |
index e00a4df..172596f 100644 | |
--- a/src/gfx/opengl/opengl_output.c | |
+++ b/src/gfx/opengl/opengl_output.c | |
@@ -56,17 +56,11 @@ static struct shader fb_shader; | |
#define FB_VERT_LEN 5 | |
#define FB_VERT_COUNT 4 | |
static GLfloat fb_quad_verts[FB_VERT_LEN * FB_VERT_COUNT] = { | |
- /* | |
- * it is not a mistake that the texture-coordinates are upside-down |
jay@sbckrs_desktop ~/programs/washingtondc/build $ bash ~/pmp.sh | |
200 epoll_wait,epoll_dispatch,event_base_loop,io_main,start_thread,clone | |
31 ??,??,dreamcast_run,main | |
18 dreamcast_run,main | |
17 ??,??,??,??,??,??,?? | |
14 ??,??,??,?? | |
13 sh4_read_mem_32,??,bios,?? | |
11 ??,bios,?? | |
10 sh4_read_mem_32,??,??,??,??,??,??,?? | |
10 memory_map_read_32,sh4_read_mem_32,??,??,??,??,??,??,?? |
==== Code Cache perf stats ==== | |
JIT: 828742910 total accesses | |
JIT: 185777 total tree searches | |
JIT: 177434 table evictions | |
JIT: max depth was 13 | |
JIT: max cache size was 4259 | |
JIT: height of root at shutdown is 13 | |
JIT: balance of root at shutdown is 0 | |
JIT: The top 10 most popular code blocks were accessed: | |
JIT: 0x8c0b5d8c - 30683 times |
Need to unify texture cache and framebuffer to support render-to-texture | |
without needlessly copying the framebuffer back to texture memory after every | |
STARTRENDER command. | |
Also need to support framebuffer access from the SH4 (meaning situations where | |
the CPU writes directly to the framebuffer without using the graphics hw at | |
all, or situations where the CPU reads from something that's already been | |
rendered). | |
Previously I've attempted to solve this problem the "easy" way by only copying |
Flat profile: | |
Each sample counts as 0.01 seconds. | |
% cumulative self self total | |
time seconds seconds calls ms/call ms/call name | |
10.76 4.22 4.22 52871895 0.00 0.00 on_packet_received | |
8.39 7.51 3.29 508938409 0.00 0.00 memory_map_write_32 | |
6.14 9.92 2.41 422975160 0.00 0.00 pvr2_ta_fifo_poly_write_32 | |
5.58 12.11 2.19 sh4_fpu_inst_fmov_indgeninc_fpu | |
5.10 14.11 2.00 519802356 0.00 0.00 memory_map_read_float |