Skip to content

Instantly share code, notes, and snippets.

@kingcons
Last active August 12, 2019 04:39
Show Gist options
  • Save kingcons/8f93a75bcb6f9fda8752d01777154318 to your computer and use it in GitHub Desktop.
Save kingcons/8f93a75bcb6f9fda8752d01777154318 to your computer and use it in GitHub Desktop.
clones quickstart

Installing Clones

May need to brew install libsdl2-dev

brew install sbcl # or apt-get if you nasty playa
curl -O https://beta.quicklisp.org/quicklisp.lisp > quicklisp.lisp
sbcl --load quicklisp.lisp # and follow the prompts
git clone git@github.com:kingcons/clones.git ~/quicklisp/local-projects/clones.git

Running Clones

Start rlwrap sbcl and ...

(ql:quickload :clones)
(in-package :clones)
(change-game "roms/commercial/dk.nes") ;; this path is relative to the clones installation folder/git checkout
(step-frames 4)

Dumping Nametables

Make sure to do this on the mezzanine branch...

(ql:quickload :cl-json)
(defvar *nt* (clones.ppu::ppu-nametable (memory-ppu (cpu-memory *nes*))))
(cl-json:encode-json *nt*)

Dumping Images

Make sure to brew install libpng first and use my fork of cl-png...

git clone git@github.com:kingcons/cl-png.git ~/quicklisp/local-projects/cl-png

Then ...

(ql:quickload '(:zpng :clones))
(in-package :clones)
(change-game "roms/commercial/smb.nes")
(step-frames 30)
(defvar *image* (make-instance 'zpng:pixel-streamed-png :color-type :truecolor :width 256 :height 240))
(with-open-file (out "test.png" :element-type '(unsigned-byte 8) :direction :output
                 :if-does-not-exist :create :if-exists :supersede)
  (zpng:start-png *image* out)
  (dotimes (i (* 256 240))
    (let ((pixel (list (aref clones.ppu:*framebuffer* (+ (* i 3) 0))
                       (aref clones.ppu:*framebuffer* (+ (* i 3) 1))
                       (aref clones.ppu:*framebuffer* (+ (* i 3) 2)))))
      (zpng:write-pixel pixel *image*)))
  (zpng:finish-png *image*))
@kingcons
Copy link
Author

kingcons commented Aug 3, 2019

smb_title

Looks like nametables are identical at frame 45 and I feel confident assuming pattern tables and palette are also identical. We could be handling palette transparency incorrectly. Or for some reason our background logic is off. ... 🤔

The lines in the background we're getting appear clearly at x: 115, y: 70 of Mario's title in the scaled canvas. So that should be x: 57 and y: 35 in the unscaled data. They appear to alternate scanlines so inspecting the values of our pattern/tile at y: 35 and y: 36 for x: 57 . should reveal the issue.

(5 minutes later...) Yep, odd scanlines are finding the wrong background tiles, blank ones, which leads to the sprites showing through.

(10 minutes later...) This looks like a ScrollInfo bug! Apparently we're flipping the nametable index when I don't think we should be. It's alternating between 1 and 0. Even on a single scanline, apparently. I thought I tested this carefully based on what I saw on Nesdev but the data is pretty clear. I'll dig into this more after lunch. NT index 0 background is transparent. We should be reading from NT index 1. Does the scroll need to take mirroring into account somehow? 🤔

Scanline: 35
Coarse Y: 4
Fine Y: 3
BG Tile: 
[ 0, [ 0, 0, 0, 0, 0, 0, 0, 0 ] ]
NT index: 0
NT offset: 8192
NT byte: 72
Index: 328
Scanline: 36
Coarse Y: 4
Fine Y: 4
BG Tile: 
[ 1, [ 2, 2, 2, 2, 2, 2, 2, 2 ] ]
NT index: 1
NT offset: 9216
NT byte: 36
Index: 292
NT index: 0
NT offset: 8192
NT byte: 72
Index: 328
NT index: 1
NT offset: 9216
NT byte: 36
Index: 292

@kingcons
Copy link
Author

kingcons commented Aug 3, 2019

Yeah, so we weren't accounting for the mirroring type in ScrollInfo's next_tile and next_scanline methods. So every scanline we were flipping to an empty nametable, basically. See: https://wiki.nesdev.com/w/index.php/PPU_scrolling#Coarse_X_increment

"The coarse X component of v needs to be incremented when the next tile is reached. Bits 0-4 are incremented, with overflow toggling bit 10. This means that bits 0-4 count from 0 to 31 across a single nametable, and bit 10 selects the current nametable horizontally."

Selecting the next "horizontal nametable" should be done differently depending on whether the nametables are laid out vertically or horizontally. Mirroring matters!

@kingcons
Copy link
Author

kingcons commented Aug 4, 2019

Did some performance investigation. On the epidernes side, main performance improvement is likely not updating Disassembly/NES component unless the render loop is s topped. If things are still choppy after that, we should manually run an effect that updates the canvas rather than going through state management. Frames are heavy!

On the rawbones side, dropping sprite handling killed over 50% of our runtime! We can definitely optimize better than we're currently doing. find_sprite seems to be the main culprit. We can probably trade a one time Array allocation in the context for 256 ints, each one representing the pixel of a sprite on the scanline. Then, every scanline we could clear the array, loop over the sprite candidates, and render them into it all in one go. We'd just have to modify pixel priority to pull pixels from the context.

More importantly, we've got some palette issues that still need to be fixed (visible in DK, ex: 68,43 should have same color as 66,51 and 90,50 should match 100,50; visible in SMB, ex: 35,200 should be same color as 30,200) and there's still some incorrect scrolling behavior that could be caused by periodic resets to CoarseX(0)/CoarseY(0). Could be something else but merits further investigation.

@kingcons
Copy link
Author

kingcons commented Aug 4, 2019

Fixed a few bugs this morning. Mostly seems like stuff we would've caught sooner if we had good tests. We were picking the wrong high bits for background tiles, the switch statement was just mismatching cases. Then, we weren't switching on the right values for determining quadrant. We were testing whether coarse_x and coarse_y themselves were odd when we should've checked whether coarse_x / 2 and coarse_y / 2 were even or odd, since each "area" in a quad is a 2x2 tile square.

i.e.

|    | 00 | 01 | 02 | 03 | < coarse_x
| 00 | tl | tl | tr | tr |
| 01 | tl | tl | tr | tr |
| 02 | bl | bl | br | br |
| 03 | bl | bl | br | br |
  ^ coarse_y

@kingcons
Copy link
Author

kingcons commented Aug 6, 2019

The bulk of our remaining scrolling problems are coming from resetting the scroll position to zero. Like a lot.

I added some logging to our STA instructions when they were to PPUSCROLL and found that writes to $2005 do seem responsible. Based on the lovely disassembly here (https://gist.github.com/1wErt3r/4048722) I found that the two locations doing the writes are InitScroll at 0x8ee6 and SkipSprite0 at 0x815c. I confirmed those addresses using the logged pc from our STA instruction and Clones disassembler. As far as I can tell, InitScroll always writes zeros and SkipSprite0 is only writing zeroes for the second vertical byte of the scroll which is fine. Now, to figure out why InitScroll is being called so much. At the moment I'm a bit suspicious of our Hblank and Vblank timings...

@kingcons
Copy link
Author

kingcons commented Aug 12, 2019

Instrumented ANESE today and found that it is subject to the same InitScroll writes as we are so it probably isn't a timing issue. I'm becoming more suspicious of how we approach Nametable Mirroring in general. I wouldn't be mad at double checking both mirroring as detected in ROM parsing and the VRAM reading code to make sure what it's doing seems reasonable. Provided it does, we're looking at seeing if there's something wrong with how we read the ScrollInfo to compute the NT bytes to read for the background and how that stacks up to the "shift registers" approach on PPU rendering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment