Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save himanshugoel2797/32f069603777862ae731b960b8d4fec7 to your computer and use it in GitHub Desktop.
Save himanshugoel2797/32f069603777862ae731b960b8d4fec7 to your computer and use it in GitHub Desktop.
hgoel> to continue, my previous thought: still pretty doubtful, there isn't really any incentive to merge the two
4:40 PM design wise, it's IMO far more convenient to keep them separate
4:40 PM B<bcos_> Based on recent experiments; I'm "relatively convinced" I can do extremely high quality software rendering on modern 8-core CPUs
4:41 PM ..which means, if CPU performance doubled we'd have enough CPU to not bother with GPU at all
4:42 PM H<hgoel> would that include stuff like in modern games?
4:42 PM B<bcos_> Depends what stuff
4:42 PM H<hgoel> mostly lighting equations
4:42 PM plus, if CPU math speeds double, then most likely said doubling can be applied to GPUs and we're back at the same spot
4:43 PM B<bcos_> I was doing full lighting/shadow, focal blur and "infinite supersampling"
4:43 PM ..but no reflections
4:44 PM H<hgoel> in a ray tracer?
4:44 PM B<bcos_> Yes
4:45 PM H<hgoel> how are you handling polys?
4:45 PM and lighting?
4:45 PM B<bcos_> All done on polys
4:46 PM H<hgoel> I mean, any acceleration structures?
4:46 PM B<bcos_> Acceleration structures?
4:46 PM H<hgoel> to speed up determining which polygon a ray is intersecting with
4:47 PM B<bcos_> I split the world into 6 faces (top, bottom, left, ..) and spit each face into tiles
4:47 PM H<hgoel> naive solution for instance is to loop over all the polys in a scene checking for intersections
4:47 PM B<bcos_> (64*64 tiles per face)
4:48 PM The determine which polys effect which tiles; and do "for each pixel/ray ( determine tile, process list of polys for that tile )"
4:49 PM H<hgoel> I see, how well would that work with animations though?
4:50 PM B<bcos_> For "render entire scene from scratch" I was estimating ~15 frames per second on 8-core Haswell
4:51 PM H<hgoel> at what resolution?
4:51 PM B<bcos_> ..with a bunch of tricks to recycle work ("partial render") to get better frame rates
4:51 PM At 1920*1600
4:51 PM H<hgoel> actually, resolution shouldn't matter a lot
4:52 PM hmm
4:52 PM B<bcos_> You can split it into 3 main parts: polygon processing (depends on number of polygons and number of lights), pixel generation (depends on number of pixels), and post-processing (also number of pixels)
4:53 PM H<hgoel> thing is, since you're doing 'for each pixel/ray' for each tile, which is easily parallelized, a gpu would be able to render out many more tiles per second than a cpu simply due to larger concurrency
4:54 PM B<bcos_> Pixel generation was very sensitive to "number of polygons in tile" too; but I was going to shift to "dynamic number of tiles" thing to help fix that
4:54 PM H<hgoel> what you're describing is kind of what Blender cycles does, but the GPU version when tuned correctly outpaces CPUs basically all the time, with much fancier effects
4:55 PM B<bcos_> And?
4:55 PM H<hgoel> generally, larger tile sizes leading to better CPU performance (up to a certain point IIRC) whereas smaller tile sizes leading to better GPU performance
4:56 PM B<bcos_> If CPU can do it fast enough, then GPU might be able to do it faster but nobody would care because CPU can do it fast enough
4:56 PM It's like..
4:57 PM You could build a car with a jet engine that can do 400 Km/hour, but it'll be expensive and nobody needs to drive that fast so...
4:57 PM H<hgoel> I disagree, with graphics, if you can do something faster, it means you now have more room for even more accurate effects
4:57 PM so the more performance available, the better
4:58 PM so the GPU being able to do something faster does indeed matter
4:58 PM B<bcos_> There is no "more accurate"
4:58 PM H<hgoel> there is, more physically correct lighting equations for instance
4:58 PM B<bcos_> (unless you mean things like reflection and refraction)
4:58 PM H<hgoel> yeah
4:58 PM that's what I mean
4:59 PM B<bcos_> Excluding reflection and refraction, it's impossible to get "more correct" than what I was doing (regardless of performance)
4:59 PM For reflection and refraction, I can't figure out how to implement it
5:00 PM (excluding "pure mirror")
5:00 PM H<hgoel> lots of things where GPUs are considered essential are those kinds of things, things where the application just keeps expanding to fill the room available
5:00 PM reflection would involve creating a new ray at the point at which the ray from the camera gets reflected and figuring out what it hits
5:01 PM B<bcos_> For accurate reflection it mostly becomes "for every pixel, render the entire scene from that pixel's point of view"
5:01 PM H<hgoel> similarly with refraction you'd use the indices of refraction to determine the direction of the new ray spawned
5:01 PM yeah
5:02 PM B— bcos_ nods - anything that causes a ray to change direction becomes "ouch"
5:03 PM H<hgoel> thus why reflections are a huge pain, how games approximate these days is by having a cubemap rendering of the world without reflections, then when rendering the scene, you just pick the pixel in the direction of the reflected ray, it's a huge approximation though
5:04 PM but it's better than no reflections and usually looks acceptable
5:05 PM but obviously, if a perf boost meant we could render the scene say 1000 times, they'd definitely expand the system to improve the approximation
5:05 PM and so on
5:07 PM B— bcos_ hates those dodgy tricks
5:08 PM B<bcos_> - they always break in various situations, and they break the abstraction
5:08 PM Hrm
5:08 PM H<hgoel> I love that kind of stuff, I find a certain kind of beauty in devising clever tricks to deal with technical limitations
5:09 PM B<bcos_> My approach is "app creates generic scene" then "video driver does all rendering", so that there's an impeneterable abstraction between app and video driver/hardware
5:10 PM H<hgoel> so like the fixed function opengl of the old days?
5:10 PM B<bcos_> Sort of, but different
5:11 PM (different, because it's a "4D" thing)
5:12 PM H<hgoel> I like the new programmable system, simply because it's completely flexible, I know best about what I'm trying to do, so I can squeeze out as much performance as I need
5:12 PM B<bcos_> It's a crippled joke
5:13 PM H<hgoel> how's it crippled?
5:13 PM B<bcos_> WHen I see "minimum requirements: Nvidia model X, AMD model Y" I want to stab game developers repeatedly in the face
5:13 PM H<hgoel> lol
5:14 PM B<bcos_> For fun, try putting an AMD GPU and an NVidia GPU into a modern computer (with integrated Intel); and then having the same game stretched across 3 monitors where each monitor is connected to a different GPU from a different vendor
5:15 PM H<hgoel> I think that's just a natural part of things, sort of like how a 64-bit program that needs to run at atleast a certain number of executions per second, will have a minimum CPU model required simply because anything older is either too slow or doesn't support the required features
5:15 PM B<bcos_> ^ that is relatively trivial for my API to support
5:15 PM More specifically...
5:15 PM H<hgoel> that kind of thing has gotten better with Vulkan/DX12
5:16 PM B<bcos_> My API is designed for "3 monitors attached to 3 completely different computers", where a single app can be stretched across those 3 monitors
5:17 PM H<hgoel> although it would likely still be insane to setup, wouldn't be surprised if all three companies go out of their way to make it as difficult as possible to get their drivers to cooperate and not mess with each other
5:18 PM well, except maybe the Intel ones
5:20 PM B— bcos_ could (in theory) have a 4*4 grid of monitors covering a wall, with each monitor using a different video mode and refresh rate, with an eclectic assortment of computers and video cards (including a mixture of ARM and 80x86, etc); and stretch the same 3D game across all 16 monitors
5:21 PM B<bcos_> ..and the app/game wouldn't know about any of this
5:21 PM (it'd just do the same "generate scene" regardless)
5:22 PM H<hgoel> if the drivers could be made to get along, that'd likely work fine even on windows
5:22 PM B<bcos_> Nonsense
5:22 PM H<hgoel> won't even talk about linux since the graphics driver situation there is just terrible
5:23 PM B<bcos_> All existing/modern graphics APIs are far too low level and force the app/game to deal with "low level hardware details that should've been abstracted but weren't"
5:23 PM H<hgoel> ah, I misread part of your statement, in a networked context, it would be difficult
5:24 PM well, I ought to get back to work on my os
5:25 PM B— bcos_ is supposed to be doing code to obtain and index ACPI tables
5:25 PM B<bcos_> ..which is boring :-(
5:25 PM H<hgoel> haha yeah
5:26 PM that was one thing I just copied from my previous kernel and haven't looked at since
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment