Skip to content

Instantly share code, notes, and snippets.

@asumagic
Last active June 26, 2020 20:08
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save asumagic/65d9be59f039b0ce63165f9d08498139 to your computer and use it in GitHub Desktop.
Save asumagic/65d9be59f039b0ce63165f9d08498139 to your computer and use it in GitHub Desktop.

KAG Staging build

Improvements

Bunch of technical explanations inside! If you don't understand a word, what has been emphasized is the most important to know, but this should be interesting for those of you who have been waiting for this build.

64-bit

KAG is now built for x86-64 (i.e. "64-bit").

This means that you now need a 64-bit operating system to run KAG.
While this is almost always the case, in the unlikely event that your system is still in 32-bit, you may need to upgrade to a 64-bit OS if your CPU supports it.

This comes with overall improved performance (and usually better tooling support for devs).

SDL2 Windowing & Input

Windowing and input is now handled by SDL2 rather than by Irrlicht (though Irrlicht is still used for various other things in the engine). This allows:

  • Borderless fullscreen. This improves fullscreen behavior on all platforms, especially Linux.
  • Resizable game window and toggleable fullscreen (F4).
  • Cursor grabbing is implemented, so your cursor will be constrained to your window, as long as you are playing (excluding menus and scoreboard, for instance).
  • Multiple monitor setups will behave better on all platforms. Cursor grabbing is also enabled in fullscreen so that you do not accidentally click on your other monitors.
  • Text input works more reliably. This is particularly significant for Linux users for instance, some locales would misbehave a lot (e.g. azerty).
  • The launcher window closes properly on Linux now.
  • The window icon now shows properly on all platforms.

Build toolchain changes

KAG is now built using MinGW on Windows instead of MSVS. The compilers now in use are gcc on Windows and using clang on Linux, using up-to-date version. This has a few direct advantages:

  • Improved performance. Catching up to almost a decade of optimizer improvements, including LTO (link-time optimization).
  • C++11/14/17 support for us, which was a requirement for some of the engine changes.

Updated Box2D

Box2D, the physics engine, was updated to the latest available version.

Some benefits:

  • Potentially slightly improved performance, in particular during physics intensive scenes.
  • The glitch that caused you to "glitch out" when landing on the ground was fixed, for whatever cursed reason.

Updated AngelScript

The AngelScript scripting engine, which is used by KAG for scripting was updated.

  • New language features for modders: Anonymous functions, auto type deduction, list initializers and others.
  • Improved loading times due to much faster script compilation.

AngelScript JIT compiler

A JIT compiler developed by BlindMindStudios is under use on the Linux client and server builds, which singificantly improves performance in certain cases. Due to issues on the Windows platform, it is not used there.

asllvm is under development and may ultimately replace this JIT compiler for better debuggability, performance and Windows support.

Reworked profiler

The performance profiler has been mostly rewritten to provide information in a tree like structure.

Here is an example of a performance report:

[20:28:09]  100.0% <loop>
[20:28:09]   81.5% <render>
[20:28:09]     47.3% <world>
[20:28:09]       47.3% <map>
[20:28:09]         34.8% <tilemap>
[20:28:09]         5.2% <spriterender>
[20:28:09]           3.5% <batching>
[20:28:09]           1.7% <buffer>
[20:28:09]             1.4% <update>
[20:28:09]             0.3% <render>
[20:28:09]         3.8% <lightmap>
[20:28:09]           2.3% <render>
[20:28:09]           1.5% <prepare>
[20:28:09]         2.7% <water>
[20:28:09]         0.4% <fire>
[20:28:09]         0.4% <cparticle:render:back>

Reworked script error reporting

Script error logging was improved in order to provide more context. Displayed colors were tweaked in order to make them even more readable.

Colored demo

The notorious "missing ;?" message issue has been fixed as it actually was a message (often incorrectly) overriden by the engine.
The script console now has improved error logging as well, and will stop hiding error messages that occured already.

Script execution trace changes

Line numbers are now displayed for the backtrace of the current script, which helps tracing back script exceptions.

[19:45:38] PRINTING SCRIPT EXECUTION TRACE
[19:45:38] Tip: You can fetch callstack and scriptstack info from string[]@ getCallStack() and string[]@ getScriptStack().
[19:45:38]
[19:45:38] Callstack for current script: standardcontrols
[19:45:38] #1: Line 146: void onCommand(CBlob@, uint8, CBitStream@)
[19:45:38] 
[19:45:38] Script stack (nested script execution, i.e. when causing hook calls from hooks):
[19:45:38] #1: automat
[19:45:38] #2: standardcontrols

Restart actually restarts the game rather than forking

NOTE: This fix currently only applies for the Linux platform (client and dedicated server).

KAG used to fork the process instead of properly restarting the process when a game restart was required. This led /restartserver or autoupdate restarts to leave an extra useless KAG process in memory, and in the case of a client let a blank window run in the background.

New audio engine

KAG has been ported to use a different audio engine using SoLoud.

Some key advantages:

  • The sound behavior is now much more consistent across platforms as Linux users were suffering weird positional audio, missing sound, library issues and crackling noises.
  • In the future, this may allow exposing more features for modding including effects or custom sounds generators (such as sfxr). There are a few implemented already.

Particle and sprite transparency

Partially transparent sprites should now work for things like bubbles.
It mostly works, but due to how sprites are rendered, this can cause virtual artifacts related to Z ordering.

New lighting and map buffer logic updates

This is a big one! Lighting has been completely overhauled for nicer looks, flexibility and bug fixing. This also allowed modifying a lot of map rendering logic.

Some 1280x720 screenshots: Screenie 1 Screenie 2 Screenie 3

The most significant difference is that lighting is applied to the screen with some blend mode trickery.

  • Map buffer updates (i.e. updating the appearance of tiles) are much faster and are required much less often, which should help reduce stuttering.
  • Lights are now rendered using a custom texture that is applied over a low-resolution (~1/6 the screen resolution) intermediate texture. This will allow (in the future) for much more flexibility for modders and reduces a huge lot the performance cost of having many, dynamic light sources.
  • Sky and ambient light calculation was greatly improved. The ambient light is noticeably better during mornings/evenings/the night.
  • Lighting applied on larger sprites look much better as it is no longer calculated per-vertex or per-sprite.
  • For the same reason, water lighting looks much smoother.
    Water smoothness demo
  • Light updates are now much more reliable, which makes switching lanterns on/off look better and much snappier, as well as to prevent weird flickering issues you could observe, particularly on bigger maps. This also fixes some issues like tunnels appearing in full brightness underground for a frame.
  • Light shafts now dim depending on the shaft height. This subjectively looks better in general and was partly done as a performance optimization because of how the lighting updates are performed (so much smaller regions can be updated at a time).
  • Light propagation was reworked. The smoothing pass should make it look quite nicer and natural.
    This screenshot demonstrates the overall lightmap rendering, light shafts and light propagation improvements.
    Propagation demo
  • The "ambient occlusion" effect you can observe on background tiles (which was reworked) is currently disabled when using faster graphics, as it has a (negligible) impact on performance. It does not look very different, but here is an example:
    Aphe RP AO demo
  • The daytime cycle should look much smoother, provided the gamemode uses it. This is firstly because ambient lighting color is determined during the rendering of the intermediate lighting texture rather than by requiring a lightmap update. Secondly, for rendering the daytime is interpolated so it is even smoother.

Blob tick throttling

Two fields, tickInterval and offscreenTickInterval were added to CBlob in order to throttle the tickrate of specific blobs, which allows avoiding a lot of unnecessary processing, but requires explicit usage in scripts - which will be done for the base game.
While it was possible to throttle tick execution for scripts, this was not possible for entire blobs.
Note that this new functionality goes beyond just disabling execution of scripts and allows to skip most processing of blobs, which can be undesired for many kind of blobs.

A negative interval completely disables ticking of the entity. An interval of 2 means one blob tick happens every 2 ticks, etc.

getTicksSinceStarted() only counts ticks that were not skipped. (TODO?...)
While physics for a given body are still updated by Box2D (as it is external to blob processing per se), getPosition() will not be updated.

Offscreen tick throttling

When the blob is considered out of screen (in a way you should not rely on), ticks will be skipped according to offscreenTickInterval. This also disables CSprite ticking, whereas tickInterval does not.

Ticking for most out of screen blobs should be throttled as much as possible, as this can help a lot with performance at little risk of bugs. This should generally be avoided for highly dynamic blobs like arrows or ballista bolts, though. This may also cause desyncs when dealing with blobs incrementing a counter in their onTick or anything similar.

Removed launcher

The launcher is now removed - other than for autoupdate on standalone builds. Launching the game now skips directly to the main menu.
The only worthwhile feature there was fullscreen and screen resolution options. The former was moved to video settings; the latter was removed as it is irrelevant for windowed fullscreen.

New script timeout detection

This is still experimental and buggy, but g_timeoutscripts has been updated with better logging and much improved performance (as it was quite unusable due to performance issues), which should help modders localize infinite loops in their code.

Improved yielding/sleeping logic

When the engine judges performance to be good enough, it may sleep for a millisecond so it does not use a complete CPU core when it does not need to.

However, it did wait 1ms+ between frames in certain scenarios where the FPS was above 30, which could cause stutter issues.
There was another occurence of a 1ms+ wait in network code, which was solved.

Now, depending on VSync and some other parameters, it just yields execution to other processes which is of much lower importance, while not affecting CPU usage.

Servers may also consume less CPU at idle due to improved logic there as well.

Font rendering optimizations

A lot of the font logic is now cached around rendering itself and getting text dimensions.
Several unnecessary heavy calculations have been avoided, so GUI rendering should be somewhat faster now.

Debug menu + imgui TODO

F5 will now open a brand new debug menu.

Misc. engine cleanups and improvements

Not much to say about it - lots of dead code removal, modernization and minor optimizations makes the engine a tiny bit cleaner and more maintainable, along with small performance improvements for the game.

Here is an incomplete list of small improvements that have not been mentioned before:

  • Juxta (.dll) and KAG (.exe) were merged into the KAG binary, which is more convenient and may reduce executable size a bit.
  • EASTL was removed.
  • Hashmaps are used when possible. The robin-hood-hashing library is used as it offered more consistent performance.
  • Autoconfig & the console is now loaded much earlier. This allows some stuff like moving resolution configuration to autoconfig or having more consistent log output for certain things.
  • Slightly improved performance for blob rendering when g_debug is off.
  • Cut off the CBlob memory footprint by removing legacy stuff and moving certain things out, plus aggressively reordering commonly accessed fields for cache efficiency.
  • Some field reordering in CParticle for the same cache friendliness reason, though particle code still is a bloated slow mess.
  • Reorder the Tile memory layout to reduce its size in memory and possibly improve performance.
  • Reduced the memory footprint for CBitStream by a bit.
  • finalize some C++ classes so the compiler can better devirtualize function calls. This mainly targets CBlob.
  • Slightly improved hook call performance from the engine by removing unused code.
  • Optimized some minimap generation logic, which may reduce stuttering.
  • The g_fastdeltas option was removed, as it was causing crashes and could not do what it intended to do anyway.
  • AS garbage collection is now handled explicitly. It mostly has the same logic as the AS automatic GC, but should allow to eliminate some unnecessary overhead at script execution.
  • CNet::isClient() is now hardcoded to return false on dedicated servers. This may help eliminate some dead code and unnecessary branching for dedicated servers in the engine.

Bugs, retrocompatibility breakages and enhancement TODOs

Issues have been ported to a GitHub repo

@asumagic
Copy link
Author

screen-19-08-20-00-12-38
screen-19-08-20-00-10-53
screen-19-08-20-00-10-23

@asumagic
Copy link
Author

errdemo

@asumagic
Copy link
Author

propagationplusao
newpropagationplusshafts
smoothwater

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment