Skip to content

Instantly share code, notes, and snippets.

@bkaradzic
Last active March 20, 2023 21:58
Embed
What would you like to do?
Orthodox C++

Orthodox C++

What is Orthodox C++?

Orthodox C++ (sometimes referred as C+) is minimal subset of C++ that improves C, but avoids all unnecessary things from so called Modern C++. It's exactly opposite of what Modern C++ suppose to be.

Why not Modern C++?

Back in late 1990 we were also modern-at-the-time C++ hipsters, and we used latest features. We told everyone also they should use those features too. Over time we learned it's unnecesary to use some language features just because they are there, or features we used proved to be bad (like RTTI, exceptions, and streams), or it backfired by unnecessary code complexity. If you think this is nonsense, just wait few more years and you'll hate Modern C++ too ("Why I don't spend time with Modern C++ anymore" archived LinkedIn article).

d0pfbigxcaeip0m

Why use Orthodox C++?

Code base written with Orthodox C++ limitations will be easer to understand, simpler, and it will build with older compilers. Projects written in Orthodox C++ subset will be more acceptable by other C++ projects because subset used by Orthodox C++ is unlikely to violate adopter's C++ subset preferences.

Hello World in Orthodox C++

#include <stdio.h>

int main()
{
    printf("hello, world\n");
    return 0;
}

What should I use?

  • C-like C++ is good start, if code doesn't require more complexity don't add unnecessary C++ complexities. In general case code should be readable to anyone who is familiar with C language.
  • Don't do this, the end of "design rationale" in Orthodox C++ should be immedately after "Quite simple, and it is usable. EOF".
  • Don't use exceptions.

Exception handling is the only C++ language feature which requires significant support from a complex runtime system, and it's the only C++ feature that has a runtime cost even if you don't use it – sometimes as additional hidden code at every object construction, destruction, and try block entry/exit, and always by limiting what the compiler's optimizer can do, often quite significantly. Yet C++ exception specifications are not enforced at compile time anyway, so you don't even get to know that you didn't forget to handle some error case! And on a stylistic note, the exception style of error handling doesn't mesh very well with the C style of error return codes, which causes a real schism in programming styles because a great deal of C++ code must invariably call down into underlying C libraries.

  • Don't use RTTI.
  • Don't use C++ runtime wrapper for C runtime includes (<cstdio>, <cmath>, etc.), use C runtime instead (<stdio.h>, <math.h>, etc.)
  • Don't use stream (<iostream>, <stringstream>, etc.), use printf style functions instead.
  • Don't use anything from STL that allocates memory, unless you don't care about memory management. See CppCon 2015: Andrei Alexandrescu "std::allocator Is to Allocation what std::vector Is to Vexation" talk, and Why many AAA gamedev studios opt out of the STL thread for more info.
  • Don't use metaprogramming excessively for academic masturbation. Use it in moderation, only where necessary, and where it reduces code complexity.
  • Wary of any features introduced in current standard C++, ideally wait for improvements of those feature in next iteration of standard. Example constexpr from C++11 became usable in C++14 (per Jason Turner cppbestpractices.com curator)

Is it safe to use any of Modern C++ features yet?

Due to lag of adoption of C++ standard by compilers, OS distributions, etc. it's usually not possible to start using new useful language features immediately. General guideline is: if current year is C++year+5 then it's safe to start selectively using C++year's features. For example, if standard is C++11, and current year >= 2016 then it's probably safe. If standard required to compile your code is C++17 and year is 2016 then obviously you're practicing "Resume Driven Development" methodology. If you're doing this for open source project, then you're not creating something others can use.

UPDATE As of January 14th 2022, Orthodox C++ committee approved use of C++17.

Any other similar ideas?

Code examples

@bkaradzic
Copy link
Author

IMO, it's appropriate to discuss things, as long it's respectful and civilized. Users who don't want to follow can unsubscribe to stop notifications.

@bkaradzic
Copy link
Author

Also this discussion actually brought up some interesting links...

@swang206
Copy link

swang206 commented Feb 11, 2022

Sounds like this debate shouldn't be here anymore.

If you want further discussion, go joining discord:
https://discord.gg/vMKhB9Q

As an aside, what do you think of a string interpolation proposall that would allow you to do something like fn"something '{getUserControlledString(someHandle)}'.\n" and have it be automatically translated to fn("something '", getUserControlledString(someHandle), "'.\n") ? It seems to me like the best compromise between simply building on C++'s current capabilities and getting people who want their pretty format strings to like it, and people who have a particular hatred for anything that looks like a format string will be able to just directly call fn instead. It doesn't have any tax in binary size, performance or security: the compiler just directly translates it to the function call, and it can't lead to a vulnerability since it's only a compile-time thing

I have talked about that proposal for a very long time. Yes, it is a much better solution compared to format string one (although you can still technically screw up with macros or something like that). (For me, I do not need interpolated string literal either, but there are losers who love to talk about "readability" which is extremely subjective.)

However, it still does not change the fact you need iostream replacement and finally deprecate and remove iostream + exception EH. Then this solution could only work under this context.
You still need a new IO library to make that work and std::format is clearly nowhere that solution, nor iostream.
You still have cases you do not want an interpolated string literal. One example is clearly to use them with macros. Neither format string nor interpolated string literal could deal with things like this.

interpolated string literal won't save you for things like this.
https://github.com/tearosccebe/fast_io/blob/03d74a72f377a2488364d6fa92a75bc4952b3f4c/examples/0007.legacy/construct_fstream_from_syscall.cc#L31

	println(
	"Unix Timestamp:",unix_ts,"\n"
	"Universe Timestamp:",static_cast<fast_io::universe_timestamp>(unix_ts),"\n"
	"UTC:",utc(unix_ts),"\n",
	"Local:",local(unix_ts)," Timezone:",fast_io::timezone_name(),"\n"
#ifdef __clang__
	"LLVM clang " __clang_version__ "\n"
#elif defined(__GNUC__) && defined(__VERSION__)
	"GCC " __VERSION__ "\n"
#elif defined(_MSC_VER)
	"Microsoft Visual C++ ",_MSC_VER,"\n"
#else
	"Unknown C++ compiler\n"
#endif
#if defined(_LIBCPP_VERSION)
	"LLVM libc++ ", _LIBCPP_VERSION, "\n"
#elif defined(__GLIBCXX__)
	"GCC libstdc++ ", __GLIBCXX__ , "\n"
#elif defined(_MSVC_STL_UPDATE)
	"Microsoft Visual C++ STL ", _MSVC_STL_UPDATE, "\n"
#else
	"Unknown C++ standard library\n"
#endif
	"fstream.rdbuf():",fiob.fb,"\n"
	"FILE*:",static_cast<fast_io::c_io_observer>(fiob).fp,"\n"
	"fd:",static_cast<fast_io::posix_io_observer>(fiob).fd
#if (defined(_WIN32) && !defined(__WINE__)) || defined(__CYGWIN__)
	,"\n"
	"win32 HANDLE:",static_cast<fast_io::win32_io_observer>(fiob).handle
#ifndef _WIN32_WINDOWS
//NT kernel
	,"\n"
	"zw HANDLE:",static_cast<fast_io::zw_io_observer>(fiob).handle,"\n"
	"nt HANDLE:",static_cast<fast_io::nt_io_observer>(fiob).handle
#endif
#endif
);

So the program will print out different information with different environments.

./construct_fstream_from_syscall
Unix Timestamp:1644618626.884474078
Universe Timestamp:434602343073853826.884474078
UTC:2022-02-11T22:30:26.884474078Z
Local:2022-02-11T17:30:26.884474078-05:00 Timezone:EST
GCC 12.0.1 20220209 (experimental)
GCC libstdc++ 20220209
fstream.rdbuf():0x00007ffe1dce8648
FILE*:0x0000000000def2a0
fd:3
./construct_fstream_from_syscall_clang
Unix Timestamp:1644618743.939916634
Universe Timestamp:434602343073853943.939916634
UTC:2022-02-11T22:32:23.939916634Z
Local:2022-02-11T17:32:23.939916634-05:00 Timezone:EST
LLVM clang 15.0.0 (https://github.com/llvm/llvm-project.git 85628ce75b3084dc0f185a320152baf85b59aba7)
GCC libstdc++ 20220209
fstream.rdbuf():0x00007ffccdf22240
FILE*:0x0000000001efb2a0
fd:3
wine ./construct_fstream_from_syscall.exe 
Unix Timestamp:1644618830.2005686
Universe Timestamp:434602343073854030.2005686
UTC:2022-02-11T22:33:50.2005686Z
Local:2022-02-11T18:33:50.2005686-04:00 Timezone:Eastern Daylight Time
GCC 12.0.0 20211231 (experimental)
GCC libstdc++ 20211231
fstream.rdbuf():0x000000000031fb18
FILE*:0x000000006a0fa250
fd:3
win32 HANDLE:0x0000000000000048
zw HANDLE:0x0000000000000048
nt HANDLE:0x0000000000000048

Wow... they put a link to an article talking about memory safety in their page about memory safety ? What an incredible thing to do ! This obviously means that they endorse every single word written in the quote you got from that article, no matter that the reason they linked to that article was actually to provide a source for a completely different citation from a different part of the article, or that half of your quote is actually from a completely different article published several months after that first article. And that's also totally exactly the same to if they said it themselves.

...Yunno, I don't like Google either, but this seems incredibly disingenuous...

Oh yeah, Google founded things like rust-for-linux, etc (guess what? That Alex Gaynor (a (former?) Mozilla employee)) is the lead of the project. Of course they write unsafe hell, but hey! C++ is a dead language, if you use it you are a loser, and human rights violator.

Clearly, it is nothing wrong to say Google says "C++ violates human rights".

@DBJDBJ
Copy link

DBJDBJ commented Feb 13, 2022

@bkaradzic, with all the respect, I can't help but stand confused why don't you move this gist to a standalone standard separate repository?

@bkaradzic
Copy link
Author

@DBJDBJ I don't see how that would be beneficial?

@DBJDBJ
Copy link

DBJDBJ commented Feb 13, 2022

The content might be just one document. Succinct message. At safe distance from discussions.

It all depends on what do you want to do with this. It has nicely grown. Perhaps it needs a bigger place to live.

@aerosayan
Copy link

@bkaradzic i've been using orthodox c++ for some time ... with some well thought out use for some necessary modern c++ features.

@DBJDBJ , having this information in a separate repo might be useful ... however reading the counter-arguments provided here by others are also useful and necessary for new adopters to fully understand the nuances of where modern c++ goes wrong.

Thanks

@bkaradzic
Copy link
Author

@aerosayan

with some well thought out use for some necessary modern c++ features

What you said here is actually the key... "well thought out use" is opposite of "blindly following the latest"

@aerosayan
Copy link

@bkaradzic

IMHO new devs blindly follow the rules because they blindly trust other experienced developers. I have observed that problems occur due to new devs not understanding that the experienced devs are giving them generic advice and not something that's good for every industry.

If you're making a customer service app, you would need features like multiple inheritance or reference counter based shared pointers. However those things become problematic when writing any multi-threaded high performance (low latency and/or high throughput) code.

I think we can help new devs select programming language features they should use, based on few simple tests. Since we can't give generic advice, I would like to share why I think Orthodox C++ makes code reliable at least for game engine development or numerical solver development :

  1. Are the language features used easy for the compiler to optimize?
  • Things like multiple inheritance, deeply nested operator overloads, become a little bit problematic because c++ compilers may not be able to optimize them correctly. Sometimes I needed to manually check the disassembly to verify that the code was actually optimized. This makes the code less reliable. Also, I would like the code to perform good even in debug mode, when all the optimizations may not be turned on. If I can't trust the code to be optimized on my current compiler and any compiler or platform I may use in the future, I will not use it. This is extremely important, because I may be developing on Linux, but I would like my code to reliably run on Windows when compiled with MSVC.
  1. Are the language features used behave consistently on every platform and compiler?
  • This is primarily the reason to not use STL if your application needs to perform reliably and consistently on every platform. I have observed that the STL implementations on linux and windows are sometimes different and that can cause a lot of problems. For example ... few years ago I observed that std::set on linux gcc was allocating memory for 3 elements but on windows msvc was allocating a lot more. I don't have the exact code to demonstrate it now, but this stackoverflow question shows a similar problem https://stackoverflow.com/questions/57031917/ where the bucket sizes used for std::deque was different on different compilers. Apparently it's due to the standard not specifying how the container should be implemented. This can be a serious problem if you want consistent behavior on every platform and compiler. So using STL might be a death sentence for our code. It's not that STL is bad, but simply that we can't be 100% sure that every implementation of STL will behave consistently.
  1. Are the language features safe to use?
  • Nothing gets me more enraged than stackoverflow and reddit c++ fanatics recommending std::shared_ptr to make your code safe. They use an atomic reference counter and when modified from different threads (like in a multi-threaded rendering engine), the performance gets obliterated. https://stackoverflow.com/questions/31254767 Sure ... your code is free of memory leaks and doesn't suffer from race conditions, but it comes at a severe price. Someone may argue that my code was wrong for creating the race condition, and they would be right ... however I would rather have my code crash severely due to a race condition, so I could fix it, instead of slowly destroying the code's performance over time.

I don't have more to add at this moment, but if new devs are taught how Orthodox C++ tries to help them create reliabile and consistent code, and how abuse of modern c++ can lead to bad code, they will be able to avoid many of the mistakes that we made in the past.

Thanks

@bkaradzic
Copy link
Author

image

@DBJDBJ
Copy link

DBJDBJ commented Jul 28, 2022

@bkaradzic i've been using orthodox c++ for some time ... with some well thought out use for some necessary modern c++ features.

@DBJDBJ , having this information in a separate repo might be useful ... however reading the counter-arguments provided here by others are also useful and necessary for new adopters to fully understand the nuances of where modern c++ goes wrong.

Thanks

... go the full mile and turn it into github hosted site ... Although without Mr @bkaradzic that's not gonna happen ...

@guruprasadah
Copy link

@bkaradzic , I have a small doubt I wish to ask you. Heap fragmentation is a problem when it comes to STL, so why not write something to a bump allocator? Have a continuous block of memory, and when a realloc is requested, if requested size is bigger than existing, expand the pool, move it to the end and get rid of the hole in between (using memmove). I agree that STL introduces LOT of bloat, and that the includes are often thousands of lines long, but the way I see it, we are ourselves going to implement them if we don't use them, and if we wish for it to be feature complete it will stretch a few hundred lines of code. We can also implement not de-allocating memory when object is destroyed, instead reusing it (memory alloc caching??) in future. If it exceeds a limit, maybe de-alloc it.

Could you please share your thoughts on the above?

@DBJDBJ
Copy link

DBJDBJ commented Oct 5, 2022

@guruprasadah the core issue with STL is it is mandatory. Not optional. It is as simple as that.

Thus. If there is some small fast etc. replacement you can not replace STL with that. Or whatever else you fancy. You simply have no choice. It is all or nothing. Actually, it is all.

@guruprasadah
Copy link

@DBJDBJ Then what do you suggest in place of std::vector ? I once tried to follow this orthodox approach and all I ended up with was an Array class that was 500 loc. It also fully did not support all operations of std::vector . Fully implementing those will take a couple hundred more LOC. So at this point, LOC of STL and custom reaches same. So what is the use of half-assing our own implementation, that also suffers from STL problems - instead of juist stl

@trcrsired
Copy link

trcrsired commented Oct 6, 2022

@DBJDBJ Then what do you suggest in place of std::vector ? I once tried to follow this orthodox approach and all I ended up with was an Array class that was 500 loc. It also fully did not support all operations of std::vector . Fully implementing those will take a couple hundred more LOC. So at this point, LOC of STL and custom reaches same. So what is the use of half-assing our own implementation, that also suffers from STL problems - instead of juist stl

No. Reimplement that by yourself does NOT suffer from the problems of STL.

  1. No bad_alloc
  2. realloc and merging between all trivially_relocatable types
  3. allocator does not cause performance degradation due to the bad design of allocator<T> instead of just allocator.
  4. Provide an extra interface like emplace_back_unchecked to avoid redundant bounds checking of emplace_back.
  5. Zero page semantics aware to call things like calloc instead of malloc to avoid the cost of integers and floating points to be initialized with zero.
  6. No dependency of C++ std::string or anything that uses C++ EH which causes binary bloat and dead code.
    https://github.com/cppfastio/fast_io/blob/master/include/fast_io_dsal/impl/vector.h
    https://github.com/cppfastio/fast_io/blob/master/benchmark/0011.containers/vector/0002.multi_push_back/main.h
./fast_io
fast_io::vector<T>:0.235741445s
./std
std::vector<T>:0.355929078s

No change on allocator. the performance gap is already HUGE due to cache unfriendliness of std::vector + binary bloat issues of std::vector

@trcrsired
Copy link

trcrsired commented Oct 6, 2022

@bkaradzic , I have a small doubt I wish to ask you. Heap fragmentation is a problem when it comes to STL, so why not write something to a bump allocator? Have a continuous block of memory, and when a realloc is requested, if requested size is bigger than existing, expand the pool, move it to the end and get rid of the hole in between (using memmove). I agree that STL introduces LOT of bloat, and that the includes are often thousands of lines long, but the way I see it, we are ourselves going to implement them if we don't use them, and if we wish for it to be feature complete it will stretch a few hundred lines of code. We can also implement not de-allocating memory when object is destroyed, instead reusing it (memory alloc caching??) in future. If it exceeds a limit, maybe de-alloc it.

Could you please share your thoughts on the above?

The way to write an allocator is exactly the issue of C++ container models. Why is it std::allocator<T>, not just std::allocator? The allocator<T> causes a huge amount of issues. How does allocating memory space have anything to do with element type?

All my code avoids C++ containers for the reason they are not freestanding. I need the code to work in any environment and the C++ vector does not work at all due to the crappy allocator model, allocation failure, and logic_error.

@guruprasadah
Copy link

@DBJDBJ Then what do you suggest in place of std::vector ? I once tried to follow this orthodox approach and all I ended up with was an Array class that was 500 loc. It also fully did not support all operations of std::vector . Fully implementing those will take a couple hundred more LOC. So at this point, LOC of STL and custom reaches same. So what is the use of half-assing our own implementation, that also suffers from STL problems - instead of juist stl

No. Reimplement that by yourself does NOT suffer from the problems of STL.

  1. No bad_alloc
  2. realloc and merging between all trivially_relocatable types
  3. allocator does not cause performance degradation due to the bad design of allocator<T> instead of just allocator.
  4. Provide an extra interface like emplace_back_unchecked to avoid redundant bounds checking of emplace_back.
  5. Zero page semantics aware to call things like calloc instead of malloc to avoid the cost of integers and floating points to be initialized with zero.
  6. No dependency of C++ std::string or anything that uses C++ EH which causes binary bloat and dead code.
    https://github.com/cppfastio/fast_io/blob/master/include/fast_io_dsal/impl/vector.h
    https://github.com/cppfastio/fast_io/blob/master/benchmark/0011.containers/vector/0002.multi_push_back/main.h
./fast_io
fast_io::vector<T>:0.235741445s
./std
std::vector<T>:0.355929078s

No change on allocator. the performance gap is already HUGE due to cache unfriendliness of std::vector + binary bloat issues of std::vector

I am a bit of a new person to c++, but I would like to ask - how is bad_alloc a problem here? I agree exceptions are NOT the best way to handle errors but, then in our own implementation - we throw a similar error, maybe just not using exceptions. Clarify please?

@trcrsired
Copy link

bad_alloc should NEVER EVER happen. It should just crash. Many libcs even has functions like xmalloc which will kill your process.

The problem is that bad_alloc introduces oddities.

  1. When you throw bad_alloc, you need to run destructors, destructors themselves might allocate again.
  2. Emergency heap may still crash.
  3. Many operating systems, including windows and linux, will kill your process with OOM killer. With virtual memory, bad_alloc never really happens and it just crashes out.

@AsuMagic
Copy link

./fast_io
fast_io::vector<T>:0.235741445s
./std
std::vector<T>:0.355929078s

No change on allocator. the performance gap is already HUGE due to cache unfriendliness of std::vector + binary bloat issues of std::vector

Not that I doubt it is easy to outperform std::vector::push_back, but I doubt the real world performance impact of std::vector is as brutal as you make it out to be, unless you have such numbers.
By the way, how do you know that your benchmark actually performs the push_backs for all the vectors properly? I do not trust the results for this reason, even if it seems to be doing some amount of what you asked according to your numbers. I quickly checked the std::vector variant and it isn't a problem there, but I don't know about your vector implementation.
I'm not trying to trash on your implementation, by the way, it is actually pretty interesting to me.
Also, TIL about clang's trivial_abi, neat.

I do find it amusing that you defeated the compiler optimization in the benchmark merely by looping the push_backs until it couldn't unroll the loop and figure it out. I feel like the compiler ought to be smart enough to figure out there is no side effect to any of this, but it probably can't see past the complexity, especially because of the relocation logic.
Which... honestly just makes me think push_back should be avoided in general, because it's going to have garbage codegen and optimization implications anyway, at least when you can get away using resize ahead of time and indexing, which, according to my experience with real-world code, is usually feasible. That usecase is probably significantly less affected by the code bloat and EH related issues you mention.

It could very well be that the unchecked variant you mention benefits codegen a lot on that front. However, the preconditions you need to check for in your code to ensure it is safe seem in practice close enough to what you'd need to .resize() and index, though?

What I do wish is that std::vector allowed resizing without initializing the Ts. With non-trivial types, I found that the compiler would sometimes still emit the memory zeroing code even when unnecessary. I worked around that by wrapping my T in a ugly way, but it was trash.

@trcrsired
Copy link

trcrsired commented Feb 24, 2023

I have written a new guideline that would kill all modern C++ nonsense.
https://github.com/trcrsired/Portable-Cpp-Guideline

@trcrsired
Copy link

./fast_io
fast_io::vector<T>:0.235741445s
./std
std::vector<T>:0.355929078s

No change on allocator. the performance gap is already HUGE due to cache unfriendliness of std::vector + binary bloat issues of std::vector

Not that I doubt it is easy to outperform std::vector::push_back, but I doubt the real world performance impact of std::vector is as brutal as you make it out to be, unless you have such numbers. By the way, how do you know that your benchmark actually performs the push_backs for all the vectors properly? I do not trust the results for this reason, even if it seems to be doing some amount of what you asked according to your numbers. I quickly checked the std::vector variant and it isn't a problem there, but I don't know about your vector implementation. I'm not trying to trash on your implementation, by the way, it is actually pretty interesting to me. Also, TIL about clang's trivial_abi, neat.

I do find it amusing that you defeated the compiler optimization in the benchmark merely by looping the push_backs until it couldn't unroll the loop and figure it out. I feel like the compiler ought to be smart enough to figure out there is no side effect to any of this, but it probably can't see past the complexity, especially because of the relocation logic. Which... honestly just makes me think push_back should be avoided in general, because it's going to have garbage codegen and optimization implications anyway, at least when you can get away using resize ahead of time and indexing, which, according to my experience with real-world code, is usually feasible. That usecase is probably significantly less affected by the code bloat and EH related issues you mention.

It could very well be that the unchecked variant you mention benefits codegen a lot on that front. However, the preconditions you need to check for in your code to ensure it is safe seem in practice close enough to what you'd need to .resize() and index, though?

What I do wish is that std::vector allowed resizing without initializing the Ts. With non-trivial types, I found that the compiler would sometimes still emit the memory zeroing code even when unnecessary. I worked around that by wrapping my T in a ugly way, but it was trash.

I can guarantee the performance gap is huge at the MACRO level due to the redundant code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment