Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Orthodox C++

Orthodox C++

What is Orthodox C++?

Orthodox C++ (sometimes referred as C+) is minimal subset of C++ that improves C, but avoids all unnecessary things from so called Modern C++. It's exactly opposite of what Modern C++ suppose to be.

Why not Modern C++?

Back in late 1990 we were also modern-at-the-time C++ hipsters, and we used latest features. We told everyone also they should use those features too. Over time we learned it's unnecesary to use some language features just because they are there, or features we used proved to be bad (like RTTI, exceptions, and streams), or it backfired by unnecessary code complexity. If you think this is nonsense, just wait few more years and you'll hate Modern C++ too ("Why I don't spend time with Modern C++ anymore" archived LinkedIn article).

d0pfbigxcaeip0m

Why use Orthodox C++?

Code base written with Orthodox C++ limitations will be easer to understand, simpler, and it will build with older compilers. Projects written in Orthodox C++ subset will be more acceptable by other C++ projects because subset used by Orthodox C++ is unlikely to violate adopter's C++ subset preferences.

Hello World in Orthodox C++

#include <stdio.h>

int main()
{
    printf("hello, world\n");
    return 0;
}

What should I use?

  • C-like C++ is good start, if code doesn't require more complexity don't add unnecessary C++ complexities. In general case code should be readable to anyone who is familiar with C language.
  • Don't do this, the end of "design rationale" in Orthodox C++ should be immedately after "Quite simple, and it is usable. EOF".
  • Don't use exceptions.

Exception handling is the only C++ language feature which requires significant support from a complex runtime system, and it's the only C++ feature that has a runtime cost even if you don't use it – sometimes as additional hidden code at every object construction, destruction, and try block entry/exit, and always by limiting what the compiler's optimizer can do, often quite significantly. Yet C++ exception specifications are not enforced at compile time anyway, so you don't even get to know that you didn't forget to handle some error case! And on a stylistic note, the exception style of error handling doesn't mesh very well with the C style of error return codes, which causes a real schism in programming styles because a great deal of C++ code must invariably call down into underlying C libraries.

  • Don't use RTTI.
  • Don't use C++ runtime wrapper for C runtime includes (<cstdio>, <cmath>, etc.), use C runtime instead (<stdio.h>, <math.h>, etc.)
  • Don't use stream (<iostream>, <stringstream>, etc.), use printf style functions instead.
  • Don't use anything from STL that allocates memory, unless you don't care about memory management. See CppCon 2015: Andrei Alexandrescu "std::allocator Is to Allocation what std::vector Is to Vexation" talk, and Why many AAA gamedev studios opt out of the STL thread for more info.
  • Don't use metaprogramming excessively for academic masturbation. Use it in moderation, only where necessary, and where it reduces code complexity.
  • Wary of any features introduced in current standard C++, ideally wait for improvements of those feature in next iteration of standard. Example constexpr from C++11 became usable in C++14 (per Jason Turner cppbestpractices.com curator)

Is it safe to use any of Modern C++ features yet?

Due to lag of adoption of C++ standard by compilers, OS distributions, etc. it's usually not possible to start using new useful language features immediately. General guideline is: if current year is C++year+5 then it's safe to start selectively using C++year's features. For example, if standard is C++11, and current year >= 2016 then it's probably safe. If standard required to compile your code is C++17 and year is 2016 then obviously you're practicing "Resume Driven Development" methodology. If you're doing this for open source project, then you're not creating something others can use.

UPDATE As of January 14th 2022, Orthodox C++ committee approved use of C++17.

Any other similar ideas?

Code examples

@swang206
Copy link

swang206 commented Feb 7, 2022

Yes. i DO provide compilers for orthodox C++. That is GCC 12.0.1 built with --disable-hosted-libstdcxx. NO HOSTED C++!!!!

Say goodbye to array, move, addressof!
https://bitbucket.org/ejsvifq_mabmip/x86_64-elf-baremetal-toolchain

@GabrielRavier
Copy link

GabrielRavier commented Feb 7, 2022

🤔 @swang206 What's the problem with unique_ptr ? I can see why people would be against, say, shared_ptr, but unique_ptr seems like a really convenient way of just holding a malloc-ed pointer, unless you hate the very concept of rvalue references or something

@swang206
Copy link

swang206 commented Feb 7, 2022

People abuse it for writing things like

std::unique_ptr<FILE,std::function<decltype(fclose)>>

or abusing lambda (which encourages binary bloat).

While it ignores the real issues: It encourages ignoring checking return code.

  • unique_ptr's serious performance issues due to ABI
  • You cannot use it as an observer
  • passing unique_ptr const& is technically passing int** from abi perspective.

And std::unique_ptr is NOT freestanding. This means if you use it, your code will FAIL to compile.

https://youtu.be/zxFNnOFYCkI

@GabrielRavier
Copy link

GabrielRavier commented Feb 8, 2022

Although I don't really see much of a problem w.r.t. usage with FILE, I simply do not see how it "encourages ignoring checking return code". Could you elaborate on that @swang206 ? unique_ptr only does the job of automatically handling simple direct ownership of some memory, it doesn't do any kind of error checking itself or anything I can see that would encourage "ignoring checking return code"...

Also I don't get what you mean by "You cannot use it as an observer", not being an observer is literally the entire point of unique_ptr... I suppose one could introduce something like observer_ptr for that purpose (it's actually in std::experimental rn) but I don't think that's a blemish on the design of something made specifically for the purpose of owning a pointer (such that a unique_ptr always owns what it points to).

I also agree that passing unique_ptr const& is basically passing void ** and that doing so needlessly is bad, but unless you don't understand unique_ptr, you're not likely to make such a mistake.

(And that last remark is rather ridiculous. Unless you're on a stupid configuration like a GCC deliberately built without libstdc++, it's not going to be a problem)

@swang206
Copy link

swang206 commented Feb 9, 2022

You are objectively wrong.

Chromium abuses std::unique_ptr all the time. I failed to see how making all apis return unique_ptr is not an abuse.

https://github.com/chromium/chromium/blob/c4d3c31083a2e1481253ff2d24298a1dfe19c754/components/sync/protocol/proto_value_conversions.h

unique_ptr const& People did this all the time too.
https://github.com/chromium/chromium/blob/c4d3c31083a2e1481253ff2d24298a1dfe19c754/ui/accessibility/mojom/ax_assistant_structure_mojom_traits.h

"(And that last remark is rather ridiculous. Unless you're on a stupid configuration like a GCC deliberately built without libstdc++, it's not going to be a problem)"

This is also very wrong. C++ standard defines two sets of implementation. One is freestanding implementation. Another is hosted implementation. In the freestanding implementation, C++ standard library will only provide headers that are needed for freestanding. In kernel or embedded systems, libstdc++ will be built with --disable-hosted-libstdcxx, that is still libstdc++, but freestanding C++ one.

https://en.cppreference.com/w/cpp/freestanding
https://youtu.be/4I7XZiS0ru4
https://bitbucket.org/ejsvifq_mabmip/x86_64-elf-baremetal-toolchain

I suggest you download toolchain before making those ridiculous claims which are not even facts. Yes, the freestanding C++ toolchain does provide libstdc++, but only the freestanding part. Not any hosted part, including std::array. You are not allowed to use std::array if you want portable code.

<cstddef> <limits> <cfloat> <climits> <version> <cstdint> <cstdlib> (partial) <new> <typeinfo> <source_location> <exception> <initializer_list> <compare> <coroutine> <cstdarg> <concepts> <type_traits> <bit> <atomic>

Since unique_ptr is the memory header, it is of course not provided by freestanding. That is by definition what the standard requires. std::move, std::forward are in utility headers, of course, you are not allowed to use them too.

In fact, Ben Saks has planned to even remove headers like cfloat, new, typeinfo, exception, etc in freestanding since they are deeply problematic.

There are even things like std::addressof that rely on compiler magic, that C++ standard does not provide in freestanding.

And yes, in a freestanding environment, you do not have std::array, std::unique_ptr, std::move, std::addressof, etc. That is a real fact defined by C++ standard. If you use any feature besides the header that the standard requires, your code will NOT be portable.

You must write those things by yourself if you want portability.
https://github.com/tearosccebe/fast_io/blob/master/include/fast_io_core_impl/freestanding/addressof.h

Also, things like unique_ptr will never be freestanding either since there is no way to guarantee the implementation will provide a heap. Providing things like unique_ptr would pull dependency on the heap.

You do not have things like std::vector, std::string etc too, and will never have that. Particularly they have a dependency on std::logic_error for things like.at() .substr(), make them unusable in freestanding environment.

@GabrielRavier
Copy link

GabrielRavier commented Feb 9, 2022

You are objectively wrong.

Chromium abuses std::unique_ptr all the time. I failed to see how making all apis return unique_ptr is not an abuse.

github.com/chromium/chromium/blob/c4d3c31083a2e1481253ff2d24298a1dfe19c754/components/sync/protocol/proto_value_conversions.h

While the amount of types in here is kind of scary, I can't see how it is wrong to return unique_ptr when you're returning an owning pointer. That's the whole point of it, and I find it expresses the ownership quite well in this context.

unique_ptr const& People did this all the time too.
github.com/chromium/chromium/blob/c4d3c31083a2e1481253ff2d24298a1dfe19c754/ui/accessibility/mojom/ax_assistant_structure_mojom_traits.h

Oh come on, all those methods are simple helpers. If the nodes are only ever used with unique_ptr then it only makes sense to write them like that. You're not going to have performance problems with one-line methods in headers.

"(And that last remark is rather ridiculous. Unless you're on a stupid configuration like a GCC deliberately built without libstdc++, it's not going to be a problem)"

This is also very wrong. C++ standard defines two sets of implementation. One is freestanding implementation. Another is hosted implementation. In the freestanding implementation, C++ standard library will only provide headers that are needed for freestanding. In kernel or embedded systems, libstdc++ will be built with --disable-hosted-libstdcxx, that is still libstdc++, but freestanding C++ one.

en.cppreference.com/w/cpp/freestanding
youtu.be/4I7XZiS0ru4
bitbucket.org/ejsvifq_mabmip/x86_64-elf-baremetal-toolchain

I suggest you download toolchain before making those ridiculous claims which are not even facts. Yes, the freestanding C++ toolchain does provide libstdc++, but only the freestanding part. Not any hosted part, including std::array. You are not allowed to use std::array if you want portable code.

Freestanding is a special case, not the normal one, and is only really intended to be used on embedded/bare-metal/kernels/etc., where most existing code would not work anyway. I was more thinking of your toolchain which you appeared to think is something appropriate for general use outside of those special circumstances.
Also, the current state of freestanding where pretty much all of the standard library is thrown out is not exactly considered to be a good thing... See also papers such as this one, which want to add a lot more features that should be able to work perfectly fine in any freestanding environment.

(partial) <source_location> <initializer_list> <type_traits>

Since unique_ptr is the memory header, it is of course not provided by freestanding. That is by definition what the standard requires. std::move, std::forward are in utility headers, of course, you are not allowed to use them too.

In fact, Ben Saks has planned to even remove headers like cfloat, new, typeinfo, exception, etc in freestanding since they are deeply problematic.

There are even things like std::addressof that rely on compiler magic, that C++ standard does not provide in freestanding.

std::addressof is exactly the kind of thing that basically everyone would like to see being usable in freestanding, yes. I see no reason why freestanding code should not be able to use something like that.

And yes, in a freestanding environment, you do not have std::array, std::unique_ptr, std::move, std::addressof, etc. That is a real fact defined by C++ standard. If you use any feature besides the header that the standard requires, your code will NOT be portable.

You must write those things by yourself if you want portability.
github.com/tearosccebe/fast_io/blob/master/include/fast_io_core_impl/freestanding/addressof.h

Also, things like unique_ptr will never be freestanding either since there is no way to guarantee the implementation will provide a heap. Providing things like unique_ptr would pull dependency on the heap.

Only if you pull in std::make_unique. Even std::default_delete doesn't necessarily do pull it in either since it can work with a specialization of delete that does not use an actual heap.

You do not have things like std::vector, std::string etc too, and will never have that. Particularly they have a dependency on std::logic_error for things like.at() .substr(), make them unusable in freestanding environment.

You just need to remove those from a freestanding version and all is fine (at least w.r.t. containers that don't use the heap). That's what P0829 does for std::array, std::optional, std::string_view and others, btw.

@swang206
Copy link

swang206 commented Feb 9, 2022

That is just very wrong. Why do you return std::unique_ptr<T> instead of just returning T? OOP Abuse?

Yes, they DO have issues with things like this. Google just benchmarked the abi issue with std::unique_ptr being not moved. They observed 1.6% performance lost in MACRO (not micro) benchmarks. (Without even considering things like this.) Yes, using unique_ptr is a huge performance bug.

Freestanding isn't any special case. In fact, over 95% of devices are embedded devices. C++ the language is totally misdesigned for environments in basically any environment, particularly embedded and kernel. Linus Torvalds is totally correct. C++ is terrible for the kernel because C++ was never designed for the environment it should be used.

That paper already proves WG21 is a joke and why you need orthodox C++, not modern C++. WG21 cares about adding security bugs like std::filesystem, std::format but ignoring the reality modern C++ is terrible for basically everything. They have a very bad priority on doing things.
https://www.youtube.com/watch?v=zozo8b7-nsw

When will be a working compiler for herbceptions??
When will be their fix to iostream? They really need to remove that and replace it with something else like fast_io instead. It is funny entire WG21 committee cannot find a solution for fixing iostream compared to one single person.
https://youtu.be/CefgZlXeMUg?t=1055

In fact, C++ is so problematic even for environments like wasm due to the lack of support of EH. I recently compile C++ to Lua, I would argue C++ EH would forever be an implementation issue that is totally unusable.

'''
Only if you pull in std::make_unique. Even std::default_delete doesn't necessarily do pull it in either since it can work with a specialization of delete that does not use an actual heap.
'''

Those are already very bad. Ben Craig has said those things like heap won't be freestanding. You do not understand how things work at all tbh.

In embedded systems, you usually do not have a heap. You might argue you can add a heap to that. (I am sure embedded folks won't due to non-deterministic the heap introduces)

However, In the kernel, they usually have multiple heaps.
Windows kernel, for example, they have two heaps. One is interrupt-safe, another is none interrupt-safe. Assuming default to any of them is just totally wrong. Forcing a global default new is just impossible.

Not mentioning other issues like how to report allocation failures. If they said those facilities should just fail fast and make global operator new being noexcept, it would probably be better. However, there will be big company morons who whine about "our programs are too important, it cannot crash for some other shit"

Similar things like floating point, although CPU does provide floating-point facilities, using floating-point will force the OS kernel to save XMM registers for syscalls. That slows down syscall performance, making using floating-point not possible either.

I would argue only std::array (without at method) and std::span could be in the freestanding. (Yes, i see them as disasters too)
std::optional and std::string_view are designed in the way using EH for reporting logic_error too much, which makes them totally useless.

@GabrielRavier
Copy link

GabrielRavier commented Feb 10, 2022

That is just very wrong. Why do you return std::unique_ptr instead of just returning T? OOP Abuse?

Uh, I assumed you were complaining about using std::unique_ptr<T> in place of T *. Of course if one uses std::unique_ptr<T> instead of just T where it's not necessary, then of course that's just stupid, but it's the same as if you were to use T * everywhere for no reason, really (which is a pattern I happen to see often in C code, btw...). Also, it looks to me like Dictionary is part of a virtual type hierarchy, which would make it be a lot more logical to use pointers for it (and thus std::unique_ptr where appropriate). Of course, you could then complain about inheritance and all things like it, but std::unique_ptr can't be blamed for that.

Yes, they DO have issues with things like this. Google just benchmarked the abi issue with std::unique_ptr being not moved. They observed 1.6% performance lost in MACRO (not micro) benchmarks. (Without even considering things like this.) Yes, using unique_ptr is a huge performance bug.

Could I get a relevant link to a source on this ? Your description of the problem is extremely vague and I can only vaguely guess what the actual problem they're talking about it (it seems doubtful that they're talking one-line wrapper functions...)

Freestanding isn't any special case. In fact, over 95% of devices are embedded devices. C++ the language is totally misdesigned for environments in basically any environment, particularly embedded and kernel. Linus Torvalds is totally correct. C++ is terrible for the kernel because C++ was never designed for the environment it should be used.

That paper already proves WG21 is a joke and why you need orthodox C++, not modern C++. WG21 cares about adding security bugs like std::filesystem, std::format but ignoring the reality modern C++ is terrible for basically everything. They have a very bad priority on doing things.
youtube.com/watch?v=zozo8b7-nsw

While I can see how some of std::filesystem's design leads to security flaws in certain contexts (certainly the lack of a descriptor-based API is a flaw, although not that much avoidable considering how much work it was to get the API we have to be portable everywhere), I don't see the problem with std::format in that regard. User-controlled format strings are an obvious problem, and it's pretty much the same as printf in that regard. The OOM vulnerability mentioned in the video you link to later (You seem to have shuffled them up for some reason) is barely anything more than what printf would give you - in fact printf just straight up gives you the capacity to write to memory with %n, and it's not gonna be hard to cause a segfault or something like that even if you remove %n like bionic does.

When will be a working compiler for herbceptions??
When will be their fix to iostream? They really need to remove that and replace it with something else like fast_io instead. It is funny entire WG21 committee cannot find a solution for fixing iostream compared to one single person.
youtu.be/CefgZlXeMUg?t=1055

Ah, yes, C++ should completely remove one of their core libraries and immediately replace it with this random other library. This will totally not cause massive backwards compatibility problems...

In fact, C++ is so problematic even for environments like wasm due to the lack of support of EH. I recently compile C++ to Lua, I would argue C++ EH would forever be an implementation issue that is totally unusable.

'''
Only if you pull in std::make_unique. Even std::default_delete doesn't necessarily do pull it in either since it can work with a specialization of delete that does not use an actual heap.
'''

Those are already very bad. Ben Craig has said those things like heap won't be freestanding. You do not understand how things work at all tbh.

What do you mean by that ? If you incorrectly use std::unique_ptr with std::default_delete and that results in the program trying to link to a heap where you can't have one (or can't have a sane default one that works everywhere), then it'll just fail linking. You'll just have to use std::unique_ptr correctly as a manager for a non-heap resource.

In embedded systems, you usually do not have a heap. You might argue you can add a heap to that. (I am sure embedded folks won't due to non-deterministic the heap introduces)

However, In the kernel, they usually have multiple heaps.
Windows kernel, for example, they have two heaps. One is interrupt-safe, another is none interrupt-safe. Assuming default to any of them is just totally wrong. Forcing a global default new is just impossible.

Well of course if you're working within a context where you have several custom heaps you might want to use, then you should just make it so that neither is the default (i.e. don't make new be either) and have people manually use the correct one. That doesn't preclude something like std::unique_ptr from existing, as long as you set it up to use the correct heap.

Not mentioning other issues like how to report allocation failures. If they said those facilities should just fail fast and make global operator new being noexcept, it would probably be better. However, there will be big company morons who whine about "our programs are too important, it cannot crash for some other shit"

Similar things like floating point, although CPU does provide floating-point facilities, using floating-point will force the OS kernel to save XMM registers for syscalls. That slows down syscall performance, making using floating-point not possible either.

Yes, and the paper doesn't want to just add them to freestanding, obviously (although it does discuss the possibility of adding it for micro-controllers with perfectly functioning FPUs that just don't otherwise have a full hosted implementation, but it's just a potential suggestion right now)

I would argue only std::array (without at method) and std::span could be in the freestanding. (Yes, i see them as disasters too)
std::optional and std::string_view are designed in the way using EH for reporting logic_error too much, which makes them totally useless.

Why would you need EH to be able to have a usable std::optional ? A freestanding version can just remove std::optional::value and that'll be it. People will be able to use it safely with std::optional::has_value and std::optional::operator*.

Same thing for std::string_view, just remove std::string_view::at, std::string_view::copy, std::string_view::substr and a few specializations of std::string_view::compare and you've still got a mostly functional version of it (although I do agree it is a bit more limited than std::optional in that regard...)

@PedroAreiasIST
Copy link

PedroAreiasIST commented Feb 10, 2022

If only values of intrinsic types are allowed to be returned, most of C++ ownership/resource management problems disappear.
Ownership is delegated upwards.
An example without return types?
Sparse matrix multiplication with CSR (compact sparse row) in 5 levels:

  1. Allocate row space for the result <- ownership of caller -> level1(A,B,C.NR)
  2. Calculate row starting indices for the result. -> level2(A,B,C.NR,C.IC)
  3. Allocate column space for the result and float space (NNZ) <- ownership of caller -> level3(A,B,C.NR,C.IC,C.NNZ)
  4. Calculate column indices for each row -> level4(A,B,C.NR,C.IC,C.JC,C.NNZ)
  5. Perform the floating point multiplication ->level5(A,B,C.NR,C.IC,C.JC,C.NNZ,C.VALUES)

I would say that a deep-copy smart pointer can be useful. My view is to adopt value semantics everywhere with dependence management being dealt by a directed graph.

Is this the ultra-orthodox C++ ?

@swang206
Copy link

swang206 commented Feb 10, 2022

Uh, I assumed you were complaining about using std::unique_ptr in place of T *. Of course if one uses std::unique_ptr instead of just T where it's not necessary, then of course that's just stupid, but it's the same as if you were to use T * everywhere for no reason, really (which is a pattern I happen to see often in C code, btw...). Also, it looks to me like Dictionary is part of a virtual type hierarchy, which would make it be a lot more logical to use pointers for it (and thus std::unique_ptr where appropriate). Of course, you could then complain about inheritance and all things like it, but std::unique_ptr can't be blamed for that.

std::unique_ptr should definitely be blamed for that since it makes the abuse so easy.

Could I get a relevant link to a source on this ? Your description of the problem is extremely vague and I can only vaguely guess what the actual problem they're talking about it (it seems doubtful that they're talking one-line wrapper functions...)

https://releases.llvm.org/13.0.0/projects/libcxx/docs/DesignDocs/UniquePtrTrivialAbi.html

Google has measured performance improvements of up to 1.6% on some large server macrobenchmarks, and a small reduction in binary sizes.

While I can see how some of std::filesystem's design leads to security flaws in certain contexts (certainly the lack of a descriptor-based API is a flaw, although not that much avoidable considering how much work it was to get the API we have to be portable everywhere), I don't see the problem with std::format in that regard. User-controlled format strings are an obvious problem, and it's pretty much the same as printf in that regard. The OOM vulnerability mentioned in the video you link to later (You seem to have shuffled them up for some reason) is barely anything more than what printf would give you - in fact printf just straight up gives you the capacity to write to memory with %n, and it's not gonna be hard to cause a segfault or something like that even if you remove %n like bionic does.

std::format is a HUGE issue.

  1. Implementation is insanely complex. Nobody can guarantee it is 100% bug-free. https://github.com/bminor/glibc/tree/master/stdio-common
    https://github.com/bminor/glibc/blob/master/stdio-common/vfprintf-internal.c
    From the experience of glibc, they even told you it is bugged. Even after 30 years of development. Anytime you have a bug with std::format, the attacker can do whatever they want.
  2. Even ignoring the implementation issue, it is still extremely bad. Not just denial of service issue. The attacker can exploit that to fill all your disk and even worse if someone prints data to socket with std::format, they can exploit it to create bot net to attack other computers.

Log4j is exactly the format string vulnerability same with std::format. The history has proven format string is a historical mistake and a huge security hazard no matter how you deal with it.

https://www.netsparker.com/blog/web-security/format-string-vulnerabilities/

What you really need is none format string solutions for daily work. Sure you might argue sometimes you need to use them for localizations, but there is no reason you need powerful functionalities of std::format does, if your argument is just localizations. Things like width and floating point precisions should NEVER EVER be part of localization format string and the functionalities should just NEVER fail.

Just

print("Hello World",3);

instead of

format_print("Hello World{}",3);

Ah, yes, C++ should completely remove one of their core libraries and immediately replace it with this random other library. This will totally not cause massive backwards compatibility problems...

Nobody said iostream should just be immediately removed or something. It has to go through a transition period.
In fact, fast_io does provide compatibility layer for C++ stream.
https://github.com/tearosccebe/fast_io/blob/c6d7e9d73bd3dfccc995b39666db836e885400a6/include/fast_io_legacy_impl/filebuf/filebuf_file.h#L39
Guess what? No unique_ptr nonsense. Not only it provides an RAII wrapper for std::filebuf, but allows you to construct std::filebuf from NT, win32 handle, posix fd and FILE*. It also allows you to get HANDLE, fd and FILE* from std::filebuf which is something C++ fstream does not provide. They do not expose native handle and you cannot use them with OS apis.

You have to do something to iostream finally.

  1. 50% of C++ projects ban iostream, including GCC and LLVM (even compiler vendors themselves ban it).
  2. They have caused enough issues and dialects. Issues like std::filesystem being TOCTOU is exactly because iostream cannot play the role what POSIX fd should play.
  3. Standard keeps adding iostream dialects like std::filesystem, std::format and also future networking, process stuffs. C++ networking has become a dead horse since it is an iostream dialect. Lack of cryptography issue is totally problematic for networking. These things are just doing what exactly iostream should or could do (because POSIX file descriptors work this way), but causing integration issues.
  4. They bloat binary size since compilers do not remove dead virtual functions.
  5. iostream is not thread-safe due to its usage of std::locale. locale is not thread-safe.
  6. iostream is not exception-safe either.
  7. Both stdio and iostream are EXTREMELY EXTREMELY slow. Even the fastest implementations are at least 10x slower than fast_io. Some implementations can slow up to 50000 times.

https://www.youtube.com/watch?v=grWw7j54KEY

What do you mean by that ? If you incorrectly use std::unique_ptr with std::default_delete and that results in the program trying to link to a heap where you can't have one (or can't have a sane default one that works everywhere), then it'll just fail linking. You'll just have to use std::unique_ptr correctly as a manager for a non-heap resource.

Fail to link is exactly the issue. You won't be able to easily what's wrong with it. You won't see "malloc symbol does not defined", instead it might show things like it requires _sbrk(), _fork() syscalls etc. How are you going to implement them if you do not know how those functionalities work?

Using std::unique_ptr as a manager for a non-heap resource is EXACTLY the issue. Why not just write a new c_file class instead of using std::unique_ptr<FILE,std::function<decltype(fclose)>>?

You think code like

std::unique_ptr<std::FILE,std::function<decltype(fclose)>> uptr(fopen("a.txt","wb"));
if(!uptr)
     throw std::system_error(errno,std::generic_category());

is more readable than just defining your only class and doing all the error handling inside the constructors? Just writing a new class not only making the code more readable and less error prone but also faster.

c_file cf("a.txt",open_mode::out);

https://github.com/tearosccebe/fast_io/blob/c6d7e9d73bd3dfccc995b39666db836e885400a6/include/fast_io_legacy_impl/c/impl.h#L974

I can even add a lot of other functionalities like allowing constructing c_file with a directory entry (that would avoid TOCTOU, another functionality of POSIX what C++ std::filesystem lacks of) and allowing c_file to be constructed with NT handle, win32 HANDLE and POSIX fd. I can even provide functionalities for setting permissions. Are those things you can easily do with your unique_ptr abuse without crazy amount of factory functions?

#if !defined(__AVR__)
	basic_c_family_file(basic_posix_file<char_type>&& phd,open_mode om):
		basic_c_family_io_observer<family,ch_type>{details::my_c_file_open_impl(phd.fd,om)}
	{
		phd.fd=-1;
	}
#if (defined(_WIN32)&&!defined(__WINE__)) || defined(__CYGWIN__)
//windows specific. open posix file from win32 io handle
	template<win32_family wfamily>
	basic_c_family_file(basic_win32_family_file<wfamily,char_type>&& win32_handle,open_mode om):
		basic_c_family_file(basic_posix_file<char_type>(::fast_io::freestanding::move(win32_handle),om),om)
	{
	}
	template<nt_family nfamily>
	basic_c_family_file(basic_nt_family_file<nfamily,char_type>&& nt_handle,open_mode om):
		basic_c_family_file(basic_posix_file<char_type>(::fast_io::freestanding::move(nt_handle),om),om)
	{
	}
#endif
	basic_c_family_file(native_fs_dirent ent,open_mode om,perms pm=static_cast<perms>(436)):
		basic_c_family_file(basic_posix_file<char_type>(ent,om,pm),om)
	{}
	template<::fast_io::constructible_to_os_c_str T>
	basic_c_family_file(T const& file,open_mode om,perms pm=static_cast<perms>(436)):
		basic_c_family_file(basic_posix_file<char_type>(file,om,pm),om)
	{}
	template<::fast_io::constructible_to_os_c_str T>
	basic_c_family_file(native_at_entry nate,T const& file,open_mode om,perms pm=static_cast<perms>(436)):
		basic_c_family_file(basic_posix_file<char_type>(nate,file,om,pm),om)
	{}
#endif

In reality, people just write shit like std::unique_ptr<FILE,std::function<decltype(fclose)>>, causing performance issues and security hazard (due to introducing of dynamic dispatch silently).

There is no way you can use std::unique_ptr to manage non-heap resource without paying for extra overhead. Even you use things like lambda etc, it will bloat binary size for no reason due to compiler would treat the lambda with function bodies as different types.

And how do you deal with resources that are not pointers with unique_ptr? Like POSIX fd? unique_ptr in reality is always just an abuse.

https://youtu.be/zxFNnOFYCkI?t=888
Not mentioning a lot of people abuse it for creating data structures like linked list, leading to stack overflow because of the recursive calls of destructors.

Conclusion: using unique_ptr is always just wrong.

Why would you need EH to be able to have a usable std::optional ? A freestanding version can just remove std::optional::value and that'll be it. People will be able to use it safely with std::optional::has_value and std::optional::operator*.

Same thing for std::string_view, just remove std::string_view::at, std::string_view::copy, std::string_view::substr and a few specializations of std::string_view::compare and you've still got a mostly functional version of it (although I do agree it is a bit more limited than std::optional in that regard...)

It is so easy to enter the place you do not want to go. Particularly in the hosted environment. Since it throws EH, it would randomly bloat binary size and you have no way to detect them without using a freestanding toolchain. I was to fix a bug that relates to using std::string_view' substr, causing serious binary bloat. What i end up is just banning entire std::string_view, because it is too awful.

Not mentioning with other problems like silently introducing std::string as dependency in the header files, causing enormous amount of pain of compilation time.

@GabrielRavier
Copy link

GabrielRavier commented Feb 10, 2022

std::unique_ptr should definitely be blamed for that since it makes the abuse so easy.

Why ? It doesn't make it any easier than T *...

releases.llvm.org/13.0.0/projects/libcxx/docs/DesignDocs/UniquePtrTrivialAbi.html

Welp, you do have a point here. I actually already knew about this, and it is true that std::unique_ptr has found itself disadvantaged by the Itanium ABI, although this is more of an implementation-specific issue than anything else, and the link you gave is about the solution to that, so... (note: I was objecting to your old statement because considering the way you formulated it, you appeared to be specifically arguing that the performance problems were caused by one-line methods in headers)

std::format is a HUGE issue.

  1. Implementation is insanely complex. Nobody can guarantee it is 100% bug-free. github.com/bminor/glibc/tree/master/stdio-common
    github.com/bminor/glibc/blob/master/stdio-common/vfprintf-internal.c
  2. From the experience of glibc, they even told you it is bugged. Even after 30 years of development. Anytime you have a bug with std::format, the attacker can do whatever they want.
    Even ignoring the implementation issue, it is still extremely bad. Not just denial of service issue. The attacker can exploit that to fill all your disk and even worse if someone prints data to socket with std::format, they can exploit it to create bot net to attack other computers.

Log4j is exactly the format string vulnerability same with std::format. The history has proven format string is a historical mistake and a huge security hazard no matter how you deal with it.

netsparker.com/blog/web-security/format-string-vulnerabilities

What you really need is none format string solutions for daily work. Sure you might argue sometimes you need to use them for localizations, but there is no reason you need powerful functionalities of std::format does, if your argument is just localizations. Things like width and floating point precisions should NEVER EVER be part of localization format string and the functionalities should just NEVER fail.

I mean, your point about format strings is interesting, but it seems odd to specifically bash C++ for adding format strings that are nicely usable in C++ code when this is something that has been done by most languages out there...

Also, the log4j format string vulnerability isn't exactly something inherent to format strings... the way I see it, the vulnerability is basically like if %s in printf would execute its operand as a shell command if it starts with ${system: or something stupid like that. It's not something inherent to format strings and I could very easily see it happening outside of there.. certainly Log4j could have had the same vulnerability without using a format string (i.e. if log.info("something '{}'.\n", userControlled) is vulnerable, I see no reason why an identical API except made without format strings wouldn't find itself just as much vulnerable when people do log.info("something '", userControlled, "'.\n")...).

Nobody said iostream should just be immediately removed or something. It has to go through a transition period.

Well I apologize for the misunderstanding then, the way you formulated it seemed to imply otherwise. I actually do share most of your grievances with iostream btw, I certainly do consider it to generally be pretty shit (although I have some major doubts about fast_io being the magic solution to it all...)

Fail to link is exactly the issue. You won't be able to easily what's wrong with it. You won't see "malloc symbol does not defined", instead it might show things like it requires _sbrk(), _fork() syscalls etc. How are you going to implement them if you do not know how those functionalities work?

??? Under what environment would you see that happen ? If you accidentally link in the entire libc to your kernel or something like that you're gonna have much bigger problems than this...

Conclusion: using unique_ptr is always just wrong.

Giving examples of obviously stupid usages of unique_ptr isn't gonna break it... in the same way I could say something like "using switch is always just wrong" on the basis of the massive amount of "oops forgot a break" fuckups.

It is so easy to enter the place you do not want to go. Particularly in the hosted environment. Since it throws EH, it would randomly bloat binary size and you have no way to detect them without using a freestanding toolchain. I was to fix a bug that relates to using std::string_view' substr, causing serious binary bloat. What i end up is just banning entire std::string_view, because it is too awful.

Not mentioning with other problems like silently introducing std::string as dependency in the header files, causing enormous amount of pain of compilation time.

You could also just, yunno, have the standard say that the toolchain #ifdefs out the exception-throwing methods when in freestanding mode... If the standard added such a thing (which is what P0829 is proposing) you'd be able to easily check if you've accidentally used the forbidden functions by compiling in freestanding mode (should be easy enough if you're writing a library intended for that kind of usage), which would directly give you an explicit "unknown method" error while compiling the offending TU.

@swang206
Copy link

swang206 commented Feb 11, 2022

Why ? It doesn't make it any easier than T *...

If it is T*, it would prevent you from using the feature because you have to worry about memory leak then you have to change your design. It forces a tax on you so you would create designs without pointer tricks like type-erasure for example. With unique_ptr, the tax is much smaller and people just abuse it everywhere. That is exactly the problem of C++ abstractions. There are no zero-overhead abstractions but people like you just think they are zero-overhead and abuse it everywhere. The unique_ptr just creates a huge amount of mess including tons of type-confusions, vptr injection, etc. unique_ptr abuse in the chromium code base is exactly why chromium is plagued with security vulns. Now they are saying they are banning unique_ptr and moving to shared_ptr now. However, it is just another wave of abuse.

Of course google just blames C++ for violating human rights and they are going to RIIR chromium.

I mean, your point about format strings is interesting, but it seems odd to specifically bash C++ for adding format strings that are nicely usable in C++ code when this is something that has been done by most languages out there...

Most languages are a bad excuse for justifying adding a feature into another language. Every language has its different use cases and design goals. Some features that make sense in another language do not necessarily work in another language.
One example is clearly garbage collection. 10 years ago, people were arguing about C++ should add garbage collections because just like you said "most languages have garbage collectors". And C++ did. Guess what? It has been proven a complete failure and will be removed in C++23. No compiler supports it and C++ people do not want them either.

The same failures happen with std::regex, what is that disaster right now? std::format is regex 2.0.

fast_io has no magic, it just prints parameters one by one with variadic templates. No magic at all. While things like std::format have tons of crazy amount of things behind them. Since it is simple, it actually works in windows kernel, linux kernel, bare-metal operating systems.

??? Under what environment would you see that happen ? If you accidentally link in the entire libc to your kernel or something like that you're gonna have much bigger problems than this...

Do you think C++ freestanding really works? Even GCC's --disable-hosted-libstdcxx just fails to build. I complained to GCC a lot to the point they hate me of course. They finally fixed it this year in GCC12.0.1 due to my complaints. Which has not even been released yet.

Oh you ask about clang? or MSVC? They do not even provide freestanding C++ implementation. Guess what? You do not even have cstdint. Thanks. Due to all the mess, WG21 created, like wasting time and resources adding shit like std::filesystem, charconv, std::format, instead of fixing freestanding.

In reality, most people just use newlib-cygwin (https://github.com/mirror/newlib-cygwin). That creates a huge amount of unreadable linkage errors.

Giving examples of obviously stupid usages of unique_ptr isn't gonna break it... in the same way I could say something like "using switch is always just wrong" on the basis of the massive amount of "oops forgot a break" fuckups.

Oh yeah, for typical modern C++ users like you, you would always think all C features are wrong yeah for sure.

You could also just, yunno, have the standard say that the toolchain #ifdefs out the exception-throwing methods when in freestanding mode... If the standard added such a thing (which is what P0829 is proposing) you'd be able to easily check if you've accidentally used the forbidden functions by compiling in freestanding mode (should be easy enough if you're writing a library intended for that kind of usage), which would directly give you an explicit "unknown method" error while compiling the offending TU.

EASY ENOUGH LOL. Said by a typical modern C++ user like you who live in the echo chamber who does not even know how to do Canadian compilation.

To test things like freestanding, I have to do Canadian compilation to build over 20 cross toolchains that run on windows, built on linux to ensure every system works correctly. But you are saying it is "EASY ENOUGH".

Dude, you have no idea what you are talking about.

@swang206
Copy link

swang206 commented Feb 11, 2022

Welp, you do have a point here. I actually already knew about this, and it is true that std::unique_ptr has found itself disadvantaged by the Itanium ABI, although this is more of an implementation-specific issue than anything else, and the link you gave is about the solution to that, so... (note: I was objecting to your old statement because considering the way you formulated it, you appeared to be specifically arguing that the performance problems were caused by one-line methods in headers)

LOL. Typical modern C++ user. Blaming the implementation not blaming the language itself. The language is poorly designed and ignores reality. Some jokes like "compilers should optimize exceptions" and other C++ abstractions are laughable.

The reality is that compilers are not GOD. They are not general-purpose AI either. If you truly believe compilers can deal with all the shit C++ "zero-overhead" abstractions created, you should believe compilers can just program itself instead. Oh, then why do you even need a compiler? The AI self programs to produce machine code. No reason to program anymore. So zero-overhead abstraction.

Zero overhead(cost) abstractions are one of the biggest lies in programming history. The reality is that compilers are f**king dumb. I never see a case compiler does the right thing. Of course, for people like you, you would just blame the implementation instead of blaming wg21 for creating low-quality libraries that are extremely harmful for compilers to deal with.

@swang206
Copy link

swang206 commented Feb 11, 2022

(i.e. if log.info("something '{}'.\n", userControlled) is vulnerable, I see no reason why an identical API except made without format strings wouldn't find itself just as much vulnerable when people do log.info("something '", userControlled, "'.\n")...).

No. it has a huge difference. Because you can do things like

log.info(userControlled, blah blah)

for format string.

The API allows you to control the format string. They can literally do whatever they want.

Anyone who defends format string = defend bugs and security vulns. Nothing more.

Format string has been proven trillion dollar mistake. much worse than pointer being nullable so you need std::optional modern C++ folks love to bitch about.

@GabrielRavier
Copy link

GabrielRavier commented Feb 11, 2022

If it is T*, it would prevent you from using the feature because you have to worry about memory leak then you have to change your design. It forces a tax on you so you would create designs without pointer tricks like type-erasure for example. With unique_ptr, the tax is much smaller and people just abuse it everywhere.

I mean yes, if you don't understand what a pointer is, you might misuse unique_ptr in such a way, even though unique_ptr is pretty clear on that...

That is exactly the problem of C++ abstractions. There are no zero-overhead abstractions but people like you just think they are zero-overhead and abuse it everywhere. The unique_ptr just creates a huge amount of mess including tons of type-confusions, vptr injection, etc. unique_ptr abuse

It seems to me like normal pointers also create the exact same "mess".

in the chromium code base is exactly why chromium is plagued with security vulns. Now they are saying they are banning unique_ptr and transferring to shared_ptr now. However, it is just another wave of abuse.

Uhhhh... I'm pretty sure they're actually transferring to a massive union w.r.t. the base::Value stuff... shared_ptr has been banned from Chromium for quite a long time now, btw

Of course google just blames C++ for violating human rights and they are going to RIIR chromium.

?????

Most languages are a bad excuse for justifying adding a feature into another language. Every language has its different use cases and design goals. Some features that make sense in another language do not necessarily work in another language.
One example is clearly garbage collection. 10 years ago, people were arguing about C++ should add garbage collections because just like you said "most languages have garbage collectors". And C++ did. Guess what? It has been proven a complete failure and will be removed in C++23. No compiler supports it and C++ people do not want them either.

Adding garbage collection was a bit stupid, yes. Format strings aren't exactly anything like it, though. GC was kind of useless because C++ is not a language made to be anywhere near that kind of stuff. Format strings are just a nice way to express how to format data, so there isn't much of a reason that they wouldn't be applicable to C++ just as much as to basically every other language.

The same failures happen with std::regex, what is that disaster right now? std::format is regex 2.0.

std::regex was a complete failure partly of its design and also because of its implementations. That doesn't mean the very idea of regexes is inapplicable to C++ lol.

Do you think C++ freestanding really works? Even GCC's --disable-hosted-libstdcxx just fails to build. I complained to GCC a lot to the point they hate me of course. They finally fixed it this year in GCC12.0.1 due to my complaints. Which has not even been released yet.

Oh you ask about clang? or MSVC? They do not even provide freestanding C++ implementation. Guess what? You do not even have cstdint. Thanks. Due to all the mess, WG21 created, like wasting time and resources adding shit like std::filesystem, charconv, std::format, instead of fixing freestanding.

While WG21 isn't the compiler vendors and they thus aren't exactly responsible for Clang and MSVC not shipping freestanding C++ implementations, I do agree that they share some part of the blame in that freestanding is presently in a pretty bad state, and thus not actually very useful. In fact, I'd say P0829 is perhaps the best chance of Clang/MSVC finally giving a shit about freestanding - if the functionality offered in there is actually useful in a non-negligible way beyond what freestanding C already offers, they might bother working on it.

Oh yeah, for typical modern C++ users like you, you would always think all C features are wrong yeah for sure.

Just in case you're actually serious and believe this, I would clarify that I do not, in fact, think all C features are wrong, or even that switch is.

While I do think having something like -Wimplicit-fallthrough on is pretty important for safety w.r.t. fallthroughs, and that it would have been nice for the language to perhaps have had a requirement for some kind of fallthrough; statement, I was only talking about it as an example of why your argumentation is obviously flawed.

EASY ENOUGH LOL. Said by a typical modern C++ user like you who live in the echo chamber who does not even know how to do Canadian compilation.

To test things like freestanding, I have to do Canadian compilation to build over 20 cross toolchains that run on windows, built on linux to ensure every system works correctly. But you are saying it is "EASY ENOUGH".

Dude, you have no idea what you are talking about.

I say that it would be "easy enough" to test whether you've used something like string_view properly if you have at least one easily available toolchain that has the proposed features from P0829. I'm not saying that it would be "easy enough" to check whether your code works everywhere. I know how much of a pain it is to check whether the same software works everywhere (I've done similar things myself multiple times and it's why I have about 20 Windows and Linux/BSD distribution VMs ready to be fired up to test whether something works on them).

LOL. Typical modern C++ user. Blaming the implementation not blaming the language itself. The language is poorly designed and ignores reality. Some jokes like "compilers should optimize exceptions" and other C++ abstractions are laughable.

I mean for sure WG21 deserves a bit of the blame on this, but it's quite clear that the issue isn't solvable in the standard unless you want to give up on std::unique_ptr for stupidly asinine reasons. That clang and libc++ are working towards solving this is great, though.

Also, on exception optimization, while it's a bit complicated to do, it's not impossible: it's just that nobody has really bothered because unless you're using exceptions in an arguably pretty wrong way, you're not gonna need such optimization.

No. it has a huge difference. Because you can do things like

log.info(userControlled, blah blah)

for format string.

The API allows you to control the format string. They can literally do whatever they want.

Anyone who defends format string = defend bugs and security vulns. Nothing more.

Format string has been proven trillion dollar mistake. much worse than pointer being nullable so you need std::optional modern C++ folks love to bitch about.

But that's not the issue that log4j had. The issue log4j had is essentially equivalent to if, say, doing print("something '", userControlled, "'.\n") with that fast_io library was vulnerable because it looks at all strings you give it to check if any of them contain ${jndi:address} and then connected to address (thus basically giving control to whoever controls the server considering how fucked the protocol is). That's not a format string vulnerability. It's just an incredibly stupid way to specify something like this.

@swang206
Copy link

swang206 commented Feb 11, 2022

It seems to me like normal pointers also create the exact same "mess".

That is clearly wrong. If there is no unique_ptr, you won't use pointers at all and you will write your own class for RAII. unique_ptr creates a mess of abusing them instead of writing your own class.

Adding garbage collection was a bit stupid, yes. Format strings aren't exactly anything like it, though. GC was kind of useless because C++ is not a language made to be anywhere near that kind of stuff. Format strings are just a nice way to express how to format data, so there isn't much of a reason that they wouldn't be applicable to C++ just as much as to basically every other language.

That is your opinion on whether it is stupid or not.

People who added those things have the same reasons as "yours":

  1. Garbage collection wasn't stupid since garbage collection DOES trivialize memory management in a lot of ways and improves the productivity of programmers.
  2. Garbage collection reduces memory fragmentation.
  3. Every mainstream language has garbage collection at that time.
    Those facts are proven in a lot of other languages.

Format strings are just a nice way to express how to format data, so there isn't much of a reason that they wouldn't be applicable to C++ just as much as to basically every other language.

Still, adding a feature that another language has is a very bad reason.

  1. format string is a direct violation of the zero-overhead principle (which is the language design goal of C++).
    It forces a tax on people who do not need format string due to the runtime bloat. This violates the first part of the zero-overhead principle. You do not pay for what you do not use.
    I can always do better by hand. This violates 2nd part of the zero-overhead principle
  2. C++ as a language is powerful enough to the point you do not need format string at all while providing even better expressiveness.
  3. Format string vulns are huge issues.
  4. format string also greatly hurts runtime performance, binary size, and security. Those are all facts and truth and proven by all languages.
  5. format string came from a time when compilers are too bad and they have to shift everything to the runtime. That is no longer true nowadays.
  6. Parsing format string introduces a lot of branches and indirections and memory access which are very bad for today's architectures.

std::regex was a complete failure partly of its design and also because of its implementations. That doesn't mean the very idea of regexes is inapplicable to C++ lol.

Yes, the truth is that it is inapplicable to C++. Same with std::format. Because those runtime parsing features are direct violations of the zero-overhead principle. Plus there are whiners in wg21 who just want to put locale everywhere. If they do not try to solve the fundamental issue which is to deprecate iostream and replace it with something else. Those features will forever be issues.

I mean for sure WG21 deserves a bit of the blame on this, but it's quite clear that the issue isn't solvable in the standard unless you want to give up on std::unique_ptr for stupidly asinine reasons. That clang and libc++ are working towards solving this is great, though.

That is NOT great at all. Since C++ does not have destructive move semantics, applying things like [[clang::trivial_abi]] will silently cause use-after-free due to mess up with the order of destructions. That has been explained by Chandler in cppcon2019. The fundamental problem is the language is poorly designed which makes those things happen in the first place.

https://youtu.be/rHIkrotSwcc?t=2379

Also, on exception optimization, while it's a bit complicated to do, it's not impossible: it's just that nobody has really bothered because unless you're using exceptions in an arguably pretty wrong way, you're not gonna need such optimization.

About EH, That is clearly false. C++ EH is a trillion-dollar historical mistake.
https://youtu.be/I_ffAFzi-7M
There are environments the deterministic of EH is extremely important. Like real-time systems.

C++ EH is a direct violation of the zero-overhead principle due to binary bloat and hurt on optimizations.
https://www.youtube.com/watch?v=ARYP83yNAWk
C++ EH is ridiculous slow to the point you should absolutely ban it. It is 100x slower than syscall. Using exception handling == denial of service since the attacker can just send invalid data to make your server keep throwing EH and hitting denial of service EXTREMELY quick.

Optimizations to EH solve nothing. You still need a dynamic allocation to throw and some sorts of RTTI to catch.

C++ EH is also extremely hard to implement. You still do not solve the issues in a lot of environments where you do not have EH implemented. I just recently compile C++ code to wasm and then use wasm2lua to translate wasm to lua. Do you think Lua is going to provide an EH mechanism as C++ does? That is impossible to implement.

https://youtu.be/_1Dob0kb8pw
https://youtu.be/_1Dob0kb8pw

C++ EH is a historical mistake and design failure.

because it looks at all strings you give it to check if any of them contain ${jndi:address} and then connected to address (thus basically giving control to whoever controls the server considering how fucked the protocol is). That's not a format string vulnerability. It's just an incredibly stupid way to specify something like this.

Why would fast_io check any of them contain ${jndi:address}? It just prints data dude. ${jndi:address} itself is a format string. That is exactly format string vulneralbility.

fast_io NEVER parses format string. Completely immune to any issues like this.

@swang206
Copy link

swang206 commented Feb 11, 2022

?????

????? for what? Google is blaming C++ for violating human rights because they said vulns in C++ helped CCP attack Uyghurs and Tibet's human rights groups. C++ = violating human rights.

If you do not RIIR your C++ code or if you start new C++ projects, you are violating human rights. That is what google said.

https://www.chromium.org/Home/chromium-security/memory-safety/

They referenced https://alexgaynor.net/2019/aug/12/introduction-to-memory-unsafety-for-vps-of-engineering/ here. This Alex Gaynor is one of the core rust language dev who keeps writing shit about C and C++. Like C++ violates human rights. But Google agrees with it. Clearly, Google agrees C++ violates human rights.

Organizations which write large amounts of C and C++ inevitably produce large numbers of vulnerabilities that can be directly attributed to memory unsafety. These vulnerabilities are exploited, to the peril of hospitals, human rights dissidents, and health policy experts. Using C and C++ is bad for society, bad for your reputation, and it’s bad for your customers.
However, for a small group of people, it is. They’re nutritionists, they’re senior government officials, they’re human rights advocates, they’re Tibetans, and they’re Uyghurs. And for them their web browsers and OS kernels are bulwarks against oppressive forces. This may sound hyperbolic, but every one of those links contains a citation. And we know these tools aren’t living up to these users' hopes for their protection.

Oh yeah. C++ violates human rights.

@spacepluk
Copy link

spacepluk commented Feb 11, 2022

I don’t want to be a party pooper but maybe you should take this conversation somewhere else.

@GabrielRavier
Copy link

GabrielRavier commented Feb 11, 2022

@spacepluk Half of this thread is people debating about modern C++, including the very person who wrote the gist, so i don't really know if that's really needed 🤷

@swang206
Copy link

swang206 commented Feb 11, 2022

Because modern C++ is a mistake, you debate on it.

@GabrielRavier
Copy link

GabrielRavier commented Feb 11, 2022

That is clearly wrong. If there is no unique_ptr, you won't use pointers at all and you will write your own class for RAII. unique_ptr creates a mess of abusing them instead of writing your own class.

If you want to do "type-confusion", "vptr injection" or other things that are possible using unique_ptr I see 0 reason why you wouldn't be able to do the exact same thing with normal pointers. The only difference is that you'll be more likely to have a bug in your app without unique_ptr.

That is your opinion on whether it is stupid or not.

People who added those things have the same reasons as "yours":

  1. Garbage collection wasn't stupid since garbage collection DOES trivialize memory management in a lot of ways and improves the productivity of programmers.
    Garbage collection reduces memory fragmentation.
  2. Every mainstream language has garbage collection at that time.
  3. Those facts are proven in a lot of other languages.

Just to be clear, it's more the obvious argumentation for why format strings still fit in C++'s paradigm, whereas GC never did, especially in the way it was specified. I'm not saying format strings fit perfectly, but they certainly are better than GC in that regard.

  1. format string is a direct violation of the zero-overhead principle (which is the language design goal of C++).
    It forces a tax on people who do not need format string due to the runtime bloat. This violates the first part of the zero-overhead principle. You do not pay for what you do not use.
    I can always do better by hand. This violates 2nd part of the zero-overhead principle

What do you mean by that ? If you don't want to use a format string, just don't. Are you talking about code that accidentally calls printf when they just intended to print some hardcoded message ?

  1. C++ as a language is powerful enough to the point you do not need format string at all while providing even better expressiveness.
  2. Format string vulns are huge issues.
  3. format string also greatly hurts runtime performance, binary size, and security. Those are all facts and truth and proven by all languages.
  4. format string came from a time when compilers are too bad and they have to shift everything to the runtime. That is no longer true nowadays.

Just as an aside, this is pretty funny when on the previous message you were saying that "compilers are f**king dumb"

  1. Parsing format string introduces a lot of branches and indirections and memory access which are very bad for today's architectures.

On your general point about format strings, I actually mostly agree. They're pretty restrictive compared to what you can do today in C++. I certainly think that std::format wasn't the best thing ever, I was just finding it odd that you were so vindictive against it.

As an aside, what do you think of a string interpolation proposall that would allow you to do something like fn"something '{getUserControlledString(someHandle)}'.\n" and have it be automatically translated to fn("something '", getUserControlledString(someHandle), "'.\n") ? It seems to me like the best compromise between simply building on C++'s current capabilities and getting people who want their pretty format strings to like it, and people who have a particular hatred for anything that looks like a format string will be able to just directly call fn instead. It doesn't have any tax in binary size, performance or security: the compiler just directly translates it to the function call, and it can't lead to a vulnerability since it's only a compile-time thing

Why would fast_io check any of them contain ${jndi:address}? It just prints data dude. ${jndi:address} itself is a format string. That is exactly format string vulneralbility.

fast_io NEVER parses format string. Completely immune to any issues like this.

I mean, the JNDI stupidity isn't really a format string. It's quite close, but it's not there. Format strings contain placeholders that are replaced by arguments to the function they're passed to. Meanwhile, Log4j takes the string you give it to print as the argument to its format strings, and then looks at specifiers in there and blindly goes off into the web to . It's incredibly stupid, and it has parallels with format strings, but I don't think you can call it that. In the same way, I would hope you wouldn't call system(userControlled) or popen(userControlled, "r") a "format string vulnerability". It involves doing processing on a string, but it's not a format string.

????? for what? Google is blaming C++ for violating human rights because they said vulns in C++ helped CCP attack Uyghurs and Tibet's human rights groups. C++ = violating human rights.

If you do not RIIR your C++ code or if you start new C++ projects, you are violating human rights. That is what google said.

chromium.org/Home/chromium-security/memory-safety

They referenced alexgaynor.net/2019/aug/12/introduction-to-memory-unsafety-for-vps-of-engineering here. This Alex Gaynor is one of the core rust language dev who keeps writing shit about C and C++. Like C++ violates human rights. But Google agrees with it. Clearly, Google agrees C++ violates human rights.

Organizations which write large amounts of C and C++ inevitably produce large numbers of vulnerabilities that can be directly attributed to memory unsafety. These vulnerabilities are exploited, to the peril of hospitals, human rights dissidents, and health policy experts. Using C and C++ is bad for society, bad for your reputation, and it’s bad for your customers.
However, for a small group of people, it is. They’re nutritionists, they’re senior government officials, they’re human rights advocates, they’re Tibetans, and they’re Uyghurs. And for them their web browsers and OS kernels are bulwarks against oppressive forces. This may sound hyperbolic, but every one of those links contains a citation. And we know these tools aren’t living up to these users' hopes for their protection.

Oh yeah. C++ violates human rights.

Wow... they put a link to an article talking about memory safety in their page about memory safety ? What an incredible thing to do ! This obviously means that they endorse every single word written in the quote you got from that article, no matter that the reason they linked to that article was actually to provide a source for a completely different citation from a different part of the article, or that half of your quote is actually from a completely different article published several months after that first article. And that's also totally exactly the same to if they said it themselves.

...Yunno, I don't like Google either, but this seems incredibly disingenuous...

@bkaradzic
Copy link
Author

bkaradzic commented Feb 11, 2022

IMO, it's appropriate to discuss things, as long it's respectful and civilized. Users who don't want to follow can unsubscribe to stop notifications.

@bkaradzic
Copy link
Author

bkaradzic commented Feb 11, 2022

Also this discussion actually brought up some interesting links...

@swang206
Copy link

swang206 commented Feb 11, 2022

Sounds like this debate shouldn't be here anymore.

If you want further discussion, go joining discord:
https://discord.gg/vMKhB9Q

As an aside, what do you think of a string interpolation proposall that would allow you to do something like fn"something '{getUserControlledString(someHandle)}'.\n" and have it be automatically translated to fn("something '", getUserControlledString(someHandle), "'.\n") ? It seems to me like the best compromise between simply building on C++'s current capabilities and getting people who want their pretty format strings to like it, and people who have a particular hatred for anything that looks like a format string will be able to just directly call fn instead. It doesn't have any tax in binary size, performance or security: the compiler just directly translates it to the function call, and it can't lead to a vulnerability since it's only a compile-time thing

I have talked about that proposal for a very long time. Yes, it is a much better solution compared to format string one (although you can still technically screw up with macros or something like that). (For me, I do not need interpolated string literal either, but there are losers who love to talk about "readability" which is extremely subjective.)

However, it still does not change the fact you need iostream replacement and finally deprecate and remove iostream + exception EH. Then this solution could only work under this context.
You still need a new IO library to make that work and std::format is clearly nowhere that solution, nor iostream.
You still have cases you do not want an interpolated string literal. One example is clearly to use them with macros. Neither format string nor interpolated string literal could deal with things like this.

interpolated string literal won't save you for things like this.
https://github.com/tearosccebe/fast_io/blob/03d74a72f377a2488364d6fa92a75bc4952b3f4c/examples/0007.legacy/construct_fstream_from_syscall.cc#L31

	println(
	"Unix Timestamp:",unix_ts,"\n"
	"Universe Timestamp:",static_cast<fast_io::universe_timestamp>(unix_ts),"\n"
	"UTC:",utc(unix_ts),"\n",
	"Local:",local(unix_ts)," Timezone:",fast_io::timezone_name(),"\n"
#ifdef __clang__
	"LLVM clang " __clang_version__ "\n"
#elif defined(__GNUC__) && defined(__VERSION__)
	"GCC " __VERSION__ "\n"
#elif defined(_MSC_VER)
	"Microsoft Visual C++ ",_MSC_VER,"\n"
#else
	"Unknown C++ compiler\n"
#endif
#if defined(_LIBCPP_VERSION)
	"LLVM libc++ ", _LIBCPP_VERSION, "\n"
#elif defined(__GLIBCXX__)
	"GCC libstdc++ ", __GLIBCXX__ , "\n"
#elif defined(_MSVC_STL_UPDATE)
	"Microsoft Visual C++ STL ", _MSVC_STL_UPDATE, "\n"
#else
	"Unknown C++ standard library\n"
#endif
	"fstream.rdbuf():",fiob.fb,"\n"
	"FILE*:",static_cast<fast_io::c_io_observer>(fiob).fp,"\n"
	"fd:",static_cast<fast_io::posix_io_observer>(fiob).fd
#if (defined(_WIN32) && !defined(__WINE__)) || defined(__CYGWIN__)
	,"\n"
	"win32 HANDLE:",static_cast<fast_io::win32_io_observer>(fiob).handle
#ifndef _WIN32_WINDOWS
//NT kernel
	,"\n"
	"zw HANDLE:",static_cast<fast_io::zw_io_observer>(fiob).handle,"\n"
	"nt HANDLE:",static_cast<fast_io::nt_io_observer>(fiob).handle
#endif
#endif
);

So the program will print out different information with different environments.

./construct_fstream_from_syscall
Unix Timestamp:1644618626.884474078
Universe Timestamp:434602343073853826.884474078
UTC:2022-02-11T22:30:26.884474078Z
Local:2022-02-11T17:30:26.884474078-05:00 Timezone:EST
GCC 12.0.1 20220209 (experimental)
GCC libstdc++ 20220209
fstream.rdbuf():0x00007ffe1dce8648
FILE*:0x0000000000def2a0
fd:3
./construct_fstream_from_syscall_clang
Unix Timestamp:1644618743.939916634
Universe Timestamp:434602343073853943.939916634
UTC:2022-02-11T22:32:23.939916634Z
Local:2022-02-11T17:32:23.939916634-05:00 Timezone:EST
LLVM clang 15.0.0 (https://github.com/llvm/llvm-project.git 85628ce75b3084dc0f185a320152baf85b59aba7)
GCC libstdc++ 20220209
fstream.rdbuf():0x00007ffccdf22240
FILE*:0x0000000001efb2a0
fd:3
wine ./construct_fstream_from_syscall.exe 
Unix Timestamp:1644618830.2005686
Universe Timestamp:434602343073854030.2005686
UTC:2022-02-11T22:33:50.2005686Z
Local:2022-02-11T18:33:50.2005686-04:00 Timezone:Eastern Daylight Time
GCC 12.0.0 20211231 (experimental)
GCC libstdc++ 20211231
fstream.rdbuf():0x000000000031fb18
FILE*:0x000000006a0fa250
fd:3
win32 HANDLE:0x0000000000000048
zw HANDLE:0x0000000000000048
nt HANDLE:0x0000000000000048

Wow... they put a link to an article talking about memory safety in their page about memory safety ? What an incredible thing to do ! This obviously means that they endorse every single word written in the quote you got from that article, no matter that the reason they linked to that article was actually to provide a source for a completely different citation from a different part of the article, or that half of your quote is actually from a completely different article published several months after that first article. And that's also totally exactly the same to if they said it themselves.

...Yunno, I don't like Google either, but this seems incredibly disingenuous...

Oh yeah, Google founded things like rust-for-linux, etc (guess what? That Alex Gaynor (a (former?) Mozilla employee)) is the lead of the project. Of course they write unsafe hell, but hey! C++ is a dead language, if you use it you are a loser, and human rights violator.

Clearly, it is nothing wrong to say Google says "C++ violates human rights".

@DBJDBJ
Copy link

DBJDBJ commented Feb 13, 2022

@bkaradzic, with all the respect, I can't help but stand confused why don't you move this gist to a standalone standard separate repository?

@bkaradzic
Copy link
Author

bkaradzic commented Feb 13, 2022

@DBJDBJ I don't see how that would be beneficial?

@DBJDBJ
Copy link

DBJDBJ commented Feb 13, 2022

The content might be just one document. Succinct message. At safe distance from discussions.

It all depends on what do you want to do with this. It has nicely grown. Perhaps it needs a bigger place to live.

@aerosayan
Copy link

aerosayan commented Apr 16, 2022

@bkaradzic i've been using orthodox c++ for some time ... with some well thought out use for some necessary modern c++ features.

@DBJDBJ , having this information in a separate repo might be useful ... however reading the counter-arguments provided here by others are also useful and necessary for new adopters to fully understand the nuances of where modern c++ goes wrong.

Thanks

@bkaradzic
Copy link
Author

bkaradzic commented Apr 16, 2022

@aerosayan

with some well thought out use for some necessary modern c++ features

What you said here is actually the key... "well thought out use" is opposite of "blindly following the latest"

@aerosayan
Copy link

aerosayan commented Apr 17, 2022

@bkaradzic

IMHO new devs blindly follow the rules because they blindly trust other experienced developers. I have observed that problems occur due to new devs not understanding that the experienced devs are giving them generic advice and not something that's good for every industry.

If you're making a customer service app, you would need features like multiple inheritance or reference counter based shared pointers. However those things become problematic when writing any multi-threaded high performance (low latency and/or high throughput) code.

I think we can help new devs select programming language features they should use, based on few simple tests. Since we can't give generic advice, I would like to share why I think Orthodox C++ makes code reliable at least for game engine development or numerical solver development :

  1. Are the language features used easy for the compiler to optimize?
  • Things like multiple inheritance, deeply nested operator overloads, become a little bit problematic because c++ compilers may not be able to optimize them correctly. Sometimes I needed to manually check the disassembly to verify that the code was actually optimized. This makes the code less reliable. Also, I would like the code to perform good even in debug mode, when all the optimizations may not be turned on. If I can't trust the code to be optimized on my current compiler and any compiler or platform I may use in the future, I will not use it. This is extremely important, because I may be developing on Linux, but I would like my code to reliably run on Windows when compiled with MSVC.
  1. Are the language features used behave consistently on every platform and compiler?
  • This is primarily the reason to not use STL if your application needs to perform reliably and consistently on every platform. I have observed that the STL implementations on linux and windows are sometimes different and that can cause a lot of problems. For example ... few years ago I observed that std::set on linux gcc was allocating memory for 3 elements but on windows msvc was allocating a lot more. I don't have the exact code to demonstrate it now, but this stackoverflow question shows a similar problem https://stackoverflow.com/questions/57031917/ where the bucket sizes used for std::deque was different on different compilers. Apparently it's due to the standard not specifying how the container should be implemented. This can be a serious problem if you want consistent behavior on every platform and compiler. So using STL might be a death sentence for our code. It's not that STL is bad, but simply that we can't be 100% sure that every implementation of STL will behave consistently.
  1. Are the language features safe to use?
  • Nothing gets me more enraged than stackoverflow and reddit c++ fanatics recommending std::shared_ptr to make your code safe. They use an atomic reference counter and when modified from different threads (like in a multi-threaded rendering engine), the performance gets obliterated. https://stackoverflow.com/questions/31254767 Sure ... your code is free of memory leaks and doesn't suffer from race conditions, but it comes at a severe price. Someone may argue that my code was wrong for creating the race condition, and they would be right ... however I would rather have my code crash severely due to a race condition, so I could fix it, instead of slowly destroying the code's performance over time.

I don't have more to add at this moment, but if new devs are taught how Orthodox C++ tries to help them create reliabile and consistent code, and how abuse of modern c++ can lead to bad code, they will be able to avoid many of the mistakes that we made in the past.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment