Skip to content

Instantly share code, notes, and snippets.

@Orvid
Last active September 8, 2020 21:00
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save Orvid/5c9bc8c54e960a604968 to your computer and use it in GitHub Desktop.
Save Orvid/5c9bc8c54e960a604968 to your computer and use it in GitHub Desktop.

Alright, so, I'm going to be going a bit overboard and giving a full analysis of the changes I've made for HHVM support, which changes were needed because of limitations of the MSVC compiler, standard library, or general design philosophy of the OS (such as file descriptors vs. HANDLE's). I'm also going to be posting this in response to a question asked by one of the HHVM team on the primary PR containing the actual workarounds here, and I don't exactly expect this to be "small", as it's going to be the result of me reviewing the ~300 PRs I've made to HHVM, ~60 PRs to Folly and ~10 to FBThrift.

Now, as a brief warning, and also for some clarity, some of my names for various language constructs may be completely made up, but I aim to include code samples wherever I can to help get the point across. Also, for those who don't know, HHVM is currently tested under GCC 4.8, GCC 4.9, and Clang 3.6. While it does compile with GCC 5.0+, it doesn't actually work, likely due to some optimizations it's doing. When I refer to GCC and Clang in this, I am referring to the specific versions tested. When I refer to MSVC itself, I refer to version 19.00.23026, which is what shipped with Visual Studio 2015 RTM.

  1. So, first up, I'll start with the issues with MSVC's implementation of the C++ standard, or what I'm pretty sure is part of the spec.

  2. offsetof use in constexpr contexts. I was just a bit slow at the draw to get the fix for this into RTM, I filed it as Visual Studio Feedback #1414341 and it was marked as closed within a few days of the RTM release. This is the cause of the largest changes to workaround, and is one of the workarounds added as part of HHVM #5984 (D44535).

  3. constexpr constructor errors with union members with bitfields. This is one that I didn't file until after the RTM release, as it wasn't possible for me to hit this in the RC version, as constexpr constructors didn't work in the RC. This is filed as Visual Studio Feedback #1571281 and has already been marked as fixed for Update 1. This is one of the workarounds added as part of HHVM #5984 (D44535).

  4. Templated function pointer overload resolution failure. This was one that I noticed fairly early, and filed as Visual Studio Feedback #1464651. It was marked as fixed while I was writing this, and is one of the workarounds added as part of HHVM #5984 (D44535).

  5. Now, I'm not certain if this next one is part of the spec or not, and is the earliest workaround for MSVC that I submitted to HHVM, but it does work with GCC and Clang. I have not submitted a bug report on this, although one may exist. Specifically, when defining an enum class with a value that has the same name as an outer type, MSVC complains about attempting to redefine the outer type: Cpp struct Accessor { enum class Kind { Empty, Accessor, }; }; This was worked around in HHVM #5622 (D41511) by simply renaming the member of the enum class from Accessor to AccessorKind. There was another name conflict in HHVM #5636 (D41589), but I'm not sure if that was valid to do in the first place.

  6. In-class initialization of non-primitive static variables. This is currently guarded by a warning-as-error in MSVC, and is also explicitly mentioned in the blog post announcing the RTM version, so I haven't reported it as a bug. HHVM #5971 (D44457) and pieces of Folly #305 are the types of workaround that was done for this. In certain cases, such as HHVM #6061 (D44895) it is possible to work around the issue by simply changing the value to a function.

  7. Next up, a parser error that I had to chase for a while before I figured out the actual issue, because, sometimes even the best of us can overlook the obvious. In this case, the source of the syntax error is in the chain of template instantiations. I did not report this bug, as I'd already worked around it, and, at the time, I was failing at reproducing it in a test that wasn't dependent upon all of HHVM. While typing this up, I was able to reproduce it in a reasonable test case, and have just reported it as Visual Studio Feedback #1832242. I resolved this in HHVM #6020 (D44781) by switching from curly braces to parenthesis, and in Folly #277 by adding an = before the opening curly bracket.

  8. In certain situations, MSVC will also attempt to construct a const type, even though a const wasn't requested. This particular case worked just fine under the RC version of MSVC 2015, but the RTM version doesn't like it, leading to HHVM #6108 (D45249) being needed. Due to the lack of an isolated test-case, this hasn't been submitted as a bug.

  9. [[noreturn]] - You can't currently use [[noreturn]] on a templated function. I hadn't reported this, as I hadn't tested it under any other compiler, and I didn't know if it was actually valid under the spec, but I did test Clang while writing this, and it didn't have problem with it, after riding out a bit of grumpyness from the feedback system, so this is now filed as Visual Studio Feedback #1835949. This is not currently used in HHVM, but I discovered it when I tried to switch all of the noreturn functions to use the C++ 11 attribute in HHVM #6291 (D47385). That PR is not likely to be merged for other reasons though. And, of course, a test case: ```Cpp template [[noreturn]] void func() { throw T(); }

int main() {
  func<int>();
}
```
  1. decltype produces default arguments in its type. At least, I think that's the issue. Either way, through a macro that is used to keep support for multiple ABIs sane, HHVM ends up doing something like this (stripped of it's body and it's actual functionality): ```Cpp #include <boost/type_traits.hpp>
int abi(int kind = 1);
int abi(int kind) {
  return ([&]() -> boost::function_traits<decltype(abi)>::result_type {
    return 0;
  }());
}
```
Which results in this:
```
error C2383: 'Function': default-arguments are not allowed on this symbol
note: see reference to class template instantiation 'boost::function_traits<int (int)>' being compiled
```
Due to some fun, this ended up being reported as [Visual Studio Feedback #1839232](https://connect.microsoft.com/VisualStudio/feedback/details/1839232), and is also dependent upon Boost v1.58.0, as I haven't tested Boost v1.59.0. I attempted to work around this issue in [HHVM #6257](https://github.com/facebook/hhvm/pull/6257) ([D47019](https://reviews.facebook.net/D47019)), however some concerns were raised about `auto` discarding refs, so I was unsure how to handle this, until I randomly ended up reading through the release notes for VS 2015, and noticed `decltype(auto)` which does exactly what we need it to here. It's worth noting that, IIRC, C++ 14 allows lambdas to have default arguments, I'm not sure if that would solve this particular issue though.
  1. I haven't reported this one, but the /analyze pass doesn't seem to know about the C99 format specifiers that were added with MSVC 2015, and causes spurious warnings when they are used. This is triggered a lot of places in HHVM.

  2. I've also not reported this one, but the warning about comma operators within an array index expression, C4709 doesn't seem to be functioning correctly, because it triggers on v << load{func[argc * 8 + pTabOff], dest};, where load is a type, and func has an overloaded [] operator. env.insertedPhis[PhiKey { block, alocID }] is another example of what will trigger it.

  3. Next, there are a few issues with the standard library implementation that I've encountered.

  4. Using std::atomic<char*> as the condition to an if statement results in an error complaining about an ambiguous user-defined-conversion. I filed this as Visual Studio Feedback #1464842 just a bit after the cutoff for changes to make it into the RTM release. I work around this in HHVM #6008 (D44673) and Folly #305 by simply explicitly loading the value.

  5. Attempting to wait for a std::condition_variable with a std::chrono::duration<double> fails. This is one that I've known existed, but hadn't dug into after figuring out the workaround, and, as such hadn't submitted a bug report about it. This has been reported as Visual Studio Feedback #1839243. This is one of the workarounds added as part of HHVM #5984 (D44535). And, the obligitory test case: ```Cpp #include #include <condition_variable> #include

using namespace std;

void idontwanna() {
  mutex m_mutex;
  condition_variable m_cv;
  unique_lock<mutex> lock(m_mutex);
  m_cv.wait_for(lock, chrono::duration<double>(0));
}
```
  1. An std::function cannot be constructed by passing 0. GCC, Clang, and even the Boost implementation allow this, but the MSVC implementation does not, leading to FBThrift #108 where I just don't pass anything at all.

  2. And now, a few issues with the actual tooling, so Visual Studio itself.

  3. The first issue is mostly an issue of the linker not telling you what it's doing. HHVM is giant, and produces a very large set of input libraries to the linker, in full debug mode, the core of the HHVM runtime can surpass 4.5gb. Now, the 32-bit hosted 64-bit linker can't handle a single static library that size, but it can handle many smaller static libaries, however, when the total input set is larger than a certain point, the linker gets absurdly slow. A normal full link is ~3 minutes, but with a large enough input set, that can turn into ~45 minutes. I believe this is it switching to a "low-memory" mode, as, when the linker is doing this, it is doing an absolutely absurd amount of I/O operations (60-70k/second), and, although very few of these actually hit disk, they still incur the cost of the system call to read it from the cache. This is easily fixed by switching to the 64-bit hosted 64-bit toolchain, as I do in a slightly roundabout way in HHVM #6127 (D45351) due to HHVM being built with CMake, however I believe it would be a good idea to have the linker at least output a warning when it enters this mode, as it took me quite a while to figure out what the actual issue was.

  4. I haven't reported this second one, but I believe it is because of not being a keyword in an old either C or C++ spec, not sure which, but either way, MSVC compiles it fine, but Visual Studio doesn't like parsing struct not { Vreg64 s, d; }; complaining about the struct being named not.

  5. Now for a few fairly arbitrary issues that I'm unsure if they are spec related, but have already been worked around.

  6. First up, we have a fun one, that I haven't the slightest clue how GCC and Clang are allowing in the first place. I don't see how this is at all valid in the constexpr context it's used in, but here it goes: ```Cpp #include

struct MIterTable {
  struct Ent { void* array; void* iter; };

  std::array<Ent,7> ents;
};
__declspec(thread) MIterTable tl_miter_table;

int main() {
  static_assert(tl_miter_table.ents.size() == 7, "");
}
```
In this specific context, because the size of the `std::array` is fixed, and not dependent upon what instance of `MIterTable` is referenced, GCC and Clang allow you to do this assert. MSVC, quite rightly in my eyes, complains that it's not valid in a `constexpr` context. Regardless, HHVM does do this in a few places, and I workaround this as part of [HHVM #5984](https://github.com/facebook/hhvm/pull/5984) ([D44535](https://reviews.facebook.net/D44535)) by simply disabling the `static_assert`'s under MSVC.
  1. Under MSVC enums are comparible by default, where-as under GCC and Clang you have to implement the comparison operators yourself. Under MSVC, defining the operators and then attempting to do a comparison results in an error due to ambiguity, so in HHVM #5982 (D44523) I just disabled these under MSVC.

  2. I haven't reported this as a bug due to problems creating an isolated test case for it, but in certain cases, MSVC either won't instatiate a function defined as template<>, or else /Zc:inline is eliminating it as unreferenced. Either way, it results in an undefined symbol at link time. To compound the issue, although removing the template<> from the declaration makes it compile fine for MSVC, both GCC and Clang error if that is not there, leading to template<> being conditionally used for everything except MSVC in HHVM #6101 (D45207). It's worth noting that HHVM did require some modifications to work correctly with /Zc:inline to begin with via HHVM #6112 (D45279).

  3. In some cases the way MSVC resolves overloads differs from GCC and Clang. In HHVM #6104 (D45219) an issue was caused by a constant defined as size_t being passed where an int32_t was expected. There is also a difference in the way MSVC resolves overloads when it comes to deleted functions. In HHVM #6105 (D45243), I had to allow the unsigned constructor of Immed, which was previously declared as deleted, due to MSVC resolving any unsigned parameters to that overload, even if the unsigned type was small enough to fit in an int without truncation or overwriting the sign bit, which is the only constructor that Immed actually defines under GCC and Clang.

  4. MSVC doesn't let you have static arrays with 0 elements, leading to the first change in HHVM #6111 (D45267) This appears to be deliberate on the part of MSVC, and the code in question does absolutely nothing, so it simply got disabled under MSVC.

  5. Now that I've dealt with the issues with current features, I'll move on to the features whose absence has needed to be worked around to compile under MSVC. These are the things that should be reasonable to implement, and may even already be planned for a future release of MSVC.

  6. C99 - VLA. void myFunc(uint8_t p) { uint8_t buf[p]; } I've switched these out for allocas in HHVM #5676 (D41835) and HHVM #6000 (D44625). However, because VLAs are released at the end of the current scope and alloca doesn't free until the end of the function call, this has come back to bite some of the devs, resulting in a partial revert of that PR,HHVM f83e5722 and a bit of discussion around Folly #271 (D46407), resulting in the use of req::vector, which is a per-request allocated value in HHVM instead. This is faster than a normal malloc, but still not as fast as it would be with a VLA, so it would be nice to be able to use VLAs instead.

  7. C99 - Named variadic macro parameters. #define NEW_EXP(cls, e...) Basically this is the difference between being able to name the parameters and having to use __VA_ARGS__. I had to make changes for this in multiple places, including HHVM #5640 (D41613). Note that supporting this might be an opportunity to provide a fix for the way __VA_ARGS__ is expanded when passed as arguments to another macro, eliminating the need for an extra intermediate define, as a grumpy me had to add in HHVM #6205 (D46251) after figuring out the issue and extracting the actual fix from a StackOverflow question about overloaded macros.

  8. C99 - Named member initializers. const ABI abi { .canSpill = true }; HHVM used these in a couple of places, though luckily I didn't have to deal with swapping the order of the initilizers in HHVM #5627 (D41529), and was instead able to just eliminate the named part of it, and just use a normal intializer instead.

  9. Next up, a few things that would be nice to have, and probably aren't unreasonable to implement, but aren't part of a standard.

  10. --whole-archive - The GCC and Clang linkers both have a way to tell the linker to assume that all objects in a static library are to be included as-if they were passed to the linker individually. This is not currently possible with MSVC's linker, and in HHVM #6143 (D45489) I attempt to work around this by using OBJECT libraries, which are part of CMake. They work by simply passing the compiled object files to the linker directly, rather than a static library. This is not likely to make it into HHVM any time soon, because OBJECT libraries aren't supported until CMake 3.0, however, quite a few of the supported Linux distros don't have CMake 3.0+, the automated testers for instance run CMake 2.7, this makes increasing the required version of CMake a much harder thing to do. HHVM needs to do this because it uses dynamic initializers to register extensions on startup, and, if the objects wouldn't otherwise be included in the link, those extensions end up not being linked in, which would mean that a central spot would be required to explicitly register every extension, and that is not easily maintainable.

  11. A way to get the current frame pointer. For GCC and Clang HHVM does this by reading the RBP register, and I can read RBP as well, but it would require an external assembly stub, and I'm not even sure if that would be entirely correct for MSVC. In HHVM #5943 (D44295) I simply declare the variable and set it nullptr, after an always_assert(false).

  12. A way to tell MSVC to clobber all the callee saved registers. This is something specific to HHVM's use-case, and could probably be achieved via an external assembly stub, but in HHVM #6119 (D45309) I've just done an always_assert(false) instead, as it's only needed for the JIT.

  13. A way to allocate specifically in low memory. HHVM specifically requires that the memory allocated for the JIT'd code be within the first 4gb of memory. Currently the plan to work around this is to use a statically allocated array in the binary (HHVM is linked at the absolute lowest address possible, and is set to not be able to load anywhere else), but it would be nice if there were some other way to do this. Unfortunately, I wouldn't be surprised if this were a limitation of the OS, but this would be much nicer if it were possible to do the allocation dynamically at runtime.

  14. constexpr evaluation of certain builtins. GCC supports this, and I don't think Clang does, but it's not an unreasonable thing to support. Folly can make use of _BitScanForward, _BitScanForward64, _BitScanReverse, _BitScanReverse64, and __popcnt64 in this context, however I believe there are a significant number of intrinsics that could make use of this capability. It is possible to call them within the constraints of the constexpr syntax: Cpp inline int __builtin_ffs(int x, unsigned long index = 0) { return (int)(_BitScanForward(&index, (unsigned long)x) ? index : 0); }

  15. There is not currently a way under MSVC to detect at compile time whether we re targetting a little endian or big endian CPU. In FBThrift #110 I simply hard-code it to little endian.

  16. The coalescing operator, ?:. GCC and Clang support it, but it's not part of any spec that I've seen. I worked around this in HHVM #5626 (D41535) and HHVM #6103 (D45225) by simply copying the condition and sticking it between the ? and :, making it into a ternary instead.

  17. Computed gotos. goto *optab[uint8_t(op)]; In HHVM #5669 (D41811) I refactored the bytecode interpreter to use a switch statement rather than a computed goto under MSVC, because it is not currently possible to implement a threaded interpreter, which is significantly faster than a switch based dispatch, without computed gotos, which are not currently supported under MSVC. I suspect this is likely a moderately complex undertaking to implement in MSVC, involving moderately complex changes in both the frontend, optimization, and codegen passes.

  18. Explicit type conversion syntax. At least, that's what MSVC's error message calls it, "error C4576: a parenthesized type followed by an initializer list is a non-standard explicit type conversion syntax". I fixed the places where HHVM was using this syntax to construct some things in HHVM #5686 (D41889) and where FBThrift was using it in FBThrift #111 by simply removing the parenthesis around the type. The compiler knows the syntax exists, it just doesn't support it.

  19. A flatten attribute, __attribute__((__flatten__)), or [[flatten]] as GCC and Clang do it. This is more a performance tweaking piece, and really isn't needed, but I thought it best to include everything.

  20. The <cxxabi.h> header. HHVM primarily uses this for a clean demangling API, and this is a bit less reasonable than the rest of these, but it would eliminate a few spots where the Windows API, UnDecorateSymbolName, for this had to be used instead.

  21. Next, I have a few things that had to be worked around, but are quite a bit less likely to be implemented or changed, by Microsoft at least.

  22. Struct and class being part of the mangled names. This was a big one, because HHVM uses class and struct interchangibly to refer to types. This is a problem for MSVC if the full definition of the type is never declared in a place that includes the type as part of a mangled name. To solve this the forward declarations just have to be synchronized with their actual definitions, for HHVM #5693 (D42015) I wrote a simple tool to do the majority of the fixes, and then did the rest of the fixes manually as I came across them.

  23. Bitfield layout. This is a difference that is allowed by spec, and changing it would break a lot of code, so it's not likely to happen anytime soon. Allowing it via explicit request for certain declarations would be nice though. I attempt to work around it in HHVM #5990 (D44571) however GCC is currently grumpy about enums being stored in less bits than they are, so this will likely require a conditional change for MSVC.

  24. Typeof. I don't even remotely expect this to ever be implemented. GCC supports an extension to C++ that adds a typeof keyword. This keyword performs, in the context that HHVM used it, the exact same job as decltype. I swapped the uses of typeof for decltype in HHVM #5624 (D41517).

  25. Because MSVC will not layout a child struct's members in a parent struct's alignment padding unless the struct has virtual inheritence, which is not possible in this particular case, the assertion on the size of ObjectData had to be changed in HHVM #6116 (D45303) from 24 to 20 under MSVC.

  26. The last one is not quite on HHVM's side of things, but definitely isn't Microsoft's issue. MySQL 5.6, and MySQL 5.7 RC do not compile under MSVC 2015 due, IIRC, to it adding the timeval struct. This means that WebScaleSQL may end up having to make the upgrade to 5.7, a release that is currently intended to be skipped, assuming the actual release of MySQL 5.7 does compile properly with MSVC 2015. Without the WebScaleSQL upgrade, the Windows version of HHVM will likely not be supporting the async MySQL extension, due to the fact that HHVM links against the static version of the runtime, and the MySQL client library would have to be dynamically linked in, and also link against an earlier version of MSVC, which I anticipate wreaking havoc at runtime.

  27. Last up is the differences in design philosophy between Posix and Windows systems.

  28. Stack unwinding. These are just so different that it pretty much has to be handled specially for each platform, there's no getting around it. For the moment, HHVM #6007 (D44667) simply stubs out the unwind mechanics that the JIT uses in a way that will error when it's called.

  29. Timezones. MSVC and the Posix API handle these very differently, and, although having tm_gmtoff and tm_zone as proper members of tm would be nice, it's something that I'm probably just going to have to either workaround, as I did in HHVM #5999 (D44619), or else disable, as I did in HHVM #6096 (D45177).

  30. Locales. It's not much of a difference, but there is a big enough one to justify adding a shim, which is what I did in HHVM #5633 (D41571)

  31. Getting the home directory and temporary directory. Posix systems have environment variables that can be used to get these, Windows has a pair of variables for the home directory, and a system call for the temporary directory.

  32. Sockets. MSVC doesn't support AF_UNIX sockets, so places using them have simply been disabled in my port.

  33. Paths. Windows uses \ as a directory separator, and Unix uses /. There is also a difference in what an absolute path looks like, which has given rise to HHVM #5822 (D43227), which only addresses the issue where I've noticed it.

  34. File Descriptors vs. HANDLEs / SOCKETs. While the WinSock API is almost the same as the Posix API, minus the lack of AF_UNIX support, there is one key difference, the Posix API uses file descriptors, passed around as ints, and the WinSock API uses SOCKETs. Now, unfortunately, due to C++ conventions, even with this difference, it will compile just fine, however, this will likely not work at runtime, because the SOCKETs are being truncated, and, although the compiler generates warnings for this, those warnings are disabled when building HHVM, because it's absolutely full of implicit truncations, even on Linux. Thankfully, because HHVM is 64-bit only, we can handle this issue by simply adding overloads that accept file descriptors, and translating those file descriptors to the actual WinSock functions. This works without needing to change existing code, other than replacing includes, for any function that accepts a SOCKET for a parameter. For functions that return a SOCKET, such as, socket, we have to explicitly reference our wrapper function, which is also stubbed out for non-windows platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment