Alright, so, I'm going to be going a bit overboard and giving a full analysis of the changes I've made for HHVM support, which changes were needed because of limitations of the MSVC compiler, standard library, or general design philosophy of the OS (such as file descriptors vs. HANDLE's). I'm also going to be posting this in response to a question asked by one of the HHVM team on the primary PR containing the actual workarounds here, and I don't exactly expect this to be "small", as it's going to be the result of me reviewing the ~300 PRs I've made to HHVM, ~60 PRs to Folly and ~10 to FBThrift.
Now, as a brief warning, and also for some clarity, some of my names for various language constructs may be completely made up, but I aim to include code samples wherever I can to help get the point across. Also, for those who don't know, HHVM is currently tested under GCC 4.8, GCC 4.9, and Clang 3.6. While it does compile with GCC 5.0+, it doesn't actually work, likely due to some optimizations it's doing. When I refer to GCC and Clang in this, I am referring to the specific versions tested. When I refer to MSVC itself, I refer to version 19.00.23026, which is what shipped with Visual Studio 2015 RTM.
-
So, first up, I'll start with the issues with MSVC's implementation of the C++ standard, or what I'm pretty sure is part of the spec.
-
offsetof
use inconstexpr
contexts. I was just a bit slow at the draw to get the fix for this into RTM, I filed it as Visual Studio Feedback #1414341 and it was marked as closed within a few days of the RTM release. This is the cause of the largest changes to workaround, and is one of the workarounds added as part of HHVM #5984 (D44535). -
constexpr
constructor errors with union members with bitfields. This is one that I didn't file until after the RTM release, as it wasn't possible for me to hit this in the RC version, asconstexpr
constructors didn't work in the RC. This is filed as Visual Studio Feedback #1571281 and has already been marked as fixed for Update 1. This is one of the workarounds added as part of HHVM #5984 (D44535). -
Templated function pointer overload resolution failure. This was one that I noticed fairly early, and filed as Visual Studio Feedback #1464651. It was marked as fixed while I was writing this, and is one of the workarounds added as part of HHVM #5984 (D44535).
-
Now, I'm not certain if this next one is part of the spec or not, and is the earliest workaround for MSVC that I submitted to HHVM, but it does work with GCC and Clang. I have not submitted a bug report on this, although one may exist. Specifically, when defining an
enum class
with a value that has the same name as an outer type, MSVC complains about attempting to redefine the outer type:Cpp struct Accessor { enum class Kind { Empty, Accessor, }; };
This was worked around in HHVM #5622 (D41511) by simply renaming the member of theenum class
fromAccessor
toAccessorKind
. There was another name conflict in HHVM #5636 (D41589), but I'm not sure if that was valid to do in the first place. -
In-class initialization of non-primitive static variables. This is currently guarded by a warning-as-error in MSVC, and is also explicitly mentioned in the blog post announcing the RTM version, so I haven't reported it as a bug. HHVM #5971 (D44457) and pieces of Folly #305 are the types of workaround that was done for this. In certain cases, such as HHVM #6061 (D44895) it is possible to work around the issue by simply changing the value to a function.
-
Next up, a parser error that I had to chase for a while before I figured out the actual issue, because, sometimes even the best of us can overlook the obvious. In this case, the source of the syntax error is in the chain of template instantiations. I did not report this bug, as I'd already worked around it, and, at the time, I was failing at reproducing it in a test that wasn't dependent upon all of HHVM. While typing this up, I was able to reproduce it in a reasonable test case, and have just reported it as Visual Studio Feedback #1832242. I resolved this in HHVM #6020 (D44781) by switching from curly braces to parenthesis, and in Folly #277 by adding an
=
before the opening curly bracket. -
In certain situations, MSVC will also attempt to construct a
const
type, even though aconst
wasn't requested. This particular case worked just fine under the RC version of MSVC 2015, but the RTM version doesn't like it, leading to HHVM #6108 (D45249) being needed. Due to the lack of an isolated test-case, this hasn't been submitted as a bug. -
[[noreturn]]
- You can't currently use[[noreturn]]
on a templated function. I hadn't reported this, as I hadn't tested it under any other compiler, and I didn't know if it was actually valid under the spec, but I did test Clang while writing this, and it didn't have problem with it, after riding out a bit of grumpyness from the feedback system, so this is now filed as Visual Studio Feedback #1835949. This is not currently used in HHVM, but I discovered it when I tried to switch all of the noreturn functions to use the C++ 11 attribute in HHVM #6291 (D47385). That PR is not likely to be merged for other reasons though. And, of course, a test case: ```Cpp template [[noreturn]] void func() { throw T(); }
int main() {
func<int>();
}
```
decltype
produces default arguments in its type. At least, I think that's the issue. Either way, through a macro that is used to keep support for multiple ABIs sane, HHVM ends up doing something like this (stripped of it's body and it's actual functionality): ```Cpp #include <boost/type_traits.hpp>
int abi(int kind = 1);
int abi(int kind) {
return ([&]() -> boost::function_traits<decltype(abi)>::result_type {
return 0;
}());
}
```
Which results in this:
```
error C2383: 'Function': default-arguments are not allowed on this symbol
note: see reference to class template instantiation 'boost::function_traits<int (int)>' being compiled
```
Due to some fun, this ended up being reported as [Visual Studio Feedback #1839232](https://connect.microsoft.com/VisualStudio/feedback/details/1839232), and is also dependent upon Boost v1.58.0, as I haven't tested Boost v1.59.0. I attempted to work around this issue in [HHVM #6257](https://github.com/facebook/hhvm/pull/6257) ([D47019](https://reviews.facebook.net/D47019)), however some concerns were raised about `auto` discarding refs, so I was unsure how to handle this, until I randomly ended up reading through the release notes for VS 2015, and noticed `decltype(auto)` which does exactly what we need it to here. It's worth noting that, IIRC, C++ 14 allows lambdas to have default arguments, I'm not sure if that would solve this particular issue though.
-
I haven't reported this one, but the
/analyze
pass doesn't seem to know about the C99 format specifiers that were added with MSVC 2015, and causes spurious warnings when they are used. This is triggered a lot of places in HHVM. -
I've also not reported this one, but the warning about comma operators within an array index expression, C4709 doesn't seem to be functioning correctly, because it triggers on
v << load{func[argc * 8 + pTabOff], dest};
, whereload
is a type, andfunc
has an overloaded[]
operator.env.insertedPhis[PhiKey { block, alocID }]
is another example of what will trigger it. -
Next, there are a few issues with the standard library implementation that I've encountered.
-
Using
std::atomic<char*>
as the condition to anif
statement results in an error complaining about an ambiguous user-defined-conversion. I filed this as Visual Studio Feedback #1464842 just a bit after the cutoff for changes to make it into the RTM release. I work around this in HHVM #6008 (D44673) and Folly #305 by simply explicitly loading the value. -
Attempting to wait for a
std::condition_variable
with astd::chrono::duration<double>
fails. This is one that I've known existed, but hadn't dug into after figuring out the workaround, and, as such hadn't submitted a bug report about it. This has been reported as Visual Studio Feedback #1839243. This is one of the workarounds added as part of HHVM #5984 (D44535). And, the obligitory test case: ```Cpp #include #include <condition_variable> #include
using namespace std;
void idontwanna() {
mutex m_mutex;
condition_variable m_cv;
unique_lock<mutex> lock(m_mutex);
m_cv.wait_for(lock, chrono::duration<double>(0));
}
```
-
An
std::function
cannot be constructed by passing0
. GCC, Clang, and even the Boost implementation allow this, but the MSVC implementation does not, leading to FBThrift #108 where I just don't pass anything at all. -
And now, a few issues with the actual tooling, so Visual Studio itself.
-
The first issue is mostly an issue of the linker not telling you what it's doing. HHVM is giant, and produces a very large set of input libraries to the linker, in full debug mode, the core of the HHVM runtime can surpass 4.5gb. Now, the 32-bit hosted 64-bit linker can't handle a single static library that size, but it can handle many smaller static libaries, however, when the total input set is larger than a certain point, the linker gets absurdly slow. A normal full link is ~3 minutes, but with a large enough input set, that can turn into ~45 minutes. I believe this is it switching to a "low-memory" mode, as, when the linker is doing this, it is doing an absolutely absurd amount of I/O operations (60-70k/second), and, although very few of these actually hit disk, they still incur the cost of the system call to read it from the cache. This is easily fixed by switching to the 64-bit hosted 64-bit toolchain, as I do in a slightly roundabout way in HHVM #6127 (D45351) due to HHVM being built with CMake, however I believe it would be a good idea to have the linker at least output a warning when it enters this mode, as it took me quite a while to figure out what the actual issue was.
-
I haven't reported this second one, but I believe it is because of
not
being a keyword in an old either C or C++ spec, not sure which, but either way, MSVC compiles it fine, but Visual Studio doesn't like parsingstruct not { Vreg64 s, d; };
complaining about the struct being namednot
. -
Now for a few fairly arbitrary issues that I'm unsure if they are spec related, but have already been worked around.
-
First up, we have a fun one, that I haven't the slightest clue how GCC and Clang are allowing in the first place. I don't see how this is at all valid in the
constexpr
context it's used in, but here it goes: ```Cpp #include
struct MIterTable {
struct Ent { void* array; void* iter; };
std::array<Ent,7> ents;
};
__declspec(thread) MIterTable tl_miter_table;
int main() {
static_assert(tl_miter_table.ents.size() == 7, "");
}
```
In this specific context, because the size of the `std::array` is fixed, and not dependent upon what instance of `MIterTable` is referenced, GCC and Clang allow you to do this assert. MSVC, quite rightly in my eyes, complains that it's not valid in a `constexpr` context. Regardless, HHVM does do this in a few places, and I workaround this as part of [HHVM #5984](https://github.com/facebook/hhvm/pull/5984) ([D44535](https://reviews.facebook.net/D44535)) by simply disabling the `static_assert`'s under MSVC.
-
Under MSVC enums are comparible by default, where-as under GCC and Clang you have to implement the comparison operators yourself. Under MSVC, defining the operators and then attempting to do a comparison results in an error due to ambiguity, so in HHVM #5982 (D44523) I just disabled these under MSVC.
-
I haven't reported this as a bug due to problems creating an isolated test case for it, but in certain cases, MSVC either won't instatiate a function defined as
template<>
, or else/Zc:inline
is eliminating it as unreferenced. Either way, it results in an undefined symbol at link time. To compound the issue, although removing thetemplate<>
from the declaration makes it compile fine for MSVC, both GCC and Clang error if that is not there, leading totemplate<>
being conditionally used for everything except MSVC in HHVM #6101 (D45207). It's worth noting that HHVM did require some modifications to work correctly with/Zc:inline
to begin with via HHVM #6112 (D45279). -
In some cases the way MSVC resolves overloads differs from GCC and Clang. In HHVM #6104 (D45219) an issue was caused by a constant defined as
size_t
being passed where anint32_t
was expected. There is also a difference in the way MSVC resolves overloads when it comes to deleted functions. In HHVM #6105 (D45243), I had to allow theunsigned
constructor ofImmed
, which was previously declared as deleted, due to MSVC resolving any unsigned parameters to that overload, even if the unsigned type was small enough to fit in anint
without truncation or overwriting the sign bit, which is the only constructor thatImmed
actually defines under GCC and Clang. -
MSVC doesn't let you have static arrays with 0 elements, leading to the first change in HHVM #6111 (D45267) This appears to be deliberate on the part of MSVC, and the code in question does absolutely nothing, so it simply got disabled under MSVC.
-
Now that I've dealt with the issues with current features, I'll move on to the features whose absence has needed to be worked around to compile under MSVC. These are the things that should be reasonable to implement, and may even already be planned for a future release of MSVC.
-
C99 - VLA.
void myFunc(uint8_t p) { uint8_t buf[p]; }
I've switched these out foralloca
s in HHVM #5676 (D41835) and HHVM #6000 (D44625). However, because VLAs are released at the end of the current scope andalloca
doesn't free until the end of the function call, this has come back to bite some of the devs, resulting in a partial revert of that PR,HHVM f83e5722 and a bit of discussion around Folly #271 (D46407), resulting in the use ofreq::vector
, which is a per-request allocated value in HHVM instead. This is faster than a normalmalloc
, but still not as fast as it would be with a VLA, so it would be nice to be able to use VLAs instead. -
C99 - Named variadic macro parameters.
#define NEW_EXP(cls, e...)
Basically this is the difference between being able to name the parameters and having to use__VA_ARGS__
. I had to make changes for this in multiple places, including HHVM #5640 (D41613). Note that supporting this might be an opportunity to provide a fix for the way__VA_ARGS__
is expanded when passed as arguments to another macro, eliminating the need for an extra intermediate define, as a grumpy me had to add in HHVM #6205 (D46251) after figuring out the issue and extracting the actual fix from a StackOverflow question about overloaded macros. -
C99 - Named member initializers.
const ABI abi { .canSpill = true };
HHVM used these in a couple of places, though luckily I didn't have to deal with swapping the order of the initilizers in HHVM #5627 (D41529), and was instead able to just eliminate the named part of it, and just use a normal intializer instead. -
Next up, a few things that would be nice to have, and probably aren't unreasonable to implement, but aren't part of a standard.
-
--whole-archive
- The GCC and Clang linkers both have a way to tell the linker to assume that all objects in a static library are to be included as-if they were passed to the linker individually. This is not currently possible with MSVC's linker, and in HHVM #6143 (D45489) I attempt to work around this by using OBJECT libraries, which are part of CMake. They work by simply passing the compiled object files to the linker directly, rather than a static library. This is not likely to make it into HHVM any time soon, because OBJECT libraries aren't supported until CMake 3.0, however, quite a few of the supported Linux distros don't have CMake 3.0+, the automated testers for instance run CMake 2.7, this makes increasing the required version of CMake a much harder thing to do. HHVM needs to do this because it uses dynamic initializers to register extensions on startup, and, if the objects wouldn't otherwise be included in the link, those extensions end up not being linked in, which would mean that a central spot would be required to explicitly register every extension, and that is not easily maintainable. -
A way to get the current frame pointer. For GCC and Clang HHVM does this by reading the RBP register, and I can read RBP as well, but it would require an external assembly stub, and I'm not even sure if that would be entirely correct for MSVC. In HHVM #5943 (D44295) I simply declare the variable and set it
nullptr
, after analways_assert(false)
. -
A way to tell MSVC to clobber all the callee saved registers. This is something specific to HHVM's use-case, and could probably be achieved via an external assembly stub, but in HHVM #6119 (D45309) I've just done an
always_assert(false)
instead, as it's only needed for the JIT. -
A way to allocate specifically in low memory. HHVM specifically requires that the memory allocated for the JIT'd code be within the first 4gb of memory. Currently the plan to work around this is to use a statically allocated array in the binary (HHVM is linked at the absolute lowest address possible, and is set to not be able to load anywhere else), but it would be nice if there were some other way to do this. Unfortunately, I wouldn't be surprised if this were a limitation of the OS, but this would be much nicer if it were possible to do the allocation dynamically at runtime.
-
constexpr
evaluation of certain builtins. GCC supports this, and I don't think Clang does, but it's not an unreasonable thing to support. Folly can make use of_BitScanForward
,_BitScanForward64
,_BitScanReverse
,_BitScanReverse64
, and__popcnt64
in this context, however I believe there are a significant number of intrinsics that could make use of this capability. It is possible to call them within the constraints of theconstexpr
syntax:Cpp inline int __builtin_ffs(int x, unsigned long index = 0) { return (int)(_BitScanForward(&index, (unsigned long)x) ? index : 0); }
-
There is not currently a way under MSVC to detect at compile time whether we re targetting a little endian or big endian CPU. In FBThrift #110 I simply hard-code it to little endian.
-
The coalescing operator,
?:
. GCC and Clang support it, but it's not part of any spec that I've seen. I worked around this in HHVM #5626 (D41535) and HHVM #6103 (D45225) by simply copying the condition and sticking it between the?
and:
, making it into a ternary instead. -
Computed
goto
s.goto *optab[uint8_t(op)];
In HHVM #5669 (D41811) I refactored the bytecode interpreter to use a switch statement rather than a computed goto under MSVC, because it is not currently possible to implement a threaded interpreter, which is significantly faster than a switch based dispatch, without computedgoto
s, which are not currently supported under MSVC. I suspect this is likely a moderately complex undertaking to implement in MSVC, involving moderately complex changes in both the frontend, optimization, and codegen passes. -
Explicit type conversion syntax. At least, that's what MSVC's error message calls it, "error C4576: a parenthesized type followed by an initializer list is a non-standard explicit type conversion syntax". I fixed the places where HHVM was using this syntax to construct some things in HHVM #5686 (D41889) and where FBThrift was using it in FBThrift #111 by simply removing the parenthesis around the type. The compiler knows the syntax exists, it just doesn't support it.
-
A flatten attribute,
__attribute__((__flatten__))
, or[[flatten]]
as GCC and Clang do it. This is more a performance tweaking piece, and really isn't needed, but I thought it best to include everything. -
The
<cxxabi.h>
header. HHVM primarily uses this for a clean demangling API, and this is a bit less reasonable than the rest of these, but it would eliminate a few spots where the Windows API,UnDecorateSymbolName
, for this had to be used instead. -
Next, I have a few things that had to be worked around, but are quite a bit less likely to be implemented or changed, by Microsoft at least.
-
Struct and class being part of the mangled names. This was a big one, because HHVM uses
class
andstruct
interchangibly to refer to types. This is a problem for MSVC if the full definition of the type is never declared in a place that includes the type as part of a mangled name. To solve this the forward declarations just have to be synchronized with their actual definitions, for HHVM #5693 (D42015) I wrote a simple tool to do the majority of the fixes, and then did the rest of the fixes manually as I came across them. -
Bitfield layout. This is a difference that is allowed by spec, and changing it would break a lot of code, so it's not likely to happen anytime soon. Allowing it via explicit request for certain declarations would be nice though. I attempt to work around it in HHVM #5990 (D44571) however GCC is currently grumpy about enums being stored in less bits than they are, so this will likely require a conditional change for MSVC.
-
Typeof. I don't even remotely expect this to ever be implemented. GCC supports an extension to C++ that adds a
typeof
keyword. This keyword performs, in the context that HHVM used it, the exact same job asdecltype
. I swapped the uses oftypeof
fordecltype
in HHVM #5624 (D41517). -
Because MSVC will not layout a child struct's members in a parent struct's alignment padding unless the struct has virtual inheritence, which is not possible in this particular case, the assertion on the size of
ObjectData
had to be changed in HHVM #6116 (D45303) from 24 to 20 under MSVC. -
The last one is not quite on HHVM's side of things, but definitely isn't Microsoft's issue. MySQL 5.6, and MySQL 5.7 RC do not compile under MSVC 2015 due, IIRC, to it adding the
timeval
struct. This means that WebScaleSQL may end up having to make the upgrade to 5.7, a release that is currently intended to be skipped, assuming the actual release of MySQL 5.7 does compile properly with MSVC 2015. Without the WebScaleSQL upgrade, the Windows version of HHVM will likely not be supporting the async MySQL extension, due to the fact that HHVM links against the static version of the runtime, and the MySQL client library would have to be dynamically linked in, and also link against an earlier version of MSVC, which I anticipate wreaking havoc at runtime. -
Last up is the differences in design philosophy between Posix and Windows systems.
-
Stack unwinding. These are just so different that it pretty much has to be handled specially for each platform, there's no getting around it. For the moment, HHVM #6007 (D44667) simply stubs out the unwind mechanics that the JIT uses in a way that will error when it's called.
-
Timezones. MSVC and the Posix API handle these very differently, and, although having
tm_gmtoff
andtm_zone
as proper members oftm
would be nice, it's something that I'm probably just going to have to either workaround, as I did in HHVM #5999 (D44619), or else disable, as I did in HHVM #6096 (D45177). -
Locales. It's not much of a difference, but there is a big enough one to justify adding a shim, which is what I did in HHVM #5633 (D41571)
-
Getting the home directory and temporary directory. Posix systems have environment variables that can be used to get these, Windows has a pair of variables for the home directory, and a system call for the temporary directory.
-
Sockets. MSVC doesn't support
AF_UNIX
sockets, so places using them have simply been disabled in my port. -
Paths. Windows uses
\
as a directory separator, and Unix uses/
. There is also a difference in what an absolute path looks like, which has given rise to HHVM #5822 (D43227), which only addresses the issue where I've noticed it. -
File Descriptors vs.
HANDLE
s /SOCKET
s. While the WinSock API is almost the same as the Posix API, minus the lack of AF_UNIX support, there is one key difference, the Posix API uses file descriptors, passed around asint
s, and the WinSock API usesSOCKET
s. Now, unfortunately, due to C++ conventions, even with this difference, it will compile just fine, however, this will likely not work at runtime, because theSOCKET
s are being truncated, and, although the compiler generates warnings for this, those warnings are disabled when building HHVM, because it's absolutely full of implicit truncations, even on Linux. Thankfully, because HHVM is 64-bit only, we can handle this issue by simply adding overloads that accept file descriptors, and translating those file descriptors to the actual WinSock functions. This works without needing to change existing code, other than replacing includes, for any function that accepts aSOCKET
for a parameter. For functions that return aSOCKET
, such as,socket
, we have to explicitly reference our wrapper function, which is also stubbed out for non-windows platforms.