Skip to content

Instantly share code, notes, and snippets.

@phrz
Created September 21, 2018 17:33
Show Gist options
  • Save phrz/e621da0e3a5047b0d042d34c2965cf3a to your computer and use it in GitHub Desktop.
Save phrz/e621da0e3a5047b0d042d34c2965cf3a to your computer and use it in GitHub Desktop.
Helpers for reading binary file formats in C++.
#include <fstream>
#include <iostream>
#include <array>
// creates a new instance of the type
// and reads data into it. Good for single
// value types.
template<typename T>
T STLParser::readBinary(std::ifstream& ifs) {
T out;
ifs.read(
reinterpret_cast<char*>(&out),
sizeof(T)
);
return out;
}
// reads a series of binary values
// into an array with a given type.
// Should be size and type safe.
template<typename T, size_t size>
void readBinaryArray(
std::ifstream& ifs,
std::array<T,size>& data
) {
ifs.read(
reinterpret_cast<char*>(&data),
sizeof(T) * size
);
}
int main() {
std::string fileName = "sphere.stl";
std::ifstream file { fileName, std::ios::binary };
if(!file.is_open()) {
std::cerr << "Could not open file." << std::endl;
return 1;
}
// Skipping header (80 bytes)
for(int i = 0; i < 80; i++) {
file.get();
}
auto triangleCount = readBinary<uint32_t>(file);
std::array<float, 12> points;
for(uint32_t i = 0; i < triangleCount; i++) {
// reads 12 floats into `points`
readBinaryArray(file, points);
// skip the attribute data
readBinary<uint16_t>(file);
}
}
@ViralTaco
Copy link

reads a series of binary values into an array with a given type.
Should be size and type safe.

Then you use reinterpret_cast and honestly I can't find a decent answer for this.
Can you explain to me how this is type safe? I mean I see what it does.
But I'm not sure whether or not this is UB, I've had answer likes "this is UB" and "this is fine"I kinda agree more with the first one.
My best guess as to why this could be well defined would be this:

An object pointer can be explicitly converted to an object pointer of a different type.73 When a prvalue v of
object pointer type is converted to the object pointer type “pointer to cv T”, the result is static_cast<cv
T*>(static_cast<cv void*>(v)). [ Note: Converting a prvalue of type “pointer to T1” to the type “pointer
to T2” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than
those of T1) and back to its original type yields the original pointer value. — end note ]

(cf: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4713.pdf § 8.5.1.11 p 107 )
Thank you for your time

@phrz
Copy link
Author

phrz commented Oct 11, 2019

I have to preface by saying it’s been a while since I’ve looked at this snippet which I do not put forth as a good or recommended way of doing things, just one that’s worked for me for an STL format parser I made for fun. Notwithstanding that, the need to reinterpret is there by virtue of the ifstream::read method, which operates on char_type*. I do not see an issue as I read from arbitrary binary — “type safety” must begin somewhere along the deserialization line, and I begin it after that read command dumps data into a typed object while disregarding it’s type. If the incoming binary is bad, the object will not reflect a correct value.

@ViralTaco
Copy link

Ok, sounds good. I've looked into it and this can be implementation specific behaviour if the file that you read isn't from the same implementation. ie: floats don't have to be 4 bytes wide for some reason. Otherwise I've tried to find a way to break this but there isn't. This actually is one of the rare times where using reinterpret_cast isn't UB.

I guess if I was to implement this I'd do the reading and parsing in different functions. It'd be slower but it'd allow for some type checking. Another possible way of doing this is using ::sscanf but then it's not really type safe either. The simplest way is probably std::ifstream::operator>>() but then you'd need a temporary and it'd most likely be slower (before optimization, who knows if the compiler would see through it)

Anyway… Thanks for answering

@phrz
Copy link
Author

phrz commented Oct 12, 2019

Hey it’s no problem. To be clear the intent of this code is to parse binary STL files which follow a consistent, well defined encoding - it’s not to deserialize data serialized by this or other C++ implementations. I would not recommend this code for such a broad purpose, but instead a more robust serialization library, perhaps protobufs or bson.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment