Skip to content

Instantly share code, notes, and snippets.

@rzarzynski
Last active March 8, 2018 13:12
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rzarzynski/81a51d5293af33c971ea27557447c406 to your computer and use it in GitHub Desktop.
Save rzarzynski/81a51d5293af33c971ea27557447c406 to your computer and use it in GitHub Desktop.

Validation of Ceph official deb packages for the FastCRC32 support in RocksDB

Logic of the fast CRC32 check in RocksDB

RocksDB uses rocksdb::crc32c::IsFastCrc32Supported() to generate the log message whether the fast CRC32 path is available or not. The function resorts to isSSE42() when the macro __SSE4_2__ is defined. Otherwise it returns false.

bool IsFastCrc32Supported() {
#ifdef __SSE4_2__
  return isSSE42();
#elif defined(_WIN64)
  return isSSE42();
#else
  return false;
#endif
}

The usage of rocksdb::crc32c::IsFastCrc32Supported():

void DumpSupportInfo(Logger* logger) {
  ROCKS_LOG_HEADER(logger, "Compression algorithms supported:");
  ROCKS_LOG_HEADER(logger, "\tSnappy supported: %d", Snappy_Supported());
  ROCKS_LOG_HEADER(logger, "\tZlib supported: %d", Zlib_Supported());
  ROCKS_LOG_HEADER(logger, "\tBzip supported: %d", BZip2_Supported());
  ROCKS_LOG_HEADER(logger, "\tLZ4 supported: %d", LZ4_Supported());
  ROCKS_LOG_HEADER(logger, "\tZSTD supported: %d", ZSTD_Supported());
  ROCKS_LOG_HEADER(logger, "Fast CRC32 supported: %d",
                   crc32c::IsFastCrc32Supported());
}

However, the check is too costly (because of relying on the cpuid instruction) to be executed each time a CRC calculation is invoked. Thus, the result is cached with the assistance of Choose_Extend() function used to initialize a static variable.

static inline Function Choose_Extend() {
  return isSSE42() ? ExtendImpl<Fast_CRC32> : ExtendImpl<Slow_CRC32>;
}

In contrast to IsFastCrc32Supported(), isSSE42() is called unconditionally and likely returns true. However, Fast_CRC32() on its own does fallback to Slow_CRC32() when __SSE4_2__ is undefined.

static inline void Fast_CRC32(uint64_t* l, uint8_t const **p) {
#ifdef __SSE4_2__
#  ifdef __LP64__
  *l = _mm_crc32_u64(*l, LE_LOAD64(*p));
  *p += 8;
#  else
  *l = _mm_crc32_u32(static_cast<unsigned int>(*l), LE_LOAD32(*p));
  *p += 4;
  *l = _mm_crc32_u32(static_cast<unsigned int>(*l), LE_LOAD32(*p));
  *p += 4;
#  endif
#elif defined(_WIN64)
// ...
#else
  Slow_CRC32(l, p);
#endif
}

Conclusion: to judge whether RocksDB uses the fast CRC32 path it is enough to verify the value returned by IsFastCrc32Supported(). [UPDATED]: The logic of the check is tricky. A situation where isSSE42() returns true (and an user pays the init-time cost of cpuid) but Fast_CRC32 turns ultimately into Slow_CRC32 is possible.

ceph-osd_12.2.4-1trusty_amd64.deb

$ objdump -D -C ./usr/bin/ceph-osd
...
0000000000db2850 <rocksdb::crc32c::IsFastCrc32Supported()@@Base>:
  db2850:       31 c0                   xor    %eax,%eax
  db2852:       c3                      retq   
  db2853:       66 66 66 66 2e 0f 1f    data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
  db285a:       84 00 00 00 00 00 

Result: there is no support for the fast CRC32 path.

ceph-osd_12.2.4-1xenial_amd64.deb

$ objdump -D -C ./usr/bin/ceph-osd
...
0000000000e829f0 <rocksdb::crc32c::IsFastCrc32Supported()@@Base>:
  e829f0:       31 c0                   xor    %eax,%eax
  e829f2:       c3                      retq   
  e829f3:       0f 1f 00                nopl   (%rax)
  e829f6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  e829fd:       00 00 00 

Result: there is no support for the fast CRC32 path.

ceph-osd_12.2.4-1~bpo90+1_amd64.deb

$ objdump -D -C ./usr/bin/ceph-osd
...
0000000000e43670 <rocksdb::crc32c::IsFastCrc32Supported()@@Base>:
  e43670:       31 c0                   xor    %eax,%eax
  e43672:       c3                      retq   
  e43673:       0f 1f 00                nopl   (%rax)
  e43676:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  e4367d:       00 00 00 

Result: there is no support for the fast CRC32 path.

@rzarzynski
Copy link
Author

The indentation of macro directives in Fast_CRC32 has been added by me to make the code more readable.

@liewegas
Copy link

liewegas commented Mar 5, 2018

@rzarzynski
Copy link
Author

The Fast_CRC32 function could get the target attribute depending on the macros HAVE_SSE42 and __GNUC__.

#if defined(HAVE_SSE42) && defined(__GNUC__)
#  if defined(__clang__)
#    if __has_cpp_attribute(gnu::target)
__attribute__ ((target ("sse4.2")))
#    endif
#  else  // gcc supports this since 4.4
__attribute__ ((target ("sse4.2")))
#  endif
#endif
static inline void Fast_CRC32(uint64_t* l, uint8_t const **p) {

However, grepping the whole Ceph tree (including the RocksDB directory) doesn't catch any definition of HAVE_SSE42.

$ grep -r HAVE_SSE42 ./
./src/rocksdb/util/crc32c.cc:#if defined(HAVE_SSE42) && defined(__GNUC__)

It also doesn't look like a compiler-provided thing.

#include <iostream>

int main (void) {
#if defined(HAVE_SSE42)
  std::cout << "HAVE_SSE42 defined" << std::endl;
#endif

#if defined(__SSE4_2__)
  std::cout << "__SSE4_2__ defined" << std::endl;
#endif

#if defined(__GNUC__)
  std::cout << "__GNUC__ defined" << std::endl;
#endif
}

Two runs of this program on my machine (Ubuntu 16.04) compiled with and without the -msse4.2 .

$ g++ -msse4.2 -std=c++17 /tmp/test.cc -o /tmp/test && /tmp/test
__SSE4_2__ defined
__GNUC__ defined
$ g++ -std=c++17 /tmp/test.cc -o /tmp/test && /tmp/test
__GNUC__ defined

@rzarzynski
Copy link
Author

rzarzynski commented Mar 7, 2018

On freshest master build the CRC32 support looks fine.

0000000000a3b750 <rocksdb::crc32c::IsFastCrc32Supported[abi:cxx11]()@@Base>:
  a3b750:       55                      push   %rbp
  a3b751:       41 b8 03 00 00 00       mov    $0x3,%r8d
  a3b757:       31 f6                   xor    %esi,%esi
  a3b759:       48 89 e5                mov    %rsp,%rbp
  a3b75c:       41 57                   push   %r15
  a3b75e:       41 56                   push   %r14
  a3b760:       41 55                   push   %r13
  a3b762:       41 54                   push   %r12
  a3b764:       4c 8d 65 80             lea    -0x80(%rbp),%r12
  a3b768:       53                      push   %rbx
  a3b769:       4c 8d 77 10             lea    0x10(%rdi),%r14
  a3b76d:       49 89 fd                mov    %rdi,%r13
  a3b770:       48 83 ec 58             sub    $0x58,%rsp
  a3b774:       4c 89 37                mov    %r14,(%rdi)
  a3b777:       48 c7 47 08 00 00 00    movq   $0x0,0x8(%rdi)
  a3b77e:       00 
  a3b77f:       64 48 8b 04 25 28 00    mov    %fs:0x28,%rax
  a3b786:       00 00 
  a3b788:       48 89 45 c8             mov    %rax,-0x38(%rbp)
  a3b78c:       31 c0                   xor    %eax,%eax
  a3b78e:       49 8d 44 24 10          lea    0x10(%r12),%rax
  a3b793:       c6 47 10 00             movb   $0x0,0x10(%rdi)
  a3b797:       4c 89 e7                mov    %r12,%rdi
  a3b79a:       48 c7 45 88 00 00 00    movq   $0x0,-0x78(%rbp)
  a3b7a1:       00 
  a3b7a2:       c6 45 90 00             movb   $0x0,-0x70(%rbp)
  a3b7a6:       48 89 45 80             mov    %rax,-0x80(%rbp)
  a3b7aa:       b8 01 00 00 00          mov    $0x1,%eax
  a3b7af:       0f a2                   cpuid  
  a3b7b1:       81 e1 00 00 10 00       and    $0x100000,%ecx
  a3b7b7:       31 d2                   xor    %edx,%edx
  a3b7b9:       89 cb                   mov    %ecx,%ebx

UPDATE: the notcmalloc variant is also OK.

@rzarzynski
Copy link
Author

My custom luminous v12.2.4 build (with the paranoid checker) is affected:

0000000000e833c0 <rocksdb::crc32c::IsFastCrc32Supported()@@Base>:
  e833c0:       31 c0                   xor    %eax,%eax
  e833c2:       c3                      retq   
  e833c3:       0f 1f 00                nopl   (%rax)
  e833c6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  e833cd:       00 00 00 

@rzarzynski
Copy link
Author

OK, I believe we've got the ultimate confirmation:

  • on which distros the problem occurs (all) and
  • the value returned by IsFastCrc32Supported() is consistent with the fallback behavior of Fast_CRC32. It really wants to employ Slow_CRC32.

Here is a fragment of one of the logs for the probing shaman build.

/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.4-2-g21e42e3/rpm/el7/BUILD/ceph-12.2.4-2-g21e42e3/src/rocksdb/util/crc32c.cc:359:3: error: #error "Oops, __SSE4_2__ is NOT defined. Going Slow_CRC32()!"
 # error "Oops, __SSE4_2__ is NOT defined. Going Slow_CRC32()!"

The idea is to #error in RocksDBs when the Slow_CRC32 is going to be picked. See commits ceph/rocksdb@f884ce3 and ceph/ceph-ci@21e42e3.

@rzarzynski
Copy link
Author

About the fix perspective. Just setting HAVE_SSE42 is not enough. We would face a situation when the result IsFastCrc32Supported() would be contradict to the actual flow. Moreover, the crc32c.cc in RocksDB's master got many new things (HAVE_POWER8, HAVE_PCLMUL, NO_THREEWAY_CRC32C), so it looks we need something specific for our ceph/rocksdb in Luminous.

Most likely attributing IsFastCrc32Supported the same way as Fast_CRC32 currently is + HAVE_SSE42 in makefile will do.

@rzarzynski
Copy link
Author

In master it works apparently because of ceph/rocksdb@11c5d47, which sets the HAVE_SSE42.

Also ceph/rocksdb@019aa70 is important for the whole case as it removes the target attribution for Fast_CRC32 due to failing build on CentOS 7 with older GCCs (see the commits messages).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment