Skip to content

Instantly share code, notes, and snippets.

@kiszk
Created August 26, 2020 06:41
Show Gist options
  • Save kiszk/daedf1e033a07d3f484655a69aac78ca to your computer and use it in GitHub Desktop.
Save kiszk/daedf1e033a07d3f484655a69aac78ca to your computer and use it in GitHub Desktop.
Micah, thank you for expanding the scope for Big Endian support in Arrow. I am happy to see this when I am back from one-week vacation. I agree with this since we have just seen the kickoff of BE support in Go. Let me make comments
> 1. As long as there is CI in place to catch regressions (right now I think the CI is fairly unreliable?)
I agree. While TravisCI on s390x are unreliable, each platform can set up CI script.
> 2. No degradation in performance for little-endian architectures (verified by additional micro benchmarks)
Yes, we can do it. @kou suggested me to use the existing mechanism like [1]. Now, it supports C++.We could expand this to other languages.
> 3. Not a large amount of invasive code to distinguish between platforms.
Yes, I will prepare a draft of the principle for endian-independent code in another mail.
> Kazuaki Ishizaki I asked question previously, but could you give some data points around:
Sorry for the delay. I was waiting for comments during my one-week vacation.
> 1. The current state of C++ support (how much code needed to change)?
In my understanding, the PR [2] is the last PR to support big-endian for intra-process and inter-process in C++ except Parquet support.
> 2. How many more PRs you expect to need for Java (and approximate size)?
I have just submitted two PRs [5, 6] today to pass the current test using “mvn install”. The total size of the four PRs [3, 4, 5, 6] are around 200 lines. For CI for Java, I submitted one PR [7] (~100 lines).
In addition, I will submit another PR (~500 lines) to pass integration test.
BTW, I think that it would be good to add test cases to increase test coverage for little-endian and big-endian (e.g. increase the types of a schema to be tested in flight-core). If it is acceptable, we can submit more PRs regardless of endianness.
[1] https://github.com/apache/arrow/pull/7940#issuecomment-672690540
[2] https://github.com/apache/arrow/pull/7507
[3] https://github.com/apache/arrow/pull/7942
[4] https://github.com/apache/arrow/pull/7944
[5] https://github.com/apache/arrow/pull/8056
[6] https://github.com/apache/arrow/pull/8057
[7] https://github.com/apache/arrow/pull/7938
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment