Total number of JIRA tickets assigned to version 0.17.0: 563
Total number of applied patches since 0.16.0: 529
Patches with assigned issue in 0.17.0:
- ARROW-7712: [CI] [Crossbow] Delete fuzzit jobs
- ARROW-6738: [Java] Fix problems with current union comparison logic
- ARROW-7734: [C++] check status details for nullptr in equality
- ARROW-6724: [C++] Allow simpler BufferOutputStream creation
- ARROW-6871: [Java] Enhance TransferPair related parameters check and tests
- ARROW-7301: [Java] Sql type DATE should correspond to DateDayVector
- ARROW-7736: [Release] Retry binary download on transient error
- ARROW-7466: [CI][Java] Fix gandiva-jar-osx nightly build failure
- ARROW-7684: [Rust] Example Flight client and server for DataFusion
- ARROW-7729: [Python][CI] Pin pandas version to 0.25 in the dask integration test
- ARROW-7735: [Release][Python] Use pip to install dependencies for wheel verification
- ARROW-7726: [CI] [C++] Use boost binaries on Windows GHA build
- ARROW-7073: [Java] Support concating vectors values in batch
- ARROW-7750: [Release] Make the source release verification script restartable
- ARROW-7751: [Release] macOS wheel verification also needs arrow-testing
- ARROW-7691: [C++] Check non-scalar Flatbuffers fields are not null
- ARROW-7760: [Release] Fix verify-release-candidate.sh since pip3 seems to no longer be in miniconda, install miniconda unconditionally
- ARROW-6757: [Release] Use same CMake generator for C++ and Python when verifying RC, remove Python 3.5 from wheel verification
- ARROW-7752: [Release] Enable and test dataset in the verification script
- ARROW-7766: [Python][Packaging] Windows py38 wheels are built with wrong ABI tag
- ARROW-7762: [Python] Do not ignore exception for invalid version in ParquetWriter
- ARROW-4226: [C++] Add sparse CSF tensor support
- ARROW-7405: [Java] ListVector isEmpty API is incorrect
- ARROW-7467: [Java] ComplexCopier does incorrect copy for Map nullable info
- ARROW-7524: [C++][CI] Enable Parquet in the VS2019 GHA job
- ARROW-7720: [C++][Python] Add check_metadata argument to Table.equals
- ARROW-7774: [Packaging][Python] Update macos and windows wheel filenames
- ARROW-5981: [C++] Propagate errors from MemoTable to DictionaryBuilder
- ARROW-7780: [Release] Fix Windows wheel RC verification script given lack of "m" ABI tag in Python 3.8
- ARROW-7631: [C++][Gandiva] return zero if there is an overflow while downscaling a decimal
- ARROW-7797: [Release][Rust] Fix arrow-flight's version in datafusion crate
- ARROW-7772: [R][C++][Dataset] Unable to filter on date32 object with date64 scalar
- ARROW-7796: [R] write_* functions should invisibly return their inputs
- ARROW-7799: [R][CI] Remove flatbuffers from homebrew formulae
- ARROW-7745: [Doc] [C++] Update Parquet documentation
- ARROW-7804: [C++][R] Compile error on macOS 10.11
- ARROW-7791: [C++][Parquet] Fix building error "cannot bind lvalue"
- ARROW-7662: [R] Support creating ListArray from R list
- ARROW-5742: [CI][C++] Add nightly Valgrind build
- ARROW-7817: [CI] macOS R autobrew nightly failed on installing dependency from source
- ARROW-7829: [R] Test R bindings on clang
- ARROW-7787: [Rust] Added .collect to Table API
- ARROW-7828: [Release] Remove SSH keys for internal use
- ARROW-7701: [FlightRPC][C++] disable flaky MacOS test
- ARROW-7832: [R] Patches to 0.16.0 release
- ARROW-7795: [Rust] Added support for NOT
- ARROW-7119: [C++][CI] Show automatic backtraces
- ARROW-7624: [Rust] Soundness issues via
Buffer
methods - ARROW-7754: [C++] Make Result<> faster
- ARROW-7722: [FlightRPC][Java] disable flaky Flight auth test
- ARROW-2447: [C++] Device and MemoryManager API
- ARROW-7815: [C++] Improve input validation
- ARROW-5757: [Python] Remove Python 2.7 support
- ARROW-7793: [Java] Release accounted-for reservation memory to parent in case of leak
- ARROW-6875: [FlightRPC] implement criteria for ListFlights
- ARROW-7848: [C++][Python][Doc] Add MapType API doc
- ARROW-7834: [Release] Post release task for updating the documentations
- ARROW-7849: [Packaging][Python] Remove the remaining py27 crossbow wheel tasks from the nightlies
- ARROW-1581: [Packaging] Tooling to make nightly wheels available for install
- ARROW-7846: [Python][Dev] Remove dependencies on six
- ARROW-7844: [R] Converter_List is not thread-safe
- ARROW-7833: [R] Make install_arrow() actually install arrow
- ARROW-6165: [Integration] Run integration tests on multiple cores
- ARROW-7781: [C++] Improve message when referencing a missing field
- ARROW-7841: [C++] Use ${HADOOP_HOME}/lib/native/ to find libhdfs.so again
- ARROW-7819: [C++][Gandiva] Add DumpIR to Filter/Projector object
- ARROW-7859: [R] Minor patches for CRAN submission 0.16.0.2
- ARROW-7330: [C++] Migrate Arrow Cuda to Result
- ARROW-7775: [Rust] fix: Don't let safe code arbitrarily transmute readers and writers
- ARROW-7615: [CI][Gandiva] Ensure gandiva_jni library has only a whitelisted set of shared dependencies
- ARROW-7777: [Go] Fix StructBuilder and ListBuilder panics on index out of range
- ARROW-7758: [Python] Safe cast to nanosecond timestamps in to_pandas conversion
- ARROW-7761: [C++][Python] Support S3 URIs
- ARROW-7836: [Rust] "allocate_aligned"/"reallocate" need to initialize memory to avoid UB
- ARROW-7868: [Crossbow] Reduce GitHub API query parallelism
- ARROW-7838: [C++] Only link Boost libraries with tests, not libarrow.so
- ARROW-7725: [C++] Add infrastructure for unity builds and precompiled headers
- ARROW-7400: [Java] Avoid the worst case for quick sort
- ARROW-7546: [Java] Use new implementation to concat vectors values in batch
- ARROW-7869: [Python] Remove boost::system and boost::filesystem from Python wheels
- ARROW-7462: [C++] Add CpuInfo detection for Arm64 Architecture
- ARROW-7839: [Python][Dataset] Expose IPC format in python bindings
- ARROW-7788: [C++][Parquet] Enable Arrow Schema to Parquet Schema for missing types
- ARROW-7862: [R] Linux installation should run quieter by default
- ARROW-7201: [GLib][Gandiva] Add support for BooleanNode
- ARROW-7880: [CI][R] R sanitizer job is not really working
- ARROW-7881: [C++] Fix -Wpedantic warnings
- ARROW-7876: [R] Installation fails in the documentation generation image
- ARROW-7742: [GLib] Add support for MapArray
- ARROW-7786: [R] Wire up check_metadata in Table.Equals method
- ARROW-7889: [Rust] Add support to datafusion-cli for parquet files.
- ARROW-7884: [C++] Relax concurrency rules around GetSize()
- ARROW-7863: [C++][Python][CI] Ensure running HDFS related tests
- ARROW-5357: [Rust] Change Buffer::len to represent total bytes instead of used bytes
- ARROW-7887: [Rust] Add date/time/duration/timestamp types to filter kernel
- ARROW-7080: [C++][Parquet] Read and write "field_id" attribute in Parquet files, propagate to Arrow field metadata. Assorted additional changes
- ARROW-7895: [Python] Remove more python 2.7 cruft
- ARROW-7063: [C++][Python] Add metadata output and toggle in PrettyPrint, add pyarrow.Schema.to_string, disable metadata output by default
- ARROW-7628: [Python] Clarify docs of csv reader skip_rows and nulls in strings
- ARROW-7491: [Java] Improve the performance of aligning
- ARROW-7912: [Format] C data interface
- ARROW-7920: [R] Fill in some missing input validation
- ARROW-7897: [Packaging] Temporarily disable artifact uploading until we fix the deployment issues
- ARROW-7608: [C++][Dataset] Add the ability to list files in FileSystemSource
- ARROW-7547: [C++][Dataset][Python] Add ParquetFileFormat options
- ARROW-7922: [CI][Crossbow] Nightly macOS wheel builds fail (brew bundle edition)
- ARROW-7921: [Go] Add Reset method to various components and clean up comments.
- ARROW-7915: [CI][Python] Enable development mode in tests
- ARROW-7874: [Python][Archery] Validate docstrings with numpydoc
- ARROW-6666: [Rust] Datafusion parquet string literal support
- ARROW-7664: [C++] Rework FileSystemFromUri
- ARROW-7685: [Developer] Add support for GitHub Actions to Crossbow
- ARROW-7899: [Integration][Java] Fix Flight integration test client to verify each batch
- ARROW-7928: [Python] Update Python flight server and client examples for latest API
- ARROW-7930: [CI][Python] Test jpype integration
- ARROW-7879: [C++][Doc] Add doc for the Device API
- ARROW-7934: : [C++] Fix UriEscape for empty string
- ARROW-7888: [Python] Update pyarrow.jvm to support jpype 0.7+
- ARROW-7929: [C++] Align CMake target names to upstreams
- ARROW-7937: [Python][Packaging] Remove boost from the macos wheels
- ARROW-6393: [C++] Add EqualOptions support in SparseTensor::Equals
- ARROW-1636: [C++][Integration] Implement integration test parsing in C++ for null type, add integration test data generation
- ARROW-7877: [Packaging] Fix crossbow deployment to github artifacts
- ARROW-7625: [Parquet][GLib] Add support for writer properties
- ARROW-7949: [Git] Ignore macOS specific file: 'Brewfile.lock.json'
- ARROW-7886: [C++][Dataset][Python][R] Consolidate Source and Dataset classes
- ARROW-7926: [Dev] Improve "archery lint" UI
- ARROW-7916: [C++] Project IPC batches to materialized fields only
- ARROW-7958: [Java] Update Avro to version 1.9.2
- ARROW-5949: [Rust] Implement Dictionary Array
- ARROW-7959: [Ruby] Add support for Ruby 2.3 again
- ARROW-7962: [R][Dataset] Followup to "Consolidate Source and Dataset classes"
- ARROW-3543: [R] Better support for timestamp format and time zones in R
- ARROW-7923: [CI][Crossbow] macOS autobrew fails on homebrew-versions
- ARROW-7969: [Packaging] Use cURL to upload artifacts
- ARROW-7970: [Packaging][Python] Use system boost to build the macOS wheels
- ARROW-7947: [Rust] [Flight] [DataFusion] Implement get_schema example
- ARROW-7971: [Rust] Create rowcount utility
- ARROW-7936: [Python] Fix and exercise tests on python 3.5
- ARROW-7931: [C++] Fix crash on corrupt Map array input (OSS-Fuzz)
- ARROW-7978: [Dev] Do not run IWYU in Github Actions "lint" workflow
- ARROW-7764: [C++] Don't keep a null bitmap in ArrayData if null_count == 0
- ARROW-7981: [C++][Dataset] Fix compilation on gcc 5.4
- ARROW-7940: [C++] Remove ARROW_USE_CLCACHE handling
- ARROW-7789: [R] Can't initialize arrow objects when R.oo package is loaded
- ARROW-7983: [CI][R] Nightly builds should be more verbose when they fail
- ARROW-7913: [C++][Python][R] C++ implementation of C data interface
- ARROW-1571: [C++][Compute] Optimize sorting integers in small value range
- ARROW-7975: [C++] Preserve intended buffer size by default when writing to IPC format
- ARROW-7882: [C++][Gandiva] Optimise like function for substring pattern
- ARROW-7749: [C++] Link more tests together
- ARROW-7917: [C++] Find Python 3 in CMake configuration
- ARROW-7988: [R] Fix on.exit calls in reticulate bindings
- ARROW-7992: [C++] Fix MSVC warning (#6525)
- ARROW-7890: [C++] Add Future implementation
- ARROW-7987: [CI][R] Fix for verbose nightly builds
- ARROW-7932: [Rust] implement array_reader for temporal types
- ARROW-8000: [C++] Fix compilation on gcc 4.8
- ARROW-8003: [C++] Use CMAKE_C_COMPILER when building bundled bzip2
- ARROW-7977: [C++] Rename fs::FileStats to fs::FileInfo
- ARROW-7739: [GLib] Use placement new to initialize shared_ptr object in private structs
- ARROW-7999: [C++] Fix crash on corrupt List / Map array input
- ARROW-8008: [C++/Python] Set Python3_FIND_FRAMEWORK=LAST
- ARROW-7998: [C++][Plasma] Make Seal requests synchronous
- ARROW-7995: [C++] Add facility to coalesce and cache reads
- ARROW-8013: [Python][Packaging] Fix building manylinux wheels
- ARROW-7990: [Developer][C++] Add option to run "archery lint --iwyu" on all C++ files, not just the ones that you changed. Add "match" option to iwyu.sh
- ARROW-7974: [C++][Developer] Fix linter warnings when PYTHONDEVMODE enabled
- ARROW-8006: [C++] Initialize spaced data when reading nulls from Parquet
- ARROW-8007: [Python] Remove unused and defunct assert_get_object_equal in plasma tests
- ARROW-5563: [Format] Update integration test JSON format documentation
- ARROW-7872: [C++/Python] Support conversion of list of structs to pandas
- ARROW-7984: [R] Check for valid inputs in more places
- ARROW-7892: [Python] Add FileSystemDataset.format attribute
- ARROW-7980: [Python] Fix creation of tz-aware datetime dtype on first pandas import
- ARROW-8016: [Developer] Fix jira-python deprecation warning in merge_arrow_pr.py
- ARROW-8009: [Java] Fix the hash code methods for BitVector
- ARROW-7837: [JAVA] copyFromSafe fails due to a bug in handleSafe
- ARROW-7806: [Python] Support LargeListArray and list conversion to pandas.
- ARROW-7048: [Java] Support for combining multiple vectors under VectorSchemaRoot
- ARROW-7935: [Java] Remove Netty dependency for BufferAllocator and ReferenceManager
- ARROW-7444: [GLib] Add LocalFileSystem support
- ARROW-7785: [C++] Improve compilation performance of sparse tensor related code
- ARROW-7943: [C++][Parquet] Add code to generate rep/def levels for nested arrays
- ARROW-8030: [Plasma] Uniform comments style
- ARROW-7991: [C++][Plasma] Allow option for evicting if full when creating an object
- ARROW-7982: [C++] Add function VisitArrayDataInline() helper
- ARROW-8014: [C++] Provide CMake targets exercising tests with a label
- ARROW-8024: [R] Bindings for BinaryType and FixedSizeBinaryType
- ARROW-8011: [C++] Fix buffer size when reading Parquet data to Arrow
- ARROW-4120: [Python] Testing utility for checking for "macro" memory leaks detectible with psutil.Process
- ARROW-6821: [C++][Parquet] Do not require Thrift compiler when building (but still require library)
- ARROW-8021: [Python] Install test requirements including pandas in Appveyor
- ARROW-7530: [Developer] Do not include list of PR commits in commit message when using PR merge tool
- ARROW-7963: [C++][Dataset][Python] Expose Dataset Fragments to Python
- ARROW-8057: [Python] Do not compare schema metadata in Schema.equals and Table.equals by default
- ARROW-8044: [CI][NIGHTLY:gandiva-jar-osx] Pin pygit2 at 1.0.3 for OSX
- ARROW-8055: [GLib][Ruby] Add some metadata bindings to GArrowSchema
- ARROW-8071: [GLib] Fix build error with configure
- ARROW-8072: [Plasma] Add const for plasma protocol
- ARROW-7907: [Python] Add test case for previously failing code involving slicing a 0-length ChunkedArray
- ARROW-8042: [Python] Clean up docstring and error message when creating ChunkedArray with no chunks
- ARROW-8026: [Python] Support memoryview as a value type for creating binary-like arrays
- ARROW-7994: [CI][C++][GLib][Ruby] Move MinGW CI to GitHub Actions from AppVeyor
- ARROW-7675: [R][CI] Move Windows CI from Appveyor to GHA
- ARROW-8083: [GLib] Add support for Peek() to GIOInputStream
- ARROW-7419: [Python] Support SparseCSCMatrix
- ARROW-7587: [C++][Compute] Implement nth_to_indices kernel
- ARROW-7951: [Python] Expose BYTE_STREAM_SPLIT in pyarrow
- ARROW-7802: [C++][Python] Support LargeBinary and LargeString in the hash kernel
- ARROW-8036: [C++] Avoid gtest 1.10 deprecation warnings
- ARROW-8064: [Dev] Implement Comment bot via Github actions
- ARROW-5265: [Python][CI] Add integration test with kartothek
- ARROW-8097: [Dev] Comment bot's crossbow command acts on the master branch
- ARROW-7680: [C++] Fix dataset.factory(...) with Windows paths
- ARROW-7865: [R] Test builds on latest Linux versions
- ARROW-7985: [C++] Fix builder capacity check
- ARROW-8091: [CI][Crossbow] Fix nightly homebrew and R failures
- ARROW-1560: [C++] Kernel implementations for "match" function
- ARROW-8104: [C++] Don't install bundled Thrift
- ARROW-7864: [R] Make sure bundled installation works even if there are system packages
- ARROW-8107: [Packaging][APT] Use HTTPS for LLVM APT repository for Debian GNU/Linux stretch
- ARROW-8109: [Packaging][APT] Drop support for Ubuntu Disco
- ARROW-8028: [Go] Allow duplicate field names in schemas and nested types
- ARROW-8102: [Dev] Crossbow's version detection doesn't work in the comment bot's scenario
- ARROW-8077: [Python][Packaging] Add Windows Python 3.5 wheel build script
- ARROW-2255: [C++][Developer][Integration] Serialize custom field/schema metadata
- ARROW-8106: [Python] Ensure extension array conversion tests passes with latest pandas
- ARROW-8119: [Dev] Make Yaml optional dependency for archery
- ARROW-7412: [C++][Dataset] Provide FieldRef to disambiguate field references
- ARROW-8117: [Datafusion] [Rust] allow cast SQLTimestamp to Timestamp
- ARROW-8120: [Packaging][APT] Add support for Ubuntu Focal
- ARROW-8110: [C#] BuildArrays fails if NestedType is included
- ARROW-8125: [C++] Restore link between tests created with add_arrow_test and arrow-tests target
- ARROW-8128: [C#] NestedType children serialized on wrong length
- ARROW-8124: [Rust] Update library dependencies
- ARROW-8132: [C++] Fix S3FileSystem tests on Windows
- ARROW-8133: [CI] Github Actions sometimes fail to checkout Arrow
- ARROW-8027: [Integration] Add test case for duplicated field names
- ARROW-7616: [Java] Support comparing value ranges for dense union vector
- ARROW-8087: [C++][Dataset] Partitioning schema fields follow paths' segment ordering
- ARROW-8105: [Python] Fix segfault when shrunken masked array is passed to pyarrow.array
- ARROW-7427: [Python] Support SparseCSFTensor
- ARROW-8112: [FlightRPC][C++] make sure status codes round-trip through gRPC
- ARROW-8129: [C++][Compute] Refine compare sort kernel
- ARROW-8136: [Python] Restore creating a dataset from a relative path
- ARROW-7332: [C++][Python] Propagate Arrow Status through Parquet errors
- ARROW-8101: [FlightRPC][Java] Fix null arrays in Flight with no buffers
- ARROW-8130: [C++][Gandiva] fix dex visitor to handle interval type
- ARROW-8140: [Dev] Follow class name change
- ARROW-8139: [C++] FileSystem enum causes attributes warning
- ARROW-6841: [C++] Migrate to LLVM 8
- ARROW-8092: [CI][Crossbow] OSX wheels fail on bundled bzip2
- ARROW-8095: [C++] Add support for string dictionary value with length
- ARROW-7812: [Packaging][Python] Use LLVM 8 in manylinux1 wheels
- ARROW-8126: [C++][Compute] Add nth-to-indices kernel benchmark
- ARROW-8127: [C++] [Parquet] Incorrect column chunk metadata for multipage batch writes
- ARROW-8144: [CI] Cmake 3.2 nightly build fails
- ARROW-8141: [C++] speed unpack1_32 using intrinsics API
- ARROW-8080: [C++] Add ARROW_SIMD_LEVEL option
- ARROW-7858: [C++][Python] Support casting from ExtensionArray
- ARROW-8122: [Python] Empty numpy arrays with shape cannot be deserialized
- ARROW-8153: [Packaging] Update the conda feedstock files and upload artifacts to Anaconda
- ARROW-7365: [Python] Convert FixedSizeList in to_pandas
- ARROW-8146: [C++] Add per-filesystem facility to sanitize a path
- ARROW-7927: [C++] Fix 'cpu_info.cc' compilation warning.
- ARROW-7966: [FlightRPC][C++] Validate individual batches in integration
- ARROW-7824: [C++][Dataset] WriteFragments to disk
- ARROW-8159: [Python] Support pandas.ExtensionDtype in Schema.from_pandas
- ARROW-8123: [Rust] [DataFusion] Add LogicalPlanBuilder
- ARROW-8118: [R] dim method for FileSystemDataset
- ARROW-7390: [C++][Dataset] Fix RecordBatchProjector race
- ARROW-8166: [C++] fix AVX512 intrinsics fail with clang-8
- ARROW-8178: [C++] Update to Flatbuffers 1.12.0
- ARROW-8103: [R] Make default Linux build more minimal
- ARROW-8179: [R] Windows build script tweaking for nightly packaging on GHA
- ARROW-8177: [rust] Make schema_to_fb_offset public because it is very useful!
- ARROW-7857: [Python] Revert temporary changes to pandas extension array tests
- ARROW-7896: [C++] Refactor from #include guards to #pragma once
- ARROW-8181: [Java][FlightRPC] Expose transport error metadata
- ARROW-8182: [Packaging] Increment the version number detected from the latest git tag
- ARROW-8088: [C++][Dataset] Support dictionary partition columns
- ARROW-7515: [C++] Rename nonexistent and non_existent to not_found
- ARROW-8145: [C++] Rename FileSystem::GetTargetInfos to GetFileInfo
- ARROW-7049: [C++] Fix MinGW64 warning in FieldRef::Get
- ARROW-8188: [R] Adapt to latest checks in R-devel
- ARROW-8165: [Packaging] Make nightly wheels available on a PyPI server
- ARROW-8191: [Packaging][APT] Fix cmake removal in Debian GNU/Linux Stretch
- ARROW-7091: [C++] Move DataType factory decls to type_fwd.h
- ARROW-8194: [CI] Run tests in parallel on Github Actions
- ARROW-6872: [Python] Fix empty table creation from schema with dictionary field
- ARROW-8150: [Rust] Allow writing custom FileMetaData k/v pairs
- ARROW-8186: [Python] Fix dataset expression operation with invalid scalar
- ARROW-8176: [FlightRPC] bind to a free port for integration tests
- ARROW-8200: [GLib] Rename garrow_file_system_target_info{,s}() to ..._file_info{,s}()
- ARROW-8195: [CI][C++][MSVC] Use preinstalled Boost
- ARROW-8187: [R] Make test assertions robust to i18n
- ARROW-8203: [C#] Use the latest SourceLink
- ARROW-8206: [R] Minor fix for backwards compatibility on Linux installation
- ARROW-8207: [Packaging][wheel] Use LLVM 8 in manylinux2010 and manylinux2014
- ARROW-8193: [C++] Fix gcc 4.8 compilation error with non-copyable types in Iterator::ToVector
- ARROW-8192: [C++] script for unpack avx512 intrinsics code
- ARROW-8197: [Rust] [DataFusion] Fix schema returned by physical plan
- ARROW-7898: [Python] Reduce the number docstring violations using numpydoc
- ARROW-8059: [Python] Make FileSystem objects serializable
- ARROW-8058: [Dataset] Relax DatasetFactory discovery validation
- ARROW-8204: [Rust] [DataFusion] Add support for aliased expressions in SQL
- ARROW-7979: [C++] Add experimental buffer compression to IPC write path. Add "field" selection to read path. Migrate some APIs to Result. Read/write Message metadata
- ARROW-8060: [Python] Make dataset Expression objects serializable
- ARROW-7919: [R] install_arrow() should conda install if appropriate
- ARROW-6915: [Developer] Do not overwrite point release fix versions with merge tool
- ARROW-7733: [Developer] Download new enough Go locally in release verification script
- ARROW-8142: [C++][Compute] Explicit no chunks case for WrapDatumsLike
- ARROW-7771: [Developer] Use ARROW_TMPDIR environment variable in the verification scripts instead of TMPDIR
- ARROW-8219: [Rust] sqlparser crate needs to be bumped to version 0.2.5
- ARROW-7708: [Developer][Release] Include PARQUET issues in release changelogs by scraping git history
- ARROW-8151: [Dataset][Benchmarking] benchmark S3File performance
- ARROW-8184: [Packaging] Use arrow-nightlies organization name on Anaconda and Gemfury to host the nightlies
- ARROW-4815: [Rust] [DataFusion] Add support for SQL wilcard operator
- ARROW-8215: [CI][GLib] Fix install error on macOS
- ARROW-8225: [Rust] Rust Arrow IPC reader must respect continuation markers.
- ARROW-8233: [CI][GLib][R] Fix timeount on MinGW
- ARROW-8239: [Java] fix param checks in splitAndTransfer method
- ARROW-8061: [C++][Dataset] Provide RowGroup fragments for ParquetFileFormat
- ARROW-7741: [C++] Adds parquet write support for nested types
- ARROW-8242: [C++] Flight fails to compile on GCC 4.8
- ARROW-8070: [C++] Cast segfaults on unsupported cast from list to utf8
- ARROW-8243: [Rust] [DataFusion] Fix inconsistency in LogicalPlanBuilder api
- ARROW-8231: [Rust] Parse parquet key_value_metadata
- ARROW-6895: [C++][Parquet] Do not reset dictionary in ByteArrayDictionaryRecordReader during incremental reads
- ARROW-8241: [Rust] Add Schema convenience methods index_of and field_with_name
- ARROW-8246: [C++] Add -Wa,-mbig-obj to CXXFLAGS on MinGW if it is supported
- ARROW-7783: [C++] Set ARROW_COMPUTE=ON if ARROW_DATASET=ON
- ARROW-8224: [C++] Remove APIs deprecated prior to 0.16.0
- ARROW-7941: [Rust] [DataFusion] Add support for named columns in logical plan
- ARROW-8249: [Rust] [DataFusion] Table API now uses LogicalPlanBuilder
- ARROW-8256: [Rust] [DataFusion] Update CLI documentation for 0.17.0 release
- ARROW-8259: [Rust] [DataFusion] ProjectionPushDown now respects LIMIT
- ARROW-8255: [Rust] [DataFusion] Bug fix for COUNT(*)
- ARROW-8267: [CI][GLib] Fix build error on Ubuntu 16.04
- ARROW-2587: [Python][Parquet] Verify nested data can be written
- ARROW-5510: [C++][Python][R][GLib] Implement Feather "V2" using Arrow IPC file format
- ARROW-8252: [CI][Ruby] Add Ubuntu 20.04
- ARROW-8222: [C++] Use bcp to make a slim boost for bundled build
- ARROW-8268: [CI][Ruby] Enable Zstandard on Ubuntu 16.04
- ARROW-8232: [Python] Deprecate pyarrow.open_stream and pyarrow.open_file APIs in favor of accessing via pyarrow.ipc namespace
- ARROW-8269: [Python] Add pandas mark to test_parquet_row_group_fragments to fix nopandas build
- ARROW-8220: [Python] Make dataset FileFormat objects serializable
- ARROW-8183: [C++][Python][FlightRPC] Expose transport error metadata
- ARROW-8168: [Java][Plasma] Improve Java Plasma client off-heap memory usage
- ARROW-8274: [C++] Use LZ4 frame format for "LZ4" compression in IPC
- ARROW-8264: [Rust] [DataFusion] Add utility for printing batches
- ARROW-7792: [R] read_* functions should close connection to file
- ARROW-8280: [C++] Use c-ares_INCLUDE_DIR
- ARROW-8198: [C++] Format Diff of NullArrays
- ARROW-8271: [Packaging] Allow wheel upload failures to gemfury
- ARROW-8238: [C++] Fix FieldPath type definition
- ARROW-8294: [Flight] Add DoExchange to Flight.proto
- ARROW-8286: [Python] Ensure to create FileSystemDataset when passing pathlib path
- ARROW-8288: [Python] Expose with_ modifiers on DataType
- ARROW-8270: : [Python][Flight] Update Python server example to support TLS
- ARROW-8218: [C++] Decompress record batch messages in parallel at field level. Only allow LZ4_FRAME, ZSTD compression
- ARROW-8279: [C++] Do not export Codec implementation symbols, remove codec-specific headers
- ARROW-8277: [Python] implemented eq, repr, and provided a wrapper of Take() for RecordBatch
- ARROW-7428: [Format][C++] Add serialization for CSF sparse tensors
- ARROW-7740: [C++] Fix StructArray::Flatten corruption
- ARROW-8303: [Python] Fix test failure on Python 3.5 caused by non-deterministic dict key ordering
- ARROW-8298: [C++][MinGW] Fix gRPC detection
- ARROW-8291: [Packaging] Conda nightly builds can't locate Numpy
- ARROW-8217: [R] Unskip previously failing test on Win32 in test-dataset.R from ARROW-7979
- ARROW-8308: [Rust] Implement DoExchange on examples
- ARROW-7641: [R] Make dataset vignette have executable code:
- ARROW-8309: [CI] C++/Java/Rust workflows should trigger on changes to Flight.proto
- ARROW-8292: [Python] Allow to manually specify schema in dataset() function
- ARROW-5585: [Go] Rename TypeEquals to TypeEqual
- ARROW-8272: [CI][Python] Fix test failure on Python 3.5
- ARROW-8304: [Flight][Python] Fix client example with TLS
- ARROW-8310: [C++] Improve auto-retry in S3 tests
- ARROW-7904: [C++][Python] Revamp metadata display, change show_metadata to verbose_metadata
- ARROW-8276: [C++][Dataset] Use Scanner for Fragment.to_table
- ARROW-8082: [Plasma] Add JNI list() interface
- ARROW-7852: [Python] 0.16.0 wheels not compatible with older numpy
- ARROW-8167: [CI] Add support for skipping builds with skip pattern in pull request title
- ARROW-8079: [Python] Implement a wrapper for KeyValueMetadata, duck-typing dict where relevant
- ARROW-8185: [Packaging] Document the available nightly wheels and conda packages
- ARROW-6479: [C++] Inline errors from externalprojects on failure
- ARROW-7008: [C++] Check binary offsets and data buffers for nullness in validation. Produce valid arrays in DictionaryEncode on zero-length arrays
- ARROW-8315: [Python] Fix dataset tests on Python 3.5
- ARROW-8322: [CI] Fix C# workflow file syntax
- ARROW-8244: [Python] Fix parquet.write_to_dataset to set file path in metadata_collector
- ARROW-8319: [CI] Install thrift compiler in the debian build
- ARROW-8005: [Tools] Update apache mirror links
- ARROW-5501: [R] Reorganize read/write file/stream functions
- ARROW-8216: [C++][Compute] Filter out nulls by default
- ARROW-5473: [C++] Fix googletest_ep build failure on windows+ninja
- ARROW-8325: [R][CI] Stop including boost in R windows bundle
- ARROW-8326: [C++] Use TYPED_TEST_SUITE instead of deprecated TYPED_TEST_CASE
- ARROW-8321: [CI] Use bundled thrift in Fedora 30 build
- ARROW-8305: [Java] ExtensionTypeVector should make sure underlyingVector not null
- ARROW-8098: [Go] Avoid unsafe unsafe.Pointer usage
- ARROW-8323: [C++] Add pragmas wrapping proto_utils.h to disable conversion warnings
- ARROW-8245: [Python][Parquet] Skip hidden directories when reading partitioned parquet files
- ARROW-8331: [C++] Fix filter_benchmark.cc compilation
- ARROW-8332: [C++] Don't require Thrift compiler for Parquet build
- ARROW-8307: [Python] Add memory_map= option to pyarrow.feather.read_table
- ARROW-8209: [Python] Improve error message when trying to access duplicate Table column
- ARROW-6837: [C++] Add APIs to read and write "custom_metadata" field of IPC file footer
- ARROW-6996: [Python] Expose boolean filter kernel on ChunkedArray/RecordBatch/Table
- ARROW-8333: [C++] Compile benchmarks in at least one C++ CI entry
- ARROW-8336: [Packaging][deb] Use libthrift-dev on Debian 10 and Ubuntu 19.10 or later
- ARROW-8237: [Python][Documentation] Review Python developer documentation, add Dockerfile showing minimal source build with conda and pip/virtualenv
- ARROW-8327: [FlightRPC][Java] check gRPC trailers for null
- ARROW-8334: [C++] [Gandiva] Missing DATE32 in LLVM Types
- ARROW-8341: [Packaging][deb] Reduce disk usage on building packages
- ARROW-8320: [Format] Add clarification to CDataInterface.rst regarding memory alignment of buffers
- ARROW-8227: [C++] Refine SIMD feature definitions
- ARROW-7891: [C++][GLib][Python][R] Make uniform use of check_metadata=false default. Add Py/R/GLib bindings for RecordBatch::Equals with check_metadata
- ARROW-4304: [Rust] Enhance documentation for arrow
- ARROW-6947: [Rust] [DataFusion] Scalar UDF support
- ARROW-8351: [R][CI] Store the Rtools-built Arrow C++ library as a build artifact
- ARROW-8330: [Documentation] The post release script generates the documentation with a development version
- ARROW-8345: [Python] Ensure feather read/write can work without pandas installed
- ARROW-8275: [Python] Update Feather documentation for V2, Python IPC API cleanups / deprecations
- ARROW-8300: [R] Documentation and changelog updates for 0.17
- ARROW-8346: [CI][GLib] Follow pkg-config change in Homebrew
- ARROW-8349: [CI][NIGHTLY:gandiva-jar-osx] Use latest pygit2
- ARROW-8353: [C++] Fix some compiler warnings in release builds
- ARROW-8358: [C++] Fix some clang-11 compiler warnings
- ARROW-8347: [C++] Migrate Array methods to Result
- ARROW-7794: [Rust] Support releasing arrow-flight
- ARROW-8357: [Rust] [DataFusion] Add format dir to dockerfile for CLI
- ARROW-8366: [Rust] Revert "ARROW-7794: [Rust] Support releasing arrow-flight"
- ARROW-8365: [C++] Error when writing files to S3 larger than 5 GB
- ARROW-8356: [Developer] Support * wildcards with "crossbow submit" via GitHub actions
- ARROW-8361: [C++] Add Result APIs to Buffer methods and functions
- ARROW-8362: [Crossbow] Ensure that the locally generated version is used in the docker tasks
- ARROW-8213: [Python][Dataset] Opening a dataset with a local incorrect path gives confusing error message
- ARROW-8342: [Python] Continue to return dict from "metadata" properties accessing KeyValueMetadata
- ARROW-8343: [GLib] Add GArrowRecordBatchIterator
- ARROW-8352: [R] Add install_pyarrow()
- ARROW-8367: [C++] Deprecate Buffer::FromString(..., MemoryPool*)
- ARROW-8299: [C++] Reusable "optional ParallelFor" function for optional use of multithreading
- ARROW-8370: [C++] Migrate type/schema APIs to Result
- ARROW-6176: [Python] Basic implementation of arrow_ext_class, in pure Python
- ARROW-8371: [Crossbow] Implement and exercise sanity checks for tasks.yml
- ARROW-8369: [CI] Fix crossbow wildcard groups
- ARROW-8372: [C++] Migrate Table and RecordBatch APIs to Result
- ARROW-8316: [CI] Set docker-compose to use docker-cli instead of docker-py for building images
- ARROW-7256: [C++] Remove ARROW_MEMORY_POOL_DEFAULT macro
- ARROW-8375: [CI][R] Make Windows tests more verbose in case of segfault
- ARROW-8354: [R] Fix segfault in Table to Array conversion
- ARROW-8039: [Python] Use dataset API in existing parquet readers and tests
- ARROW-7233: [C++] Use Result in remaining value-returning IPC APIs
- ARROW-8380: Export StringDictionaryBuilder from arrow::array crate
- ARROW-8164: [C++][Dataset] Provide Dataset::ReplaceSchema()
- ARROW-7336: [C++][Compute] fix minmax kernel options
- ARROW-8376: [R] Add experimental interface to ScanTask/RecordBatch iterators
- ARROW-8373: [CI][GLib] Find gio-2.0 manually on macOS
- ARROW-7679: [R] Cleaner interface for creating UnionDataset
- ARROW-6510: [Python][Filesystem] Expose nanosecond resolution mtime
- ARROW-8389: [Integration] Run tests in parallel
- ARROW-8335: [Release] Add crossbow jobs to run release verification
- ARROW-8311: [C++] Add push style stream format reader
- ARROW-8388: [C++][CI] Ensure Arrow compiles with GCC 4.8
- ARROW-8396: [Rust] Removes libc dependency
- ARROW-8390: [R] Expose schema unification features
- ARROW-8398: [Python] Remove deprecated API usage from python tests
- ARROW-8158: [Java] Getting length of data buffer and base variable width vector
- ARROW-8397: [C++] Fail to compile aggregate_test.cc on Ubuntu 16.04
- ARROW-8266: [C++] Provide backup mirrors for thrift externalproject
- ARROW-8387: [Rust] Make schema_to_fb public
- ARROW-8408: [Python] Add memory_map argument to feather.read_feather
- ARROW-8414: [Python] Fix non-deterministic row order failure in parquet tests
- ARROW-8393: [C++][Gandiva] Make gandiva function registry case-insensitive
- ARROW-8407: [Rust] Add documentation for Dictionary data type
- ARROW-8410: [C++] Fix compilation errors on modest ARMv8 platforms (rockpro64, rpi4)
- ARROW-8420: [C++] Distinguish ARMv7 from ARMv8 in SetupCxxFlags.cmake
- ARROW-8428: [C++] GCC 4.8 Implicit move-on-return failure in C++ tests
- ARROW-8290: [Python] Improve FileSystemDataset constructor
- ARROW-8406: [C++][Python] Fix file URI handling
- ARROW-8403: [C++] Add ToString() to ChunkedArray, Table and RecordBatch
- ARROW-8409: [R] Add R wrappers for getting and setting global CPU thread pool capacity
- ARROW-8386: [Python] Fix error when pyarrow.jvm gets an empty vector
- ARROW-8295: [C++][Dataset] Push down projection to IpcReadOptions
- ARROW-8433: [R] Add feather alias for ipc format in dataset API
- ARROW-8416: [Python] Add feather alias for ipc format in dataset API
- ARROW-8429: [C++] Implement missing checks in IPC MessageDecoder
- ARROW-8427: [C++][Dataset] Only apply ignore_prefixes to selector results
- ARROW-8415: [C++][Packaging] Fix gandiva linux job
Patches with assigned issue outside of 0.17.0:
- PARQUET-1716: [C++] Add BYTE_STREAM_SPLIT encoder and decoder
- ARROW-7514: [C#] Make GetValueOffset Obsolete
- ARROW-7743: [Rust] Support reading timestamp micros
- ARROW-7768: [Rust] Implement TryClone and Length for Cursor<Vec>
- PARQUET-1788: Remove UBSan when rep/dev levels are null
- PARQUET-1770: [C++][CI] Add fuzz target for reading Parquet files
- ARROW-7505: [Java] Remove Netty dependency for ArrowBuf (#6131)
- PARQUET-1799: [C++] Stream API: Relax schema checking when reading
- PARQUET-1797: [C++] Fix fuzzer issues
- PARQUET-1785: [C++] Implement ByteStreamSplitDecoder::DecodeArrow and refactor tests
- PARQUET-1780: [C++] Set ColumnMetadata.encoding_stats field
- PARQUET-1806: [C++] Improve fuzzing seed corpus
- PARQUET-1810: [C++] Fix undefined behaviour on invalid enum values (OSS-Fuzz)
- ARROW-7993: [Java] Support decimal type in ComplexCopier
- ARROW-7335: [C++][Gandiva] Add day_time_interval functions: castBIGINT, extractDay
- PARQUET-1663: [C++] Provide API to check the presence of repeated fields
- PARQUET-1813: [C++] Remove debug print statement from parquet-arrow-schema-test
- ARROW-8086: [Java] Support writing decimal from big endian byte array in UnionListWriter
- ARROW-8096: [C++][Gandiva] fix TreeExprBuilder::MakeNull to create node for interval type
- ARROW-3329: [C++] Added casts Decimal128 to Decimal128 and Int64
- PARQUET-1819: [C++] Fix crashes on invalid input
- ARROW-8136: [Python] More robust inference of local relative path in dataset
- PARQUET-1823: [C++] Invalid RowGroup returned by parquet::arrow::FileReader
- PARQUET-1819: [C++] Refactor decoding
- PARQUET-1824: [C++] Fix crashes and undefined behaviour on invalid input
- PARQUET-1786: [C++] Improve ByteStreamSplit decoder using SSE2
- PARQUET-1825: [C++] Fix compilation error in column_io_benchmark.cc
- PARQUET-458: [C++][Parquet] Add support for reading/writing DataPageV2 format
- PARQUET-1829: [C++] Fix crashes on invalid input (OSS-Fuzz)
- PARQUET-1828: [C++] Use SSE2 for the ByteStreamSplit encoder
- PARQUET-1831: [C++] Fix crashes on invalid input (OSS-Fuzz)
- ARROW-8225: [Rust] Continuation marker check was in wrong location.
- ARROW-8237: [Python][Documentation] Minor corrections to python minimal build documentation
- PARQUET-1835: [C++] Fix crashes on invalid input
- ARROW-7794: [Rust] [Flight] Remove hard-coded relative path to Flight.proto
Patches without assigned issue:
- [maven-release-plugin] prepare for next development iteration
- [Release] Update versions for 1.0.0-SNAPSHOT
- [Release] Update .deb package names for 1.0.0
- [C++] [Dev] Sync arrow-testing submodule (#6373)
- [MINOR][Python] Build more compressors in Windows Python build instructions
JIRAs in 0.17.0 without assigned patch:
- ARROW-7997
- ARROW-7520
- ARROW-7939
- ARROW-7713
- ARROW-8094
- ARROW-8235
- ARROW-7044
- ARROW-6829
- ARROW-8063
- ARROW-7986
- ARROW-8441
- ARROW-5405
- ARROW-8368
- ARROW-6528
- ARROW-7790
- ARROW-5205
- ARROW-4482
- ARROW-7956
- ARROW-8254
- ARROW-7222
- ARROW-7965
- ARROW-7499
- ARROW-4428
- ARROW-7338
- ARROW-7755
- ARROW-7996
- ARROW-8035
- ARROW-8234
- ARROW-8439
- ARROW-4286
- ARROW-8432
- ARROW-8138
- ARROW-7827
- ARROW-8437
- ARROW-7507
- ARROW-6823
- ARROW-7908
- ARROW-6275
- ARROW-7202
- ARROW-3054
- ARROW-8329
- ARROW-7861
- ARROW-8029
- ARROW-8223
- ARROW-8236
- ARROW-8047
- ARROW-7944
- ARROW-8018
- ARROW-1470
- ARROW-3750
- ARROW-7809
- ARROW-7860
- ARROW-3410
- ARROW-7501
- ARROW-8401
- ARROW-8099
- ARROW-7973
- ARROW-6547
- ARROW-1582
- ARROW-7373
- ARROW-7672
- ARROW-7968
- ARROW-3004
- ARROW-8247
- ARROW-1907
- ARROW-8438
- ARROW-7807
- ARROW-8154
- ARROW-8442
- ARROW-8449
- ARROW-6890
- ARROW-5497
- ARROW-7555
- ARROW-8075