This document lays out the ways in which a few prominent SQL-on-Hadoop systems read and write decimal values from and to parquet files, and their respective in-memory formats.
Parquet's logical DECIMAL
type can to be represented by the following
physical types.
- A 32-bit integer for values that occupy between 1 and 4 bytes
- A 64-bit integer for values that occupy between 1 and 8 bytes
- Fixed size array of bytes for values that fit into 1 or more bytes
- Variable size array of bytes for values that fit into 1 or more bytes
Note some possible ambiguity around how many bytes to write for different precisions:
-
For option 4 (variable size byte array), the specification says:
The minimum number of bytes to store the unscaled value should be used.
For option 3 (fixed size byte array), there's no such requirement. For example, if a system wanted to write decimal values as 16 byte arrays even though each individual value fit into a 32-bit integer, that's allowed.
In practice, most systems write the minimum number of bytes needed to store a decimal value of a given precision.
Each system supports reading and writing parquet files. If you're just interested in the TL;DR, then see the compatibility matrix below.
Parquet-MR | Hive | Impala | Spark | |
---|---|---|---|---|
Parquet-MR | R/W | |||
Hive | R/W | |||
Impala | R/W | |||
Spark | R/W |
This is the original implementation of the Parquet format in Java. All of the
below systems except Impala use parquet-mr
to implement read and write
support for Parquet.
parquet-mr
requires client libraries to extend a set of abstract classes,
e.g.,
PrimitiveConverter
,
that convert Parquet physical types into the client's preferred logical type.
parquet-mr
requires client libraries to extend a set of abstract classes,
e.g.,
RecordConsumer
,
that convert the client's value of a particular logical type to the client's
preferred physical type.
Hive uses custom FastHiveDecimal
/FastHiveDecimalImpl
classes which are backed by 3 Java long
values, each of which is 8 bytes in memory.
Hive uses DecimalColumnVector
s which contain an array of
HiveDecimalWritable
objects.
HiveDecimalWritable
is a subclass of FastHiveDecimal
that produces mutable instances.
Questions:
- Does Hive support reading Parquet decimals written in 32- or 64-bit integer format?
Hive writes the minimum number of bytes necessary to represent decimal values.
Impala has three scalar in-memory decimal types:
Decimal4
, backed byint32_t
Decimal8
, backed byint64_t
Decimal16
, backed by__int128_t
Impala writes every Decimal value in the FIXED_LEN_BYTE_ARRAY
format.
It writes the minimum number of bytes necessary to represent decimal values.
Spark uses a custom Decimal
class to represent scalar decimals.
This class stores either a BigDecimal
or Long
depending on the precision. It also stores the precision and scale of the decimal.
Spark uses an Array[Decimal]
collection to hold vectors of decimals.
Spark reads bytes given to it by the parquet-mr
API into its Decimal
object
mentioned above.
Spark writes decimal values to Parquet files in two modes: legacy and non-legacy.
- Legacy mode writes all decimal values as fixed size byte arrays.
- Non-legacy mode writes 32-bit integers, 64-bit integers, or fixed size byte arrays depending on the decimal precision.
In both cases, Spark only writes the minimum number of bytes needed to represent the level of precision of the decimal type.
Arrow uses a 2's complement based, 16-byte custom Decimal128
class backed by 1 int64_t
and 1 uint64_t
.
Arrow converts the scalar in-memory format values to their bytes (in big-endian byte order). The vector format is an array whose elements are fixed size byte arrays of size 16 bytes.
Every system in this document supports writing decimal values as FIXED_LEN_BYTE_ARRAY
s.
Some systems support writing decimals as 32- or 64-bit integers. Therefore, for maximum compatibility
parquet-cpp
should support reading of INT32
, INT64
, and FIXED_LEN_BYTE_ARRAY
backed decimal values.
Every system in this document supports reading of parquet files containing
decimal values represented as FIXED_LEN_BYTE_ARRAY
s. Therefore, for maximum
compatibility parquet-cpp
should write decimals in the FIXED_LEN_BYTE_ARRAY
physical format using the smallest number of bytes possible.