Skip to content

Instantly share code, notes, and snippets.

@rvernica
Last active April 19, 2018 14:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rvernica/cb178f62ad5db44886cdcaa4496a0e97 to your computer and use it in GitHub Desktop.
Save rvernica/cb178f62ad5db44886cdcaa4496a0e97 to your computer and use it in GitHub Desktop.
aio_tools benchmark
while (!cursor.end() &&
((cellsPerChunk <= 0 && bytesCount < bytesPerChunk) ||
(cellsPerChunk > 0 && nCells < cellsPerChunk)))
{
for (size_t i = 0; i < nAttrs; ++i)
{
shared_ptr<ConstChunkIterator> citer = cursor.getChunkIter(i);
switch (_inputTypes[i])
{
case TE_INT64:
{
size_t cnt = 100000;
vector<int64_t> values(cnt);
vector<bool> is_valid(cnt);
bytesCount += cnt * _inputSizes[i];
citer->setPosition(citer->getLastPosition());
// while (!citer->end())
// {
// Value const& value = citer->getItem();
// if(value.isNull())
// {
// // THROW_NOT_OK(
// // static_cast<arrow::Int64Builder*>(
// // _arrowBuilders[i].get())->AppendNull());
// values.push_back(-1);
// is_valid.push_back(false);
// }
// else
// {
// // THROW_NOT_OK(
// // static_cast<arrow::Int64Builder*>(
// // _arrowBuilders[i].get())->Append(
// // value.getInt64()));
// values.push_back(value.getInt64());
// is_valid.push_back(true);
// }
// bytesCount += _inputSizes[i];
// ++(*citer);
// }
THROW_NOT_OK(
static_cast<arrow::Int64Builder*>(
_arrowBuilders[i].get())->Append(values, is_valid));
break;
}
Setup
===
Number of runs: 3
Target size: 200.00 MB
Buffer size: 3.05 MB
Chunk size: 100000
Number of records: 6553600
Number of chunks: 66
Fix Size Schema (int64 only)
---
SciDB size: 200.02 MB
In-memory size: 0.00 MB
File size: 204.57 MB
Save
===
Fix Size Schema (int64 only)
---
Binary: 1.28 seconds 156.45 MB/second
Arrow: 0.82 seconds 242.81 MB/second
> iquery -aq "aio_save(build(<x:int64>[i=1:400:0:100], i), '/tmp/a', 'format=arrow', 'buffer_size=1000')"
{chunk_no,dest_instance_id,source_instance_id} val
> 0
2018-03-25 01:01:41.000158 [0x7fab10474700] [DEBUG]: ALT_SAVE>> Starting SG
2018-03-25 01:01:41.000159 [0x7fab10474700] [DEBUG]: ALT_SAVE>> opening file
2018-03-25 01:01:41.000159 [0x7fab10474700] [DEBUG]: ALT_SAVE>> starting write
2018-03-25 01:01:41.000159 [0x7fab10272700] [DEBUG]: ALT_SAVE>> builder.getTotalSize 1165
2018-03-25 01:01:41.000159 [0x7fab10272700] [DEBUG]: ALT_SAVE>> bytesCount 900
2018-03-25 01:01:41.000159 [0x7fab10272700] [DEBUG]: ALT_SAVE>> nCells 1
2018-03-25 01:01:41.000169 [0x7fab10474700] [DEBUG]: ALT_SAVE>> wrote 4052 bytes, closing
2018-03-25 01:01:41.000169 [0x7fab10474700] [DEBUG]: ALT_SAVE>> closed
> 1
2018-03-25 01:01:41.000158 [0x7ff3c5dfb700] [DEBUG]: ALT_SAVE>> Starting SG
2018-03-25 01:01:41.000159 [0x7ff3c5efc700] [DEBUG]: ALT_SAVE>> builder.getTotalSize 1965
2018-03-25 01:01:41.000160 [0x7ff3c5efc700] [DEBUG]: ALT_SAVE>> bytesCount 1800
2018-03-25 01:01:41.000160 [0x7ff3c5efc700] [DEBUG]: ALT_SAVE>> nCells 2
2018-03-25 01:01:41.000160 [0x7ff3c5efc700] [DEBUG]: ALT_SAVE>> builder.getTotalSize 1165
2018-03-25 01:01:41.000160 [0x7ff3c5efc700] [DEBUG]: ALT_SAVE>> bytesCount 900
2018-03-25 01:01:41.000161 [0x7ff3c5efc700] [DEBUG]: ALT_SAVE>> nCells 1
> iquery -aq "aio_save(apply(build(<x:int64>[i=1:400:0:100], i), y, string(x)), '/tmp/a', 'format=arrow', 'buffer_size=2000')"
{chunk_no,dest_instance_id,source_instance_id} val
> 0
2018-03-25 01:02:36.000646 [0x7fab10474700] [DEBUG]: ALT_SAVE>> Starting SG
2018-03-25 01:02:36.000647 [0x7fab10474700] [DEBUG]: ALT_SAVE>> opening file
2018-03-25 01:02:36.000647 [0x7fab10474700] [DEBUG]: ALT_SAVE>> starting write
2018-03-25 01:02:36.000648 [0x7fab10676700] [DEBUG]: ALT_SAVE>> builder.getTotalSize 1981
2018-03-25 01:02:36.000648 [0x7fab10676700] [DEBUG]: ALT_SAVE>> bytesCount 1400
2018-03-25 01:02:36.000648 [0x7fab10676700] [DEBUG]: ALT_SAVE>> nCells 1
2018-03-25 01:02:36.000649 [0x7fab10474700] [DEBUG]: ALT_SAVE>> wrote 7092 bytes, closing
2018-03-25 01:02:36.000650 [0x7fab10474700] [DEBUG]: ALT_SAVE>> closed
> 1
2018-03-25 01:02:36.000643 [0x7ff3c5efc700] [DEBUG]: ALT_SAVE>> Starting SG
2018-03-25 01:02:36.000647 [0x7ff3c5bf9700] [DEBUG]: ALT_SAVE>> builder.getTotalSize 3373
2018-03-25 01:02:36.000647 [0x7ff3c5bf9700] [DEBUG]: ALT_SAVE>> bytesCount 2692
2018-03-25 01:02:36.000648 [0x7ff3c5bf9700] [DEBUG]: ALT_SAVE>> nCells 2
2018-03-25 01:02:36.000648 [0x7ff3c5bf9700] [DEBUG]: ALT_SAVE>> builder.getTotalSize 1981
2018-03-25 01:02:36.000648 [0x7ff3c5bf9700] [DEBUG]: ALT_SAVE>> bytesCount 1400
2018-03-25 01:02:36.000648 [0x7ff3c5bf9700] [DEBUG]: ALT_SAVE>> nCells 1

Size: 500MB 4 SciDB instances

Read SciDB Write Arrow Array Send Arrow Array Read/Write Arrow Batch Time Speed Notes Commit
-- ON ON ON 1.71s 292.23 MB/s e5911e2
ON -- NOP NOP 2.05s 243.65 MB/s No output 9c9550a
ON ON -- NOP 2.43s 206.14 MB/s No output e0a6705
ON ON ON -- 3.22s 155.50 MB/s No output 9419393
ON ON ON ON 3.28s 152.54 MB/s 0859db1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment