Fix C++ client time column access returning NULL for non-long types (#17397)#17400
Fix C++ client time column access returning NULL for non-long types (#17397)#17400PDGGK wants to merge 1 commit intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Good job!
Well, I doubt if it is a good idea to support returning int32 and float, because the data will probably be incorrect (I can't even say it is a precision loss) due to overflow.
It will be better to add some comments in the method definitions to point out the potential risk.
Also, please add a unit test or an integration test for the targeted scenario and take a look at the failed tests.
|
And please rebase the latest master branch, the failed c++ tests should already be fixed now. |
|
Great catch on the vector out-of-bounds UB in Regarding your design question: I strongly prefer throwing Could you please update the numeric getters to throw exceptions instead of casting? |
|
Thanks for the reviews @jt2594838 @HTHou @CritasWang! I agree — silently casting a 64-bit timestamp to int32/float is dangerous. I'll update all non-long/non-string getters to throw Summary of changes coming:
Will push the update shortly. |
ea0b2f8 to
a8b95a1
Compare
…he#17397) Add tsBlockColumnIndex < 0 guards to all ByTsBlockColumnIndex getter methods that were missing them. When the time pseudo-column is accessed through a typed getter other than getLong or getString, throw IoTDBException instead of causing undefined behavior via getColumn(-1). Signed-off-by: Zihan Dai <1436286758@qq.com>
a8b95a1 to
97d9b26
Compare
Description
Fixes #17397
Several
ByTsBlockColumnIndexgetter methods inIoTDBRpcDataSet.cppdid not handle thetsBlockColumnIndex < 0case (time pseudo-column in tree model), causing undefined behavior when the time column was accessed through typed getters other thangetLongorgetString.In tree model, column index 1 maps to
tsBlockColumnIndex = -1(the time column). When a user callsgetBooleanByIndex(1), this flows togetBooleanByTsBlockColumnIndex(-1), which callscurTsBlock_->getColumn(-1)— an out-of-boundsstd::vectoraccess (UB in C++).Only
getLongByTsBlockColumnIndexandgetStringByTsBlockColumnIndexcorrectly handled the< 0case. The following five methods were missing the guard:getBooleanByTsBlockColumnIndexIoTDBException(boolean from timestamp is nonsensical)getDoubleByTsBlockColumnIndexdoublegetFloatByTsBlockColumnIndexfloatgetIntByTsBlockColumnIndexint32_tgetBinaryByTsBlockColumnIndexBinarystringDesign Discussion
For the numeric types (
int,float,double), I chose to return the timestamp value (cast to the target type) to stay consistent withgetLongByTsBlockColumnIndexwhich already returns the timestamp. However, an alternative approach would be to throwIoTDBExceptionfor all non-long/non-string types, since:static_cast<int32_t>(timestamp)silently truncates 64-bit timestampsfloatorbooleanis likely a user errorI'd appreciate the maintainers' preference on which approach is better. Happy to change if throwing is preferred.
Note
The Java client (
IoTDBRpcDataSet.java) has the same asymmetry — onlygetLongandgetStringhandletsBlockColumnIndex < 0. The other typed getters (getBoolean,getInt,getFloat,getDouble,getBinary) will throwIndexOutOfBoundsExceptionin Java. This could be addressed as a follow-up.