c++ - "Flattening" std::set<std::string> for storage and comparison? -
this might silly question based on fact std::set<> has comparison operators, think might have optimization specific use case , want make sure i'm not hurting myself somehow.
essentially, have costly operation takes input std::set&. i'm caching operation's result can return result if same inputs have been passed in. require storing copies of sets (which i'm doing in
std::map<std::set<std::string>, result*>
, , doing search every time operation called. since same operation going called thousands of times in row, cached std::set found >99% of time. experimented thought might small improvement, based on fact characters invalid in passed-in strings: flattened std::set single string, component strings being delimited ':' character. std::map becomes
std::map<std::string, result*>
and every time operation called, set flattened , single string searched in cache.
i surprised performance improvement. test run used std::sets containing 5 strings, each 30 characters long, , run of 10,000,000 searches. on workstation, times each run were
std::map<std::set<std::string>, result*> : 138.8 seconds std::map<std::string, result> : 89.2 seconds
it seems that, overhead of flattening set every call, second method huge improvement. guess question is: why? doing potentially bad here implementors of std::set purposefully avoided (i.e. potentially causing bad heap fragmentation bigger string?) because individual strings in set in different locations , have compared separately? shooting myself in foot? seems obvious improvement in specific case give such performance boost.
why?
data locality.
std::set
implemented binary search tree. may search operation faster due caching on machine std::string
, in comparison std::set
.
Comments
Post a Comment