IndexedVector - holding a vector of unique objects with faster finding elements











up vote
10
down vote

favorite












I am trying to write a class that will enable me to hold an std::vector of some unique objects, but faster finding elements already stored in a vector. I want the uniqueness to be based on some key element of the held objects (not based on compare operator).



I considered Boost::multi_index_container which is almost exactly what I need, except that I couldn't easily extract std::vector from it. I need exactly std::vector to pass it to some external API. If I had multi_indexed_container then every time I needed an std::vector (very often) I would have to construct it from scratch (based on iterators of multi_index_container) which I think would cost too much.



I also considered sorted vector. This would allow me to hold only vector, but I would have to take care about proper insertion (so that it is sorted). It is easy, but I think it would still cost too much, as the complexity of a binary search algorithm to find the element in sorted vector is $O(logN)$. I need something better.



The last resort is a known idea of having a vector and a hashmap which stores indexes of the vector as values. The complexity of finding element in such storage would be constant (as it is for hashmap).



I also want to use my container with types which have public field representing ID (key). I want my container to be able to receive information which of the fields of the type it holds represents the key. I don't want to store the value object in two places (vector and hashmap), I only want to store it in a vector, and in the hashmap store only the ID (key).



One thing I need to add: I need this for environment where I will take care of thread safety myself, and exceptions are not used (no need to worry about exception safety).



Here is the class together with some example usage and some basic tests. Do you think it is OK? Does it lack anything?



#include <iostream>
#include <vector>
#include <unordered_map>

template <class V, typename Mem, Mem V::*member>
class IndexedVector {
using key_type = Mem;
using vector_storage = std::vector<V>;
using hashed_storage = std::unordered_map<key_type, size_t>;
using hashed_pair = std::pair<key_type, size_t>;
using hashed_iter = typename hashed_storage::iterator;

public:
bool push(const V& val) {
key_type key = (val.*member);
hashed_iter it = indices.find(key);
if (it == indices.end())
{
size_t vi = values.size();
values.push_back(val);
indices.insert(hashed_pair(key, vi));
return true;
}
return false;
}

bool erase(const V& val) {
key_type key = (val.*member);
auto hiter = indices.find(key);
if (hiter == indices.end())
{
return false;
}

size_t idx = hiter->second;
values[idx] = values[values.size() - 1];
indices.find((values[idx].*member))->second = idx;

values.pop_back();
indices.erase(hiter);
return true;
}

V& operator(size_t idx)
{
return values[idx];
}

const V& operator(size_t idx) const
{
return values[idx];
}

size_t idx(key_type key)
{
auto hiter = indices.find(key);
if (hiter == indices.end())
{
return -1;
}
else
{
return hiter->second;
}
}

const std::vector<V>& getValues() { return values; }
size_t size() { return values.size(); }

private:
hashed_storage indices;
vector_storage values;
};

struct A
{
int x;
int key;
};

std::ostream& operator<<(std::ostream& out, const A& a)
{
out << "[" << a.x << "," << a.key << "]";
return out;
}

bool operator==(const A& a1, const A& a2)
{
return a1.x == a2.x && a1.key == a2.key;
}

#define TEST(n) if ((n)) { std::cout << "OKn"; } else { std::cout << "NOT OKn"; }

int main()
{
IndexedVector<A, decltype(A::key), &A::key> s;
auto& v = s.getValues();

A a1{100, 1};
A a2{200, 2};

TEST(s.size() == 0);

s.push(a1);
TEST(s.size() == 1);
TEST(s.idx(a1.key) == 0);
TEST(v[0] == a1);
TEST(s[0] == a1);

s.push(a2);
TEST(s.size() == 2);
TEST(v[0] == a1);
TEST(s.idx(a1.key) == 0);
TEST(v[1] == a2);
TEST(s.idx(a2.key) == 1);

TEST(s.erase(a1) == true);
TEST(s.size() == 1);
TEST(v[0] == a2);

TEST(s.erase(a2) == true);
TEST(s.size() == 0);

TEST(s.erase(a1) == false);
TEST(s.size() == 0);
TEST(s.idx(a1.key) == -1);
}









share|improve this question




















  • 1




    I can see myself only one issue: exception-safety. If vector's push_back method or operator throws (because the type used as V throws when copying or something like that) then IndexedVector would lose its consistency (the behaviour would be undefined). What do you think about it? Is it real danger and flaw of this class? I guess that with such design there is not much I could do to prevent this danger. What do you think?
    – YotKay
    Sep 22 '17 at 11:21










  • Does this site work? I am starting to doubt it...
    – YotKay
    Sep 23 '17 at 19:45















up vote
10
down vote

favorite












I am trying to write a class that will enable me to hold an std::vector of some unique objects, but faster finding elements already stored in a vector. I want the uniqueness to be based on some key element of the held objects (not based on compare operator).



I considered Boost::multi_index_container which is almost exactly what I need, except that I couldn't easily extract std::vector from it. I need exactly std::vector to pass it to some external API. If I had multi_indexed_container then every time I needed an std::vector (very often) I would have to construct it from scratch (based on iterators of multi_index_container) which I think would cost too much.



I also considered sorted vector. This would allow me to hold only vector, but I would have to take care about proper insertion (so that it is sorted). It is easy, but I think it would still cost too much, as the complexity of a binary search algorithm to find the element in sorted vector is $O(logN)$. I need something better.



The last resort is a known idea of having a vector and a hashmap which stores indexes of the vector as values. The complexity of finding element in such storage would be constant (as it is for hashmap).



I also want to use my container with types which have public field representing ID (key). I want my container to be able to receive information which of the fields of the type it holds represents the key. I don't want to store the value object in two places (vector and hashmap), I only want to store it in a vector, and in the hashmap store only the ID (key).



One thing I need to add: I need this for environment where I will take care of thread safety myself, and exceptions are not used (no need to worry about exception safety).



Here is the class together with some example usage and some basic tests. Do you think it is OK? Does it lack anything?



#include <iostream>
#include <vector>
#include <unordered_map>

template <class V, typename Mem, Mem V::*member>
class IndexedVector {
using key_type = Mem;
using vector_storage = std::vector<V>;
using hashed_storage = std::unordered_map<key_type, size_t>;
using hashed_pair = std::pair<key_type, size_t>;
using hashed_iter = typename hashed_storage::iterator;

public:
bool push(const V& val) {
key_type key = (val.*member);
hashed_iter it = indices.find(key);
if (it == indices.end())
{
size_t vi = values.size();
values.push_back(val);
indices.insert(hashed_pair(key, vi));
return true;
}
return false;
}

bool erase(const V& val) {
key_type key = (val.*member);
auto hiter = indices.find(key);
if (hiter == indices.end())
{
return false;
}

size_t idx = hiter->second;
values[idx] = values[values.size() - 1];
indices.find((values[idx].*member))->second = idx;

values.pop_back();
indices.erase(hiter);
return true;
}

V& operator(size_t idx)
{
return values[idx];
}

const V& operator(size_t idx) const
{
return values[idx];
}

size_t idx(key_type key)
{
auto hiter = indices.find(key);
if (hiter == indices.end())
{
return -1;
}
else
{
return hiter->second;
}
}

const std::vector<V>& getValues() { return values; }
size_t size() { return values.size(); }

private:
hashed_storage indices;
vector_storage values;
};

struct A
{
int x;
int key;
};

std::ostream& operator<<(std::ostream& out, const A& a)
{
out << "[" << a.x << "," << a.key << "]";
return out;
}

bool operator==(const A& a1, const A& a2)
{
return a1.x == a2.x && a1.key == a2.key;
}

#define TEST(n) if ((n)) { std::cout << "OKn"; } else { std::cout << "NOT OKn"; }

int main()
{
IndexedVector<A, decltype(A::key), &A::key> s;
auto& v = s.getValues();

A a1{100, 1};
A a2{200, 2};

TEST(s.size() == 0);

s.push(a1);
TEST(s.size() == 1);
TEST(s.idx(a1.key) == 0);
TEST(v[0] == a1);
TEST(s[0] == a1);

s.push(a2);
TEST(s.size() == 2);
TEST(v[0] == a1);
TEST(s.idx(a1.key) == 0);
TEST(v[1] == a2);
TEST(s.idx(a2.key) == 1);

TEST(s.erase(a1) == true);
TEST(s.size() == 1);
TEST(v[0] == a2);

TEST(s.erase(a2) == true);
TEST(s.size() == 0);

TEST(s.erase(a1) == false);
TEST(s.size() == 0);
TEST(s.idx(a1.key) == -1);
}









share|improve this question




















  • 1




    I can see myself only one issue: exception-safety. If vector's push_back method or operator throws (because the type used as V throws when copying or something like that) then IndexedVector would lose its consistency (the behaviour would be undefined). What do you think about it? Is it real danger and flaw of this class? I guess that with such design there is not much I could do to prevent this danger. What do you think?
    – YotKay
    Sep 22 '17 at 11:21










  • Does this site work? I am starting to doubt it...
    – YotKay
    Sep 23 '17 at 19:45













up vote
10
down vote

favorite









up vote
10
down vote

favorite











I am trying to write a class that will enable me to hold an std::vector of some unique objects, but faster finding elements already stored in a vector. I want the uniqueness to be based on some key element of the held objects (not based on compare operator).



I considered Boost::multi_index_container which is almost exactly what I need, except that I couldn't easily extract std::vector from it. I need exactly std::vector to pass it to some external API. If I had multi_indexed_container then every time I needed an std::vector (very often) I would have to construct it from scratch (based on iterators of multi_index_container) which I think would cost too much.



I also considered sorted vector. This would allow me to hold only vector, but I would have to take care about proper insertion (so that it is sorted). It is easy, but I think it would still cost too much, as the complexity of a binary search algorithm to find the element in sorted vector is $O(logN)$. I need something better.



The last resort is a known idea of having a vector and a hashmap which stores indexes of the vector as values. The complexity of finding element in such storage would be constant (as it is for hashmap).



I also want to use my container with types which have public field representing ID (key). I want my container to be able to receive information which of the fields of the type it holds represents the key. I don't want to store the value object in two places (vector and hashmap), I only want to store it in a vector, and in the hashmap store only the ID (key).



One thing I need to add: I need this for environment where I will take care of thread safety myself, and exceptions are not used (no need to worry about exception safety).



Here is the class together with some example usage and some basic tests. Do you think it is OK? Does it lack anything?



#include <iostream>
#include <vector>
#include <unordered_map>

template <class V, typename Mem, Mem V::*member>
class IndexedVector {
using key_type = Mem;
using vector_storage = std::vector<V>;
using hashed_storage = std::unordered_map<key_type, size_t>;
using hashed_pair = std::pair<key_type, size_t>;
using hashed_iter = typename hashed_storage::iterator;

public:
bool push(const V& val) {
key_type key = (val.*member);
hashed_iter it = indices.find(key);
if (it == indices.end())
{
size_t vi = values.size();
values.push_back(val);
indices.insert(hashed_pair(key, vi));
return true;
}
return false;
}

bool erase(const V& val) {
key_type key = (val.*member);
auto hiter = indices.find(key);
if (hiter == indices.end())
{
return false;
}

size_t idx = hiter->second;
values[idx] = values[values.size() - 1];
indices.find((values[idx].*member))->second = idx;

values.pop_back();
indices.erase(hiter);
return true;
}

V& operator(size_t idx)
{
return values[idx];
}

const V& operator(size_t idx) const
{
return values[idx];
}

size_t idx(key_type key)
{
auto hiter = indices.find(key);
if (hiter == indices.end())
{
return -1;
}
else
{
return hiter->second;
}
}

const std::vector<V>& getValues() { return values; }
size_t size() { return values.size(); }

private:
hashed_storage indices;
vector_storage values;
};

struct A
{
int x;
int key;
};

std::ostream& operator<<(std::ostream& out, const A& a)
{
out << "[" << a.x << "," << a.key << "]";
return out;
}

bool operator==(const A& a1, const A& a2)
{
return a1.x == a2.x && a1.key == a2.key;
}

#define TEST(n) if ((n)) { std::cout << "OKn"; } else { std::cout << "NOT OKn"; }

int main()
{
IndexedVector<A, decltype(A::key), &A::key> s;
auto& v = s.getValues();

A a1{100, 1};
A a2{200, 2};

TEST(s.size() == 0);

s.push(a1);
TEST(s.size() == 1);
TEST(s.idx(a1.key) == 0);
TEST(v[0] == a1);
TEST(s[0] == a1);

s.push(a2);
TEST(s.size() == 2);
TEST(v[0] == a1);
TEST(s.idx(a1.key) == 0);
TEST(v[1] == a2);
TEST(s.idx(a2.key) == 1);

TEST(s.erase(a1) == true);
TEST(s.size() == 1);
TEST(v[0] == a2);

TEST(s.erase(a2) == true);
TEST(s.size() == 0);

TEST(s.erase(a1) == false);
TEST(s.size() == 0);
TEST(s.idx(a1.key) == -1);
}









share|improve this question















I am trying to write a class that will enable me to hold an std::vector of some unique objects, but faster finding elements already stored in a vector. I want the uniqueness to be based on some key element of the held objects (not based on compare operator).



I considered Boost::multi_index_container which is almost exactly what I need, except that I couldn't easily extract std::vector from it. I need exactly std::vector to pass it to some external API. If I had multi_indexed_container then every time I needed an std::vector (very often) I would have to construct it from scratch (based on iterators of multi_index_container) which I think would cost too much.



I also considered sorted vector. This would allow me to hold only vector, but I would have to take care about proper insertion (so that it is sorted). It is easy, but I think it would still cost too much, as the complexity of a binary search algorithm to find the element in sorted vector is $O(logN)$. I need something better.



The last resort is a known idea of having a vector and a hashmap which stores indexes of the vector as values. The complexity of finding element in such storage would be constant (as it is for hashmap).



I also want to use my container with types which have public field representing ID (key). I want my container to be able to receive information which of the fields of the type it holds represents the key. I don't want to store the value object in two places (vector and hashmap), I only want to store it in a vector, and in the hashmap store only the ID (key).



One thing I need to add: I need this for environment where I will take care of thread safety myself, and exceptions are not used (no need to worry about exception safety).



Here is the class together with some example usage and some basic tests. Do you think it is OK? Does it lack anything?



#include <iostream>
#include <vector>
#include <unordered_map>

template <class V, typename Mem, Mem V::*member>
class IndexedVector {
using key_type = Mem;
using vector_storage = std::vector<V>;
using hashed_storage = std::unordered_map<key_type, size_t>;
using hashed_pair = std::pair<key_type, size_t>;
using hashed_iter = typename hashed_storage::iterator;

public:
bool push(const V& val) {
key_type key = (val.*member);
hashed_iter it = indices.find(key);
if (it == indices.end())
{
size_t vi = values.size();
values.push_back(val);
indices.insert(hashed_pair(key, vi));
return true;
}
return false;
}

bool erase(const V& val) {
key_type key = (val.*member);
auto hiter = indices.find(key);
if (hiter == indices.end())
{
return false;
}

size_t idx = hiter->second;
values[idx] = values[values.size() - 1];
indices.find((values[idx].*member))->second = idx;

values.pop_back();
indices.erase(hiter);
return true;
}

V& operator(size_t idx)
{
return values[idx];
}

const V& operator(size_t idx) const
{
return values[idx];
}

size_t idx(key_type key)
{
auto hiter = indices.find(key);
if (hiter == indices.end())
{
return -1;
}
else
{
return hiter->second;
}
}

const std::vector<V>& getValues() { return values; }
size_t size() { return values.size(); }

private:
hashed_storage indices;
vector_storage values;
};

struct A
{
int x;
int key;
};

std::ostream& operator<<(std::ostream& out, const A& a)
{
out << "[" << a.x << "," << a.key << "]";
return out;
}

bool operator==(const A& a1, const A& a2)
{
return a1.x == a2.x && a1.key == a2.key;
}

#define TEST(n) if ((n)) { std::cout << "OKn"; } else { std::cout << "NOT OKn"; }

int main()
{
IndexedVector<A, decltype(A::key), &A::key> s;
auto& v = s.getValues();

A a1{100, 1};
A a2{200, 2};

TEST(s.size() == 0);

s.push(a1);
TEST(s.size() == 1);
TEST(s.idx(a1.key) == 0);
TEST(v[0] == a1);
TEST(s[0] == a1);

s.push(a2);
TEST(s.size() == 2);
TEST(v[0] == a1);
TEST(s.idx(a1.key) == 0);
TEST(v[1] == a2);
TEST(s.idx(a2.key) == 1);

TEST(s.erase(a1) == true);
TEST(s.size() == 1);
TEST(v[0] == a2);

TEST(s.erase(a2) == true);
TEST(s.size() == 0);

TEST(s.erase(a1) == false);
TEST(s.size() == 0);
TEST(s.idx(a1.key) == -1);
}






c++ vectors






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Sep 18 '17 at 6:57

























asked Sep 17 '17 at 18:04









YotKay

1665




1665








  • 1




    I can see myself only one issue: exception-safety. If vector's push_back method or operator throws (because the type used as V throws when copying or something like that) then IndexedVector would lose its consistency (the behaviour would be undefined). What do you think about it? Is it real danger and flaw of this class? I guess that with such design there is not much I could do to prevent this danger. What do you think?
    – YotKay
    Sep 22 '17 at 11:21










  • Does this site work? I am starting to doubt it...
    – YotKay
    Sep 23 '17 at 19:45














  • 1




    I can see myself only one issue: exception-safety. If vector's push_back method or operator throws (because the type used as V throws when copying or something like that) then IndexedVector would lose its consistency (the behaviour would be undefined). What do you think about it? Is it real danger and flaw of this class? I guess that with such design there is not much I could do to prevent this danger. What do you think?
    – YotKay
    Sep 22 '17 at 11:21










  • Does this site work? I am starting to doubt it...
    – YotKay
    Sep 23 '17 at 19:45








1




1




I can see myself only one issue: exception-safety. If vector's push_back method or operator throws (because the type used as V throws when copying or something like that) then IndexedVector would lose its consistency (the behaviour would be undefined). What do you think about it? Is it real danger and flaw of this class? I guess that with such design there is not much I could do to prevent this danger. What do you think?
– YotKay
Sep 22 '17 at 11:21




I can see myself only one issue: exception-safety. If vector's push_back method or operator throws (because the type used as V throws when copying or something like that) then IndexedVector would lose its consistency (the behaviour would be undefined). What do you think about it? Is it real danger and flaw of this class? I guess that with such design there is not much I could do to prevent this danger. What do you think?
– YotKay
Sep 22 '17 at 11:21












Does this site work? I am starting to doubt it...
– YotKay
Sep 23 '17 at 19:45




Does this site work? I am starting to doubt it...
– YotKay
Sep 23 '17 at 19:45










1 Answer
1






active

oldest

votes

















up vote
2
down vote














Does this site work? I am starting to doubt it...




Well, AFAIK, the front page can filter by "my tags" or sort by "date" but not both? So it's very easy for something to get stuck on the second page forever. One thing you could try is, after posting a snippet for review, go over to the C++ Slack and ask if anyone there would like to give you a review.





template <class V, typename Mem, Mem V::*member>


This works, but it's kind of cumbersome. As you noticed:



template <class V, typename Mem, Mem V::*member> class IndexedVector { ... }
IndexedVector<A, decltype(A::key), &A::key> s;


In C++17 you could get a similar effect by writing simply



template <class V, auto member> class IndexedVector { ... }
IndexedVector<A, &A::key> s;


However, there are other problems associated with pointers-to-members. For example, you can't use your approach to key on a key that doesn't have a type — so, no keying on bit-fields. And you can't use your approach (pre-C++17) to key on a member that isn't a member of A — so, no keying on inherited data members.



So the canonical C++ way to do this is to provide a class type that encapsulates the appropriate behavior. Like how std::hash and std::less do it. We'd write:



template <class V, class KeyFunction> class IndexedVector { ... }
struct MyKeyFunction { int operator()(const A& a) const { return a.key; } };
IndexedVector<A, MyKeyFunction> s;


Finally: I strongly recommend CamelCase for all template parameter names. So I'd write Member, not member.





#define TEST(n) if ((n)) { std::cout << "OKn"; } else { std::cout << "NOT OKn"; }


The double-parentheses around n aren't necessary; double parens don't do anything here that single parens wouldn't. Also, for hygiene, you should wrap every statement-like macro in do ... while(0) — even though in this case I think you're safe.



#define TEST(n) do { 
if ((n)) { std::cout << "OKn"; }
else { std::cout << "NOT OKn"; }
} while (0)




bool operator==(const A& a1, const A& a2) 


Any time you have == you should also have !=. You get a pass here because this is just test code.





size_t idx(key_type key) { ... }
const std::vector<V>& getValues() { return values; }
size_t size() { return values.size(); }


All three of these methods should be const-qualified.





    hashed_iter it = indices.find(key);
auto hiter = indices.find(key);


It's mildly confusing to the reader that you sometimes say it and sometimes say hiter. I would stick with it, unless you're worried that the reader might need to distinguish hash iterators from other iterators. (And in that case, I'd say just hit, and I'd never use it to refer to a hash iterator.)



It's also unusual to abbreviate iterator in the name of a typedef. I'd replace all your uses of hashed_iter with hash_iterator.





using hashed_pair = std::pair<key_type, size_t>;
[...]
indices.insert(hashed_pair(key, vi));


What does this gain you, compared to using the two-argument emplace?



indices.emplace(key, vi);


In fact, this exact situation just came up a few hours ago on the C++ Slack — someone pointed out that it's too easy to get the hashed_pair typedef wrong and end up making unnecessary copies! And then what do you do? You get the typedef wrong!



using hashed_pair = std::pair</*oops no const*/ key_type, size_t>;
[...]
indices.insert(hashed_pair(key, vi));


This code constructs a pair<key_type, size_t>. Then it calls indices.insert, which wants a parameter of type pair<const key_type, size_t>; so it has to construct a pair<const key_type, size_t> out of the pair you already constructed...



Well, actually insert also has an overload taking any old P&& type (overload #2 here), so what you wrote is just a hair less efficient than the .emplace(key, vi) version. (It costs just one extra move/destroy.) But what you wrote is more verbose and causes the reader to go, "Huh?"; so, I would recommend against it.





size_t idx(key_type key)
{
auto hiter = indices.find(key);
if (hiter == indices.end())
{
return -1;
}
else
{
return hiter->second;
}
}


Since key_type might be something heavyweight, like std::string, shouldn't this method take (const key_type& key) instead of (key_type key)?



Also, utter nitpick, but many C++ programmers find code easier to read when they can see more of it on the screen at once. (Up to a point, of course!) So that leads to a general preference for more compact bracing styles:



size_t idx(const key_type& key) const {
auto it = indices.find(key);
if (it != indices.end()) {
return it->second;
} else {
return -1;
}
}


And once it's in this form, we can reduce the number of indented lines and perhaps more directly express our intent by writing:



size_t idx(const key_type& key) const {
auto it = indices.find(key);
if (it != indices.end()) {
return it->second;
}
return -1;
}


Some compilers might complain about the implicit conversion of -1 to size_t. I might try to be explicit there; or even assert that such a case ought to be impossible, if that's what I meant.



    return size_t(-1);
// or
assert(false);


Compilers will definitely complain about your comparison of size_t with -1 down in the test cases; I'd definitely fix that.



TEST(s.idx(a1.key) == size_t(-1));





share|improve this answer





















  • There's quite a good case for static constexpr auto invalid_index = size_t(-1); to avoid writing ugly conversions everywhere.
    – Toby Speight
    13 hours ago











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f175902%2findexedvector-holding-a-vector-of-unique-objects-with-faster-finding-elements%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote














Does this site work? I am starting to doubt it...




Well, AFAIK, the front page can filter by "my tags" or sort by "date" but not both? So it's very easy for something to get stuck on the second page forever. One thing you could try is, after posting a snippet for review, go over to the C++ Slack and ask if anyone there would like to give you a review.





template <class V, typename Mem, Mem V::*member>


This works, but it's kind of cumbersome. As you noticed:



template <class V, typename Mem, Mem V::*member> class IndexedVector { ... }
IndexedVector<A, decltype(A::key), &A::key> s;


In C++17 you could get a similar effect by writing simply



template <class V, auto member> class IndexedVector { ... }
IndexedVector<A, &A::key> s;


However, there are other problems associated with pointers-to-members. For example, you can't use your approach to key on a key that doesn't have a type — so, no keying on bit-fields. And you can't use your approach (pre-C++17) to key on a member that isn't a member of A — so, no keying on inherited data members.



So the canonical C++ way to do this is to provide a class type that encapsulates the appropriate behavior. Like how std::hash and std::less do it. We'd write:



template <class V, class KeyFunction> class IndexedVector { ... }
struct MyKeyFunction { int operator()(const A& a) const { return a.key; } };
IndexedVector<A, MyKeyFunction> s;


Finally: I strongly recommend CamelCase for all template parameter names. So I'd write Member, not member.





#define TEST(n) if ((n)) { std::cout << "OKn"; } else { std::cout << "NOT OKn"; }


The double-parentheses around n aren't necessary; double parens don't do anything here that single parens wouldn't. Also, for hygiene, you should wrap every statement-like macro in do ... while(0) — even though in this case I think you're safe.



#define TEST(n) do { 
if ((n)) { std::cout << "OKn"; }
else { std::cout << "NOT OKn"; }
} while (0)




bool operator==(const A& a1, const A& a2) 


Any time you have == you should also have !=. You get a pass here because this is just test code.





size_t idx(key_type key) { ... }
const std::vector<V>& getValues() { return values; }
size_t size() { return values.size(); }


All three of these methods should be const-qualified.





    hashed_iter it = indices.find(key);
auto hiter = indices.find(key);


It's mildly confusing to the reader that you sometimes say it and sometimes say hiter. I would stick with it, unless you're worried that the reader might need to distinguish hash iterators from other iterators. (And in that case, I'd say just hit, and I'd never use it to refer to a hash iterator.)



It's also unusual to abbreviate iterator in the name of a typedef. I'd replace all your uses of hashed_iter with hash_iterator.





using hashed_pair = std::pair<key_type, size_t>;
[...]
indices.insert(hashed_pair(key, vi));


What does this gain you, compared to using the two-argument emplace?



indices.emplace(key, vi);


In fact, this exact situation just came up a few hours ago on the C++ Slack — someone pointed out that it's too easy to get the hashed_pair typedef wrong and end up making unnecessary copies! And then what do you do? You get the typedef wrong!



using hashed_pair = std::pair</*oops no const*/ key_type, size_t>;
[...]
indices.insert(hashed_pair(key, vi));


This code constructs a pair<key_type, size_t>. Then it calls indices.insert, which wants a parameter of type pair<const key_type, size_t>; so it has to construct a pair<const key_type, size_t> out of the pair you already constructed...



Well, actually insert also has an overload taking any old P&& type (overload #2 here), so what you wrote is just a hair less efficient than the .emplace(key, vi) version. (It costs just one extra move/destroy.) But what you wrote is more verbose and causes the reader to go, "Huh?"; so, I would recommend against it.





size_t idx(key_type key)
{
auto hiter = indices.find(key);
if (hiter == indices.end())
{
return -1;
}
else
{
return hiter->second;
}
}


Since key_type might be something heavyweight, like std::string, shouldn't this method take (const key_type& key) instead of (key_type key)?



Also, utter nitpick, but many C++ programmers find code easier to read when they can see more of it on the screen at once. (Up to a point, of course!) So that leads to a general preference for more compact bracing styles:



size_t idx(const key_type& key) const {
auto it = indices.find(key);
if (it != indices.end()) {
return it->second;
} else {
return -1;
}
}


And once it's in this form, we can reduce the number of indented lines and perhaps more directly express our intent by writing:



size_t idx(const key_type& key) const {
auto it = indices.find(key);
if (it != indices.end()) {
return it->second;
}
return -1;
}


Some compilers might complain about the implicit conversion of -1 to size_t. I might try to be explicit there; or even assert that such a case ought to be impossible, if that's what I meant.



    return size_t(-1);
// or
assert(false);


Compilers will definitely complain about your comparison of size_t with -1 down in the test cases; I'd definitely fix that.



TEST(s.idx(a1.key) == size_t(-1));





share|improve this answer





















  • There's quite a good case for static constexpr auto invalid_index = size_t(-1); to avoid writing ugly conversions everywhere.
    – Toby Speight
    13 hours ago















up vote
2
down vote














Does this site work? I am starting to doubt it...




Well, AFAIK, the front page can filter by "my tags" or sort by "date" but not both? So it's very easy for something to get stuck on the second page forever. One thing you could try is, after posting a snippet for review, go over to the C++ Slack and ask if anyone there would like to give you a review.





template <class V, typename Mem, Mem V::*member>


This works, but it's kind of cumbersome. As you noticed:



template <class V, typename Mem, Mem V::*member> class IndexedVector { ... }
IndexedVector<A, decltype(A::key), &A::key> s;


In C++17 you could get a similar effect by writing simply



template <class V, auto member> class IndexedVector { ... }
IndexedVector<A, &A::key> s;


However, there are other problems associated with pointers-to-members. For example, you can't use your approach to key on a key that doesn't have a type — so, no keying on bit-fields. And you can't use your approach (pre-C++17) to key on a member that isn't a member of A — so, no keying on inherited data members.



So the canonical C++ way to do this is to provide a class type that encapsulates the appropriate behavior. Like how std::hash and std::less do it. We'd write:



template <class V, class KeyFunction> class IndexedVector { ... }
struct MyKeyFunction { int operator()(const A& a) const { return a.key; } };
IndexedVector<A, MyKeyFunction> s;


Finally: I strongly recommend CamelCase for all template parameter names. So I'd write Member, not member.





#define TEST(n) if ((n)) { std::cout << "OKn"; } else { std::cout << "NOT OKn"; }


The double-parentheses around n aren't necessary; double parens don't do anything here that single parens wouldn't. Also, for hygiene, you should wrap every statement-like macro in do ... while(0) — even though in this case I think you're safe.



#define TEST(n) do { 
if ((n)) { std::cout << "OKn"; }
else { std::cout << "NOT OKn"; }
} while (0)




bool operator==(const A& a1, const A& a2) 


Any time you have == you should also have !=. You get a pass here because this is just test code.





size_t idx(key_type key) { ... }
const std::vector<V>& getValues() { return values; }
size_t size() { return values.size(); }


All three of these methods should be const-qualified.





    hashed_iter it = indices.find(key);
auto hiter = indices.find(key);


It's mildly confusing to the reader that you sometimes say it and sometimes say hiter. I would stick with it, unless you're worried that the reader might need to distinguish hash iterators from other iterators. (And in that case, I'd say just hit, and I'd never use it to refer to a hash iterator.)



It's also unusual to abbreviate iterator in the name of a typedef. I'd replace all your uses of hashed_iter with hash_iterator.





using hashed_pair = std::pair<key_type, size_t>;
[...]
indices.insert(hashed_pair(key, vi));


What does this gain you, compared to using the two-argument emplace?



indices.emplace(key, vi);


In fact, this exact situation just came up a few hours ago on the C++ Slack — someone pointed out that it's too easy to get the hashed_pair typedef wrong and end up making unnecessary copies! And then what do you do? You get the typedef wrong!



using hashed_pair = std::pair</*oops no const*/ key_type, size_t>;
[...]
indices.insert(hashed_pair(key, vi));


This code constructs a pair<key_type, size_t>. Then it calls indices.insert, which wants a parameter of type pair<const key_type, size_t>; so it has to construct a pair<const key_type, size_t> out of the pair you already constructed...



Well, actually insert also has an overload taking any old P&& type (overload #2 here), so what you wrote is just a hair less efficient than the .emplace(key, vi) version. (It costs just one extra move/destroy.) But what you wrote is more verbose and causes the reader to go, "Huh?"; so, I would recommend against it.





size_t idx(key_type key)
{
auto hiter = indices.find(key);
if (hiter == indices.end())
{
return -1;
}
else
{
return hiter->second;
}
}


Since key_type might be something heavyweight, like std::string, shouldn't this method take (const key_type& key) instead of (key_type key)?



Also, utter nitpick, but many C++ programmers find code easier to read when they can see more of it on the screen at once. (Up to a point, of course!) So that leads to a general preference for more compact bracing styles:



size_t idx(const key_type& key) const {
auto it = indices.find(key);
if (it != indices.end()) {
return it->second;
} else {
return -1;
}
}


And once it's in this form, we can reduce the number of indented lines and perhaps more directly express our intent by writing:



size_t idx(const key_type& key) const {
auto it = indices.find(key);
if (it != indices.end()) {
return it->second;
}
return -1;
}


Some compilers might complain about the implicit conversion of -1 to size_t. I might try to be explicit there; or even assert that such a case ought to be impossible, if that's what I meant.



    return size_t(-1);
// or
assert(false);


Compilers will definitely complain about your comparison of size_t with -1 down in the test cases; I'd definitely fix that.



TEST(s.idx(a1.key) == size_t(-1));





share|improve this answer





















  • There's quite a good case for static constexpr auto invalid_index = size_t(-1); to avoid writing ugly conversions everywhere.
    – Toby Speight
    13 hours ago













up vote
2
down vote










up vote
2
down vote










Does this site work? I am starting to doubt it...




Well, AFAIK, the front page can filter by "my tags" or sort by "date" but not both? So it's very easy for something to get stuck on the second page forever. One thing you could try is, after posting a snippet for review, go over to the C++ Slack and ask if anyone there would like to give you a review.





template <class V, typename Mem, Mem V::*member>


This works, but it's kind of cumbersome. As you noticed:



template <class V, typename Mem, Mem V::*member> class IndexedVector { ... }
IndexedVector<A, decltype(A::key), &A::key> s;


In C++17 you could get a similar effect by writing simply



template <class V, auto member> class IndexedVector { ... }
IndexedVector<A, &A::key> s;


However, there are other problems associated with pointers-to-members. For example, you can't use your approach to key on a key that doesn't have a type — so, no keying on bit-fields. And you can't use your approach (pre-C++17) to key on a member that isn't a member of A — so, no keying on inherited data members.



So the canonical C++ way to do this is to provide a class type that encapsulates the appropriate behavior. Like how std::hash and std::less do it. We'd write:



template <class V, class KeyFunction> class IndexedVector { ... }
struct MyKeyFunction { int operator()(const A& a) const { return a.key; } };
IndexedVector<A, MyKeyFunction> s;


Finally: I strongly recommend CamelCase for all template parameter names. So I'd write Member, not member.





#define TEST(n) if ((n)) { std::cout << "OKn"; } else { std::cout << "NOT OKn"; }


The double-parentheses around n aren't necessary; double parens don't do anything here that single parens wouldn't. Also, for hygiene, you should wrap every statement-like macro in do ... while(0) — even though in this case I think you're safe.



#define TEST(n) do { 
if ((n)) { std::cout << "OKn"; }
else { std::cout << "NOT OKn"; }
} while (0)




bool operator==(const A& a1, const A& a2) 


Any time you have == you should also have !=. You get a pass here because this is just test code.





size_t idx(key_type key) { ... }
const std::vector<V>& getValues() { return values; }
size_t size() { return values.size(); }


All three of these methods should be const-qualified.





    hashed_iter it = indices.find(key);
auto hiter = indices.find(key);


It's mildly confusing to the reader that you sometimes say it and sometimes say hiter. I would stick with it, unless you're worried that the reader might need to distinguish hash iterators from other iterators. (And in that case, I'd say just hit, and I'd never use it to refer to a hash iterator.)



It's also unusual to abbreviate iterator in the name of a typedef. I'd replace all your uses of hashed_iter with hash_iterator.





using hashed_pair = std::pair<key_type, size_t>;
[...]
indices.insert(hashed_pair(key, vi));


What does this gain you, compared to using the two-argument emplace?



indices.emplace(key, vi);


In fact, this exact situation just came up a few hours ago on the C++ Slack — someone pointed out that it's too easy to get the hashed_pair typedef wrong and end up making unnecessary copies! And then what do you do? You get the typedef wrong!



using hashed_pair = std::pair</*oops no const*/ key_type, size_t>;
[...]
indices.insert(hashed_pair(key, vi));


This code constructs a pair<key_type, size_t>. Then it calls indices.insert, which wants a parameter of type pair<const key_type, size_t>; so it has to construct a pair<const key_type, size_t> out of the pair you already constructed...



Well, actually insert also has an overload taking any old P&& type (overload #2 here), so what you wrote is just a hair less efficient than the .emplace(key, vi) version. (It costs just one extra move/destroy.) But what you wrote is more verbose and causes the reader to go, "Huh?"; so, I would recommend against it.





size_t idx(key_type key)
{
auto hiter = indices.find(key);
if (hiter == indices.end())
{
return -1;
}
else
{
return hiter->second;
}
}


Since key_type might be something heavyweight, like std::string, shouldn't this method take (const key_type& key) instead of (key_type key)?



Also, utter nitpick, but many C++ programmers find code easier to read when they can see more of it on the screen at once. (Up to a point, of course!) So that leads to a general preference for more compact bracing styles:



size_t idx(const key_type& key) const {
auto it = indices.find(key);
if (it != indices.end()) {
return it->second;
} else {
return -1;
}
}


And once it's in this form, we can reduce the number of indented lines and perhaps more directly express our intent by writing:



size_t idx(const key_type& key) const {
auto it = indices.find(key);
if (it != indices.end()) {
return it->second;
}
return -1;
}


Some compilers might complain about the implicit conversion of -1 to size_t. I might try to be explicit there; or even assert that such a case ought to be impossible, if that's what I meant.



    return size_t(-1);
// or
assert(false);


Compilers will definitely complain about your comparison of size_t with -1 down in the test cases; I'd definitely fix that.



TEST(s.idx(a1.key) == size_t(-1));





share|improve this answer













Does this site work? I am starting to doubt it...




Well, AFAIK, the front page can filter by "my tags" or sort by "date" but not both? So it's very easy for something to get stuck on the second page forever. One thing you could try is, after posting a snippet for review, go over to the C++ Slack and ask if anyone there would like to give you a review.





template <class V, typename Mem, Mem V::*member>


This works, but it's kind of cumbersome. As you noticed:



template <class V, typename Mem, Mem V::*member> class IndexedVector { ... }
IndexedVector<A, decltype(A::key), &A::key> s;


In C++17 you could get a similar effect by writing simply



template <class V, auto member> class IndexedVector { ... }
IndexedVector<A, &A::key> s;


However, there are other problems associated with pointers-to-members. For example, you can't use your approach to key on a key that doesn't have a type — so, no keying on bit-fields. And you can't use your approach (pre-C++17) to key on a member that isn't a member of A — so, no keying on inherited data members.



So the canonical C++ way to do this is to provide a class type that encapsulates the appropriate behavior. Like how std::hash and std::less do it. We'd write:



template <class V, class KeyFunction> class IndexedVector { ... }
struct MyKeyFunction { int operator()(const A& a) const { return a.key; } };
IndexedVector<A, MyKeyFunction> s;


Finally: I strongly recommend CamelCase for all template parameter names. So I'd write Member, not member.





#define TEST(n) if ((n)) { std::cout << "OKn"; } else { std::cout << "NOT OKn"; }


The double-parentheses around n aren't necessary; double parens don't do anything here that single parens wouldn't. Also, for hygiene, you should wrap every statement-like macro in do ... while(0) — even though in this case I think you're safe.



#define TEST(n) do { 
if ((n)) { std::cout << "OKn"; }
else { std::cout << "NOT OKn"; }
} while (0)




bool operator==(const A& a1, const A& a2) 


Any time you have == you should also have !=. You get a pass here because this is just test code.





size_t idx(key_type key) { ... }
const std::vector<V>& getValues() { return values; }
size_t size() { return values.size(); }


All three of these methods should be const-qualified.





    hashed_iter it = indices.find(key);
auto hiter = indices.find(key);


It's mildly confusing to the reader that you sometimes say it and sometimes say hiter. I would stick with it, unless you're worried that the reader might need to distinguish hash iterators from other iterators. (And in that case, I'd say just hit, and I'd never use it to refer to a hash iterator.)



It's also unusual to abbreviate iterator in the name of a typedef. I'd replace all your uses of hashed_iter with hash_iterator.





using hashed_pair = std::pair<key_type, size_t>;
[...]
indices.insert(hashed_pair(key, vi));


What does this gain you, compared to using the two-argument emplace?



indices.emplace(key, vi);


In fact, this exact situation just came up a few hours ago on the C++ Slack — someone pointed out that it's too easy to get the hashed_pair typedef wrong and end up making unnecessary copies! And then what do you do? You get the typedef wrong!



using hashed_pair = std::pair</*oops no const*/ key_type, size_t>;
[...]
indices.insert(hashed_pair(key, vi));


This code constructs a pair<key_type, size_t>. Then it calls indices.insert, which wants a parameter of type pair<const key_type, size_t>; so it has to construct a pair<const key_type, size_t> out of the pair you already constructed...



Well, actually insert also has an overload taking any old P&& type (overload #2 here), so what you wrote is just a hair less efficient than the .emplace(key, vi) version. (It costs just one extra move/destroy.) But what you wrote is more verbose and causes the reader to go, "Huh?"; so, I would recommend against it.





size_t idx(key_type key)
{
auto hiter = indices.find(key);
if (hiter == indices.end())
{
return -1;
}
else
{
return hiter->second;
}
}


Since key_type might be something heavyweight, like std::string, shouldn't this method take (const key_type& key) instead of (key_type key)?



Also, utter nitpick, but many C++ programmers find code easier to read when they can see more of it on the screen at once. (Up to a point, of course!) So that leads to a general preference for more compact bracing styles:



size_t idx(const key_type& key) const {
auto it = indices.find(key);
if (it != indices.end()) {
return it->second;
} else {
return -1;
}
}


And once it's in this form, we can reduce the number of indented lines and perhaps more directly express our intent by writing:



size_t idx(const key_type& key) const {
auto it = indices.find(key);
if (it != indices.end()) {
return it->second;
}
return -1;
}


Some compilers might complain about the implicit conversion of -1 to size_t. I might try to be explicit there; or even assert that such a case ought to be impossible, if that's what I meant.



    return size_t(-1);
// or
assert(false);


Compilers will definitely complain about your comparison of size_t with -1 down in the test cases; I'd definitely fix that.



TEST(s.idx(a1.key) == size_t(-1));






share|improve this answer












share|improve this answer



share|improve this answer










answered 13 hours ago









Quuxplusone

10.8k11857




10.8k11857












  • There's quite a good case for static constexpr auto invalid_index = size_t(-1); to avoid writing ugly conversions everywhere.
    – Toby Speight
    13 hours ago


















  • There's quite a good case for static constexpr auto invalid_index = size_t(-1); to avoid writing ugly conversions everywhere.
    – Toby Speight
    13 hours ago
















There's quite a good case for static constexpr auto invalid_index = size_t(-1); to avoid writing ugly conversions everywhere.
– Toby Speight
13 hours ago




There's quite a good case for static constexpr auto invalid_index = size_t(-1); to avoid writing ugly conversions everywhere.
– Toby Speight
13 hours ago


















draft saved

draft discarded




















































Thanks for contributing an answer to Code Review Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f175902%2findexedvector-holding-a-vector-of-unique-objects-with-faster-finding-elements%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Quarter-circle Tiles

build a pushdown automaton that recognizes the reverse language of a given pushdown automaton?

Mont Emei