Addition function for two Big Integers represented as std::string

up vote
1
down vote

favorite

This function is from my attempt to write a Big Integer class in C++.

Below, I have a simple addition function.
It takes two strings of arbitrary length which contains only digits (no
'-' sign as well). It returns the sum of the two integers.

Could someone help me optimize for performance?

Since this is my first attempt writing C++ code, I would love for some feedback and improvements in code clarity and style as well. Furthermore, any improvements in security and memory are also appreciated. Please assume that the input will be properly sanitized and no leading zeros are present. I have not included the entire class as it can work as a standalone function.

std::string BigInt::add(const std::string &a, const std::string &b) const

{

    //I think keeping the assignment and declaration of addend in 

    // multiple lines prevents copy constructors, right?

    std::string result(a.size() > b.size() ? a:b);

    std::string addend;

    addend = a.size() > b.size() ? b:a;



    char carry = 0;

    char adder = 0;

    for(int cnt = (result.size() - 1); cnt >= 0; cnt--)

    {

        adder = result[cnt] - '0' + carry;

        //cnt is non-negative so it is safe cast

        if( (result.size() - addend.size()) <= static_cast<unsigned>(cnt))

        {

            adder += addend[cnt-(result.size()-addend.size())] - '0';

        }

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }

    //This propagates the final carry. O(n) insert, can maybe be optimized?

    if(carry)

    {

            result.insert(result.begin(), carry + '0');

    } 

    return result;

}

edited yesterday

Toby Speight

22.9k537109

asked yesterday

Gnik

16112

add a comment |

up vote
1
down vote

favorite

This function is from my attempt to write a Big Integer class in C++.

Below, I have a simple addition function.
It takes two strings of arbitrary length which contains only digits (no
'-' sign as well). It returns the sum of the two integers.

Could someone help me optimize for performance?

std::string BigInt::add(const std::string &a, const std::string &b) const

{

    //I think keeping the assignment and declaration of addend in 

    // multiple lines prevents copy constructors, right?

    std::string result(a.size() > b.size() ? a:b);

    std::string addend;

    addend = a.size() > b.size() ? b:a;



    char carry = 0;

    char adder = 0;

    for(int cnt = (result.size() - 1); cnt >= 0; cnt--)

    {

        adder = result[cnt] - '0' + carry;

        //cnt is non-negative so it is safe cast

        if( (result.size() - addend.size()) <= static_cast<unsigned>(cnt))

        {

            adder += addend[cnt-(result.size()-addend.size())] - '0';

        }

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }

    //This propagates the final carry. O(n) insert, can maybe be optimized?

    if(carry)

    {

            result.insert(result.begin(), carry + '0');

    } 

    return result;

}

edited yesterday

Toby Speight

22.9k537109

asked yesterday

Gnik

16112

add a comment |

up vote
1
down vote

favorite

This function is from my attempt to write a Big Integer class in C++.

Below, I have a simple addition function.
It takes two strings of arbitrary length which contains only digits (no
'-' sign as well). It returns the sum of the two integers.

Could someone help me optimize for performance?

std::string BigInt::add(const std::string &a, const std::string &b) const

{

    //I think keeping the assignment and declaration of addend in 

    // multiple lines prevents copy constructors, right?

    std::string result(a.size() > b.size() ? a:b);

    std::string addend;

    addend = a.size() > b.size() ? b:a;



    char carry = 0;

    char adder = 0;

    for(int cnt = (result.size() - 1); cnt >= 0; cnt--)

    {

        adder = result[cnt] - '0' + carry;

        //cnt is non-negative so it is safe cast

        if( (result.size() - addend.size()) <= static_cast<unsigned>(cnt))

        {

            adder += addend[cnt-(result.size()-addend.size())] - '0';

        }

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }

    //This propagates the final carry. O(n) insert, can maybe be optimized?

    if(carry)

    {

            result.insert(result.begin(), carry + '0');

    } 

    return result;

}

edited yesterday

Toby Speight

22.9k537109

asked yesterday

Gnik

16112

This function is from my attempt to write a Big Integer class in C++.

Below, I have a simple addition function.
It takes two strings of arbitrary length which contains only digits (no
'-' sign as well). It returns the sum of the two integers.

Could someone help me optimize for performance?

std::string BigInt::add(const std::string &a, const std::string &b) const

{

    //I think keeping the assignment and declaration of addend in 

    // multiple lines prevents copy constructors, right?

    std::string result(a.size() > b.size() ? a:b);

    std::string addend;

    addend = a.size() > b.size() ? b:a;



    char carry = 0;

    char adder = 0;

    for(int cnt = (result.size() - 1); cnt >= 0; cnt--)

    {

        adder = result[cnt] - '0' + carry;

        //cnt is non-negative so it is safe cast

        if( (result.size() - addend.size()) <= static_cast<unsigned>(cnt))

        {

            adder += addend[cnt-(result.size()-addend.size())] - '0';

        }

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }

    //This propagates the final carry. O(n) insert, can maybe be optimized?

    if(carry)

    {

            result.insert(result.begin(), carry + '0');

    } 

    return result;

}

c++ performance integer

edited yesterday

Toby Speight

22.9k537109

asked yesterday

Gnik

16112

edited yesterday

Toby Speight

22.9k537109

asked yesterday

Gnik

16112

edited yesterday

Toby Speight

22.9k537109

edited yesterday

Toby Speight

22.9k537109

edited yesterday

Toby Speight

22.9k537109

asked yesterday

Gnik

16112

asked yesterday

Gnik

16112

asked yesterday

Gnik

16112

add a comment |

1 Answer
1

active

oldest

votes

up vote
3
down vote

accepted

Memory management

There are no leaks, which is great!

However, there is an unnecessary copy being made:

//I think keeping the assignment and declaration of addend in 

// multiple lines prevents copy constructors, right?

std::string addend;

addend = a.size() > b.size() ? b:a;

Well, the comment is correct in that the copy constructor doesn't get called, but the copy assignment operator - and that one not only creates a copy of the right hand side operand, but also needs to clean up the original state.

A better way would be to use a reference instead:

const std::string &addend = a.size() > b.size() ? b : a;

Now, no copy is being made, but you can still easily access the smaller operand.

Performance

The simplest way to improve performance would be to chose better data structures/data representations.

The current implementation with std::string has one slight advantage (it is a string, i.e. directly human readable), and some disadvantages:

It wastes a lot of memory (1 byte per decimal digit = 25.6x more memory usage than necessary).

It operates in really small steps (1 byte at a time).

There are some unnecessarily complicated repeated checks.

Inserting a new leading digit (because of carryover) requires all digits to be moved in memory, plus a likely reallocation.

For better performance, these issues should be addressed.

NOTE: For the best performance, you need to have a look at how the target processor works and understand how you can apply that knowledge to your advantage. That will likely include having to use embedded assembly (e.g. for x86 platforms, having access to the CPU carry flag can be really helpful) and similar tricks.

But this question is targeting C++, so I'll try to give an example that can be portably done within C++ with not too much overhead.

So, we need to address the issues mentioned above. Let's go through them one at a time:

Inserting a new leading digit (because of carryover) requires all digits to be moved in memory, plus a likely reallocation.

This can be amended by storing the digits in reverse order (new digit gets inserted at the back instead of the front) and by switching to a std::vector for storage (with maybe some extra capacity to prevent the reallocation).

std::vector<char> BigInt::add(const std::vector<char> &a, const std::vector<char> &b) const

{

    // Some helper variables

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    std::vector<char> result(a.size() > b.size() ? a:b);

    const std::vector<char> &addend = a.size() > b.size() ? b : a;



    char carry = 0;

    char adder = 0;



    for(int cnt = 0; cnt < max_size; ++cnt)

    {

        adder = result[cnt] - '0' + carry;



        if(cnt < min_size)

        {

            adder += addend[cnt] - '0';

        }



        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }



    if(carry)

    {

        result.push_back(carry + '0');

    }



    return result;

}

Already looks a lot cleaner, doesn't it?

There are some unnecessarily complicated repeated checks.

Some of the complicated length calculations already got taken care of, but there is one part that is needlessly repeated:

        if(cnt < min_size)

        {

            adder += addend[cnt] - '0';

        }

Why needlessly? Well, actually we know that this condition will be true for the first min_size iterations of the for loop, and false afterwards. With that knowledge, we can move that check one level higher, and replace the for loop with:

    for(int cnt = 0; cnt < min_size; ++cnt)

    {

        adder = (result[cnt] - '0') + carry + (addend[cnt] - '0');

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }



    for(int cnt = min_size; carry && cnt < max_size; ++cnt)

    {

        adder = result[cnt] - '0' + carry;

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }

Of course, now the loop bodies are nearly identical, but that can easily be fixed with some refactoring. Full code:

std::vector<char> BigInt::add(const std::vector<char> &a, const std::vector<char> &b) const

{

    // Some helper variables

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    std::vector<char> result(a.size() > b.size() ? a:b);

    const std::vector<char> &addend = a.size() > b.size() ? b : a;



    char carry = 0;



    auto add_digits = [&](char lhs, char rhs)

    {

        auto adder = lhs + rhs + carry;

        carry = adder / 10;

        return adder % 10;

    };



    for(int cnt = 0; cnt < min_size; ++cnt)

    {

        result[cnt] = add_digits(result[cnt] - '0', addend[cnt] - '0') + '0';

    }



    for(int cnt = min_size; carry && cnt < max_size; ++cnt)

    {

        result[cnt] = add_digits(result[cnt] - '0', 0) + '0';

    }



    if(carry)

    {

        result.push_back(carry + '0');

    }



    return result;

}

It wastes a lot of memory (1 byte per decimal digit = 25.6x more memory usage than necessary).

It operates in really small steps (1 byte at a time).

Fixing this requires an increase in the digit "size": Before, we were using base-10 digits stored in chars. This can easily be improved to base-256 digits stored in chars (assuming 8 bit char, could be larger!).

But that's basically just calculating on the chars themselves. Most processors can work directly with bigger numbers, like ints (which are usually the largest integer type the target CPU can handle natively). So, why not use ints as "digits"?

Sadly, ints aren't exactly portable themselves; signed integers can be handled very differently on different processors (e.g. one's complement or two's complement). But: Unsigned intergers are thoroughly specified by the C++ standard, so we can use those.

Also, we can instruct the result vector to preallocate space in case we need space for a new leading digit.

This can easily be added upon the code above for this:

bool check_overflow(unsigned int lhs, unsigned int rhs)

{

    return lhs > std::numeric_limits<unsigned int>::max() - rhs;

}



std::vector<unsigned int> BigInt::add(const std::vector<unsigned int>& a, const std::vector<unsigned int>& b) const

{

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    auto result = std::vector<unsigned int>();

    result.reserve(max_size + 1);



    const auto& larger = (a.size() > b.size() ? a : b);

    const auto& smaller = (a.size() > b.size() ? b : a);



    auto carry = 0u;



    auto add_digit = [&](unsigned int lhs, unsigned int rhs)

    {

        auto sum = lhs + rhs + carry;

        carry = (check_overflow(lhs, rhs) || (sum == 0 && carry)) ? 1 : 0;

        return sum;

    };



    for(auto i = 0; i < min_size; ++i)

    {

        result.push_back(add_digit(larger[i], smaller[i]));

    }



    for(auto i = min_size; i < max_size; ++i)

    {

        result.push_back(add_digit(larger[i], 0));

    }



    if(carry) result.push_back(carry);



    return result;

}

Of course, this isn't perfect performance yet, much can be improved by using knowledge about the target architecture, e.g. by using SSE instructions or similar. Still, this should be quite a lot faster than the original approach, and should be fully portable between different platforms.

answered yesterday

hoffmale

5,502835

Excellent review! Thank you!
– Gnik
yesterday

Instead of min(), max() and assignments, I would have used something like std::tie(smaller , larger) = std::minmax(a, b, (const auto& lhs, const auto& rhs) {return lhs.size() < rhs.size();});
– Calak
21 hours ago

@Calak: std::tie cannot be used to initialize references, so your suggestion would need to copy the inputs.
– hoffmale
18 hours ago

However, IIUC it should work with structured bindings: const auto& [smaller, larger] = std::minmax(a, b, (const auto& lhs, const auto& rhs) { return lhs.size() < rhs.size(); });. That said, this requires C++17, and I've tried to keep my answer C++11 compatible (since OP didn't specify an exact version).
– hoffmale
18 hours ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f208933%2faddition-function-for-two-big-integers-represented-as-stdstring%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
3
down vote

accepted

Memory management

There are no leaks, which is great!

However, there is an unnecessary copy being made:

//I think keeping the assignment and declaration of addend in 

// multiple lines prevents copy constructors, right?

std::string addend;

addend = a.size() > b.size() ? b:a;

A better way would be to use a reference instead:

const std::string &addend = a.size() > b.size() ? b : a;

Now, no copy is being made, but you can still easily access the smaller operand.

Performance

The simplest way to improve performance would be to chose better data structures/data representations.

The current implementation with std::string has one slight advantage (it is a string, i.e. directly human readable), and some disadvantages:

It wastes a lot of memory (1 byte per decimal digit = 25.6x more memory usage than necessary).

It operates in really small steps (1 byte at a time).

There are some unnecessarily complicated repeated checks.

Inserting a new leading digit (because of carryover) requires all digits to be moved in memory, plus a likely reallocation.

For better performance, these issues should be addressed.

NOTE: For the best performance, you need to have a look at how the target processor works and understand how you can apply that knowledge to your advantage. That will likely include having to use embedded assembly (e.g. for x86 platforms, having access to the CPU carry flag can be really helpful) and similar tricks.

But this question is targeting C++, so I'll try to give an example that can be portably done within C++ with not too much overhead.

So, we need to address the issues mentioned above. Let's go through them one at a time:

Inserting a new leading digit (because of carryover) requires all digits to be moved in memory, plus a likely reallocation.

std::vector<char> BigInt::add(const std::vector<char> &a, const std::vector<char> &b) const

{

    // Some helper variables

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    std::vector<char> result(a.size() > b.size() ? a:b);

    const std::vector<char> &addend = a.size() > b.size() ? b : a;



    char carry = 0;

    char adder = 0;



    for(int cnt = 0; cnt < max_size; ++cnt)

    {

        adder = result[cnt] - '0' + carry;



        if(cnt < min_size)

        {

            adder += addend[cnt] - '0';

        }



        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }



    if(carry)

    {

        result.push_back(carry + '0');

    }



    return result;

}

Already looks a lot cleaner, doesn't it?

There are some unnecessarily complicated repeated checks.

Some of the complicated length calculations already got taken care of, but there is one part that is needlessly repeated:

        if(cnt < min_size)

        {

            adder += addend[cnt] - '0';

        }

    for(int cnt = 0; cnt < min_size; ++cnt)

    {

        adder = (result[cnt] - '0') + carry + (addend[cnt] - '0');

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }



    for(int cnt = min_size; carry && cnt < max_size; ++cnt)

    {

        adder = result[cnt] - '0' + carry;

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }

Of course, now the loop bodies are nearly identical, but that can easily be fixed with some refactoring. Full code:

std::vector<char> BigInt::add(const std::vector<char> &a, const std::vector<char> &b) const

{

    // Some helper variables

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    std::vector<char> result(a.size() > b.size() ? a:b);

    const std::vector<char> &addend = a.size() > b.size() ? b : a;



    char carry = 0;



    auto add_digits = [&](char lhs, char rhs)

    {

        auto adder = lhs + rhs + carry;

        carry = adder / 10;

        return adder % 10;

    };



    for(int cnt = 0; cnt < min_size; ++cnt)

    {

        result[cnt] = add_digits(result[cnt] - '0', addend[cnt] - '0') + '0';

    }



    for(int cnt = min_size; carry && cnt < max_size; ++cnt)

    {

        result[cnt] = add_digits(result[cnt] - '0', 0) + '0';

    }



    if(carry)

    {

        result.push_back(carry + '0');

    }



    return result;

}

It wastes a lot of memory (1 byte per decimal digit = 25.6x more memory usage than necessary).

It operates in really small steps (1 byte at a time).

Sadly, ints aren't exactly portable themselves; signed integers can be handled very differently on different processors (e.g. one's complement or two's complement). But: Unsigned intergers are thoroughly specified by the C++ standard, so we can use those.

Also, we can instruct the result vector to preallocate space in case we need space for a new leading digit.

This can easily be added upon the code above for this:

bool check_overflow(unsigned int lhs, unsigned int rhs)

{

    return lhs > std::numeric_limits<unsigned int>::max() - rhs;

}



std::vector<unsigned int> BigInt::add(const std::vector<unsigned int>& a, const std::vector<unsigned int>& b) const

{

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    auto result = std::vector<unsigned int>();

    result.reserve(max_size + 1);



    const auto& larger = (a.size() > b.size() ? a : b);

    const auto& smaller = (a.size() > b.size() ? b : a);



    auto carry = 0u;



    auto add_digit = [&](unsigned int lhs, unsigned int rhs)

    {

        auto sum = lhs + rhs + carry;

        carry = (check_overflow(lhs, rhs) || (sum == 0 && carry)) ? 1 : 0;

        return sum;

    };



    for(auto i = 0; i < min_size; ++i)

    {

        result.push_back(add_digit(larger[i], smaller[i]));

    }



    for(auto i = min_size; i < max_size; ++i)

    {

        result.push_back(add_digit(larger[i], 0));

    }



    if(carry) result.push_back(carry);



    return result;

}

Of course, this isn't perfect performance yet, much can be improved by using knowledge about the target architecture, e.g. by using SSE instructions or similar. Still, this should be quite a lot faster than the original approach, and should be fully portable between different platforms.

answered yesterday

hoffmale

5,502835

Excellent review! Thank you!
– Gnik
yesterday

Instead of min(), max() and assignments, I would have used something like std::tie(smaller , larger) = std::minmax(a, b, (const auto& lhs, const auto& rhs) {return lhs.size() < rhs.size();});
– Calak
21 hours ago

@Calak: std::tie cannot be used to initialize references, so your suggestion would need to copy the inputs.
– hoffmale
18 hours ago

However, IIUC it should work with structured bindings: const auto& [smaller, larger] = std::minmax(a, b, (const auto& lhs, const auto& rhs) { return lhs.size() < rhs.size(); });. That said, this requires C++17, and I've tried to keep my answer C++11 compatible (since OP didn't specify an exact version).
– hoffmale
18 hours ago

add a comment |

up vote
3
down vote

accepted

Memory management

There are no leaks, which is great!

However, there is an unnecessary copy being made:

//I think keeping the assignment and declaration of addend in 

// multiple lines prevents copy constructors, right?

std::string addend;

addend = a.size() > b.size() ? b:a;

A better way would be to use a reference instead:

const std::string &addend = a.size() > b.size() ? b : a;

Now, no copy is being made, but you can still easily access the smaller operand.

Performance

The simplest way to improve performance would be to chose better data structures/data representations.

The current implementation with std::string has one slight advantage (it is a string, i.e. directly human readable), and some disadvantages:

It wastes a lot of memory (1 byte per decimal digit = 25.6x more memory usage than necessary).

It operates in really small steps (1 byte at a time).

There are some unnecessarily complicated repeated checks.

Inserting a new leading digit (because of carryover) requires all digits to be moved in memory, plus a likely reallocation.

For better performance, these issues should be addressed.

NOTE: For the best performance, you need to have a look at how the target processor works and understand how you can apply that knowledge to your advantage. That will likely include having to use embedded assembly (e.g. for x86 platforms, having access to the CPU carry flag can be really helpful) and similar tricks.

But this question is targeting C++, so I'll try to give an example that can be portably done within C++ with not too much overhead.

So, we need to address the issues mentioned above. Let's go through them one at a time:

Inserting a new leading digit (because of carryover) requires all digits to be moved in memory, plus a likely reallocation.

std::vector<char> BigInt::add(const std::vector<char> &a, const std::vector<char> &b) const

{

    // Some helper variables

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    std::vector<char> result(a.size() > b.size() ? a:b);

    const std::vector<char> &addend = a.size() > b.size() ? b : a;



    char carry = 0;

    char adder = 0;



    for(int cnt = 0; cnt < max_size; ++cnt)

    {

        adder = result[cnt] - '0' + carry;



        if(cnt < min_size)

        {

            adder += addend[cnt] - '0';

        }



        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }



    if(carry)

    {

        result.push_back(carry + '0');

    }



    return result;

}

Already looks a lot cleaner, doesn't it?

There are some unnecessarily complicated repeated checks.

Some of the complicated length calculations already got taken care of, but there is one part that is needlessly repeated:

        if(cnt < min_size)

        {

            adder += addend[cnt] - '0';

        }

    for(int cnt = 0; cnt < min_size; ++cnt)

    {

        adder = (result[cnt] - '0') + carry + (addend[cnt] - '0');

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }



    for(int cnt = min_size; carry && cnt < max_size; ++cnt)

    {

        adder = result[cnt] - '0' + carry;

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }

Of course, now the loop bodies are nearly identical, but that can easily be fixed with some refactoring. Full code:

std::vector<char> BigInt::add(const std::vector<char> &a, const std::vector<char> &b) const

{

    // Some helper variables

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    std::vector<char> result(a.size() > b.size() ? a:b);

    const std::vector<char> &addend = a.size() > b.size() ? b : a;



    char carry = 0;



    auto add_digits = [&](char lhs, char rhs)

    {

        auto adder = lhs + rhs + carry;

        carry = adder / 10;

        return adder % 10;

    };



    for(int cnt = 0; cnt < min_size; ++cnt)

    {

        result[cnt] = add_digits(result[cnt] - '0', addend[cnt] - '0') + '0';

    }



    for(int cnt = min_size; carry && cnt < max_size; ++cnt)

    {

        result[cnt] = add_digits(result[cnt] - '0', 0) + '0';

    }



    if(carry)

    {

        result.push_back(carry + '0');

    }



    return result;

}

It wastes a lot of memory (1 byte per decimal digit = 25.6x more memory usage than necessary).

It operates in really small steps (1 byte at a time).

Sadly, ints aren't exactly portable themselves; signed integers can be handled very differently on different processors (e.g. one's complement or two's complement). But: Unsigned intergers are thoroughly specified by the C++ standard, so we can use those.

Also, we can instruct the result vector to preallocate space in case we need space for a new leading digit.

This can easily be added upon the code above for this:

bool check_overflow(unsigned int lhs, unsigned int rhs)

{

    return lhs > std::numeric_limits<unsigned int>::max() - rhs;

}



std::vector<unsigned int> BigInt::add(const std::vector<unsigned int>& a, const std::vector<unsigned int>& b) const

{

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    auto result = std::vector<unsigned int>();

    result.reserve(max_size + 1);



    const auto& larger = (a.size() > b.size() ? a : b);

    const auto& smaller = (a.size() > b.size() ? b : a);



    auto carry = 0u;



    auto add_digit = [&](unsigned int lhs, unsigned int rhs)

    {

        auto sum = lhs + rhs + carry;

        carry = (check_overflow(lhs, rhs) || (sum == 0 && carry)) ? 1 : 0;

        return sum;

    };



    for(auto i = 0; i < min_size; ++i)

    {

        result.push_back(add_digit(larger[i], smaller[i]));

    }



    for(auto i = min_size; i < max_size; ++i)

    {

        result.push_back(add_digit(larger[i], 0));

    }



    if(carry) result.push_back(carry);



    return result;

}

Of course, this isn't perfect performance yet, much can be improved by using knowledge about the target architecture, e.g. by using SSE instructions or similar. Still, this should be quite a lot faster than the original approach, and should be fully portable between different platforms.

answered yesterday

hoffmale

5,502835

Excellent review! Thank you!
– Gnik
yesterday

Instead of min(), max() and assignments, I would have used something like std::tie(smaller , larger) = std::minmax(a, b, (const auto& lhs, const auto& rhs) {return lhs.size() < rhs.size();});
– Calak
21 hours ago

@Calak: std::tie cannot be used to initialize references, so your suggestion would need to copy the inputs.
– hoffmale
18 hours ago

However, IIUC it should work with structured bindings: const auto& [smaller, larger] = std::minmax(a, b, (const auto& lhs, const auto& rhs) { return lhs.size() < rhs.size(); });. That said, this requires C++17, and I've tried to keep my answer C++11 compatible (since OP didn't specify an exact version).
– hoffmale
18 hours ago

add a comment |

up vote
3
down vote

accepted

Memory management

There are no leaks, which is great!

However, there is an unnecessary copy being made:

//I think keeping the assignment and declaration of addend in 

// multiple lines prevents copy constructors, right?

std::string addend;

addend = a.size() > b.size() ? b:a;

A better way would be to use a reference instead:

const std::string &addend = a.size() > b.size() ? b : a;

Now, no copy is being made, but you can still easily access the smaller operand.

Performance

The simplest way to improve performance would be to chose better data structures/data representations.

The current implementation with std::string has one slight advantage (it is a string, i.e. directly human readable), and some disadvantages:

It wastes a lot of memory (1 byte per decimal digit = 25.6x more memory usage than necessary).

It operates in really small steps (1 byte at a time).

There are some unnecessarily complicated repeated checks.

Inserting a new leading digit (because of carryover) requires all digits to be moved in memory, plus a likely reallocation.

For better performance, these issues should be addressed.

NOTE: For the best performance, you need to have a look at how the target processor works and understand how you can apply that knowledge to your advantage. That will likely include having to use embedded assembly (e.g. for x86 platforms, having access to the CPU carry flag can be really helpful) and similar tricks.

But this question is targeting C++, so I'll try to give an example that can be portably done within C++ with not too much overhead.

So, we need to address the issues mentioned above. Let's go through them one at a time:

Inserting a new leading digit (because of carryover) requires all digits to be moved in memory, plus a likely reallocation.

std::vector<char> BigInt::add(const std::vector<char> &a, const std::vector<char> &b) const

{

    // Some helper variables

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    std::vector<char> result(a.size() > b.size() ? a:b);

    const std::vector<char> &addend = a.size() > b.size() ? b : a;



    char carry = 0;

    char adder = 0;



    for(int cnt = 0; cnt < max_size; ++cnt)

    {

        adder = result[cnt] - '0' + carry;



        if(cnt < min_size)

        {

            adder += addend[cnt] - '0';

        }



        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }



    if(carry)

    {

        result.push_back(carry + '0');

    }



    return result;

}

Already looks a lot cleaner, doesn't it?

There are some unnecessarily complicated repeated checks.

Some of the complicated length calculations already got taken care of, but there is one part that is needlessly repeated:

        if(cnt < min_size)

        {

            adder += addend[cnt] - '0';

        }

    for(int cnt = 0; cnt < min_size; ++cnt)

    {

        adder = (result[cnt] - '0') + carry + (addend[cnt] - '0');

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }



    for(int cnt = min_size; carry && cnt < max_size; ++cnt)

    {

        adder = result[cnt] - '0' + carry;

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }

Of course, now the loop bodies are nearly identical, but that can easily be fixed with some refactoring. Full code:

std::vector<char> BigInt::add(const std::vector<char> &a, const std::vector<char> &b) const

{

    // Some helper variables

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    std::vector<char> result(a.size() > b.size() ? a:b);

    const std::vector<char> &addend = a.size() > b.size() ? b : a;



    char carry = 0;



    auto add_digits = [&](char lhs, char rhs)

    {

        auto adder = lhs + rhs + carry;

        carry = adder / 10;

        return adder % 10;

    };



    for(int cnt = 0; cnt < min_size; ++cnt)

    {

        result[cnt] = add_digits(result[cnt] - '0', addend[cnt] - '0') + '0';

    }



    for(int cnt = min_size; carry && cnt < max_size; ++cnt)

    {

        result[cnt] = add_digits(result[cnt] - '0', 0) + '0';

    }



    if(carry)

    {

        result.push_back(carry + '0');

    }



    return result;

}

It wastes a lot of memory (1 byte per decimal digit = 25.6x more memory usage than necessary).

It operates in really small steps (1 byte at a time).

Sadly, ints aren't exactly portable themselves; signed integers can be handled very differently on different processors (e.g. one's complement or two's complement). But: Unsigned intergers are thoroughly specified by the C++ standard, so we can use those.

Also, we can instruct the result vector to preallocate space in case we need space for a new leading digit.

This can easily be added upon the code above for this:

bool check_overflow(unsigned int lhs, unsigned int rhs)

{

    return lhs > std::numeric_limits<unsigned int>::max() - rhs;

}



std::vector<unsigned int> BigInt::add(const std::vector<unsigned int>& a, const std::vector<unsigned int>& b) const

{

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    auto result = std::vector<unsigned int>();

    result.reserve(max_size + 1);



    const auto& larger = (a.size() > b.size() ? a : b);

    const auto& smaller = (a.size() > b.size() ? b : a);



    auto carry = 0u;



    auto add_digit = [&](unsigned int lhs, unsigned int rhs)

    {

        auto sum = lhs + rhs + carry;

        carry = (check_overflow(lhs, rhs) || (sum == 0 && carry)) ? 1 : 0;

        return sum;

    };



    for(auto i = 0; i < min_size; ++i)

    {

        result.push_back(add_digit(larger[i], smaller[i]));

    }



    for(auto i = min_size; i < max_size; ++i)

    {

        result.push_back(add_digit(larger[i], 0));

    }



    if(carry) result.push_back(carry);



    return result;

}

Of course, this isn't perfect performance yet, much can be improved by using knowledge about the target architecture, e.g. by using SSE instructions or similar. Still, this should be quite a lot faster than the original approach, and should be fully portable between different platforms.

answered yesterday

hoffmale

5,502835

Memory management

There are no leaks, which is great!

However, there is an unnecessary copy being made:

//I think keeping the assignment and declaration of addend in 

// multiple lines prevents copy constructors, right?

std::string addend;

addend = a.size() > b.size() ? b:a;

A better way would be to use a reference instead:

const std::string &addend = a.size() > b.size() ? b : a;

Now, no copy is being made, but you can still easily access the smaller operand.

Performance

The simplest way to improve performance would be to chose better data structures/data representations.

The current implementation with std::string has one slight advantage (it is a string, i.e. directly human readable), and some disadvantages:

It wastes a lot of memory (1 byte per decimal digit = 25.6x more memory usage than necessary).

It operates in really small steps (1 byte at a time).

There are some unnecessarily complicated repeated checks.

Inserting a new leading digit (because of carryover) requires all digits to be moved in memory, plus a likely reallocation.

For better performance, these issues should be addressed.

NOTE: For the best performance, you need to have a look at how the target processor works and understand how you can apply that knowledge to your advantage. That will likely include having to use embedded assembly (e.g. for x86 platforms, having access to the CPU carry flag can be really helpful) and similar tricks.

But this question is targeting C++, so I'll try to give an example that can be portably done within C++ with not too much overhead.

So, we need to address the issues mentioned above. Let's go through them one at a time:

Inserting a new leading digit (because of carryover) requires all digits to be moved in memory, plus a likely reallocation.

std::vector<char> BigInt::add(const std::vector<char> &a, const std::vector<char> &b) const

{

    // Some helper variables

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    std::vector<char> result(a.size() > b.size() ? a:b);

    const std::vector<char> &addend = a.size() > b.size() ? b : a;



    char carry = 0;

    char adder = 0;



    for(int cnt = 0; cnt < max_size; ++cnt)

    {

        adder = result[cnt] - '0' + carry;



        if(cnt < min_size)

        {

            adder += addend[cnt] - '0';

        }



        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }



    if(carry)

    {

        result.push_back(carry + '0');

    }



    return result;

}

Already looks a lot cleaner, doesn't it?

There are some unnecessarily complicated repeated checks.

Some of the complicated length calculations already got taken care of, but there is one part that is needlessly repeated:

        if(cnt < min_size)

        {

            adder += addend[cnt] - '0';

        }

    for(int cnt = 0; cnt < min_size; ++cnt)

    {

        adder = (result[cnt] - '0') + carry + (addend[cnt] - '0');

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }



    for(int cnt = min_size; carry && cnt < max_size; ++cnt)

    {

        adder = result[cnt] - '0' + carry;

        result[cnt] = adder%10 + '0';

        carry = adder/10;

    }

Of course, now the loop bodies are nearly identical, but that can easily be fixed with some refactoring. Full code:

std::vector<char> BigInt::add(const std::vector<char> &a, const std::vector<char> &b) const

{

    // Some helper variables

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    std::vector<char> result(a.size() > b.size() ? a:b);

    const std::vector<char> &addend = a.size() > b.size() ? b : a;



    char carry = 0;



    auto add_digits = [&](char lhs, char rhs)

    {

        auto adder = lhs + rhs + carry;

        carry = adder / 10;

        return adder % 10;

    };



    for(int cnt = 0; cnt < min_size; ++cnt)

    {

        result[cnt] = add_digits(result[cnt] - '0', addend[cnt] - '0') + '0';

    }



    for(int cnt = min_size; carry && cnt < max_size; ++cnt)

    {

        result[cnt] = add_digits(result[cnt] - '0', 0) + '0';

    }



    if(carry)

    {

        result.push_back(carry + '0');

    }



    return result;

}

It wastes a lot of memory (1 byte per decimal digit = 25.6x more memory usage than necessary).

It operates in really small steps (1 byte at a time).

Sadly, ints aren't exactly portable themselves; signed integers can be handled very differently on different processors (e.g. one's complement or two's complement). But: Unsigned intergers are thoroughly specified by the C++ standard, so we can use those.

Also, we can instruct the result vector to preallocate space in case we need space for a new leading digit.

This can easily be added upon the code above for this:

bool check_overflow(unsigned int lhs, unsigned int rhs)

{

    return lhs > std::numeric_limits<unsigned int>::max() - rhs;

}



std::vector<unsigned int> BigInt::add(const std::vector<unsigned int>& a, const std::vector<unsigned int>& b) const

{

    auto min_size = std::min(a.size(), b.size());

    auto max_size = std::max(a.size(), b.size());



    auto result = std::vector<unsigned int>();

    result.reserve(max_size + 1);



    const auto& larger = (a.size() > b.size() ? a : b);

    const auto& smaller = (a.size() > b.size() ? b : a);



    auto carry = 0u;



    auto add_digit = [&](unsigned int lhs, unsigned int rhs)

    {

        auto sum = lhs + rhs + carry;

        carry = (check_overflow(lhs, rhs) || (sum == 0 && carry)) ? 1 : 0;

        return sum;

    };



    for(auto i = 0; i < min_size; ++i)

    {

        result.push_back(add_digit(larger[i], smaller[i]));

    }



    for(auto i = min_size; i < max_size; ++i)

    {

        result.push_back(add_digit(larger[i], 0));

    }



    if(carry) result.push_back(carry);



    return result;

}

Of course, this isn't perfect performance yet, much can be improved by using knowledge about the target architecture, e.g. by using SSE instructions or similar. Still, this should be quite a lot faster than the original approach, and should be fully portable between different platforms.

answered yesterday

hoffmale

5,502835

answered yesterday

hoffmale

5,502835

answered yesterday

hoffmale

5,502835

answered yesterday

hoffmale

5,502835

Excellent review! Thank you!
– Gnik
yesterday

Instead of min(), max() and assignments, I would have used something like std::tie(smaller , larger) = std::minmax(a, b, (const auto& lhs, const auto& rhs) {return lhs.size() < rhs.size();});
– Calak
21 hours ago

@Calak: std::tie cannot be used to initialize references, so your suggestion would need to copy the inputs.
– hoffmale
18 hours ago

However, IIUC it should work with structured bindings: const auto& [smaller, larger] = std::minmax(a, b, (const auto& lhs, const auto& rhs) { return lhs.size() < rhs.size(); });. That said, this requires C++17, and I've tried to keep my answer C++11 compatible (since OP didn't specify an exact version).
– hoffmale
18 hours ago

add a comment |

Excellent review! Thank you!
– Gnik
yesterday

Instead of min(), max() and assignments, I would have used something like std::tie(smaller , larger) = std::minmax(a, b, (const auto& lhs, const auto& rhs) {return lhs.size() < rhs.size();});
– Calak
21 hours ago

@Calak: std::tie cannot be used to initialize references, so your suggestion would need to copy the inputs.
– hoffmale
18 hours ago

However, IIUC it should work with structured bindings: const auto& [smaller, larger] = std::minmax(a, b, (const auto& lhs, const auto& rhs) { return lhs.size() < rhs.size(); });. That said, this requires C++17, and I've tried to keep my answer C++11 compatible (since OP didn't specify an exact version).
– hoffmale
18 hours ago

Excellent review! Thank you!
– Gnik
yesterday

Instead of min(), max() and assignments, I would have used something like std::tie(smaller , larger) = std::minmax(a, b, (const auto& lhs, const auto& rhs) {return lhs.size() < rhs.size();});
– Calak
21 hours ago

@Calak: std::tie cannot be used to initialize references, so your suggestion would need to copy the inputs.
– hoffmale
18 hours ago

However, IIUC it should work with structured bindings:

const auto& [smaller, larger] = std::minmax(a, b, (const auto& lhs, const auto& rhs) { return lhs.size() < rhs.size(); });

. That said, this requires C++17, and I've tried to keep my answer C++11 compatible (since OP didn't specify an exact version).
– hoffmale
18 hours ago

However, IIUC it should work with structured bindings:

const auto& [smaller, larger] = std::minmax(a, b, (const auto& lhs, const auto& rhs) { return lhs.size() < rhs.size(); });

. That said, this requires C++17, and I've tried to keep my answer C++11 compatible (since OP didn't specify an exact version).
– hoffmale
18 hours ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

KvcS,NC7hKuqnw6YDu,K9l1BnAAxFfcDSR j6,HM

搜尋此網誌

Krdytkyu

Addition function for two Big Integers represented as std::string

1 Answer
1

Memory management

Performance

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Memory management

Performance

Memory management

Performance

Memory management

Performance

Memory management

Performance

Post as a guest

Popular posts from this blog

Orthoptera

Ellipse (mathématiques)

Quarter-circle Tiles

Addition function for two Big Integers represented as std::string

1 Answer 1

Memory management

Performance

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Memory management

Performance

Memory management

Performance

Memory management

Performance

Memory management

Performance

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Orthoptera

Ellipse (mathématiques)

Quarter-circle Tiles

1 Answer
1

1 Answer
1

1 Answer
1