Python 3 multi-threaded pinger

up vote
3
down vote

favorite

My goal: I want to ping every single IPv4 address and record whether or not they responded.

The way I have it set up is every IP address corresponds to an index. For example 0.0.0.0 is index 0 and 0.0.1.0 is index 256. So if 0.0.0.0 responded, then the 0th element of the bitarray is true. At the end I write the bitarray to a file.

Here is the code:

import subprocess

from bitarray import bitarray

import threading

import time



response_array = bitarray(256 * 256 * 256 * 256)

response_array.setall(False)



def send_all_pings():

    index = 0

    for f1 in range(256):

        for f2 in range(256):

            for f3 in range(256):

                for f4 in range(256):

                    thread = PingerThread(".".join(map(str, [f1, f2, f3, f4])), index)

                    thread.start()

                    index += 1



    time.sleep(30)

    print("Writing response array to file")

    with open('responses.bin', 'wb') as out:

        response_array.tofile(out)





class PingerThread(threading.Thread):

    def __init__(self, address, index):

        threading.Thread.__init__(self)

        self.address = address

        self.index = index



    def run(self):

        if subprocess.call(["ping", "-c", "1", "-w", "1", self.address]) == 0:

            response_array[self.index] = True

        else:

            response_array[self.index] = False

What can I do to make this run faster? Any optimisations at all, even if very small, are welcome!

edited yesterday

200_success

127k15149412

asked yesterday

Kos

20329

add a comment |

up vote
3
down vote

favorite

My goal: I want to ping every single IPv4 address and record whether or not they responded.

Here is the code:

import subprocess

from bitarray import bitarray

import threading

import time



response_array = bitarray(256 * 256 * 256 * 256)

response_array.setall(False)



def send_all_pings():

    index = 0

    for f1 in range(256):

        for f2 in range(256):

            for f3 in range(256):

                for f4 in range(256):

                    thread = PingerThread(".".join(map(str, [f1, f2, f3, f4])), index)

                    thread.start()

                    index += 1



    time.sleep(30)

    print("Writing response array to file")

    with open('responses.bin', 'wb') as out:

        response_array.tofile(out)





class PingerThread(threading.Thread):

    def __init__(self, address, index):

        threading.Thread.__init__(self)

        self.address = address

        self.index = index



    def run(self):

        if subprocess.call(["ping", "-c", "1", "-w", "1", self.address]) == 0:

            response_array[self.index] = True

        else:

            response_array[self.index] = False

What can I do to make this run faster? Any optimisations at all, even if very small, are welcome!

edited yesterday

200_success

127k15149412

asked yesterday

Kos

20329

add a comment |

up vote
3
down vote

favorite

My goal: I want to ping every single IPv4 address and record whether or not they responded.

Here is the code:

import subprocess

from bitarray import bitarray

import threading

import time



response_array = bitarray(256 * 256 * 256 * 256)

response_array.setall(False)



def send_all_pings():

    index = 0

    for f1 in range(256):

        for f2 in range(256):

            for f3 in range(256):

                for f4 in range(256):

                    thread = PingerThread(".".join(map(str, [f1, f2, f3, f4])), index)

                    thread.start()

                    index += 1



    time.sleep(30)

    print("Writing response array to file")

    with open('responses.bin', 'wb') as out:

        response_array.tofile(out)





class PingerThread(threading.Thread):

    def __init__(self, address, index):

        threading.Thread.__init__(self)

        self.address = address

        self.index = index



    def run(self):

        if subprocess.call(["ping", "-c", "1", "-w", "1", self.address]) == 0:

            response_array[self.index] = True

        else:

            response_array[self.index] = False

What can I do to make this run faster? Any optimisations at all, even if very small, are welcome!

edited yesterday

200_success

127k15149412

asked yesterday

Kos

20329

My goal: I want to ping every single IPv4 address and record whether or not they responded.

Here is the code:

import subprocess

from bitarray import bitarray

import threading

import time



response_array = bitarray(256 * 256 * 256 * 256)

response_array.setall(False)



def send_all_pings():

    index = 0

    for f1 in range(256):

        for f2 in range(256):

            for f3 in range(256):

                for f4 in range(256):

                    thread = PingerThread(".".join(map(str, [f1, f2, f3, f4])), index)

                    thread.start()

                    index += 1



    time.sleep(30)

    print("Writing response array to file")

    with open('responses.bin', 'wb') as out:

        response_array.tofile(out)





class PingerThread(threading.Thread):

    def __init__(self, address, index):

        threading.Thread.__init__(self)

        self.address = address

        self.index = index



    def run(self):

        if subprocess.call(["ping", "-c", "1", "-w", "1", self.address]) == 0:

            response_array[self.index] = True

        else:

            response_array[self.index] = False

What can I do to make this run faster? Any optimisations at all, even if very small, are welcome!

python performance python-3.x multithreading networking

edited yesterday

200_success

127k15149412

asked yesterday

Kos

20329

edited yesterday

200_success

127k15149412

asked yesterday

Kos

20329

edited yesterday

200_success

127k15149412

edited yesterday

200_success

127k15149412

edited yesterday

200_success

127k15149412

asked yesterday

Kos

20329

asked yesterday

Kos

20329

asked yesterday

Kos

20329

add a comment |

2 Answers
2

active

oldest

votes

up vote
1
down vote

Opening four billion network connections potentially at once doesn't sound like a good idea. I can't tell right now whether or not you'll hit the OS limit and if it will be handled in some graceful way like blocking until a handle is free, but I'd rather set up a sane limit up front.

answered yesterday

millimoose

27527

add a comment |

up vote
1
down vote

Some suggestions:

IP addresses are only formatted as octets (0-255) for human readability - they actually just represent integers. Instead of for example 127.0.0.1 you can use 2130706433 (127*2^24+1). In other words, range(2^32-1) represents the entire range of IPv4 addresses.

Using a Python library to ping hosts is very likely going to be much faster than starting a shell command.

Use multiprocessing rather than Python threads to avoid running into the global interpreter lock

response_array will end up taking many gigabytes of memory. If you really need the kind of detail you're logging you should be writing each entry to disk ASAP (keeping the file open all the while). You could also look into simplifying your reporting, such as only saving the IP addresses which don't respond, or saving to two files, one with responding IPs and the other with non-responding ones. You'll have to store a file (or a pair of files) per process, to avoid them clobbering each other.

edited yesterday

answered yesterday

l0b0

4,172923

1

Since this is mostly waiting on a network, async io might be even better than multiprocessing. Although you probably don't want 4 billion open connections, but that's an issue with the original code as well
– millimoose
yesterday

Also saving to two files from umpty concurrent processes sounds like either a bottleneck or a way to end up with a file full of garbage.
– millimoose
yesterday

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209170%2fpython-3-multi-threaded-pinger%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
1
down vote

answered yesterday

millimoose

27527

add a comment |

up vote
1
down vote

answered yesterday

millimoose

27527

add a comment |

up vote
1
down vote

answered yesterday

millimoose

27527

answered yesterday

millimoose

27527

answered yesterday

millimoose

27527

answered yesterday

millimoose

27527

answered yesterday

millimoose

27527

add a comment |

up vote
1
down vote

Some suggestions:

IP addresses are only formatted as octets (0-255) for human readability - they actually just represent integers. Instead of for example 127.0.0.1 you can use 2130706433 (127*2^24+1). In other words, range(2^32-1) represents the entire range of IPv4 addresses.

Using a Python library to ping hosts is very likely going to be much faster than starting a shell command.

Use multiprocessing rather than Python threads to avoid running into the global interpreter lock

response_array will end up taking many gigabytes of memory. If you really need the kind of detail you're logging you should be writing each entry to disk ASAP (keeping the file open all the while). You could also look into simplifying your reporting, such as only saving the IP addresses which don't respond, or saving to two files, one with responding IPs and the other with non-responding ones. You'll have to store a file (or a pair of files) per process, to avoid them clobbering each other.

edited yesterday

answered yesterday

l0b0

4,172923

1

Since this is mostly waiting on a network, async io might be even better than multiprocessing. Although you probably don't want 4 billion open connections, but that's an issue with the original code as well
– millimoose
yesterday

Also saving to two files from umpty concurrent processes sounds like either a bottleneck or a way to end up with a file full of garbage.
– millimoose
yesterday

add a comment |

up vote
1
down vote

Some suggestions:

IP addresses are only formatted as octets (0-255) for human readability - they actually just represent integers. Instead of for example 127.0.0.1 you can use 2130706433 (127*2^24+1). In other words, range(2^32-1) represents the entire range of IPv4 addresses.

Using a Python library to ping hosts is very likely going to be much faster than starting a shell command.

Use multiprocessing rather than Python threads to avoid running into the global interpreter lock

response_array will end up taking many gigabytes of memory. If you really need the kind of detail you're logging you should be writing each entry to disk ASAP (keeping the file open all the while). You could also look into simplifying your reporting, such as only saving the IP addresses which don't respond, or saving to two files, one with responding IPs and the other with non-responding ones. You'll have to store a file (or a pair of files) per process, to avoid them clobbering each other.

edited yesterday

answered yesterday

l0b0

4,172923

1

Since this is mostly waiting on a network, async io might be even better than multiprocessing. Although you probably don't want 4 billion open connections, but that's an issue with the original code as well
– millimoose
yesterday

Also saving to two files from umpty concurrent processes sounds like either a bottleneck or a way to end up with a file full of garbage.
– millimoose
yesterday

add a comment |

up vote
1
down vote

Some suggestions:

IP addresses are only formatted as octets (0-255) for human readability - they actually just represent integers. Instead of for example 127.0.0.1 you can use 2130706433 (127*2^24+1). In other words, range(2^32-1) represents the entire range of IPv4 addresses.

Using a Python library to ping hosts is very likely going to be much faster than starting a shell command.

Use multiprocessing rather than Python threads to avoid running into the global interpreter lock

response_array will end up taking many gigabytes of memory. If you really need the kind of detail you're logging you should be writing each entry to disk ASAP (keeping the file open all the while). You could also look into simplifying your reporting, such as only saving the IP addresses which don't respond, or saving to two files, one with responding IPs and the other with non-responding ones. You'll have to store a file (or a pair of files) per process, to avoid them clobbering each other.

edited yesterday

answered yesterday

l0b0

4,172923

Some suggestions:

IP addresses are only formatted as octets (0-255) for human readability - they actually just represent integers. Instead of for example 127.0.0.1 you can use 2130706433 (127*2^24+1). In other words, range(2^32-1) represents the entire range of IPv4 addresses.

Using a Python library to ping hosts is very likely going to be much faster than starting a shell command.

Use multiprocessing rather than Python threads to avoid running into the global interpreter lock

response_array will end up taking many gigabytes of memory. If you really need the kind of detail you're logging you should be writing each entry to disk ASAP (keeping the file open all the while). You could also look into simplifying your reporting, such as only saving the IP addresses which don't respond, or saving to two files, one with responding IPs and the other with non-responding ones. You'll have to store a file (or a pair of files) per process, to avoid them clobbering each other.

edited yesterday

answered yesterday

l0b0

4,172923

edited yesterday

answered yesterday

l0b0

4,172923

answered yesterday

l0b0

4,172923

answered yesterday

l0b0

4,172923

1

Since this is mostly waiting on a network, async io might be even better than multiprocessing. Although you probably don't want 4 billion open connections, but that's an issue with the original code as well
– millimoose
yesterday

Also saving to two files from umpty concurrent processes sounds like either a bottleneck or a way to end up with a file full of garbage.
– millimoose
yesterday

add a comment |

1

Since this is mostly waiting on a network, async io might be even better than multiprocessing. Although you probably don't want 4 billion open connections, but that's an issue with the original code as well
– millimoose
yesterday

Also saving to two files from umpty concurrent processes sounds like either a bottleneck or a way to end up with a file full of garbage.
– millimoose
yesterday

Since this is mostly waiting on a network, async io might be even better than multiprocessing. Although you probably don't want 4 billion open connections, but that's an issue with the original code as well
– millimoose
yesterday

Also saving to two files from umpty concurrent processes sounds like either a bottleneck or a way to end up with a file full of garbage.
– millimoose
yesterday

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Krdytkyu