Python 3 multi-threaded pinger











up vote
3
down vote

favorite












My goal: I want to ping every single IPv4 address and record whether or not they responded.



The way I have it set up is every IP address corresponds to an index. For example 0.0.0.0 is index 0 and 0.0.1.0 is index 256. So if 0.0.0.0 responded, then the 0th element of the bitarray is true. At the end I write the bitarray to a file.



Here is the code:



import subprocess
from bitarray import bitarray
import threading
import time

response_array = bitarray(256 * 256 * 256 * 256)
response_array.setall(False)

def send_all_pings():
index = 0
for f1 in range(256):
for f2 in range(256):
for f3 in range(256):
for f4 in range(256):
thread = PingerThread(".".join(map(str, [f1, f2, f3, f4])), index)
thread.start()
index += 1

time.sleep(30)
print("Writing response array to file")
with open('responses.bin', 'wb') as out:
response_array.tofile(out)


class PingerThread(threading.Thread):
def __init__(self, address, index):
threading.Thread.__init__(self)
self.address = address
self.index = index

def run(self):
if subprocess.call(["ping", "-c", "1", "-w", "1", self.address]) == 0:
response_array[self.index] = True
else:
response_array[self.index] = False


What can I do to make this run faster? Any optimisations at all, even if very small, are welcome!










share|improve this question




























    up vote
    3
    down vote

    favorite












    My goal: I want to ping every single IPv4 address and record whether or not they responded.



    The way I have it set up is every IP address corresponds to an index. For example 0.0.0.0 is index 0 and 0.0.1.0 is index 256. So if 0.0.0.0 responded, then the 0th element of the bitarray is true. At the end I write the bitarray to a file.



    Here is the code:



    import subprocess
    from bitarray import bitarray
    import threading
    import time

    response_array = bitarray(256 * 256 * 256 * 256)
    response_array.setall(False)

    def send_all_pings():
    index = 0
    for f1 in range(256):
    for f2 in range(256):
    for f3 in range(256):
    for f4 in range(256):
    thread = PingerThread(".".join(map(str, [f1, f2, f3, f4])), index)
    thread.start()
    index += 1

    time.sleep(30)
    print("Writing response array to file")
    with open('responses.bin', 'wb') as out:
    response_array.tofile(out)


    class PingerThread(threading.Thread):
    def __init__(self, address, index):
    threading.Thread.__init__(self)
    self.address = address
    self.index = index

    def run(self):
    if subprocess.call(["ping", "-c", "1", "-w", "1", self.address]) == 0:
    response_array[self.index] = True
    else:
    response_array[self.index] = False


    What can I do to make this run faster? Any optimisations at all, even if very small, are welcome!










    share|improve this question


























      up vote
      3
      down vote

      favorite









      up vote
      3
      down vote

      favorite











      My goal: I want to ping every single IPv4 address and record whether or not they responded.



      The way I have it set up is every IP address corresponds to an index. For example 0.0.0.0 is index 0 and 0.0.1.0 is index 256. So if 0.0.0.0 responded, then the 0th element of the bitarray is true. At the end I write the bitarray to a file.



      Here is the code:



      import subprocess
      from bitarray import bitarray
      import threading
      import time

      response_array = bitarray(256 * 256 * 256 * 256)
      response_array.setall(False)

      def send_all_pings():
      index = 0
      for f1 in range(256):
      for f2 in range(256):
      for f3 in range(256):
      for f4 in range(256):
      thread = PingerThread(".".join(map(str, [f1, f2, f3, f4])), index)
      thread.start()
      index += 1

      time.sleep(30)
      print("Writing response array to file")
      with open('responses.bin', 'wb') as out:
      response_array.tofile(out)


      class PingerThread(threading.Thread):
      def __init__(self, address, index):
      threading.Thread.__init__(self)
      self.address = address
      self.index = index

      def run(self):
      if subprocess.call(["ping", "-c", "1", "-w", "1", self.address]) == 0:
      response_array[self.index] = True
      else:
      response_array[self.index] = False


      What can I do to make this run faster? Any optimisations at all, even if very small, are welcome!










      share|improve this question















      My goal: I want to ping every single IPv4 address and record whether or not they responded.



      The way I have it set up is every IP address corresponds to an index. For example 0.0.0.0 is index 0 and 0.0.1.0 is index 256. So if 0.0.0.0 responded, then the 0th element of the bitarray is true. At the end I write the bitarray to a file.



      Here is the code:



      import subprocess
      from bitarray import bitarray
      import threading
      import time

      response_array = bitarray(256 * 256 * 256 * 256)
      response_array.setall(False)

      def send_all_pings():
      index = 0
      for f1 in range(256):
      for f2 in range(256):
      for f3 in range(256):
      for f4 in range(256):
      thread = PingerThread(".".join(map(str, [f1, f2, f3, f4])), index)
      thread.start()
      index += 1

      time.sleep(30)
      print("Writing response array to file")
      with open('responses.bin', 'wb') as out:
      response_array.tofile(out)


      class PingerThread(threading.Thread):
      def __init__(self, address, index):
      threading.Thread.__init__(self)
      self.address = address
      self.index = index

      def run(self):
      if subprocess.call(["ping", "-c", "1", "-w", "1", self.address]) == 0:
      response_array[self.index] = True
      else:
      response_array[self.index] = False


      What can I do to make this run faster? Any optimisations at all, even if very small, are welcome!







      python performance python-3.x multithreading networking






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited yesterday









      200_success

      127k15149412




      127k15149412










      asked yesterday









      Kos

      20329




      20329






















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          1
          down vote













          Opening four billion network connections potentially at once doesn't sound like a good idea. I can't tell right now whether or not you'll hit the OS limit and if it will be handled in some graceful way like blocking until a handle is free, but I'd rather set up a sane limit up front.






          share|improve this answer




























            up vote
            1
            down vote













            Some suggestions:




            • IP addresses are only formatted as octets (0-255) for human readability - they actually just represent integers. Instead of for example 127.0.0.1 you can use 2130706433 (127*2^24+1). In other words, range(2^32-1) represents the entire range of IPv4 addresses.

            • Using a Python library to ping hosts is very likely going to be much faster than starting a shell command.

            • Use multiprocessing rather than Python threads to avoid running into the global interpreter lock


            • response_array will end up taking many gigabytes of memory. If you really need the kind of detail you're logging you should be writing each entry to disk ASAP (keeping the file open all the while). You could also look into simplifying your reporting, such as only saving the IP addresses which don't respond, or saving to two files, one with responding IPs and the other with non-responding ones. You'll have to store a file (or a pair of files) per process, to avoid them clobbering each other.






            share|improve this answer



















            • 1




              Since this is mostly waiting on a network, async io might be even better than multiprocessing. Although you probably don't want 4 billion open connections, but that's an issue with the original code as well
              – millimoose
              yesterday












            • Also saving to two files from umpty concurrent processes sounds like either a bottleneck or a way to end up with a file full of garbage.
              – millimoose
              yesterday











            Your Answer





            StackExchange.ifUsing("editor", function () {
            return StackExchange.using("mathjaxEditing", function () {
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
            });
            });
            }, "mathjax-editing");

            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "196"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209170%2fpython-3-multi-threaded-pinger%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            1
            down vote













            Opening four billion network connections potentially at once doesn't sound like a good idea. I can't tell right now whether or not you'll hit the OS limit and if it will be handled in some graceful way like blocking until a handle is free, but I'd rather set up a sane limit up front.






            share|improve this answer

























              up vote
              1
              down vote













              Opening four billion network connections potentially at once doesn't sound like a good idea. I can't tell right now whether or not you'll hit the OS limit and if it will be handled in some graceful way like blocking until a handle is free, but I'd rather set up a sane limit up front.






              share|improve this answer























                up vote
                1
                down vote










                up vote
                1
                down vote









                Opening four billion network connections potentially at once doesn't sound like a good idea. I can't tell right now whether or not you'll hit the OS limit and if it will be handled in some graceful way like blocking until a handle is free, but I'd rather set up a sane limit up front.






                share|improve this answer












                Opening four billion network connections potentially at once doesn't sound like a good idea. I can't tell right now whether or not you'll hit the OS limit and if it will be handled in some graceful way like blocking until a handle is free, but I'd rather set up a sane limit up front.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered yesterday









                millimoose

                27527




                27527
























                    up vote
                    1
                    down vote













                    Some suggestions:




                    • IP addresses are only formatted as octets (0-255) for human readability - they actually just represent integers. Instead of for example 127.0.0.1 you can use 2130706433 (127*2^24+1). In other words, range(2^32-1) represents the entire range of IPv4 addresses.

                    • Using a Python library to ping hosts is very likely going to be much faster than starting a shell command.

                    • Use multiprocessing rather than Python threads to avoid running into the global interpreter lock


                    • response_array will end up taking many gigabytes of memory. If you really need the kind of detail you're logging you should be writing each entry to disk ASAP (keeping the file open all the while). You could also look into simplifying your reporting, such as only saving the IP addresses which don't respond, or saving to two files, one with responding IPs and the other with non-responding ones. You'll have to store a file (or a pair of files) per process, to avoid them clobbering each other.






                    share|improve this answer



















                    • 1




                      Since this is mostly waiting on a network, async io might be even better than multiprocessing. Although you probably don't want 4 billion open connections, but that's an issue with the original code as well
                      – millimoose
                      yesterday












                    • Also saving to two files from umpty concurrent processes sounds like either a bottleneck or a way to end up with a file full of garbage.
                      – millimoose
                      yesterday















                    up vote
                    1
                    down vote













                    Some suggestions:




                    • IP addresses are only formatted as octets (0-255) for human readability - they actually just represent integers. Instead of for example 127.0.0.1 you can use 2130706433 (127*2^24+1). In other words, range(2^32-1) represents the entire range of IPv4 addresses.

                    • Using a Python library to ping hosts is very likely going to be much faster than starting a shell command.

                    • Use multiprocessing rather than Python threads to avoid running into the global interpreter lock


                    • response_array will end up taking many gigabytes of memory. If you really need the kind of detail you're logging you should be writing each entry to disk ASAP (keeping the file open all the while). You could also look into simplifying your reporting, such as only saving the IP addresses which don't respond, or saving to two files, one with responding IPs and the other with non-responding ones. You'll have to store a file (or a pair of files) per process, to avoid them clobbering each other.






                    share|improve this answer



















                    • 1




                      Since this is mostly waiting on a network, async io might be even better than multiprocessing. Although you probably don't want 4 billion open connections, but that's an issue with the original code as well
                      – millimoose
                      yesterday












                    • Also saving to two files from umpty concurrent processes sounds like either a bottleneck or a way to end up with a file full of garbage.
                      – millimoose
                      yesterday













                    up vote
                    1
                    down vote










                    up vote
                    1
                    down vote









                    Some suggestions:




                    • IP addresses are only formatted as octets (0-255) for human readability - they actually just represent integers. Instead of for example 127.0.0.1 you can use 2130706433 (127*2^24+1). In other words, range(2^32-1) represents the entire range of IPv4 addresses.

                    • Using a Python library to ping hosts is very likely going to be much faster than starting a shell command.

                    • Use multiprocessing rather than Python threads to avoid running into the global interpreter lock


                    • response_array will end up taking many gigabytes of memory. If you really need the kind of detail you're logging you should be writing each entry to disk ASAP (keeping the file open all the while). You could also look into simplifying your reporting, such as only saving the IP addresses which don't respond, or saving to two files, one with responding IPs and the other with non-responding ones. You'll have to store a file (or a pair of files) per process, to avoid them clobbering each other.






                    share|improve this answer














                    Some suggestions:




                    • IP addresses are only formatted as octets (0-255) for human readability - they actually just represent integers. Instead of for example 127.0.0.1 you can use 2130706433 (127*2^24+1). In other words, range(2^32-1) represents the entire range of IPv4 addresses.

                    • Using a Python library to ping hosts is very likely going to be much faster than starting a shell command.

                    • Use multiprocessing rather than Python threads to avoid running into the global interpreter lock


                    • response_array will end up taking many gigabytes of memory. If you really need the kind of detail you're logging you should be writing each entry to disk ASAP (keeping the file open all the while). You could also look into simplifying your reporting, such as only saving the IP addresses which don't respond, or saving to two files, one with responding IPs and the other with non-responding ones. You'll have to store a file (or a pair of files) per process, to avoid them clobbering each other.







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited yesterday

























                    answered yesterday









                    l0b0

                    4,172923




                    4,172923








                    • 1




                      Since this is mostly waiting on a network, async io might be even better than multiprocessing. Although you probably don't want 4 billion open connections, but that's an issue with the original code as well
                      – millimoose
                      yesterday












                    • Also saving to two files from umpty concurrent processes sounds like either a bottleneck or a way to end up with a file full of garbage.
                      – millimoose
                      yesterday














                    • 1




                      Since this is mostly waiting on a network, async io might be even better than multiprocessing. Although you probably don't want 4 billion open connections, but that's an issue with the original code as well
                      – millimoose
                      yesterday












                    • Also saving to two files from umpty concurrent processes sounds like either a bottleneck or a way to end up with a file full of garbage.
                      – millimoose
                      yesterday








                    1




                    1




                    Since this is mostly waiting on a network, async io might be even better than multiprocessing. Although you probably don't want 4 billion open connections, but that's an issue with the original code as well
                    – millimoose
                    yesterday






                    Since this is mostly waiting on a network, async io might be even better than multiprocessing. Although you probably don't want 4 billion open connections, but that's an issue with the original code as well
                    – millimoose
                    yesterday














                    Also saving to two files from umpty concurrent processes sounds like either a bottleneck or a way to end up with a file full of garbage.
                    – millimoose
                    yesterday




                    Also saving to two files from umpty concurrent processes sounds like either a bottleneck or a way to end up with a file full of garbage.
                    – millimoose
                    yesterday


















                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Code Review Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209170%2fpython-3-multi-threaded-pinger%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Ellipse (mathématiques)

                    Quarter-circle Tiles

                    Mont Emei