Function to return subword of a camelcase string











up vote
6
down vote

favorite
1












Given a camelcase string and an index, return the subword of the string that includes that index, e.g.:



find_word('CamelCaseString', 6) -> 'Case'
find_word('ACamelCaseString', 0) -> 'A'


My code:



def find_word(s, index):
for i in range(index, 0, -1):
if s[i].isupper():
left = i
break
else:
left = 0

for i in range(index, len(s)-1):
if s[i].islower() and s[i+1].isupper() or s[i:i+2].isupper():
right = i
break
else:
right = len(s) - 1

return s[left:right+1]


Can this be made more concise/efficient?










share|improve this question






















  • Shouldn't the first example return String ?
    – Heslacher
    13 hours ago








  • 1




    No, the index 6 is character 'a' in subword 'Case'.
    – Eugene Yarmash
    13 hours ago










  • OK, now I got it.
    – Heslacher
    12 hours ago















up vote
6
down vote

favorite
1












Given a camelcase string and an index, return the subword of the string that includes that index, e.g.:



find_word('CamelCaseString', 6) -> 'Case'
find_word('ACamelCaseString', 0) -> 'A'


My code:



def find_word(s, index):
for i in range(index, 0, -1):
if s[i].isupper():
left = i
break
else:
left = 0

for i in range(index, len(s)-1):
if s[i].islower() and s[i+1].isupper() or s[i:i+2].isupper():
right = i
break
else:
right = len(s) - 1

return s[left:right+1]


Can this be made more concise/efficient?










share|improve this question






















  • Shouldn't the first example return String ?
    – Heslacher
    13 hours ago








  • 1




    No, the index 6 is character 'a' in subword 'Case'.
    – Eugene Yarmash
    13 hours ago










  • OK, now I got it.
    – Heslacher
    12 hours ago













up vote
6
down vote

favorite
1









up vote
6
down vote

favorite
1






1





Given a camelcase string and an index, return the subword of the string that includes that index, e.g.:



find_word('CamelCaseString', 6) -> 'Case'
find_word('ACamelCaseString', 0) -> 'A'


My code:



def find_word(s, index):
for i in range(index, 0, -1):
if s[i].isupper():
left = i
break
else:
left = 0

for i in range(index, len(s)-1):
if s[i].islower() and s[i+1].isupper() or s[i:i+2].isupper():
right = i
break
else:
right = len(s) - 1

return s[left:right+1]


Can this be made more concise/efficient?










share|improve this question













Given a camelcase string and an index, return the subword of the string that includes that index, e.g.:



find_word('CamelCaseString', 6) -> 'Case'
find_word('ACamelCaseString', 0) -> 'A'


My code:



def find_word(s, index):
for i in range(index, 0, -1):
if s[i].isupper():
left = i
break
else:
left = 0

for i in range(index, len(s)-1):
if s[i].islower() and s[i+1].isupper() or s[i:i+2].isupper():
right = i
break
else:
right = len(s) - 1

return s[left:right+1]


Can this be made more concise/efficient?







python strings interview-questions






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 13 hours ago









Eugene Yarmash

26329




26329












  • Shouldn't the first example return String ?
    – Heslacher
    13 hours ago








  • 1




    No, the index 6 is character 'a' in subword 'Case'.
    – Eugene Yarmash
    13 hours ago










  • OK, now I got it.
    – Heslacher
    12 hours ago


















  • Shouldn't the first example return String ?
    – Heslacher
    13 hours ago








  • 1




    No, the index 6 is character 'a' in subword 'Case'.
    – Eugene Yarmash
    13 hours ago










  • OK, now I got it.
    – Heslacher
    12 hours ago
















Shouldn't the first example return String ?
– Heslacher
13 hours ago






Shouldn't the first example return String ?
– Heslacher
13 hours ago






1




1




No, the index 6 is character 'a' in subword 'Case'.
– Eugene Yarmash
13 hours ago




No, the index 6 is character 'a' in subword 'Case'.
– Eugene Yarmash
13 hours ago












OK, now I got it.
– Heslacher
12 hours ago




OK, now I got it.
– Heslacher
12 hours ago










2 Answers
2






active

oldest

votes

















up vote
7
down vote













Review





  • Add docstrings and tests... or both in the form of doctests!



    def find_word(s, index):
    """
    Finds the CamalCased word surrounding the givin index in the string

    >>> find_word('CamelCaseString', 6)
    'Case'
    >>> find_word('ACamelCaseString', 0)
    'A'
    """

    ...



  • Loop like a native.



    Instead of going over the indexes we can loop over the item directly




    range(index, 0, -1)



    We can loop over the item and index at the same time using enumerate



    for i, s in enumerate(string[index:0:-1])


    However this would be slower since it will create a new string object with every slice.




  • If we can be sure that the givin string is a CamalCase string



    Then we can drop some of your second if statement




    if s[i].islower() and s[i+1].isupper() or s[i:i+2].isupper():



    Would be



     if s[i+1].isupper():



  • Actually your code (from a performance aspect) is quite good



    We could however use a while loop to increment both side at once, for a little performance gain.




(slower, yet more readable) Alternative



A different approach to finding CamalCase words can be done with regex,



We can find all CamalCase words with the following regex: r"([A-Z][a-z]*)"



And we can use re.finditer to create a generator for our matches and loop over them, and return when our index is in between the end and the start.



import re

def find_word_2(string, index):
for match in re.finditer(r"([A-Z][a-z]*)", string):
if match.start() <= index < match.end():
return match.group()


NOTE This yields more readable code, but it should be alot slower for large inputs.






share|improve this answer




























    up vote
    0
    down vote













    An alternative approach may involve trading space for time and pre-calculate mappings between letter indexes and the individual words. That would make the actual lookup function perform at $O(1)$ with $O(n)$ sacrifice for space. This may especially be useful if this function would be executed many times and needs a constant time response for the same word.



    And, as this is tagged with interview-questions, I personally think it would be beneficial for a candidate to mention this idea of pre-calculating indexes for future constant-time lookups.



    We could use a list to store the mappings between indexes and words:



    import re


    class Solver:
    def __init__(self, word):
    self.indexes =
    for match in re.finditer(r"([A-Z][a-z]*)", word):
    matched_word = match.group()
    for index in range(match.start(), match.end()):
    self.indexes.append(matched_word)

    def find_word(self, index):
    return self.indexes[index]


    solver = Solver('CamelCaseString')
    print(solver.find_word(2)) # prints "Camel"
    print(solver.find_word(5)) # prints "Case"





    share|improve this answer





















      Your Answer





      StackExchange.ifUsing("editor", function () {
      return StackExchange.using("mathjaxEditing", function () {
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
      });
      });
      }, "mathjax-editing");

      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "196"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209619%2ffunction-to-return-subword-of-a-camelcase-string%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      7
      down vote













      Review





      • Add docstrings and tests... or both in the form of doctests!



        def find_word(s, index):
        """
        Finds the CamalCased word surrounding the givin index in the string

        >>> find_word('CamelCaseString', 6)
        'Case'
        >>> find_word('ACamelCaseString', 0)
        'A'
        """

        ...



      • Loop like a native.



        Instead of going over the indexes we can loop over the item directly




        range(index, 0, -1)



        We can loop over the item and index at the same time using enumerate



        for i, s in enumerate(string[index:0:-1])


        However this would be slower since it will create a new string object with every slice.




      • If we can be sure that the givin string is a CamalCase string



        Then we can drop some of your second if statement




        if s[i].islower() and s[i+1].isupper() or s[i:i+2].isupper():



        Would be



         if s[i+1].isupper():



      • Actually your code (from a performance aspect) is quite good



        We could however use a while loop to increment both side at once, for a little performance gain.




      (slower, yet more readable) Alternative



      A different approach to finding CamalCase words can be done with regex,



      We can find all CamalCase words with the following regex: r"([A-Z][a-z]*)"



      And we can use re.finditer to create a generator for our matches and loop over them, and return when our index is in between the end and the start.



      import re

      def find_word_2(string, index):
      for match in re.finditer(r"([A-Z][a-z]*)", string):
      if match.start() <= index < match.end():
      return match.group()


      NOTE This yields more readable code, but it should be alot slower for large inputs.






      share|improve this answer

























        up vote
        7
        down vote













        Review





        • Add docstrings and tests... or both in the form of doctests!



          def find_word(s, index):
          """
          Finds the CamalCased word surrounding the givin index in the string

          >>> find_word('CamelCaseString', 6)
          'Case'
          >>> find_word('ACamelCaseString', 0)
          'A'
          """

          ...



        • Loop like a native.



          Instead of going over the indexes we can loop over the item directly




          range(index, 0, -1)



          We can loop over the item and index at the same time using enumerate



          for i, s in enumerate(string[index:0:-1])


          However this would be slower since it will create a new string object with every slice.




        • If we can be sure that the givin string is a CamalCase string



          Then we can drop some of your second if statement




          if s[i].islower() and s[i+1].isupper() or s[i:i+2].isupper():



          Would be



           if s[i+1].isupper():



        • Actually your code (from a performance aspect) is quite good



          We could however use a while loop to increment both side at once, for a little performance gain.




        (slower, yet more readable) Alternative



        A different approach to finding CamalCase words can be done with regex,



        We can find all CamalCase words with the following regex: r"([A-Z][a-z]*)"



        And we can use re.finditer to create a generator for our matches and loop over them, and return when our index is in between the end and the start.



        import re

        def find_word_2(string, index):
        for match in re.finditer(r"([A-Z][a-z]*)", string):
        if match.start() <= index < match.end():
        return match.group()


        NOTE This yields more readable code, but it should be alot slower for large inputs.






        share|improve this answer























          up vote
          7
          down vote










          up vote
          7
          down vote









          Review





          • Add docstrings and tests... or both in the form of doctests!



            def find_word(s, index):
            """
            Finds the CamalCased word surrounding the givin index in the string

            >>> find_word('CamelCaseString', 6)
            'Case'
            >>> find_word('ACamelCaseString', 0)
            'A'
            """

            ...



          • Loop like a native.



            Instead of going over the indexes we can loop over the item directly




            range(index, 0, -1)



            We can loop over the item and index at the same time using enumerate



            for i, s in enumerate(string[index:0:-1])


            However this would be slower since it will create a new string object with every slice.




          • If we can be sure that the givin string is a CamalCase string



            Then we can drop some of your second if statement




            if s[i].islower() and s[i+1].isupper() or s[i:i+2].isupper():



            Would be



             if s[i+1].isupper():



          • Actually your code (from a performance aspect) is quite good



            We could however use a while loop to increment both side at once, for a little performance gain.




          (slower, yet more readable) Alternative



          A different approach to finding CamalCase words can be done with regex,



          We can find all CamalCase words with the following regex: r"([A-Z][a-z]*)"



          And we can use re.finditer to create a generator for our matches and loop over them, and return when our index is in between the end and the start.



          import re

          def find_word_2(string, index):
          for match in re.finditer(r"([A-Z][a-z]*)", string):
          if match.start() <= index < match.end():
          return match.group()


          NOTE This yields more readable code, but it should be alot slower for large inputs.






          share|improve this answer












          Review





          • Add docstrings and tests... or both in the form of doctests!



            def find_word(s, index):
            """
            Finds the CamalCased word surrounding the givin index in the string

            >>> find_word('CamelCaseString', 6)
            'Case'
            >>> find_word('ACamelCaseString', 0)
            'A'
            """

            ...



          • Loop like a native.



            Instead of going over the indexes we can loop over the item directly




            range(index, 0, -1)



            We can loop over the item and index at the same time using enumerate



            for i, s in enumerate(string[index:0:-1])


            However this would be slower since it will create a new string object with every slice.




          • If we can be sure that the givin string is a CamalCase string



            Then we can drop some of your second if statement




            if s[i].islower() and s[i+1].isupper() or s[i:i+2].isupper():



            Would be



             if s[i+1].isupper():



          • Actually your code (from a performance aspect) is quite good



            We could however use a while loop to increment both side at once, for a little performance gain.




          (slower, yet more readable) Alternative



          A different approach to finding CamalCase words can be done with regex,



          We can find all CamalCase words with the following regex: r"([A-Z][a-z]*)"



          And we can use re.finditer to create a generator for our matches and loop over them, and return when our index is in between the end and the start.



          import re

          def find_word_2(string, index):
          for match in re.finditer(r"([A-Z][a-z]*)", string):
          if match.start() <= index < match.end():
          return match.group()


          NOTE This yields more readable code, but it should be alot slower for large inputs.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 11 hours ago









          Ludisposed

          6,93921959




          6,93921959
























              up vote
              0
              down vote













              An alternative approach may involve trading space for time and pre-calculate mappings between letter indexes and the individual words. That would make the actual lookup function perform at $O(1)$ with $O(n)$ sacrifice for space. This may especially be useful if this function would be executed many times and needs a constant time response for the same word.



              And, as this is tagged with interview-questions, I personally think it would be beneficial for a candidate to mention this idea of pre-calculating indexes for future constant-time lookups.



              We could use a list to store the mappings between indexes and words:



              import re


              class Solver:
              def __init__(self, word):
              self.indexes =
              for match in re.finditer(r"([A-Z][a-z]*)", word):
              matched_word = match.group()
              for index in range(match.start(), match.end()):
              self.indexes.append(matched_word)

              def find_word(self, index):
              return self.indexes[index]


              solver = Solver('CamelCaseString')
              print(solver.find_word(2)) # prints "Camel"
              print(solver.find_word(5)) # prints "Case"





              share|improve this answer

























                up vote
                0
                down vote













                An alternative approach may involve trading space for time and pre-calculate mappings between letter indexes and the individual words. That would make the actual lookup function perform at $O(1)$ with $O(n)$ sacrifice for space. This may especially be useful if this function would be executed many times and needs a constant time response for the same word.



                And, as this is tagged with interview-questions, I personally think it would be beneficial for a candidate to mention this idea of pre-calculating indexes for future constant-time lookups.



                We could use a list to store the mappings between indexes and words:



                import re


                class Solver:
                def __init__(self, word):
                self.indexes =
                for match in re.finditer(r"([A-Z][a-z]*)", word):
                matched_word = match.group()
                for index in range(match.start(), match.end()):
                self.indexes.append(matched_word)

                def find_word(self, index):
                return self.indexes[index]


                solver = Solver('CamelCaseString')
                print(solver.find_word(2)) # prints "Camel"
                print(solver.find_word(5)) # prints "Case"





                share|improve this answer























                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  An alternative approach may involve trading space for time and pre-calculate mappings between letter indexes and the individual words. That would make the actual lookup function perform at $O(1)$ with $O(n)$ sacrifice for space. This may especially be useful if this function would be executed many times and needs a constant time response for the same word.



                  And, as this is tagged with interview-questions, I personally think it would be beneficial for a candidate to mention this idea of pre-calculating indexes for future constant-time lookups.



                  We could use a list to store the mappings between indexes and words:



                  import re


                  class Solver:
                  def __init__(self, word):
                  self.indexes =
                  for match in re.finditer(r"([A-Z][a-z]*)", word):
                  matched_word = match.group()
                  for index in range(match.start(), match.end()):
                  self.indexes.append(matched_word)

                  def find_word(self, index):
                  return self.indexes[index]


                  solver = Solver('CamelCaseString')
                  print(solver.find_word(2)) # prints "Camel"
                  print(solver.find_word(5)) # prints "Case"





                  share|improve this answer












                  An alternative approach may involve trading space for time and pre-calculate mappings between letter indexes and the individual words. That would make the actual lookup function perform at $O(1)$ with $O(n)$ sacrifice for space. This may especially be useful if this function would be executed many times and needs a constant time response for the same word.



                  And, as this is tagged with interview-questions, I personally think it would be beneficial for a candidate to mention this idea of pre-calculating indexes for future constant-time lookups.



                  We could use a list to store the mappings between indexes and words:



                  import re


                  class Solver:
                  def __init__(self, word):
                  self.indexes =
                  for match in re.finditer(r"([A-Z][a-z]*)", word):
                  matched_word = match.group()
                  for index in range(match.start(), match.end()):
                  self.indexes.append(matched_word)

                  def find_word(self, index):
                  return self.indexes[index]


                  solver = Solver('CamelCaseString')
                  print(solver.find_word(2)) # prints "Camel"
                  print(solver.find_word(5)) # prints "Case"






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 4 hours ago









                  alecxe

                  14.6k53277




                  14.6k53277






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Code Review Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209619%2ffunction-to-return-subword-of-a-camelcase-string%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Ellipse (mathématiques)

                      Quarter-circle Tiles

                      Mont Emei