Zintis Zilbers

I assist people and organizations in dealing with complex IT matters.

Find Substrings for line Encoding [CF]

10 Jun 2016 » python, codefights

Given a string, return its encoding defined as follows:

First, the string is divided into the least possible number of disjoint substrings consisting of identical characters for example, "aabbbc" is divided into ["aa", "bbb", "c"] Next, each substring with length greater than one is replaced with a concatenation of its length and the repeating character for example, substring "bbb" is replaced by "3b" Finally, all the new strings are concatenated together in the same order and a new string is returned.


A substring of a string S is another string S' that occurs in S. For example, "Fights" is a substring of "CodeFights", but "CoFi" isn’t.


For s = "aabbbc", the output should be lineEncoding(s) = "2a3bc".


  • [time limit] 4000ms (py)
  • [input] string s (String consisting of lowercase English letters.)

Constraints: 4 ≤ s.length ≤ 15.

  • [output] string (Encoded version of s.)


import re
def lineEncoding(s):
    grub = [ m.group(0) for m in re.finditer(r"(\w)\1*", s )]
    numb = 0
    out  = []
    for i in grub:
        numb += 1
        if len(i) > 1:
            out.append(grub[numb-1].replace(grub[numb-1], str(len(i))+i[0]))
    return ''.join(out)

Result Tests:

s = "aabbbc"
>>> lineEncoding(s)
>>> s = "abbcabb"
>>> lineEncoding(s)
>>> s = "abcd"
>>> lineEncoding(s)