Find the optimal pattern

Question

Given a string s composed of lowercase letters, such as

aabaaababbbbaaba

and a positive integer n, such as 4, output a length-n string t such that when t is repeated to the length of s, they have as many chars in common as possible. For the given example, the optimal output would be aaba, because it has thirteen chars in common with the target string:

s: aabaaababbbbaaba
t: aabaaabaaabaaaba (aaba)
   ^^^^^^^^  ^ ^^^^

and no possible t has more. However, for aaaaaab, there are two possible outputs: aaaa and aaba, which each have 6 chars in common with the target string:

s: aaaaaab
t: aaaaaaaa (aaaa)
   ^^^^^^ 

s: aaaaaab
t: aabaaaba (aaba)
   ^^ ^^^^

Either aaaa or aaba can be outputted, or both if you'd like. Note that s is not ever repeated; the trailing a in both repeated values of t is simply ignored.

Test cases

Inputs -> Valid outputs
1 a -> a
1 aa -> a
2 aa -> aa
1 ab -> a b
2 ab -> ab
1 abb -> b
2 abb -> ab bb
2 ababa -> ab
2 abcba -> ab
2 aabbbbb -> bb  (ab is not a valid output here)
3 aababba -> aab abb
3 aababbaa -> aab
3 asdasfadf -> asf
3 asdasfadfsdf -> asf adf
2 abcdefghijklmnopqrstuvwxyzyx -> yx
2 supercalifragilisticexpialidocious -> ic ii
3 supercalifragilisticexpialidocious -> iri ili ioi
4 supercalifragilisticexpialidocious -> scii
5 supercalifragilisticexpialidocious -> iapic
2 eeeebaadbaecaebbbbbebbbbeecacebdccaecadbbbaceebedbbbddadebeddedbcedeaadcabdeccceccaeaadbbaecbbcbcbea -> bb be
10 bbbbacacbcedecdbbbdebdaedcecdabcebddbdcecebbeeaacdebdbebaebcecddadeeedbbdbbaeaaeebbedbeeaeedadeecbcd -> ebbbdbeece ebdbdbeece
20 aabbbaaabaaabaaaabbbbabbbbabbbabbbbbabbaaaababbbaababbbaababaaaabbaaabbaabbbabaaabbabbaaabbaaaaaaaba -> aabbbbaaabbabbbaabba

Rules

You may assume the input will only ever be a non-empty string of lowercase letters and a positive integer no greater than the length of the string.
You may take the inputs in any standard format and in either order.
You may output a single string, or more than one in the form of an array, separated by newlines or spaces, etc.
Your code must finish for each test-case in less than 1 minute on any fairly modern computer.
This is code-golf, so make your code as short as possible.

This challenge is Zgarb-quality. Nice work! – Martin Ender♦ 13 hours ago — Martin Ender♦, 13 hours ago

ais523 · Answer 1 · 2017-03-22 17:01:52Z

Jelly, 11 bytes

sZµṢŒrUṀṪµ€

Try it online!

Wasn't expecting to beat Dennis on this one, so tried to FGITW it (after trying several possibilities; there's more than one way to make 11). I came in shorter, much to my surprise.

Takes the string then the count as command-line arguments. Outputs on stdout.

Explanation

sZµṢŒrUṀṪµ€
s            Split {the first input} into {the second input}-sized groups
 Z           Transpose
  µ      µ€  On each of the transposed groups:
   Ṣ           Sort it;
    Œr         Run-length encode it;
      U        Rearrange it to the form {count, letter};
       Ṁ       Take the largest element (i.e. largest count)
        Ṫ      Take the second element of the pair (i.e. just the letter)

This uses the insight that the letter in each position of the pattern must be the most common letter corresponding to that position. We can find the letters corresponding to a particular pattern via splitting into pattern-sized groups, and transposing. The main reason this solution is so long is that Jelly doesn't seem to have a short way to find the mode of a list (I made several attempts, but they're all at least six bytes long).

Jelly, 10 bytes, based on @Dennis' solution

⁸ċ$ÞṪ
sZÇ€

Try it online!

This is a combination of @Dennis' solution and my own; there was a five-byte mode in that solution, which I stole for this solution. (I already had solutions based on ⁸ċ, but couldn't get below six bytes with it; I hadn't thought of using Þ.)

Explanation

µ…µ€ and Ç€ (with the … on the previous line) are both three bytes long (the latter needs a newline), and equivalent. Normally I use the former, but the latter's more flexible, as it allows you to use ⁸ to mention the argument.

This makes it possible to sort (Þ) by the number of occurrences in ⁸ (⁸ċ), then take the last element (Ṫ), to find the mode in just five characters.

Nice job outgolfing Dennis with his own language! :P – HyperNeutrino 12 hours ago — HyperNeutrino, 12 hours ago

Martin Ender · Answer 2 · 2017-03-22 16:57:23Z

Mathematica, 41 bytes

#&@@@Commonest/@(#2~Partition~UpTo@#)&

Input and output are lists of characters.

Also based on the modes of the lines of the transpose. I believe they called the built-in for the mode of a list Commonest solely to spite code golfers.

For only three additional bytes and without any restructuring, we can get a list of all optimal solutions:

Tuples[Commonest/@(#2~Partition~UpTo@#)]&

Rod · Answer 3 · 2017-03-22 19:29:10Z

up vote 5 down vote

Python 2, 106

Now it's a different answer! I was thinking about one(almost)-liner from the beggining. Now even shorter, based on zip usage by @Rod.

Thanks to @L3viathan and @Rod for clarification about using lambdas as answer

Try it online

lambda S,N:max(combinations(S,N),key=lambda s:sum(x==y for x,y in zip(S,s*len(S))))
from itertools import*

Explanation:

combinations(S,N) creates all combinations length N from characters of S

max() have argument key which takes as input function to use to compare elements

lambda s:sum(x==y for x,y in zip(S,s*len(S))) passed as such function

This lambda counts number of matching characters in list of tuples, produces by zip(S,s*len(S))

s - one of combinations and it's multipled by len(S) which creates string that is guaranteed longer than S

zip creates tuples of characters of each string S and s*len(S) and ignores all characters that can't be matched (in case of one string longer than another)

So max chooses combination, that produce maximum sum

edited 10 hours ago

Rod

3,6161634

answered 13 hours ago

Dead Possum

1535

1

you don't need to use [] on list comprehension inside functions, also you are using 1 for ... if <cond> you can use directly <cond> for ... since it will be used on sum, python will take True as 1 and False as 0 – Rod 13 hours ago

@Rod Thank you! If I would sqeeze my answer more, it will transform into your answer, approach is same :D So I'm trying something different right now – Dead Possum 13 hours ago

Yep, just saying so you can use on your future answers :3 – Rod 12 hours ago

1

Switching to a lambda will save 7 bytes. – L3viathan 11 hours ago

1

@DeadPossum He meant this (note the footer and the header) and yes, a function is a valid answer, if its a lambda you don't even need the f= (unless it's recursive) – Rod 10 hours ago

| show 4 more comments

RootTwo · Answer 4 · 2017-03-22 20:24:45Z

up vote 5 down vote

Python 3, 99, 73 61 bytes

-12, thx to @Rod

lambda s,n:''.join(max(s,key=s[i::n].count)for i in range(n))

Same idea, but rewrote it to eliminate the import statement.

lambda s,n:''.join(max(s,key=lambda c:s[i::n].count(c))for i in range(n))

Original

from collections import*
lambda s,n:''.join(Counter(s[i::n]).most_common(1)[0][0]for i in range(n))

Explanation:

s[i::n]                  a slice of every nth character of s, starting at position i

Counter(s[i::n])         counts the characters in the slice
  .most_common()         returns a list of (character, count) pairs, sorted by decreasing count
    [0][0]               grabs the letter from the first pair (i.e., the most common letter
      for i in range(n)  repeat for all starting positions

''.join                  combines the most common letters into a single string

edited 9 hours ago

answered 11 hours ago

RootTwo

8244

you can switch to python2.7 and drop the ''.join() to return a list of strings – Rod 11 hours ago

@Rod Dropping ''.join(...) would make it return a generator, not sure if that's allowed output. – L3viathan 11 hours ago

@L3viathan it need to be python2.7 to work, added to the other comment – Rod 10 hours ago

Can you write some explanation of how this works? – Dead Possum 10 hours ago

Also you can replace key=lambda c:s[i::n].count(c) with key=s[i::n].count (probably) – Rod 10 hours ago

| show 1 more comment

Dennis · Answer 5 · 2017-03-22 23:52:33Z

Jelly, 12 11 bytes

s@ZċþZMḢ$€ị

Try it online!

How it works

s@ZċþZMḢ$€ị  Main link. Arguments: n (integer), s (string)

s@           Split swapped; split s into chunks of length n.
  Z          Zip/transpose, grouping characters that correspond to repetitions.
   ċþ        Count table; for each slice in the previous result, and each character
             in s, count the occurrences of the character in the group.
             This groups by character.
     Z       Zip/transpose to group by slice.
        $€   Map the two-link chain to the left over the groups.
      M        Find all maximal indices.
       Ḣ       Head; pick the first.
          ị  Index into s to retrieve the corresponding characters.

Neil · Answer 6 · 2017-03-22 17:33:30Z

up vote 3 down vote

JavaScript (ES6), 105 bytes

(n,s)=>[...Array(n)].map((_,i)=>[...s].map((c,j)=>j%n-i||(o[c]=-~o[c])>m&&(m++,r=c),m=0,o={})&&r).join``

answered 12 hours ago

Neil

35.2k32182

Very nice. I golfed a solution for reference when adding test cases and came up at 122 bytes, looping through every char, saving the counts in an array of objects, then building the string from that array. – ETHproductions 12 hours ago

add a comment |

Mnemonic · Answer 7 · 2017-03-22 16:53:48Z

up vote 2 down vote

Pyth, 11 bytes

meo/dNd.TcF

Takes input as s,n and outputs as a list of characters.

Explanation

meo/dNd.TcF
         cFQ   Split s into chunks of length n.
       .T      Transpose.
m o/dNd        Sort characters in each string by frequency.
 e             Take the most common.

edited 13 hours ago

answered 13 hours ago

Mnemonic

25912

add a comment |

ETHproductions · Answer 8 · 2017-03-22 19:09:34Z

Japt, 16 15 bytes

Saved 1 byte thanks to @obarakon

Ç=VëUZ)¬ñ!èZ o

14 bytes of code + 1 byte for the -P flag. Try it online!

Ungolfed and explanation

 Ç   =VëUZ)¬ ñ!èZ o
UoZ{Z=VëUZ)q ñ!èZ o}
                          Implicit: U = input number, V = input string
Uo                        Create the range [0...U).
  Z{               }      Map each item Z by this function:
      VëUZ                  Take every U'th char of V, starting at index Z.
    Z=    )                 Call the result Z.
           q                Split the result into chars.
             ñ!èZ           Sort each char X by the number of occurrences of X in Z.
                  o         Pop; grab the last item (the most common char).
                      -P  Join the results (array of most common chars) into a string.

I think you can replace gJ with o – obarakon 11 hours ago — obarakon, 11 hours ago
@obarakon That's genius, thanks! – ETHproductions 11 hours ago — ETHproductions, 11 hours ago

Rod · Answer 9 · 2017-03-22 17:00:47Z

up vote 1 down vote

Python 2, 132 bytes

from itertools import*
p,k=input()
b=l=len(p)
for i in combinations(p,k):
 x=sum(x!=y for x,y in zip(p,i*l))
 if x<b:b,o=x,i
print o

Try it online!

answered 13 hours ago

Rod

3,6161634

add a comment |

Emigna · Answer 10 · 2017-03-22 17:13:32Z

05AB1E, 17 bytes

Iôð«øvy{.¡é®èÙJðÜ

Try it online!

Explanation

Iô                 # split 2nd input in chunks of 1st input size
  ð«               # append a space to each
    ø              # zip
     vy            # for each y in the zipped list
       {           # sort the string
        .¡         # group into chunks of consecutive equal elements
          é        # sort by length
           ®è      # pop the last element (the longest)
             Ù     # remove duplicate characters from the string
              J    # join the stack into one string
               ðÜ  # remove any trailing spaces

Jörg Hülsermann · Answer 11 · 2017-03-22 20:56:21Z

PHP, 245 Bytes

function p($c,$s,$r=""){global$a;if(($c-strlen($r)))foreach(str_split(count_chars($s,3))as$l)p($c,$s,$r.$l);else{for($v=str_pad("",$w=strlen($s),$r);$z<$w;)$t+=$v[$z]==$s[$z++];$a[$t][]=$r;}}p($argv[1],$argv[2]);ksort($a);echo join(" ",end($a));

Online Version

Breakdown

function p($c,$s,$r=""){
    global$a;
    if(($c-strlen($r)))  # make permutation
        foreach(str_split(count_chars($s,3))as$l)
            p($c,$s,$r.$l); #recursive
    else{
        for($v=str_pad("",$w=strlen($s),$r);$z<$w;) 
        $t+=$v[$z]==$s[$z++]; #compare strings
        $a[$t][]=$r; # insert value in array
    }
}
p($argv[1],$argv[2]); #start function with the input parameter
ksort($a); # sort result array 
echo join(" ",end($a)); #Output

asked	today
viewed	742 times
active	today

current community

your communities

more stack exchange communities

Find the optimal pattern

Test cases

Rules

11 Answers 11

Jelly, 11 bytes

Explanation

Jelly, 10 bytes, based on @Dennis' solution

Explanation

Mathematica, 41 bytes

Python 2, 106

Python 3, 99, 73 61 bytes

Jelly, 12 11 bytes

How it works

JavaScript (ES6), 105 bytes

Pyth, 11 bytes

Explanation

Japt, 16 15 bytes

Ungolfed and explanation

Python 2, 132 bytes

05AB1E, 17 bytes

PHP, 245 Bytes

Breakdown

Your Answer

Not the answer you're looking for? Browse other questions tagged code-golf string or ask your own question.

Visit Chat

Hot Network Questions

current community

your communities

more stack exchange communities

Find the optimal pattern

Test cases

Rules

11 Answers 11

Jelly, 11 bytes

Explanation

Jelly, 10 bytes, based on @Dennis' solution

Explanation

Mathematica, 41 bytes

Python 2, 106

Python 3, 99, 73 61 bytes

Jelly, 12 11 bytes

How it works

JavaScript (ES6), 105 bytes

Pyth, 11 bytes

Explanation

Japt, 16 15 bytes

Ungolfed and explanation

Python 2, 132 bytes

05AB1E, 17 bytes

PHP, 245 Bytes

Breakdown

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged code-golf string or ask your own question.

Visit Chat

Related

Hot Network Questions