Find the first word starting with each letter

Question

Given a string, find the first word starting with each letter (case insensitive).

Sample

Using Ferulas flourish in gorgeous gardens. as input:

"Ferulas flourish in gorgeous gardens."
 ^^^^^^^          ^^ ^^^^^^^^
 |                |  |
 |                |  --> is the first word starting with `g`
 |                --> is the first word starting with `i`
 --> is the first word starting with `f`

Then, the output for this sample should be the matched words joined by one single space:

"Ferulas in gorgeous"

Challenge

Both input and output must be a string representation, or the closest alternative in your language.

Program or function allowed.

You can consider a word being at least one of: lowercase or uppercase letters, digits, underscore.

This is code-golf, shortest answer in bytes wins.

Another samples:

input: "Take all first words for each letter... this is a test"
output: "Take all first words each letter is"

input: "Look ^_^ .... There are 3 little dogs :)"
output: "Look _ There are 3 dogs"

input: "...maybe some day 1 plus 2 plus 20 could result in 3"
output: "maybe some day 1 plus 2 could result in 3"

Sp3000 · Answer 1 · 2016-04-07 06:27:00Z

up vote 6 down vote

Retina, 45 bytes

i`\b((\w)\w*)\b(?<=\b\2\w*\b.+)

\W+
 
^ | $

Simply uses a single regex to remove later words starting with the same \w character (case insensitive with the i option), converts runs of \W to a single space, then removes any leading/trailing space from the result.

Try it online!

Edit: See @Kobi's answer for a shorter version using M!`

edited 1 hour ago

answered 4 hours ago

Sp3000

38.7k766194

Darn it, barely beat me to it! I couldn't figure out the lookbehind though. – GamrCorps 4 hours ago

2

I've added another Retina answer - I think that's OK if they are different enough (the basic concept is similar, of course). – Kobi 1 hour ago

1

@Kobi It's much better, so I'm glad to see it :) Makes me realise how much more I need to learn about Retina's line options and what not. – Sp3000 1 hour ago

add a comment |

Kobi · Answer 2 · 2016-04-07 06:43:44Z

Retina, 28 bytes:

M!i`\b(\w)(?<!\b\1.+)\w*
¶

M! - Match each work and print all words separated by newlines.
i - Ignore case.
\b(\w) - Capture first letter of each word
(?<!\b\1.+) - After matching the letter, check if there wasn't a previous word starting with the same letter. \1.+ ensures at least two characters, so we are skipping the current word.
\w* - match the rest of the word.
The above matches only words - all other characters are removed.
¶\n - Replace newlines with spaces.

Try it online!

user81655 · Answer 3 · 2016-04-07 07:04:24Z

up vote 3 down vote

JavaScript (ES6), 73 71 bytes

s=>s.match(u=/\w+/g).filter(w=>u[n=parseInt(w[0],36)]?0:u[n]=1).join` `

Saved 2 bytes thanks to @edc65!

Test

var solution = s=>s.match(u=/\w+/g).filter(w=>u[n=parseInt(w[0],36)]?0:u[n]=1).join` `;
var testCases = [
  "Ferulas flourish in gorgeous gardens.",
  "Take all first words for each letter... this is a test",
  "Look ^_^ .... There are 3 little dogs :)",
  "...maybe some day 1 plus 2 plus 20 could result in 3"
];
document.write("<pre>"+testCases.map(t=>t+"\n"+solution(t)).join("\n\n")+"</pre>");

edited 1 hour ago

answered 3 hours ago

user81655

6,78811035

Using parseInt("_",36) = NaN? Blasphemy! – Sp3000 3 hours ago

The fun fact is: it works @Sp3000 – edc65 1 hour ago

Using u=regexp is really clever. Save 2 bytes s=>s.match(u=/\w+/g).filter(w=>u[w=parseInt(w[0],36)]?0:u[w]=1).join' ' – edc65 1 hour ago

@edc65 Thanks. It's actually quite convenient that there are 37 possible outputs for a single base-36 digit. – user81655 1 hour ago

add a comment |

Jakube · Answer 4 · 2016-04-07 07:14:36Z

Pyth, 21 bytes

J:z"\w+"1jdxDJhM.ghkJ

Try it online: Demonstration or Test Suite

J:z"\w+"1 finds all the words in the input using the regex \w+ and stores them in J.

.ghkJ groups the words by their first letter, hM takes the first from each group, xDJ sorts these words by their index in the input string, and jd puts spaces between them.

Adnan · Answer 5 · 2016-04-07 05:50:07Z

up vote 1 down vote

05AB1E, 40 bytes

Code:

94L32+çJžj-DU-ð¡""Kvy¬Xsl©åï>iX®«Uy}\}ðý

Try it online!

answered 2 hours ago

Adnan

9,330125103

add a comment |

Alex A. · Answer 6 · 2016-04-07 06:26:15Z

Julia, 165 155 151 129 102 bytes

g(s,d=[])=join(filter(i->i!=0,[(c=lcfirst(w)[1])∈d?0:(d=[d;c];w)for w=split(s,r"\W",keep=1<0)])," ")

This is a function that accepts a string and returns a string.

Ungolfed:

function g(s, d=[])
    # Split the string into an array on unwanted characters, then for
    # each word, if the first letter has been encountered, populate
    # this element of the array with 0, otherwise note the first letter
    # and use the word. This results in an array of words and zeros.
    x = [(c = lcfirst(w)[1]) ∈ d ? 0 : (d = [d; c]; w) for w = split(s, r"\W", keep=1<0)]

    # Remove the zeros, keeping only the words. Note that this works
    # even if the word is the string "0" since 0 != "0".
    z = filter(i -> i != 0, x)

    # Join into a string and return
    return join(z, " ")
end

Saved 53 bytes with help from Sp3000!

Dennis · Answer 7 · 2016-04-07 06:34:42Z

up vote 1 down vote

Jelly, 32 31 bytes

ØB;”_
e€¢¬œṗf€¢¹ÐfµZḢŒlQi@€$ịj⁶

Try it online!

edited 1 hour ago

answered 2 hours ago

Dennis♦

71.1k12124312

add a comment |

Dr Green Eggs and Ham DJ · Answer 8 · 2016-04-07 04:53:57Z

Vim 57 keystrokes

:s/[^a-zA-Z_ ]//g<cr>A <cr>ylwv$:s/\%V\c<c-v><c-r>"\h* //eg<c-v><cr>@q<esc>0"qDk@q

Explanation:

:s/[^a-zA-Z_ ]//g                                 #Remove all invalid chars.
A <cr>                                            #Enter insert mode, and enter 
                                                  #a space and a newline at the end
ylwv$:s/\\c%V<c-v><c-r>"\h* //eg<c-v><cr>@q<esc>  #Enter all of this text on the 
                                                  #next line

0                                                 #Go to the beginning of the line
"qD                                               #Delete this line into register
                                                  #"q"
k@q                                               #Run "q" as a macro  

#Macro
ylw                                               #Yank a single letter
   v$                                             #Visual selection to end of line
     :s/                                          #Substitute regex
       \%V\c                                      #Only apply to the selection and 
                                                  #ignore case
            <c-v><c-r>"                           #Enter the yanked letter
                       \h*                        #All "Head of word" chars
                                                  #And a space
                           //                     #Replace with an empty string
                             eg                   #Continue the macro if not found
                                                  #Apply to all matches
                               <c-v><cr>          #Enter a <CR> literal
                                        @q<esc>   #Recursively call the macro

I'm really dissapointed by how long this one is. The "Invalid" chars (everything but a-z, A-Z, _ and space) really threw me off. I'm sure there's a better way to do this:

:s/[^a-zA-Z_ ]//g

Since \h matches all of that expect for the space, but I can't figure out how to put the metachar in a range. If anyone has tips, I'd love to hear em.

why a-zA-Z_ and not \w? digits are valid – edc65 1 hour ago — edc65, 1 hour ago

Katenkyo · Answer 9 · 2016-04-07 07:15:11Z

Lua, 172 Bytes

It ended up way longer that I wanted...

t={}(...):gsub("[%w_]+",function(w)b=nil for i=1,#t
do b=t[i]:sub(1,1):lower()==w:sub(1,1):lower()and 1 or b
end t[#t+1]=not b and w or nil end)print(table.concat(t," "))

Ungolfed

t={}                           -- initialise the accepted words list
(...):gsub("[%w_]+",function(w)-- iterate over each group of alphanumericals and underscores
  b=nil                        -- initialise b (boolean->do we have this letter or not)
  for i=1,#t                   -- iterate over t
  do
    b=t[i]:sub(1,1):lower()    -- compare the first char of t's i word
       ==w:sub(1,1):lower()    -- and the first char of the current word
           and 1               -- if they are equals, set b to 1
           or b                -- else, don't change it
  end
  t[#t+1]=not b and w or nil   -- insert w into t if b isn't set
end)

print(table.concat(t," "))     -- print the content of t separated by spaces

malik · Answer 10 · 2016-04-07 07:20:30Z

up vote 0 down vote

C# (LINQPAD) - 136 bytes

var w=Util.ReadLine().Split(' ');var l=string.Join(" ",w.Select(s=>w.First(f=>Regex.IsMatch(""+f[0],"(?i)"+s[0]))).Distinct());l.Dump();

answered 59 mins ago

malik

33118

add a comment |

asked	today
viewed	216 times
active	today

current community

your communities

more stack exchange communities

Find the first word starting with each letter

Sample

Challenge

Another samples:

10 Answers 10

Retina, 45 bytes

Retina, 28 bytes:

JavaScript (ES6), 73 71 bytes

Test

Pyth, 21 bytes

05AB1E, 40 bytes

Julia, 165 155 151 129 102 bytes

Jelly, 32 31 bytes

Vim 57 keystrokes

Lua, 172 Bytes

Ungolfed

Your Answer

Not the answer you're looking for? Browse other questions tagged code-golf string or ask your own question.

Visit Chat

Hot Network Questions

current community

your communities

more stack exchange communities

Find the first word starting with each letter

Sample

Challenge

Another samples:

10 Answers 10

Retina, 45 bytes

Retina, 28 bytes:

JavaScript (ES6), 73 71 bytes

Test

Pyth, 21 bytes

05AB1E, 40 bytes

Julia, 165 155 151 129 102 bytes

Jelly, 32 31 bytes

Vim 57 keystrokes

Lua, 172 Bytes

Ungolfed

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged code-golf string or ask your own question.

Visit Chat

Related

Hot Network Questions