Case insensitive 'Contains(string)'

Question

Is there a way to make the following return true?

string title = "ASTRINGTOTEST";
title.Contains("string");

There doesn't seem to be an overload that allows me to set the case sensitivity.. Currently I UPPERCASE them both, but that's just silly.

UPDATE
The sillyness I refer to is the i18n issues that come with up- and down casing.

UPDATE
This question is ancient and since then I have realized I asked for a simple answer for a really vast and difficult topic if you care to investigate it fully.
For most cases, in mono-lingual, English code bases this answer will suffice. I'm suspecting because most people coming here fall in this category this is the most popular answer.
This answer however brings up the inherent problem that we can't compare text case insensitive until we know both texts are the same culture and we know what that culture is. This is maybe a less popular answer, but I think it is more correct and that's why I marked it as such.

How is it silly? Do you mean that you're doing 2 passes on the string? I would think case-insensitive comparisons merely combine the two steps. — Calyth, Jan 14 '09 at 21:44
Since I will use it on the worldwebz i have to take foreign characters into account. As mentioned in an answer below, upcasing as well as downcasing gives internationalization issues. — Boris Callens, Jan 15 '09 at 14:15
Upper-casing both strings is silly, because you create two new strings and then still perform a case-sensitive search. There is unnecessary additional processing and memory involved in creating new strings like that, especially if you're searching through a set of strings and you upper-case the search or source terms redundantly. The IndexOf method that allows the specification of a StringComparison value is better. — Triynko, May 3 '11 at 20:39
@ColonelPanic: Correct. If you know the culture, this becomes less of a problem. But often, you either don't know or don't care. — Boris Callens, Mar 18 '13 at 7:58

Colonel Panic · Accepted Answer · 2016-02-26 09:44:17Z

up vote 791 down vote accepted

To test if the string paragraph contains the string word (thanks @QuarterMeister)

culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0

Where culture is the instance of CultureInfo describing the language that the text is written in.

This solution is transparent about the definition of case-insensitivity, which is language dependent. For example, the English language uses the characters I and i for the upper and lower case versions of the ninth letter, whereas the Turkish language uses these characters for the eleventh and twelfth letters of its 29 letter-long alphabet. The Turkish upper case version of 'i' is the unfamiliar character 'İ'.

Thus the strings tin and TIN are the same word in English, but different words in Turkish. As I understand, one means 'spirit' and the other is an onomatopoeia word. (Turks, please correct me if I'm wrong, or suggest a better example)

To summarise, you can only answer the question 'are these two strings the same but in different cases' if you know what language the text is in. If you don't know, you'll have to take a punt. Given English's hegemony in software, you should probably resort to CultureInfo.InvariantCulture, because it'll be wrong in familiar ways.

edited Feb 26 '16 at 9:44

answered Mar 17 '13 at 18:22

Colonel Panic

57.3k37239299

45

Why not culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0? That uses the right culture and is case-insensitive, it doesn't allocate temporary lowercase strings, and it avoids the question of whether converting to lowercase and comparing is always the same as a case-insensitive comparison. – Quartermeister Mar 18 '13 at 15:32

9

This solution also needlessly pollutes the heap by allocating memory for what should be a searching function – JaredPar Mar 18 '13 at 16:09

14

Comparing with ToLower() will give different results from a case-insensitive IndexOf when two different letters have the same lowercase letter. For example, calling ToLower() on either U+0398 "Greek Capital Letter Theta" or U+03F4 "Greek Capital Letter Theta Symbol" results in U+03B8, "Greek Small Letter Theta", but the capital letters are considered different. Both solutions consider lowercase letters with the same capital letter different, such as U+0073 "Latin Small Letter S" and U+017F "Latin Small Letter Long S", so the IndexOf solution seems more consistent. – Quartermeister Mar 18 '13 at 17:47

3

@Quartermeister - and BTW, I believe .NET 2 and .NET4 behave differently on this as .NET 4 always uses NORM_LINGUISTIC_CASING while .NET 2 did not (this flags has appeared with Windows Vista). – Simon Mourier Mar 23 '13 at 8:13

11

+1 for completeness - answers with a proper form of explanation are the only way users will actually learn from SO – TheGeekZn May 5 '14 at 12:57

| show 9 more comments

Sergey Brunov · Answer 2 · 2015-05-13 00:44:19Z

up vote 1970 down vote

You could use the String.IndexOf Method and pass StringComparison.OrdinalIgnoreCase as the type of search to use:

string title = "STRING";
bool contains = title.IndexOf("string", StringComparison.OrdinalIgnoreCase) >= 0;

Even better is defining a new extension method for string:

public static class StringExtensions
{
    public static bool Contains(this string source, string toCheck, StringComparison comp)
    {
        return source.IndexOf(toCheck, comp) >= 0;
    }
}

...

string title = "STRING";
bool contains = title.Contains("string", StringComparison.OrdinalIgnoreCase);

edited May 13 '15 at 0:44

Sergey Brunov

7,28622258

answered Jan 14 '09 at 21:44

JaredPar

469k889461254

15

Indeed does look like the best way to go. Weird that such a thing is not standard framework. Thx. – Boris Callens Jan 15 '09 at 14:17

147

@boris: help make it part of the framework: vote here: connect.microsoft.com/VisualStudio/feedback/details/435324/… – Ian Mercer Jul 28 '10 at 18:00

6

This gives the same answer as paragraph.ToLower(culture).Contains(word.ToLower(culture)) with CultureInfo.InvariantCulture and it doesn't solve any localisation issues. Why over complicate things? stackoverflow.com/a/15464440/284795 – Colonel Panic Mar 17 '13 at 18:52

32

@ColonelPanic the ToLower version includes 2 allocations which are unnecessary in a comparison / search operation. Why needlessly allocate in a scenario that doesn't require it? – JaredPar Mar 18 '13 at 16:09

2

@Seabiscuit that won't work because string is an IEnumerable<char> hence you can't use it to find substrings – JaredPar Nov 6 '14 at 17:55

| show 17 more comments

Liam · Answer 3 · 2015-09-15 15:45:31Z

You can use IndexOf() like this:

string title = "STRING";

if (title.IndexOf("string", 0, StringComparison.CurrentCultureIgnoreCase) != -1)
{
    // The string exists in the original
}

Since 0 (zero) can be an index, you check against -1.

MSDN

The zero-based index position of value if that string is found, or -1 if it is not. If value is String.Empty, the return value is 0.

Jed · Answer 4 · 2015-07-10 14:14:32Z

up vote 96 down vote

Alternate solution using Regex:

bool contains = Regex.IsMatch("StRiNG to search", "string", RegexOptions.IgnoreCase);

Notice

As @cHao has pointed out in his comment, there are scenario's that will cause this solution to return incorrect results. Make sure you know what you're doing before you implement this solution haphazardly.

edited Jul 10 '15 at 14:14

answered Jul 28 '10 at 17:18

Jed

5,5791356100

2

Good Idea, also we have a lot of bitwise combinations in RegexOptions like RegexOptions.IgnoreCase & RegexOptions.IgnorePatternWhitespace & RegexOptions.CultureInvariant; for anyone if helps. – Saravanan Aug 24 '11 at 4:36

7

Must say I prefer this method although using IsMatch for neatness. – wonea Sep 7 '11 at 17:40

19

What's worse, since the search string is interpreted as a regex, a number of punctuation chars will cause incorrect results (or trigger an exception due to an invalid expression). Try searching for "." in "This is a sample string that doesn't contain the search string". Or try searching for "(invalid", for that matter. – cHao Sep 9 '11 at 13:28

7

@cHao: In that case, Regex.Escape could help. Regex still seems unnecessary when IndexOf / extension Contains is simple (and arguably more clear). – Dan Mangiarelli Sep 9 '11 at 16:44

3

Note that I was not implying that this Regex solution was the best way to go. I was simply adding to the list of answers to the original posted question "Is there a way to make the following return true?". – Jed Sep 13 '11 at 15:43

| show 2 more comments

fubo · Answer 5 · 2015-06-05 06:02:45Z

up vote 45 down vote

One issue with the answer is that it will throw an exception if a string is null. You can add that as a check so it won't:

public static bool Contains(this string source, string toCheck, StringComparison comp)
{
    if (string.IsNullOrEmpty(toCheck) || string.IsNullOrEmpty(source))
        return true;

    return source.IndexOf(toCheck, comp) >= 0;
}

edited Jun 5 '15 at 6:02

fubo

17.4k73673

answered Dec 7 '10 at 21:11

FeiBao 飞豹

49246

7

If toCheck is the empty string it needs to return true per the Contains documentation: "true if the value parameter occurs within this string, or if value is the empty string (""); otherwise, false." – amurra Feb 16 '11 at 16:13

3

Based on amurra's comment above, doesn't the suggested code need to be corrected? And shouldn't this be added to the accepted answer, so that the best response is first? – David White Aug 30 '11 at 3:43

8

Now this will return true if source is an empty string or null no matter what toCheck is. That cannot be correct. Also IndexOf already returns true if toCheck is an empty string and source is not null. What is needed here is a check for null. I suggest if (source == null || value == null) return false; – Colin Jul 1 '13 at 12:21

The source cant be null – I'm Blue Da Ba Dee Dec 14 '16 at 16:55

add a comment |

Ed S. · Answer 6 · 2009-01-14 21:54:21Z

up vote 40 down vote

You could always just up or downcase the strings first.

string title = "string":
title.ToUpper().Contains("STRING")  // returns true

Oops, just saw that last bit. A case insensitive compare would *probably* do the same anyway, and if performance is not an issue, I don't see a problem with creating uppercase copies and comparing those. I could have sworn that I once saw a case-insensitive compare once...

edited Jan 14 '09 at 21:54

answered Jan 14 '09 at 21:42

Ed S.

88.8k13133202

1

Interestingly, I've seen ToUpper() recommended over the use of ToLower() in this sort of scenario, because apparently ToLower() can "lose fidelity" in certain cultures - that is, two different upper-case characters translate to the same lower-case character. – Matt Hamilton Jan 14 '09 at 21:47

72

Search for "Turkey test" :) – Jon Skeet Jan 14 '09 at 21:48

3

In some French locales, uppercase letters don't have the diacritics, so ToUpper() may not be any better than ToLower(). I'd say use the proper tools if they're available - case-insensitive compare. – Blair Conrad Jan 14 '09 at 22:03

3

Don't use ToUpper or ToLower, and do what Jon Skeet said – Peter Gfader Aug 21 '09 at 2:49

8

Just saw this again after two years and a new downvote... anyway, I agree that there are better ways to compare strings. However, not all programs will be localized (most won't) and many are internal or throwaway apps. Since I can hardly expect credit for advice best left for throwaway apps... I'm moving on :D – Ed S. Jan 25 '11 at 7:28

| show 2 more comments

abatishchev · Answer 7 · 2011-01-25 06:58:32Z

StringExtension class is the way forward, I've combined a couple of the posts above to give a complete code example:

public static class StringExtensions
{
    /// <summary>
    /// Allows case insensitive checks
    /// </summary>
    public static bool Contains(this string source, string toCheck, StringComparison comp)
    {
        return source.IndexOf(toCheck, comp) >= 0;
    }
}

johnnyRose · Answer 8 · 2016-07-19 19:22:36Z

up vote 23 down vote

This is clean and simple.

Regex.IsMatch(file,fileNamestr,RegexOptions.IgnoreCase)

edited Jul 19 '16 at 19:22

johnnyRose

2,65182044

answered Nov 9 '12 at 4:25

takirala

671630

15

This will match against a pattern, though. In your example, if fileNamestr has any special regex characters (e.g. *, +, ., etc.) then you will be in for quite a surprise. The only way to make this solution work like a proper Contains function is to escape fileNamestr by doing Regex.Escape(fileNamestr). – XåpplI'-I0llwlg'I - Feb 3 '13 at 15:18

add a comment |

Fabian Bigler · Answer 9 · 2015-07-23 10:30:30Z

OrdinalIgnoreCase, CurrentCultureIgnoreCase or InvariantCultureIgnoreCase?

Since this is missing, here are some recommendations about when to use which one:

Dos

Use StringComparison.OrdinalIgnoreCase for comparisons as your safe default for culture-agnostic string matching.
Use StringComparison.OrdinalIgnoreCase comparisons for increased speed.
Use StringComparison.CurrentCulture-based string operations when displaying the output to the user.
Switch current use of string operations based on the invariant culture to use the non-linguistic StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase when the comparison is
linguistically irrelevant (symbolic, for example).
Use ToUpperInvariant rather than ToLowerInvariant when normalizing strings for comparison.

Don'ts

Use overloads for string operations that don't explicitly or implicitly specify the string comparison mechanism.
Use StringComparison.InvariantCulture -based string
operations in most cases; one of the few exceptions would be
persisting linguistically meaningful but culturally-agnostic data.

Based on these rules you should use:

string title = "STRING";
if (title.IndexOf("string", 0, StringComparison.[YourDecision]) != -1)
{
    // The string exists in the original
}

whereas [YourDecision] depends on the recommendations from above.

link of source: http://msdn.microsoft.com/en-us/library/ms973919.aspx

serhio · Answer 10 · 2011-09-09 13:55:07Z

up vote 10 down vote

I know that this is not the C#, but in the framework (VB.NET) there is already such a function

Dim str As String = "UPPERlower"
Dim b As Boolean = InStr(str, "UpperLower")

C# variant:

string myString = "Hello World";
bool contains = Microsoft.VisualBasic.Strings.InStr(myString, "world");

edited Sep 9 '11 at 13:55

answered Sep 9 '11 at 13:23

serhio

13.5k28144290

Do you also know how it works internally? – Boris Callens Mar 18 '13 at 8:12

add a comment |

Casey · Answer 11 · 2013-12-06 14:11:23Z

The InStr method from the VisualBasic assembly is the best if you have a concern about internationalization (or you could reimplement it). Looking at in it dotNeetPeek shows that not only does it account for caps and lowercase, but also for kana type and full- vs. half-width characters (mostly relevant for Asian languages, although there are full-width versions of the Roman alphabet too). I'm skipping over some details, but check out the private method InternalInStrText:

private static int InternalInStrText(int lStartPos, string sSrc, string sFind)
{
  int num = sSrc == null ? 0 : sSrc.Length;
  if (lStartPos > num || num == 0)
    return -1;
  if (sFind == null || sFind.Length == 0)
    return lStartPos;
  else
    return Utils.GetCultureInfo().CompareInfo.IndexOf(sSrc, sFind, lStartPos, CompareOptions.IgnoreCase | CompareOptions.IgnoreKanaType | CompareOptions.IgnoreWidth);
}

johnnyRose · Answer 12 · 2016-07-19 19:23:43Z

up vote 8 down vote

Just like this:

string s="AbcdEf";
if(s.ToLower().Contains("def"))
{
    Console.WriteLine("yes");
}

edited Jul 19 '16 at 19:23

johnnyRose

2,65182044

answered Jul 13 '14 at 9:54

cdytoby

348418

This is not culture-specific and may fail for some cases. culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) should be used. – hikalkan Jul 22 '14 at 7:50

Why avoid string.ToLower() when doing case-insensitive string comparisons? Tl;Dr It's costly because a new string is "manufactured". – Liam Oct 10 '16 at 10:00

add a comment |

Peter Mortensen · Answer 13 · 2013-07-08 08:10:58Z

up vote 7 down vote

Use this:

string.Compare("string", "STRING", new System.Globalization.CultureInfo("en-US"), System.Globalization.CompareOptions.IgnoreCase);

edited Jul 8 '13 at 8:10

Peter Mortensen

10.8k1374108

answered Jul 11 '11 at 7:53

mr.martan

15911

23

The questioner is looking for Contains not Compare. – DuckMaestro Jul 11 '11 at 8:05

@DuckMaestro, the accepted answer is implementing Contains with IndexOf. So this approach is equally helpful! The C# code example on this page is using string.Compare(). SharePoint team's choice that is! – vulcan raven Jan 5 '13 at 10:07

add a comment |

Jodrell · Answer 14 · 2014-12-04 08:55:50Z

Ultimately, a generic "contains" operation comes down to a function like this,

/// <summary>
/// Determines whether the source contains the sequence.
/// </summary>
/// <typeparam name="T">The type of the items in the sequences.</typeparam>
/// <param name="sourceEnumerator">The source enumerator.</param>
/// <param name="sequenceEnumerator">The sequence enumerator.</param>
/// <param name="equalityComparer">An equality comparer.</param>
/// <remarks>
/// An empty sequence will return <c>true</c>.
/// The sequence must support <see cref="IEnumerator.Reset"/>
/// if it does not begin the source.
/// </remarks>
/// <returns>
/// <c>true</c> if the source contains the sequence;
/// otherwise <c>false</c>.
/// </returns>
public static bool Contains<T>(
    IEnumerator<T> sourceEnumerator,
    IEnumerator<T> sequenceEnumerator,
    IEqualityComparer<T> equalityComparer)
{
    if (equalityComparer == null)
    {
        equalityComparer = EqualityComparer<T>.Default;
    }

    while (sequenceEnumerator.MoveNext())
    {
        if (sourceEnumerator.MoveNext())
        {
            if (!equalityComparer.Equals(
                sourceEnumerator.Current,
                sequenceEnumerator.Current))
            {
                sequenceEnumerator.Reset();
            }
        }
        else
        {
            return false;
        }
    }

    return true;
}

this can be trivially wrapped in a extension version accepting IEnumerable like this,

public static bool Contains<T>(
        this IEnumerable<T> source,
        IEnumerable<T> sequence,
        IEqualityComparer<T> equalityComparer = null)
{
    if (sequence == null)
    {
        throw new ArgumentNullException("sequence");
    }

    using(var sequenceEnumerator = sequence.GetEnumerator())
    using(var sourceEnumerator = source.GetEnumerator())
    {
        return Contains(
            sourceEnumerator,
            sequenceEnumerator,
            equalityComparer);
    }
}

Now, this will work for the ordinal comparison of any sequences, including strings, since string implements IEnumerable<char>,

// The optional parameter ensures the generic overload is invoked
// not the string.Contains() implementation.
"testable".Contains("est", EqualityComparer<char>.Default)

However, as we know, strings are not generic, they are specialized. There are two key factors at play.

The "casing" issue which itself has various language dependent edge cases.
The rather involved issue of how a set of "Text Elements" (letters/numbers/symbols etc.) are represented by Unicode Code Points and what valid sequences of chars can represent a given string, details are expanded in these answers.

The net effect is the same. Strings that you might assert are linguistically equal can be validly represented by different combinations of chars. Whats more, the rules for validity change between cultures.

All this leads to a specialized string based "Contains" implementation like this.

using System.Globalization;

public static bool Contains(
         this string source,
         string value,
         CultureInfo culture = null,
         CompareOptions options = CompareOptions.None)
{
    if (value == null)
    {
        throw new ArgumentNullException("value");
    }

    var compareInfo = culture == null ? 
            CultureInfo.CurrentCulture.CompareInfo :
            culture.CompareInfo;

    var sourceEnumerator = StringInfo.GetTextElementEnumerator(source);
    var sequenceEnumerator = StringInfo.GetTextElementEnumerator(value);

    while (sequenceEnumerator.MoveNext())
    {
        if (sourceEnumerator.MoveNext())
        {
            if (!(compareInfo.Compare(
                    sourceEnumerator.Current,
                    sequenceEnumerator.Current,
                    options) == 0))
            {
                sequenceEnumerator.Reset();
            }
        }
        else
        {
            return false;
        }
    }

    return true;
}

This function can be used to perform a case insensitive, culture specific "contains" that will work, whatever the normalization of the strings. e.g.

"testable".Contains("EST", StringComparer.CurrentCultureIgnoreCase)

Interesting code, unfortunately it do not work as expected/needed. "ssstring".Contains("sstring") == True, But in your code Contains("ssstring", "sstring", null) == False. Simply put, it the reset sequence (sequenceEnumerator.Reset();) is not the only operation you need to make. You also need return the sourceEnumerator to position after the first match. Symbolic code: if (compareInfo.Compare(...) == 0) { if (firstMatch) { sourceEnumerator.PushPosition(); firstMatch = false; } }else{ sequenceEnumerator.Reset(); if (!firstMatch) { sourceEnumerator.PopPosition(); firstMatch = true; } } — Julo, Jan 27 at 9:47

TarmoPikaro · Answer 15 · 2015-10-17 07:46:58Z

This is quite similar to other example here, but I've decided to simplify enum to bool, primary because other alternatives are normally not needed. Here is my example:

public static class StringExtensions
{
    public static bool Contains(this string source, string toCheck, bool bCaseInsensitive )
    {
        return source.IndexOf(toCheck, bCaseInsensitive ? StringComparison.OrdinalIgnoreCase : StringComparison.Ordinal) >= 0;
    }
}

And usage is something like:

if( "main String substring".Contains("SUBSTRING", true) )
....

johnnyRose · Answer 16 · 2016-07-19 19:23:19Z

up vote 3 down vote

Using a RegEx is a straight way to do this:

Regex.IsMatch(title, "string", RegexOptions.IgnoreCase);

edited Jul 19 '16 at 19:23

johnnyRose

2,65182044

answered Sep 18 '13 at 13:08

Stend

6514

3

Your answer is exactly the same as guptat59's but, as was pointed out on his answer, this will match a regular expression, so if the string you're testing contains any special regex characters it will not yield the desired result. – Casey Dec 9 '13 at 22:55

1

This is a straight up copy of this answer and suffers from the same issues as noted in that answer – Liam Oct 10 '16 at 10:04

add a comment |

Mr.B · Answer 17 · 2016-10-19 12:08:48Z

The trick here is to look for the string, ignoring case, but to keep it exactly the same (with the same case).

 var s="Factory Reset";
            var txt="reset";
 int first = s.IndexOf(txt, StringComparison.InvariantCultureIgnoreCase) + txt.Length;
 var subString = s.Substring(first - txt.Length, txt.Length);

Output is "Reset"

Tamilselvan K · Answer 18 · 2016-10-26 14:34:14Z

up vote 0 down vote

if ("strcmpstring1".IndexOf(Convert.ToString("strcmpstring2"), StringComparison.CurrentCultureIgnoreCase) >= 0){return true;}else{return false;}

answered Oct 26 '16 at 14:34

Tamilselvan K

32126

add a comment |

FelixSFD · Answer 19 · 2016-12-11 13:41:20Z

up vote 0 down vote

You can use string.indexof () function. This will be case insensitive

edited Dec 11 '16 at 13:41

FelixSFD

2,62441839

answered Dec 11 '16 at 13:39

Okan SARICA

358

add a comment |

asked	8 years ago
viewed	516609 times
active	2 months ago

current community

your communities

more stack exchange communities

Case insensitive 'Contains(string)'

19 Answers 19

Dos

Don'ts

protected by Shankar Damodaran Jan 15 '14 at 17:55

Not the answer you're looking for? Browse other questions tagged c# string contains case-insensitive or ask your own question.

Linked

Hot Network Questions

current community

your communities

more stack exchange communities

Case insensitive 'Contains(string)'

19 Answers 19

Dos

Don'ts

protected by Shankar Damodaran Jan 15 '14 at 17:55

Not the answer you're looking for? Browse other questions tagged c# string contains case-insensitive or ask your own question.

Linked

Related

Hot Network Questions