Until I answered a question wrong yesterday, I didn't know that the field _content in the index had the content of all indexable fields in Sitecore. You would have thought that would be the first thing you would learn in Sitecore search. Whoops.

So now I want to understand how it works. Looking in all the config, I can only find the computed index field below. And after decompiling it, it does not seem to index all fields that are indexable.

<field fieldName="_content" type="Sitecore.ContentSearch.ComputedFields.MediaItemContentExtractor,Sitecore.ContentSearch">
    <mediaIndexing ref="contentSearch/indexConfigurations/defaultLuceneIndexConfiguration/mediaIndexing"/>
</field>

I am looking for the config setting that makes the _content field do what it does.

share|improve this question
1  
It's hard to know everything about Sitecore, but as we get more experience it's humbling when you stumble upon something like this. But nothing wrong with that :-) I knew about it but didn't know really how it worked, so thanks for asking! – Dylan Young 2 hours ago
up vote 7 down vote accepted

There is no config which you can change to adapt the logic. It's hardcoded.

Sitecore.Search.Crawlers.DatabaseCrawler adds content to the _content field.

In the AddAllFields method there is:

item.Fields.ReadAll();
foreach (Sitecore.Data.Fields.Field field in item.Fields)
{
    bool tokenize = this.IsTextField(field);
    FieldCrawlerBase fieldCrawler = FieldCrawlerFactory.GetFieldCrawler(field);

    if (this.IndexAllFields)
        document.Add((IFieldable) this.CreateField(field.Key, fieldCrawler.GetValue(), tokenize, 1f));
    if (tokenize)
        document.Add((IFieldable) this.CreateField(BuiltinFields.Content, fieldCrawler.GetValue(), true, 1f));
}

So after it adds crawled value to the proper document field, it also adds value to the _content (if IsTextField returned true).

The logic is skipped if Exclude From Text Search - Turn on to ignore the contents of this field in the full text search index box is checked on the field (copied from Chris Auer comment).

And here is the list of types of fields which are considered as TextFields:

"Single-Line Text"

"Rich Text"

"Multi-Line Text"

"text"

"rich text"

"html"

"memo"

"Word Document"


Also copied from Artsem Prashkovich comment so it's visible:

In addition to the first defenition of the _content field is in the Sitecore.ContentSearch.AbstractDocumentBuilder<T> class defined in the Sitecore.ContentSearch.dll. There is the AddSpecialFields method with the following code:

this.AddSpecialField("_content", (object) indexable3.Name, false);

this.AddSpecialField("_content", (object) indexable3.DisplayName, false);

It means the Name and DisplayName of item also in the _content field.

share|improve this answer
2  
In addition to the first defenition of the _content field is in the Sitecore.ContentSearch.AbstractDocumentBuilder<T> class defined in the Sitecore.ContentSearch.dll. There is the AddSpecialFields method with the following code: this.AddSpecialField("_content", (object) indexable3.Name, false); his.AddSpecialField("_content", (object) indexable3.DisplayName, false); It means the Name and DisplayName of item also in the _content field. – Artsem Prashkovich 2 hours ago
    
Thanks @ArtsemPrashkovich ! Always good to learn something new! – Marek Musielak 2 hours ago
    
And I am assuming if I check the "Exclude From Text Search - Turn on to ignore the contents of this field in the full text search index" box on the field. All this logic is ignored. – Chris Auer 2 hours ago
1  
@ChrisAuer you're absolutely right! I will update the answer. – Marek Musielak 2 hours ago

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.