Storage Locations for files gathered by the Crawl Component sharepoint 2013

When gathering files from a content source, the SharePoint 2013 Crawl Component can be very I/O intensive process – locally writing all of the files it gathers from content repositories to its to temporary file paths and having them read by the Content Processing Component during document parsing. This post can help you understand where the Crawl Components write temporary files, which can help in planning and performance troubleshooting (e.g. Why does disk performance of my C:\ drive get so bad – or worse, fill up – when I start a large crawl?)

By default, all Search data files will be written within the Installation Path

  • The Data Directory (by default, a sub-directory of the Installation Path) specifies the path for all Search data files including those used by I/O intensive components (Crawl, Analytics, and Index Components)
    • The Data Directory can only be configured at the time of Installation (e.g. it can only be changed if uninstalling/re-installing SharePoint on the given server)
      • From the Installation Wizard, choose the “File Location” tab as seen below
      • IMPORTANT: Before uninstalling SharePoint, first modify your Search topology by removing any Search components from the applicable server. Once SharePoint is re-installed, you can once again deploy the components back to this server.
    • The defined path can be viewed in the registry:

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\15.0\Search\Setup\DataDirectory

    • Advanced Note: The Index files (by default, written to the Data Directory) path can be configured separately when provisioning an Index Component via PowerShell using the “RootDirectory” parameter

3175.installAndDataPath
(As a side note: the graphic is only intended to display the default locations specified at install time. It is recommended to change these to a file path other than C:\ drive)

For the Crawl Component:

  • When crawling [gathering] an item, the filter daemon (mssdmn.exe – a child process of the Crawl Component that actually interfaces with an end content repository using a Search Connector/Protocol Handler) will download any applicable file blobs to the SSA’s “TempPath” (e.g. an HTML file, a Word document, a PowerPoint presentation, etc)
    • In the graphic below, this is step 2a
    • The defined path can be viewed either:
      • In the registry (of a Crawl server)

        HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\15.0\Search\Global\Gathering Manager\TempPath

      • Or as a property of the SSA:

        $SSA = Get-SPEnterpriseSearchServiceApplication

        $SSA.TempPath

  • When the filter daemon completes the gathering of an item, it is returned to the Gathering Manager (mssearch.exe – responsible for orchestrating a crawl of a given item) and the applicable blob is moved to the “GathererDataPath“, which is a path relative to the DataDirectory mentioned above.
    • In the graphic below, this occurs in step 2b
    • The defined path can be viewed in the registry (of a Crawl server):

      HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\15.0\Search\Components\-GUID-of-theSSA-crawl-0\GathererDataPath

  • The GathererDataPath is mapped as a network share (used by the Content Processing Components)
    • The shared path can be viewed in the registry (of a Crawl server):

      HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\15.0\Search\Components\-GUID-of-theSSA-crawl-0\GathererDataShare

8233.crawlFlow
Usage by the Content Processing Components:

  • When the item is fed from the Crawler to the Content Processing Component (step 3 above), the item is only logically submitted to the CPC in a serialized payload of properties that represent that particular item – any related blob would remain on the Crawler and retrieved by a later stage in the processing flow
    • For SharePoint list items, there would typically not be a blob (unless the list item had an attachment)
    • For a document in a SharePoint library, the blob would represent the item’s associated file (such as a Word document)
  • During the Document Parsing stage in the processing flow (e.g. during step 4 above), the item’s blob will be retrieved from the Crawl Component via the GathererDataShare
  • When the Crawl Component receives a callback (success or failure) from the CPC (e.g. in step 6b above after an item has been processed), the temporary blob is then deleted from the GathererDataPath

1373.gathererDataShare
An example path to an item with DocID 933112 would look like the following:

file://crawlSrv/gthrsvc_7ecdbb10-3c86-4298-ab09-04f61aaeb636-crawl-0//f8/0xe3cf8_1.aspx   

#0xe3cf8 hex = 933112 decimal

Where:

  • crawlerSrv is a server running a crawl component
  • gthrsvc_-GUID-of-theSearchAdminWebServiceApp--crawl-0 is the name of the crawl component
    • This GUID can be identified using the following PowerShell:

      $SSA = Get-SPEnterpriseSearchServiceApplication

      $searchAdminWeb = Get-SPServiceApplication –Name $SSA.id

      $searchAdminWeb.id

      7ecdbb10-3c86-4298-ab09-04f61aaeb636

  • And the file name is actually re-named to the hex value of the docID
    • For example: 0xe3cf8 hex = 933112 decimal
    • Which we can see in ULS, such as:
      • From the Crawl Component (in this case, running on server “faceman”):

        mssearch.exe     SharePoint Server Search Crawler:Content Plugin      af7zf VerboseEx

        CTSDocument: FeedingDocument: properties : strDocID = ssic://933112 key = path values =\\FACEMAN\gthrsvc_7ecdbb10-3c86-4298-ab09-04f61aaeb636-crawl-0\\f8\0xe3cf8.aspx 

      • From the Content Processing Component:

        NodeRunnerContent2-834ebb1f-009    Search    Document Parsing      ai3ef VerboseEx

        AttachDocParser – Parsing: ‘file://faceman/gthrsvc_7ecdbb10-3c86-4298-ab09-04f61aaeb636-crawl-0//f8/0xe3cf8.aspx’

Advertisements

SharePoint 2010 Search result Not found documents for a specific library in specific SharePoint site

Hi All,

I came across a situation where user is trying to search documents selecting the option “search in same site” instead of “all sites” from  search box and getting no result where as can find documents from other library with in same site.

Why such happens ?

The first point comes to mind for search error is  content not crawled, indexing not done for this situation.

Yes , its true but we need to think why  ?

As per my investigation I found the setting of the library as below

Draft-items-are-not-crawled-in-SharePoint

By default SharePoint only crawls major versions of files and draft items are only viewable by their creators. SharePoint is behaving as expected out of box.Draft items are not crawled in SharePoint

Resolution :

This behavior can be altered in Document Library Settings -> Versioning Settings -> Draft Item Security

Select the option “Any user who can read items”.

This will allow all users to see draft items including the crawling account.

  • Else you need to select “Create major versions” option or can publish the documents as major versions if want to get those documents in search result as per client wish.

https://support.microsoft.com/en-us/help/2304855/draft-items-are-not-crawled-in-sharepoint

Crawl error Processing this item failed because of an unknown error when trying to parse its contents sharepoint

During various search troubleshooting i came across the following crawling error in the Crawl log of a SharePoint 2013 environment.

Processing this item failed because of an unknown error when trying to parse its contents. (Error parsing document ‘http://********.*****.com/Project/abcd/Q_M/ABX/SitePages/Homepage.aspx’. Sandbox worker pool is
closed.; ; SearchID = *******************)

In order to fix this you can try to perform the following action plan:
Open “Local Policies
Click on “User rights assignment

user-rights-assignment

Make sure that the search service account has the following rights:
Replace a process level token

adjust-memory-quotas-for-process

Adjust memory quotas for a process

adjust-memory-quotas-for-process-properties

Impersonate a client after authentication

impersonate

Please make sure that the policies don’t get changed afterwards.

After implementing the above changes please run a clear configuration cache
After clearing the cache, start a full crawl and the errors should be gone.

SharePoint item crawled returned error when attempting to download the item example aspx file

Error:

SharePoint Crawl Log Error: The SharePoint item being crawled returned an error when attempting to download the item for example .aspx files

Solution:

1.Open Regedit on your search server/s
2.Navigate to this registry key: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\14.0\Search\Global\Gathering Manager
3.Change Value “UserAgent” from “MSIE 4.01” to “MSIE 8.0
4.Restart the SharePoint Search Service.
5.Open a SharePoint PowerShell
6.Get-SPSessionStateService
7.If this returns false then we need to deploy one
Enable-SPSessionStateService -DatabaseName “NameOfDatabase”

Display template SharePoint Server 2013

Display templates for the Content Search Web Part

You can use the following display templates to change the appearance of content that is shown in a Content Search Web Part. These display template files are located in the Content Web Parts subfolder in the Display Templates folder in the Master Page Gallery.

template-contentsearchwebpart

Display templates for the Refinement Web Part and the Taxonomy Refinement Web Part

You can use the display templates listed in the following table to change the appearance of content that is shown in a Refinement Web Part and a Taxonomy Refinement Web Part. These display template files are located in the Filters subfolder in the Display Templates folder in the Master Page Gallery. Note that there are different display templates for different refiner types.

template-webpart

Display templates for the Search Results Web Part

You can use the display templates in the following table to change the appearance of content shown in a Search Results Web Part. Note that the hover panels for the different result types have separate display templates. These display template files are located in the Search subfolder in the Display Templates folder in the Master Page Gallery.

template-search1
template-search2

search diagnostics and reports sharepoint

We can access and analyze several query and crawl health reports, logs and usage reports from the Search service application in the SharePoint Central Administration to monitor the health of the search system.

The health reports and logs only contain information after a full crawl has completed. To run a full crawl, we have to set up a Search service application, add at least one content source, and then start a full crawl.

To view the health reports and the crawl log, one have to be an administrator of the Search service application. Alternatively, an administrator who is a member of the Farm Administrators group can grant user accounts Read permissions on the Search service application. A user account that has Read permissions can only view the Search service application status page, the health reports and the crawl log.

Query health reports:

  1. Trend
  2. Overall
  3. Main Flow
  4. Federation
  5. SharePoint Search Provider
  6. People Search Provider
  7. Index Engine

To view query health reports:

  1. Verify that the user account that is performing this procedure is an administrator of or has Read permissions to the Search service application.
  2. In Central Administration, under Application Management, click Manage service applications.
  3. On the Service Applications page, click the Search service application.
  4. On the Search Administration page, in the Quick Launch, in the Diagnostics section, click Query Health Reports.
  5. On the Search Service Application: Query Latency Trend page, click the query report that you want to view.

The following table shows which reports are available.

query-health-report

Crawl health reports:

SharePoint 2013 provides the following reports about crawl health:

  1. Crawl Rate
  2. Crawl Latency
  3. Crawl Queue
  4. Crawl Freshness
  5. Content Processing Activity
  6. CPU and Memory Load
  7. Continuous Crawl

To view crawl health reports

  1. Verify that the user account that is performing this procedure is an administrator of or has Read permissions to the Search service application.
  2. In Central Administration, under Application Management, click Manage service applications.
  3. On the Service Applications page, click the Search service application.
  4. On the Search Administration page, in the Quick Launch, in the Diagnostics section, click Crawl Health Reports.
  5. On the Search Service Application: Crawl Reports page, click the crawl health report that you want to view.

The following table shows which reports are available.

crawl-health-report

Crawl log:

The crawl log tracks information about the status of crawled content. This log lets you determine whether crawled content was successfully added to the index, whether it was excluded because of a crawl rule, or whether indexing failed because of an error. The crawl log also contains information such as the time of the last successful crawl and whether any crawl rules were applied. You can use the crawl log to diagnose problems with the search experience.

To view the crawl log

  1. Verify that the user account that is performing this procedure is an administrator of the Search service application, or has Read permissions to it.
  2. In Central Administration, under Application Management, click Manage service applications.
  3. On the Service Applications page, click the Search service application.
  4. On the Search Administration page, in the Quick Launch, in the Diagnostics section, click Crawl Log.
  5. On the Crawl Log – Content Source page, click the view that you want.

crawl-log-views

Additional columns in the Content Source, Host Name and Crawl History views:

content-source-host-name-crawl-history-view

Usage reports (search report):

To view usage reports

  1. Verify that the user account that is performing this procedure is an administrator of or has Read permissions to the Search service application.
  2. In Central Administration, under Application Management, click Manage service applications.
  3. On the Service Applications page, click the Search service application.
  4. On the Search Administration page, in the Quick Launch, in the Diagnostics section, click Usage Reports.
  5. On the View Usage Reports page, click the usage or search reports view that you want view.

usage-report-search-report

 

Query 0 Server Not Responding – Event ID 2587

Error :

SERVER1:
Windows 2008 R2 (x64)
SharePoint 2010

SERVER2:
Windows 2008 R2 (x64)
SQL 2008

Services running on SERVER1: (none as localsystem, localservice, networkservice)

SharePoint 2010 Administration (started/auto)

SharePoint 2010 Timer (started/auto)

SharePoint 2010 Tracing (started/auto)

SharePoint 2010 User Code Host (stopped/disabled)

SharePoint 2010 VSS Writer (stopped/manual)

SharePoint Foundation Search V4 (started/manual)

SharePoint Server Search 14 (started/manual)

The RELATED ISSUE(s):

Search/indexing is not working, and in return, backups are failing claiming:

Failure Message Object Query-0 (D: on SERVER1) failed in event On Backup

Search Service Application reports:

Index Partition – 0 – SERVER2Search_Service_Application_PropertyStoreDB_9a482efd99954748a062952a3d2617d7

Query Component 0 SERVER1 Not Responding

System Event Log on SERVER1 reports:

Event ID 2587

The following conditions are currently affecting index propagation to this server for search service application ‘Search Service Application’:

1. Query 0, catalog Main: failing to copy index files from crawl component 0 for 1490 minutes. Access is denied. 0x80070005

2. Query 0 is not being automatically disabled because the minimum number of ready query components per partition is 2.

Solution :

  1. Please try to disconnect the query server from the search topology,
  2. stop the Search service in central admin, clear the index files in the query server.
  3. After that, starts the search service instance using Start-SPServiceInstance PowerShell.

Run the following PowerShell to reset the Query server:

$ssa = Get-SPEnterpriseSearchServiceApplication -Identity “SSAName”
$queryComponents = $ssa | Get-SPEnterpriseSearchQueryTopology -Active | Get-SPEnterpriseSearchQueryComponent
$component = $queryComponents | where {$_.ServerName -eq “QueryServerName” }
$component.Recover()

Internal server error exception when users perform a search in SharePoint

Symptoms

When users perform a search in a Microsoft SharePoint environment, they receive the following error message:

Internal server error exception:

Troubleshoot issues with Microsoft SharePoint Foundation

Additionally, the following critical exception may be written to the ULS log of the SharePoint Web front-end server:

Process: OWSTIMER.EXE (0x1CE0)

Product: SharePoint Foundation

Category: Topology

EvendID: 8031

Message:
An exception occurred while updating addresses for connected app {eaf6c00c-cc3f-460e-8bf2-ad9b991ea6ea_aa16845d-045a-46bc-bbc6-d701ff13950d}. The uri endpoint information may be stale. System.InvalidOperationException: The requested application could not be found at Microsoft.SharePoint.SPTopologyWebServiceApplicationProxy.ProcessCommonExceptions(Uri endpointAddress, String operationName, Exception ex, SPServiceLoadBalancerContext context) at Microsoft.SharePoint.SPTopologyWebServiceApplicationProxy.ExecuteOnChannel(String operationName, CodeBlock codeBlock) at Microsoft.SharePoint.SPTopologyWebServiceApplicationProxy.GetEndPoints(Guid serviceId) at Microsoft.SharePoint.SPConnectedServiceApplicationAddressesRefreshJob.Execute(Guid targetInstanceId) bf94139a-66f8-4aab-af31-406a5ebb6db9

Cause

This issue occurs when the Search service application proxy is associated with a web application but is not associated with any Search service applications in the farm.

Resolution

To allow users to search without receiving this error message, reassociate the web application with a valid Search service application proxy. A valid Search service application proxy will be associated with a Search service application in the Manage service applications option. To do this, follow these steps:

  1. Open Central Administration.
  2. Click Manage web applications.
  3. Select the affected web application.
  4. On the ribbon, click Service Connections.
  5. Under Configure Service Application Associations, select the valid Search service application proxy check box.

bini

How to reset a corrupt search index in SharePoint 2013 in case the reset index is not working

On a recent SharePoint 2013 deployment, I faced a strange issue, wherein the search index got corrupted and I was not able to reset the index. Although, I was able to get to the reset index screen and the reset index button was clickable, but each time it went into an infinite loop on clicking the reset index button, and the index was not reset.

After much banging our heads and googling around, I found a small piece of information that really helped and did the trick. Below is a step-wise approach to resolve such an issue.

Step-1
Stop the Windows SharePoint Services Timer service

timer

Step-2
Navigate to the cache folder in the following location: System Drive:ProgramDataMicrosoftSharePointConfig

Step-3
Locate the folder that has the file “Cache.ini”. There may be multiple GUID folders with cache.ini file

Step-4
Back up these folders with the Cache.ini file

Step-5
Delete all the XML configuration files in the GUID folder
Note: When you delete the xml files within the GUID folders, make sure that you do not delete the GUID folder and the Cache.ini file that is located in the GUID folder

xmlfiles

Step-6
Open the Cache.ini file and delete the content, replacing all the text with the number ’1′. Save the file and close
Step-7
Start the Windows SharePoint Services Timer service
After the above steps have been executed, make sure that the Cache.ini file contains some other value and not 1

resetcache