Wednesday, February 27, 2008

Crawling SharePoint sites using the SPS3 protocol handler

When setting up MOSS Index crawls I noticed an odd prefix of SPS3:// and the following article explains why it should be used for index crawling to improve performance, especially over the WAN: Jose Barreto's Blog : Crawling SharePoint sites using the SPS3 protocol handler

Microsoft SharePoint Products and Technologies Team Blog : SharePoint SDK Downloads Now Live with SP1 Updates

After wetting our appetite with MOSS SP1 at last 1.3 MOSS is here. (Note the new intellisense xml file)

Microsoft SharePoint Products and Technologies Team Blog : SharePoint SDK Downloads Now Live with SP1 Updates

Indexing PDF content in MOSS Sharepoint 2007

Out of the box, MOSS only indexes the meta data of a PDF and not the Text inside it if it is available. To index the full content you need an iFilter which is explained well here Steven Van de Craen's Blog.

We have used Foxit's ifilter as it has an x64 version which our MOSS servers are running on.

After getting this installed I was still having trouble with some files not indexing and found that out of the box MOSS will only index 16Mb files. You can up this using a registry setting on the server - documented here. This works fine except you should be mindful of possible server index time out errors. Mindsharp say that you can increase the timeout value here, but I havnt managed to find this setting in MOSS enterprise yet.

(There is an interesting article on custom Sharepoint Searches here)

Thursday, February 21, 2008