Search Analytics with SharePoint / FAST

It is no secret that Norconex is very fond of search analytics.  While you may understand the importance of gathering good search metrics, it may be more difficult to grasp how a product such as Norconex’s very own Search Analytics would integrate with your enterprise search application.

While Norconex Search Analytics can be integrated with ANY enterprise search product (Autonomy, Endeca, Exalead, Solr, etc), in this post, we have chosen to demonstrate its integration with Microsoft FAST Search for SharePoint 2010, as it may provide a familiar interface to many of you.    The Search Analytics .Net API was used to perform the integration illustrated here.  The Search Analytics JavaScript API might have been used as well.

What metrics to track?

The foundation of the FAST Search Center, as shown in Figure 1, provides a rich user interface to access the FAST Search Server capabilities of SharePoint 2010.  There are four main statistics areas that Norconex Search Analytics can track for you:

  • Search: The core information about a user search.
  • Refinements: Represents any section of a search screen where the user can chose additional means of filtering the results.   Facets are a good example.
  • Navigation: Changing from one results page to another.
  • View: The viewing of a document.

E.g. the area called Navigation Statistics lets you see numbers for how many people changed from page of a search result to another.

Figure 1 - FAST Search Center
Figure 1 - FAST Search Center

Integration with FAST Search

For each of the main statistics area, we will demonstrate how various SharePoint user interface elements can translate into useful statistics in Norconex Search Analytics.  While we won’t illustrate all statistics reports available, there should be enough to give you a strong feeling for mixing Norconex Search Analytics with your enterprise search.

Figure 2 - Norconex Search Analytics
Figure 2 - Norconex Search Analytics

Search Statistics

From the Search Analytics perspective, the Search is the “main” action representing a user submitting a query to the search engine.  Information stores includes the user search terms, the search elapsed time, where the search originated from.

One of the key query metrics is: what are the most popular terms that were submitted to the search engine.  The Search metrics could help you deliver users a more engaging experience by providing them guidance in suggesting links or specific documents.  For example, imagine a scenario in which you are a lawyer and you notice that many searches are being performed on a new statute or on a legislation coming into force, you may want to promote documents that relate to that piece of legislation for your search user community.

In FAST Search Center, search administrators have many tools to suggest specific links to users (visual best bets, promotions/demotions, etc.).  Therefore, to continue the example, you could provide a relevancy boost to documents that deals with a specific legislation by promoting and/or demoting them so you can target specific groups of users when they are searching on a specific word.

As shown in Figure 3, you can easily see that many users are searching for a keyword that relates to boat concepts (power boat, pilot boat, etc…).  You could then configure, in FAST Search Center, to always display a link that relates to marine or vessel whenever the user has typed keywords “boat”.

Figure 3 - Scenario for suggesting links
Figure 3 - Scenario for suggesting links

Another way the Query Term metrics can help you with finding user intentions, is to identify key terms that are being searched for but not usually found in documents.  This is typically done in response to a user searching for a specific word but the meaning is stored with a different word in the document.  For example, if you are legal practitioner and you are looking for the term “statute”, the meaning might be referred to as “legislation” or “act” in the document.  Therefore, you might consider creating a library of synonyms for legal terminology.

Identification of words, phrases or even abbreviations with a meaning identical or similar to the preferred query term is instrumental in the creation of a well-managed and usable information environment for your enterprise search.  It will convey to the user the “right” answer,  by deriving the appropriate “meaning” from the search term and returning the best candidate even if the term does not exist in the document.  The practice of identifying synonyms provides also a means to share a common terminology within an organization and to establish a set of best practices and governance for managing your information.

In FAST Search Center, as illustrated in Figure 4, the search administrator can create synonym expansions where he is taking variants of the word and assigning them to the FAST search engine.  Returning to our previous example, the term “boat” might indicate that the searcher is also interested in a similar terms like “vessel” or even “power boat”.

Figure 4 - Scenario for synonyms
Figure 4 - Scenario for synonyms

Refinement Statistics

The Refinement Statistics are responsible for tracking refinements made to a search result.  A “refine” action typically represents a user clicking on a facet value, a date selection, etc.

When you perform a search, you might notice the list of properties on the left-hand side of the web page, as shown in Figure 5.  These can be referred to as “refiners” or “facet” values.    As per Figure 5, you will find a facet for the Result Type, the Site, the Modified Date and the Company.

Figure 5 also shows the user has selected the refinement value “Richelieu”.  Looking at the Norconex Search Analytics server, in the Top Refinement panel, it now displays the “Richelieu” refinement, along with other refinement values from other user searches.

Figure 5 - Scenario for refinement
Figure 5 - Scenario for refinement

The Refinement Panel provides a manageable framework by which users can meet a wide range of information needs, from a simple fact finding, to a more complex exploratory search and discovery.  But how do we configure such a framework to allow hundreds or even thousands of facet values?  It is not practical to display an entire facet list to your users.  So how can we provide a refinement framework without overwhelming the real estate of the web page?  One of the approaches is looking at what is being used and by who based on the Refinement Statistics data.  You could then determine various display formats for the facets and reveal appropriate content to help your users narrow their search and have a more productive search experience.  With the Refinement Statistical data in hand, you could also decide to not display specific facet values entirely if they are not being used by your search user community.  Therefore, the Refinement Statistics can help you deliver a unique search experience by catering to your user’s needs based on their professional interest and hobbies.

View Statistics

The View Statistics are responsible for tracking documents viewed by search users.  When a user clicks to view a document from a result set, it typically represents a “view” action.

Norconex Search Analytics captures the user’s activity by monitoring which links they selected and keeps notes of them.  Going back to our previous example, if your users are searching for “power boat”, and after reviewing the list of search results they always click on the same link, the View Statistics will keep track of that.  So if your users are searching for “power boat” and continue to select that document, the search administrator could adjust the weight and return it higher in the search results.

This is powerful feature.  Knowing which documents are the most viewed will help you determine if the document should be promoted so it is returned higher in the search results.  As demonstrated in Figure 6, in the FAST Search Center, the search administrator has the ability to increase or decrease the relevancy ranking for specific content.

Figure 6 - Scenarios for the View Statistics
Figure 6 - Scenarios for the View Statistics

 Navigation Statistics

The Navigation Statistics are responsible for keeping track of page navigation.  A “navigation” action typically represents a user navigating between search result pages.

Some users have difficulty in formulating a good search keyword which often requires them to polish the search queries and sift through the search results by page navigation.  When a lawyer executes a query, for example searching for “civil rights”, the extremely large number of results and the inherent difficulty in surfing through them may discourage him from searching further.  Finding what terms caused an excessive amount of page navigation may help the search administrator to give a better treatment to these terms.

Other users may be wading through large sets of page results to get a better understanding of what they are searching for because they don’t understand the subject matter.   This type of user will typically click several page results, quickly looking at each page and deciding on a best candidate that equate to their needs. Moreover, while bouncing up and down through the search page results, the user might be doing some fact finding that will provide them with important information clues to help them refine their query.  When a lawyer executes a query but is not a subject expert in family law, for example searching for “living common-law”, he might decide to refine his query after acquiring a better understanding of the subject while browsing through the page results.  These

The search engine might be a great source of frustration or a great source of information for many search users.  So knowing what your users are searching for and how they are browsing through the search results might help you improve your search engine to drive better search results.  You might want to invest more time in providing a better guided navigation or you might want to boost the relevancy of specific documents that will help prevent exhaustive winnowing of search results.  When the first few pages are not valuable, the user may leave with a negative feeling towards your search engine.

In Figure 7, after monitoring the page views in Norconex Search Analytics, you can see that a few users are browsing through six (6) page results.  After further investigation, you find out that one of the documents on those pages contains valuable information for a specific author.  So as a search administrator, you decide to add the managed property author as a sortable field so the users can search on a specific author and sort the results.

Figure 7 - Scenarios for page navigation
Figure 7 - Scenarios for page navigation

Conclusion

In this post, we have examined only some of the benefits of integrating Norconex Search Analytics with FAST Search for SharePoint 2010.   You will find ways to perform the tuning tips suggested with most, if not all major enterprise search platforms.

If as an organization, you are struggling with managing your enterprise search and are not sure where to begin or what to do next, the Norconex Search Analytics will definitely help you get started in understanding the behavior of your search user.  To design better search experiences, we must first understand the complexities of user behavior in seeking information.  Implementing an enterprise search engine in your organization is an iterative process.  You have to understand user needs, manage the information, enhance the site navigation, provide better search performances, etc.  Every iteration of a great search engine improves the effectiveness that drives great search results.

Are you keeping a close eye on your search statistics?  What are they telling you?  Do you have a story to tell us?  We would definitely like to hear it!

Search expert with more than 10 years of software development (.NET, Java, etc) and years of experience as a specialist in various Enterprise Search products including but not limited to Autonomy IDOL, Verity and FAST Search for SharePoint 2010. He has unique practical experience in eDiscovery and has gained knowledge of the Canadian Federal Legislation procedures going back to 2003. He has created and designed various legislation software and Internet/Intranet websites.