Monday, July 22, 2019

Five things you should know before going live with Azure Search for Sitecore


If you're reading this post, you've probably configured (or about to) shiny new Sitecore 9.x production environment running in the cloud, with Azure Search as a search provider. Not having to worry about Solr Virtual Machine maintenance, Java licensing, etc. is appealing. But it is important to know that Azure Search comes with it's own nuances you might learn the hard way.


Cost
Even though Azure Search for Sitecore is deployed with 1 replica / 1 partition in small configurations, you should expect to spend at least ~736$ (~$245.28 for S1 x 3 replicas) per month for the reasons described below. This sounds like a lot, compared to hosting a VM on Azure with Solr installed. Note that you pay for the number of search units, which is equal to the number of replicas multiplied by number of partitions.


Scaling
Things like high availability do not come out of the box. If you only have one replica - you should expect periodic downtime. Here is the excerpt from Microsoft Docs:

Replicas not only help reduce query latency but can also allow for high availability. With a single replica, you should expect periodic downtime due to server reboots after software updates or for other maintenance events that will occur. As a result, it is important to consider if your application requires high availability of searches (queries) as well as writes (indexing events). Azure Search offers SLA options on all the paid search offerings with the following attributes:

  • 2 replicas for high availability of read-only workloads (queries)
  • 3 or more replicas for high availability of read-write workloads (queries and indexing)
In other words, if you use Azure Search for displaying any important listings at the website - you should have at least 3 replicas. Unless you have really expensive queries, I do not see the reasons to add more partitions. This is my personal opinion and it might contradict with Sitecore XP 9.1 ARM templates – topologies and tiers.

Logging
Azure Search comes with pretty extensive query logging that apparently shows queries not listed in the monitoring page. It dumps all data to Azure Blob and I highly recommend to set it up from the very beginning. It logs query text, duration and results count. Here is the log file example:

"resultSignature": 200, "durationMS": 156, "properties": { "Description" : "POST /indexes('sitecore-master-index-782')/docs/search.index" , "Query" : "?api-version=2017-11-11" , "Documents" : 1, "IndexName" : "sitecore-master-index-782" }}

Monitoring page is also quite useful, even though it does not seem to visualize 50x responses.

azure search queries


Index Updates
Due to performance (or let's say cost/performance) constraints of Azure Search it makes sense to reduce the number of queries to Azure Search and at least disable index updates on CD instances.

Robustness
This is also valid for Solr implementations, but it's good note anyway: make sure that you are prepared to search service outage (for any reasons) and it does not break your website. If the search hangs (and you don't have any timeout), user session will likely hang as well, for a couple of minutes. So make sure you have timeouts configured in your search-related code and if Azure Search is busy or not responding - drop the connection early.

Also, note that Azure Search provider is still new, so it is possible that you'll encounter certain bugs and issues. I saw the one where index catalog got corrupted and CM started removing live index after each rebuild (9.1 Initial Release). Hopefully, this one was fixed in a newer version.

Hope this will help you to avoid mistakes (and downtime!) when going live with Azure Search. Gor your own hints/suggestions? Share them in the comments below!

3 comments: