Skip to content

Instantly share code, notes, and snippets.

@tlaak
Last active July 29, 2020 18:35
Show Gist options
  • Save tlaak/ef3cef8ad91ccf1b9b7c1efbf660d6ec to your computer and use it in GitHub Desktop.
Save tlaak/ef3cef8ad91ccf1b9b7c1efbf660d6ec to your computer and use it in GitHub Desktop.
Foobar Agency consulting challenge

Foobar Agency - Consulting Search Challenge

Challenge description

What do we know about the situation and what questions should we be asking?

Getting answers to the following questions will give us better position to negotiate about the options and support the stakeholders in decision making.

These are the key problems the team is facing.

Performance optimisation

the team has found that the search in the (legacy) shop is not performed optimally and is looking for options to make the search better for the end customer

To be able to solve the problem, we would need to understand it better.

  • What does the performance of the search mean in this context and how was it measured?
  • How would an ideal optimised performance be like?
  • What would the impact of an improved performance be? Does it affect conversion rate or customer satisfaction enough to be worth the costs? How would we measure this?
  • How long does it take on average to return a filtered list of results?
  • Is the search "intelligent" enough to find matches to search terms that might have typos on them?
  • Does it match synonyms or things like similar colours?
  • Does it provide totally wrong results if the fuzzy matching is too greedy?
  • How is the indexing handled? When the catalogue of items updates, how long does it take to update the index so that items removed from the selection are not returned in search results any more? Same also for new items.
  • How much manual work there is to maintain an always up-to-date index and can it be automated?

Unknown technology

but the development team is not familiar with it and the programming language.

This sounds like one of the key problems the team is facing. Getting productive in a new, complicated piece of software written and configured in a different unfamiliar programming language takes anything from 1 to 6 months of time for each developer, depending on their skill level. During this time the productivity of the team is significantly decreased and chances for critical bugs and misconfigurations are much higher. Any outage or unpredictable behaviour in the search functionality will have a direct impact to the conversion rate and sales numbers.

Providing training in Solr to the team can cost roughly €1,000 per day for each team member in direct increased costs. During the training they are not actually writing any code for the project which will put the ongoing work on hold.

Training does not guarantee to provide expected results as developers would still need to put the newly learned skills into practice. Providing traininig only to selected members of the team will lower the training costs, but these members will then have to pass on their newly acquired knowledge to the rest of the team, thus reducing the productivity level of the whole team.

Working with a new programming language and technology which are not providing an ideal developer experience will probably lower the morale of the team, cause more frustration and fatigue, increase risks of a burnout and sick days, all of which will affect the quality of the work and performance of the team negatively.

Options

Extend Solr

Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. Solr powers the search and navigation features of many of the world's largest internet sites.

Questions to ask

  • What would be the cost of better and faster servers?
  • Is the server actually the bottleneck here or is Solr just in general slow regardless of the hardware it is running on?
  • Can the server capacity be easily increased in the future if the site popularity grows more than is expected?
  • Where would the servers be hosted? What's the guaranteed availability of the service provider?
  • What is the current failure rate?
  • What is the amount of work required by the migration?
  • Would a migration require changes in the webshop?
  • Would the migration cause any outages to the webshop?
  • By doing this, would other services dependent on Solr be affected?
  • If yes, would the performance of these services be improved as well?
  • Would it be possible to hire a Solr specialist to do the configuration changes?
  • How is access to the servers managed? Are team members able to get access in a centralised way?

Choose different product

This is how Algolia describes their product:

Our mission is to make you a search expert. Push data to our API to make it searchable in real time. Build your dream front end with one of our web or mobile UI libraries. Tune relevance and get analytics right from your dashboard.

Costs are the main concern

Algolia is already familiar technology to the team members. This means that they would be more productive right away.

Costs are €200,000 per annum which is quite a large number. It might be useful to get the information about the cost of the Solr instance. Cost should not be the only factor when decisions are made, but sometimes it matters a lot if there isn't any budget for increases costs. It's only about direct costs though. Indirect costs like team productivity, maintenance and configuration requirements, availability and impact on user experience should be considered.

Task

Challenges

What do we know about the person who is making the final call? Do they base decisions purely on numbers and facts that are presented to them? How did the company originally select Solr? Are there deeper connections to the Solr provider that we should be aware of? Do we have a good relationship with them or do we need support from someone else?

There are lots of variables and it's not easy to make the right decision. Developers really like to work with technology they feel comfortable with so it might have a massive impact in the team productivity, satisfaction and ability to hire more developers to the team later on.

A €200,000 licence cost per annum is roughly equivalent to the cost of 5 developers working for 2 months. If these 5 developers have very much stagnated productivity for 2 months due to learning new technology they are not familiar with, it will cost roughly the same amount of money as the Algolia licence. On top of this there will be added costs from the server hosting and maintenance.

How do we prove that Algolia would be a better solution?

  • Are there available unbiased benchmarks between Algolia and Solr?
  • Is it possible to get a demo or build a quick proof of concept, showcasing the benefits of Algolia?
  • Could other services benefit from this migration as well?
  • What's the guaranteed availability of Algolia? How big is the risk of outages?

Features of Algolia and where it shines

According to comparison charts, Algolia is ultra fast, super easy to implement, modern, comes with excellent support and is easy to setup. It handles typos well and is designed to search for records (i.e. products) instead of pages. It can provide instant search results as you type and suppors smart highlighting.

Algolia integrates well with React, Vue, Angular, JavaScript and WordPress.

Features of Solr and where it shines

Solr is powerful, works great in indexing, is scalable, customisable and enterprise ready.

Solr integrates well with Datadog, Netdata and Lucene (it's based on Lucene).

Solr requires a server to run and that server needs hosting, maintenance, security updates and trained professional staff to operate it.

Summary

Algolia provides a REST API to query and update the search indices. All input and output is provided in JSON, making it extremely easy to use in frontend Javascript. This means that Algolia is more suitable search tool for a modern front-end or mobile driven application where the records consist of products. It would be an ideal search solution for a webshop.

By being a SaaS product, Algolia allows the company to eliminate a server dependency.

It would be extremely simple to start using Algolia and integrate it into the webshop in a matter of hours and days instead of weeks and months.

Solr is stronger in cases where large amounts of content such as whole pages or text logs are indexed (used by e.g. Slack and Coursera). It is an enterprise level tool which gives lots of configuration options with the overhead cost of configuration, maintenance and special developer requirements.

Solr is configured in XML and written in Java, which differ from the typical front-end technology stack. Solr will add the need of Java backend developers and XML does not play well with modern front-end languages and libraries.

Based on this information I would recommend using Algolia as the search provider. It can be integrated in a matter of hours or days rather than weeks or even months.

I would expect the client to make a decision based on multiple variables, where overall cost (including licences, team costs, server costs) is probably playing the key role. If there is a good relationship with the client and they trust on us, they will most likely listen to what we say and follow our recommendation. This is easier if we provide enough data points to use as the basis of the decision making process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment