sleroux/frecency.md

## frecency.md

      
    Raw
  

              frecency.md
            
          
    The frecency query [1] we're making for our search recommendations/top sites is painfully slow. The query can up to a few seconds before completing. This is noticeable in places such as Top Sites where we run this query every time the user visits the panel. The items in Top Sites don’t render until the query completes. When taking a closer look at the query we're running to determine frecency, there are a number of factors which are causing it to be slow. Unlike desktop, frecency on mobile is calculated in real time instead of being pre-calculated. This results in a complex query that touches various tables with inner joins and computationally expensive ORDER/COALESCE operations. The query is then run on a potentially large data set since it touches the history table. In cases where a user has attached their FxA account, their desktop history will also be part of this.
After some discussion, there are 2 approaches we can take to optimize the way we handle frecency:


Tweak/modify the existing query. There may be opportunities in the current query that we can take advantage of to make it a bit quicker. For example, instead of doing a full table scan of history items, we can ignore items in the table that have no local visits right off the bat in our first query to reduce the data set we perform the joins/frecency calculations against. Modifying the existing query requires less coding work but could have larger affects on the UX of some areas of the app such as Top Sites. For example, ignoring remote visits from frecency will result in top sites not being affected by syncing with your FxA account.


Split the calculation and retrieval of frecency data into two different operations. This approach would follow more closely with desktop in that frecency values would be calculated not in real time, but according to some sort of heuristic. For example, desktop calculates frecency on a daily basis where as on mobile we would need to determine one that would fit our needs. The advantage of this approach is that it moves the long operation of calculating frecency into an non-blocking task and removes the need for spending the time calculating it when we need to present data to the user. On the other hand, because frecency would be calculated ‘lazily’, the data now has a ‘stale’ state which would require us to add in a invalidation heuristic. The other downside is that this would changes to the database schema and altered queries.


[1] Frecency Query: https://github.com/mozilla/firefox-ios/blob/master/Storage/SQL/SQLiteHistory.swift#L306