Currently our setup for our search engine is merely an InstantSearch front-end to an open-source engine called TypeSense. We made specialized scripts to clean and load the government transparency data into the Typesense instance.
Currently, the front-end is public and has unlimited usage, but we want to limit the daily searches available for each visitor (similar to newspaper paywalls), while still allowing people to request a fair use account (for journalists, by example).
Technically this means:
- Implement a back-end that acts as a proxy to the search requests to Typesense and checks for user authentication. This needs to implement full-fledged user management under reasonable security standards, preferabily implemented through existing authentication solutions or libraries.
- Modifying the InstantSearch-based front-end to send the proxied requests with user credentials and embed the view in a page allowing users to optionally authenticate through a user portal. Payment management is not needed, but there has to be an admin page that allows to manually enable full access to paying users.
Stack:
- Back-end solutions implemented on Python (FastAPI, Flask, Django) are encouraged but not necessary.
- The database system used has to be open source and self-hosted. (PostgreSQL, MariaDB, MongoDB, etc)
- If possible, the front-end must be kept separate from the back-end in a static manner, allowing the front-end to be served directly through CDNs. This means templating is discouraged, but proposals are still accepted.