This is important because it is a security risk
Some typical example of the problem:
- https://cidportal.jrc.ec.europa.eu/apps/gitlab/jeodpp/data-management/data-catalog/blob/2e2687193b52f1b44e3194dcb23a76c50ee06588/search.py#L82-84
- https://cidportal.jrc.ec.europa.eu/apps/gitlab/jeodpp/data-management/data-catalog/blob/2e2687193b52f1b44e3194dcb23a76c50ee06588/src/sql.py#L19-22
- https://cidportal.jrc.ec.europa.eu/apps/gitlab/jeodpp/data-management/data-catalog/blob/2e2687193b52f1b44e3194dcb23a76c50ee06588/search.py#L439
You should never use str.format
or %
to add user input into a query. You should always pass
the query parameters separately (i.e. when you call conn.execute()
).
Please read this carefully.
Unless you plan on rewriting this relatively soon, I would suggest to take the time to fix it.
Relevant reading:
- https://www.fullstackpython.com/sql-injection.html
- https://phpdelusions.net/sql_injection
- xkcd comic + comic explanation
This is a performance issue.
You open and close connections which takes time and CPU at the database. If your workers only do a handful of queries then it is not a huge deal. Still, you can easily do better by using connection pooling (which means that you open a bunch of connections which you reuse, instead of a creating a new connection from scratch its time).
There are two ways to implement this with postgresql:
We discussed this yesterday.
- records. Makes working with raw SQL queries a bit easier. See an example of the API here. Perhaps the data-export could prove useful?