Skip to content

Instantly share code, notes, and snippets.

@kissgyorgy
Last active May 6, 2024 08:06
Show Gist options
  • Save kissgyorgy/e2365f25a213de44b9a2 to your computer and use it in GitHub Desktop.
Save kissgyorgy/e2365f25a213de44b9a2 to your computer and use it in GitHub Desktop.
Python: py.test fixture for SQLAlchemy test in a transaction, create tables only once!
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
from myapp.models import BaseModel
import pytest
@pytest.fixture(scope="session")
def engine():
return create_engine("postgresql://localhost/test_database")
@pytest.fixture(scope="session")
def tables(engine):
BaseModel.metadata.create_all(engine)
yield
BaseModel.metadata.drop_all(engine)
@pytest.fixture
def dbsession(engine, tables):
"""Returns an sqlalchemy session, and after the test tears down everything properly."""
connection = engine.connect()
# begin the nested transaction
transaction = connection.begin()
# use the connection with the already started transaction
session = Session(bind=connection)
yield session
session.close()
# roll back the broader transaction
transaction.rollback()
# put back the connection to the connection pool
connection.close()
@kissgyorgy
Copy link
Author

No, feel free to use it.

@glaucouri
Copy link

I suggest a more robust approach at tables function

@pytest.fixture(scope="session")
def tables(engine):
    BaseModel.metadata.create_all(engine)
    try:
        yield
    finally:
        BaseModel.metadata.drop_all(engine)

@kissgyorgy
Copy link
Author

I suggest a more robust approach at tables function

You don't need to do that. Pytest is doing it for you: https://docs.pytest.org/en/stable/fixture.html#yield-fixtures-recommended

@glaucouri
Copy link

you are right, thank's for point out

@nzewail
Copy link

nzewail commented Jun 9, 2021

Am I the only one who finds this taking a while to drop the tables at the end? When I remove that step it seems to run quite quickly but for some reason dropping the tables at the end of the session is taking a while?

@pattersongp
Copy link

Thanks for sharing this

@peterdudfield
Copy link

Thanks

@aopis
Copy link

aopis commented Apr 17, 2022

I've trying to use your conftest on my project but I don't get to add new rows on the db.
I defined a simple test:

def test_simple(dbsession):
    genres = Genres.query.all()
    assert len(genres) == 25
    new_genre = Genres(name="my_custom_genre")
    dbsession.add(new_genre)
    dbsession.commit()
    genres = Genres.query.all()
    assert len(genres) == 26

I don't get any exception but the last assert is False and the new genre is not on the db, after the commit, list of genres is still 25.
Any clue on what's going on?
The conftest is exactly than yours except for the database_uri

@kissgyorgy
Copy link
Author

From the Genres.query attribute, I'm guessing you are using Flask-SQLAlchemy plugin. If that 's the case, you are probably not configured the plugin correctly I'm guessing? And the fixture is creating a different table that the one you are trying to connect to in the test maybe?

This snippet has nothing to do with Flask-SQLAlchemy, as I mentioned it earlier.

@danielTobon43
Copy link

danielTobon43 commented Dec 2, 2022

Hi @kissgyorgy, could you please post an example of a test where we need to add some dummy data to the testing table before the test case and remove the data from the table once the test case is done?

This is my case: I need to add some dummy data to the table before a specific test case is run, then execute a GET query with that dummy data and then, I want to clean (or go back to the initial state: empty table) the table after the test case is done so the table will be available for a new test case. I was wondering if for this task I will have to use the db.commit() because when I just run the db.add(MYDATA) without the db.commit() and then, I run a GET request to the table, it is empty. Or is this the standard approach? because what I understand is that, once we call commit(), the transaction is completed and we are not able to rollback to a previous state., Could you help me with this?

@aqc-carlodri
Copy link

For me I had to change the last fixture to this:

@pytest.fixture
def db_session(engine: Engine, tables):
    connection = engine.connect()
    session = Session(bind=connection)

    yield session

    session.rollback()
    connection.close()

to get it working as expected.

Using SQLAlchemy 2.0 and Python 3.10.

@kissgyorgy
Copy link
Author

@aqc-carlodri if your tests commit anything to the database, you are not rolling it back this way and there can be junk left for the next run.

@aqc-carlodri
Copy link

@kissgyorgy thanks for the feedback, will test further then, for some reason with your code only the first test works, the others fail to see anything in the DB even if I add stuff.

@kissgyorgy
Copy link
Author

Yes, that's the whole point of these fixtures. Tests should be independent from each other, DB should be empty at the start of every test. The nested transaction what makes it possible to commit anything within one test case but as it will be rolled back at the end, there is no effect on other tests.

@dtcMLOps
Copy link

dtcMLOps commented Mar 13, 2023

In my case, this is my fixture

@pytest.fixture(scope="session")
def dbsession(tables) -> Generator[TestingSessionLocal, Any, None]:
    """fixture that returns an sqlalchemy session, and after the test tears down everything properly."""
    connection = engines["TEST"].connect()
    # begin the nested transaction
    transaction = connection.begin()
    # use the connection with the already started transaction
    session = TestingSessionLocal()

    yield session

    session.close()
    # roll back the broader transaction
    transaction.rollback()
    # put back the connection to the connection pool
    connection.close()

and in the endpoint to avoid committing a transaction I use flush according to sqlachemy docs

https://docs.sqlalchemy.org/en/20/orm/session_api.html#sqlalchemy.orm.Session.flush

def seed_database(dbsession, records: List[object]) -> None:
    """function to add dummy data to the database

    Args:
        dbsession (fixture): database connection
        records (list[object]): list of items to be added in the database
    """
    dbsession.bulk_save_objects(records)
    dbsession.flush()

and in the endpoint

def test_mytable(dbsession, test_client, new_data):

    # GIVEN 3 records
    record1 = new_data(probability=0.1)
    record2 = new_data(probability=0.2)
    record3 = new_data(probability=0.3)
    seed_database(dbsession, records=[record1, record2, record3])

    # WHEN the endpoint is requested (GET)
    response = test_client.get(url=f"/myendpoint")

    # THEN check that the response is valid 
    assert response.status_code == 200
    assert len(response.json()) == expected

With this approach, every new test execution will have an empty database table for adding new registers

@kissgyorgy
Copy link
Author

@dtcMLOps if you you only flush in test code, there is no point using a nested transaction, as without commit, closing the session will clean up everything. The point of my snippet is that you can real code with transaction handling, committing and everything and tests will still be isolated and leave no junk in the database with the nested transaction rollback.

@dtcMLOps
Copy link

dtcMLOps commented Mar 20, 2023

@dtcMLOps if you you only flush in test code, there is no point using a nested transaction, as without commit, closing the session will clean up everything. The point of my snippet is that you can real code with transaction handling, committing and everything and tests will still be isolated and leave no junk in the database with the nested transaction rollback.

Hi @kissgyorgy, your approach to running a test is not correct if I want to run tests concurrently. Commiting to the database will make the next test case that is pointing to the same table fail since there are "registers" committed from the previous test case. My approach allows me to run the test cases concurrently.

@pxu-mdsol
Copy link

pxu-mdsol commented Mar 31, 2023

@kissgyorgy are you sure your dbsession.commit() in your code really commiting to the database? I set the breakpoint right before and after dbsession.add(...), and dbsession.commit(), while monitoring the database status (I am using a database in container). I do see your line 14 and 16 create and drop tables in the database, but I never see dbsession.commit() really do anything in the database.

Overall it seems that all tests passed, however I have doubts whether physically "committing to the database" is true.

Could you post one of your test cases?

@kissgyorgy
Copy link
Author

@pxu-mdsol the whole point of this is that things are committed to a sub-transaction only, so nothing will be saved in the database after the test case. It depends on isolation level settings if committed things can be seen from other connections/transactions. See: https://en.wikipedia.org/wiki/Isolation_(database_systems)

@pxu-mdsol
Copy link

@kissgyorgy Thanks for the quick response. I know that "nothing will be saved in the database after the test case", how about during the test case? I assume you meant that it does "change the database", it's just everything rollback at the end of each test case.

As for as whether I can monitor that "change" in the database with a separate session in the database (such as a mysql query in a mysql client) concurrently with the test session, your point is that depends on the isolation level settings?

@pxu-mdsol
Copy link

pxu-mdsol commented Mar 31, 2023

Ok, I realized that I missed the role of "transaction" in line 24. It seems that if this "transaction" is created, then dbsession.commit() in the test case will not commit changes to the database, unless we literally call transaction.commit().

So the direct answer to my question:

"nothing will be saved in the database after the test case, how about during the test case?"

is NO. Even during the test case, changes never commit to database. They all stay in transaction, and finally canceled by your line 32.

@kissgyorgy
Copy link
Author

@pxu-mdsol

"nothing will be saved in the database after the test case, how about during the test case?"
is NO. Even during the test case, changes never commit to database. They all stay in transaction, and finally canceled by your line 32.

That is False. I'm not familiar with database implementation specifics, but I'm pretty sure you can find files on disk related to transactions for certain databases and engines. A concrete example is SQLite in WAL mode. Also as I mentioned, you can see the data committed during test based on isolation levels, so the data is definitely saved in the database somewhere.

I think the more precise way to put it is that "during test case, data will be saved in the database in a transaction and after the test case, everything will be deleted by rolling back the transaction."

@pxu-mdsol
Copy link

@kissgyorgy I agree you -- transaction is saved in the database. Thanks for the discussion!

@bashtian-fr
Copy link

its been a long time since this gist was posted, but I am having really inconsistent behaviors where one time it will succeed my tests, then another run it will complain that there is no such table. Like the create_all and drop_all weren't called or that the table already exists, Did anyone experience this behavior? I only have 3 tests in a class.

@kissgyorgy
Copy link
Author

The database and table creation fixtures are session scoped, so they are only created at the start of the first test which uses it and destroyed at the end of the test suite.
These SQLAlchemy APIs didn't change.
The only problem I can guess is that the test run was abrupted and the database tables were not deleted. You can try starting the tables fixture with drop_all() , so the tables are deleted BEFORE every test run.

@bashtian-fr
Copy link

Thanks for the answer, I tried that too but that would not help.
Altho I found the problematic part but I don't know how to solve it yet. The issue is that I have a lot of Core CPUs and I had pytest --numprocesses/-n on auto. My assumption is that 2 (or more) of the tests from the same class - which are using these db fixtures - were running on different processes which would create new test session. Setting it to 1 (or not adding it) solved the problem.

@kissgyorgy
Copy link
Author

With pytest-xdist, you should make a different test database for every process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment