pip install sqlalchemy urllib3[secure]
pip install -e git+https://github.com/okfn/tabulator-py@master#egg=tabulator
pip install -e git+https://github.com/okfn/datapackage-py.git@master#egg=datapackage
pip install -e git+https://github.com/okfn/jsontableschema-sql-py.git@update#egg=jtssql
pip install -e git+https://github.com/okfn/datapackage-storage-py@master#egg=datapackage_storage
-
-
Save vitorbaptista/19d476d99595584e9ad5 to your computer and use it in GitHub Desktop.
Importing and exporting a datapackage into/from a sqlite database
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import sqlalchemy | |
| import jtssql | |
| import dpsql | |
| # The first steps of importing and exporting are the same. | |
| # The difference is that when importing we want to push a | |
| # datapackage to the storage, and when exporting we want to | |
| # create a datapackage from the data in the storage. | |
| # The first step is to point the DB with the data we want to | |
| # export. We'll use the same SQLite DB that was created on | |
| # "import_datapackage.py" | |
| engine = sqlalchemy.create_engine('sqlite:///nasdaq-listings.sqlite') | |
| # We then create a JSON Table Schema storage using this engine | |
| storage = jtssql.Storage(engine) | |
| # Now we're able to export the DB to a datapackage. The first parameter | |
| # is the storage we just created, the second is the path where we want | |
| # the datapackage descriptor to be written. The last parameter is the name | |
| # of the datapackage. The name must be a string following the data packages | |
| # spec (check http://dataprotocols.org/data-packages/#required-fields) | |
| datapackage_storage.export_package(storage, 'nasdaq-listings/datapackage.json', 'nasdaq-listings') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import sqlalchemy | |
| import jtssql | |
| import dpsql | |
| # We'll import a datapackage into a SQLite database. To do so, | |
| # we first have to create a sqlalchemy engine pointing to the SQLite DB | |
| # we want to use (it'll be created if it doesn't exist). | |
| engine = sqlalchemy.create_engine('sqlite:///nasdaq-listings.sqlite') | |
| # We then create a JSON Table Schema storage using this engine | |
| storage = jtssql.Storage(engine) | |
| # Now we're able to import the package. There're two required parameters: | |
| # the storage, where the package will be imported into, and the datapackage. | |
| # We can pass the datapackage in many ways. As an actual dict, a path to a local file, | |
| # a URL, etc. Here we'll pass a URL. For all other options, check | |
| # https://github.com/okfn/datapackage-py/blob/master/datapackage/datapackage.py#L27-L34 | |
| datapackage_storage.import_package(storage, 'https://github.com/datasets/nasdaq-listings/archive/master.zip') |
Hi,
I tried following you word to word but I get error "ImportError: No module named jtssql" every time I run the python code. Has something changed. Do you have a new gist? Please guide me.
Same question: why has jtssql disappeared? I'm building a pipeline that imports a frictionless tabular data package to SQLite, processing it using SQL then exporting it to another package. As far as I know, there is no way with the existing tools/libraries to automatically get data out of SQLite to get a complete tabular package (with all embedded CSV files).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Shall we change
import dpsqltoimport datapackage_storage?