As far as I know the docs are not super clear on this. But if your
resources have the correct path
and sql dialect
(new in v5) the
workflow you described should work with a call to
Resource.to_pandas
.
{
"name": "date",
"path": "sqlite:///database.db",
"dialect": {
"sql": {
"table": "date"
}
},
"schema": {...}
}
Here is an example. First reading the data package
import pandas as pd
from frictionless import Package
package = Package('datapackage.json')
and now checking that the types were corrected converted from the original schema
sales = package.get_resource('sales').to_pandas()
# unit_price and total_price are stored as text in sqlite with euopean separators (eg. 1.049,99)
sales.dtypes
date_id int64
product_id int64
quantity int64
unit_price float64
total_price float64
dtype: object
date = package.get_resource('date').to_pandas()
date.dtypes
date_val datetime64[ns]
year int64
yearmonth object
weekend bool
dtype: object
You can also do validation. Here a regex pattern and a foreign-key validation errors are raised:
package.validate()
{'valid': False,
'stats': {'tasks': 3, 'errors': 2, 'warnings': 0, 'seconds': 0.019},
'warnings': [],
'errors': [],
'tasks': [{'name': 'date',
'type': 'table',
'valid': True,
'place': 'sqlite:///database.db',
'labels': ['id', 'date_val', 'year', 'yearmonth', 'weekend'],
'stats': {'errors': 0,
'warnings': 0,
'seconds': 0.004,
'fields': 5,
'rows': 5},
'warnings': [],
'errors': []},
{'name': 'product',
'type': 'table',
'valid': False,
'place': 'sqlite:///database.db',
'labels': ['id', 'code', 'name', 'description', 'category'],
'stats': {'errors': 1,
'warnings': 0,
'seconds': 0.004,
'fields': 5,
'rows': 3},
'warnings': [],
'errors': [{'type': 'constraint-error',
'title': 'Constraint Error',
'description': 'A field value does not conform to a '
'constraint.',
'message': 'The cell "sku002" in row at position "4" '
'and field "code" at position "2" does not '
'conform to a constraint: constraint '
'"pattern" is "SKU\\d{3}"',
'tags': ['#table', '#row', '#cell'],
'note': 'constraint "pattern" is "SKU\\d{3}"',
'cells': ['3',
'sku002',
'Product C',
'Yet another great product',
'Category 2'],
'rowNumber': 4,
'cell': 'sku002',
'fieldName': 'code',
'fieldNumber': 2}]},
{'name': 'sales',
'type': 'table',
'valid': False,
'place': 'sqlite:///database.db',
'labels': ['sale_id',
'date_id',
'product_id',
'quantity',
'unit_price',
'total_price'],
'stats': {'errors': 1,
'warnings': 0,
'seconds': 0.01,
'fields': 6,
'rows': 5},
'warnings': [],
'errors': [{'type': 'foreign-key',
'title': 'ForeignKey Error',
'description': 'Values in the foreign key fields '
'should exist in the reference table',
'message': 'Row at position "6" violates the foreign '
'key: for "product_id": values "4" not '
'found in the lookup table "product" as '
'"id"',
'tags': ['#table', '#row'],
'note': 'for "product_id": values "4" not found in the '
'lookup table "product" as "id"',
'cells': ['5', '5', '4', '3', '14,99', '44,97'],
'rowNumber': 6,
'fieldNames': ['product_id'],
'fieldCells': ['4'],
'referenceName': 'product',
'referenceFieldNames': ['id']}]}]}
– Source