This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"id" : "1111", | |
"updateTimestamp" : "2024-04-10 12:12:12", | |
"updateUser" : "testuser", | |
"shareClassId" : "1234", | |
"shareClassCode" : "9999", | |
"accountNumber" : "xxxx", | |
"accountType" : { | |
"id" : "8888", | |
"valueType" : "test_type", |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- set up db context | |
CREATE SCHEMA TRP_TESTING_MHARRIS.PRDMSTR; | |
USE SCHEMA TRP_TESTING_MHARRIS.PRDMSTR; | |
-- check my S3 Storage Integration | |
-- docs: https://docs.snowflake.com/en/sql-reference/sql/create-storage-integration | |
DESC INTEGRATION mharris_trp_S3_Int; | |
-- Create JSON file format | |
-- there are more options for this, but here is a basic one. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- CREATE main DB | |
CREATE DATABASE TEST; | |
-- main DB table | |
CREATE TABLE TEST.PUBLIC.TEST_TABLE AS | |
SELECT 'A' as COL1, | |
2 as COL2; | |
-- clone main DB | |
CREATE DATABASE TEST_CLONE CLONE TEST; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
create OR REPLACE function get_text_classification(x text, y text) | |
returns variant | |
language python | |
runtime_version = 3.9 | |
imports=('@TIAA_TESTING.NLP_MODEL.model_data_test/bart-large-mnli.joblib') | |
packages = ('cachetools==4.2.2', 'transformers==4.32.1', 'joblib', 'pytorch') | |
handler = 'get_text_classification' | |
as $$ | |
import pandas as pd | |
import sys |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Changes from previous version | |
## max_batch_size=10 to control the batch size. with the big pretrained file, a lower number here helps performance | |
## PandasSeries[list] input so that we can concatenate the class_labels argument | |
## sentences.iloc[0][1] to extract the class lable from the input series | |
## sentence[0] to extract the text to be classified from the input series | |
session.clear_imports() | |
session.clear_packages() | |
from snowflake.snowpark.functions import pandas_udf, object_construct, array_construct |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"user": "YOUR USER NAME", | |
"password": "YOUR PASSWORD", | |
"role": "ACCOUNTADMIN", | |
"account": "ORG-ACCOUNT", | |
"warehouse": "VWH NAME", | |
"database": "DB NAME", | |
"schema": "SCHEMA NAME" | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import traceback | |
import pandas as pd | |
# from sklearn.metrics import accuracy_score, classification_report | |
# from sklearn.model_selection import train_test_split | |
# from sklearn.preprocessing import LabelEncoder | |
from torch import Tensor, device | |
from torch.utils.data import DataLoader | |
from transformers import BartTokenizerFast | |
from transformers import pipeline, BartForSequenceClassification, Trainer, TrainingArguments, \ | |
EvalPrediction, AdamW, get_linear_schedule_with_warmup |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## function specific imports need to be in the same notebook cell to be compiled with UDF | |
## cachetools is a helpeer library to cache the model file for faster loading | |
import cachetools | |
import sys | |
import joblib | |
## This @ decrator tells Snowpark to cache this function | |
## Note: this is an arbitrary python function, not a UDF or SPROC | |
## Note: sys._xoptions.get("snowflake_import_directory") is a system function to get the location of the import dir | |
## Note: serializing the model with joblib is optimal over a zip/gzip |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
name: snowpark_torch_v2_1_12_1 | |
channels: | |
- https://repo.anaconda.com/pkgs/snowflake | |
dependencies: | |
- python=3.9 | |
- pip | |
- traceback2 | |
- pandas | |
- scikit-learn | |
- pytorch |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SET AWS_ACCESS_KEY_ID='YOUR AWS_Access_Key_ID'; | |
SET AWS_SECRET_ACCESS_KEY='YOUR AWS_Secret_Access_Key'; | |
CREATE OR REPLACE SECRET my_aws_access_key | |
TYPE = GENERIC_STRING | |
SECRET_STRING = $AWS_ACCESS_KEY_ID; | |
CREATE OR REPLACE SECRET my_aws_secret_key | |
TYPE = GENERIC_STRING |
NewerOlder