shaypal5/pdp_post_adv2.py

## pdp_post_adv2.py
>>> mp = MyPipelineAndModel(
      savings_max_val=101,
      drop_gender=False,
      standardize=True,
      ohencode_country=True,
      savings_bin_val=1,
      pca_threshold=25,
      fit_intercept=True)
>>> mp
<PdPipeline -> LogisticRegression>
>>> mp.estimator
LogisticRegression()
>>> mp.pipeline
A pdpipe pipeline:
[ 0]  Drop columns Columns with at least 0.2 missing value rate
[ 1]  Drop rows by label values
[ 2]  Encode label values
[ 3]  Drop columns 'Name'
[ 4]  Apply dataframe method set_index with kwargs {'keys': 'id'}
[ 5]  Drop rows by qualifier <RowQualifier: Qualify rows with X[Savings] >
      101>
[ 6]  Assign column Viking with df[Country].isin(['Denmark', 'Finland']) &
      ~df[Bearded]
[ 7]  Assign column YearlyGrands with df[Savings] * 1000 / df[Age]
[ 8]  Bin Savings by [1].
[ 9]  One-hot encode 'Country'
[10]  Tokenize Quote
[11]  Stemming tokens in Quote...
[12]  Remove stopwords from Quote
[13]  Count-vectorizing column Quote.
[14]  Decompose columns Columns that start with Quote with PCA
[15]  Encode 'Savings_bin', 'Gender'
[16]  Scale columns Columns of dtypes <class 'numpy.number'>
[17]  Drop columns 'Bearded'
[18]  Transform input dataframes to the following schema: <Learnable Schema>
[19]  Validates conditions
	>>> mp = MyPipelineAndModel(
	savings_max_val=101,
	drop_gender=False,
	standardize=True,
	ohencode_country=True,
	savings_bin_val=1,
	pca_threshold=25,
	fit_intercept=True)
	>>> mp
	<PdPipeline -> LogisticRegression>
	>>> mp.estimator
	LogisticRegression()
	>>> mp.pipeline
	A pdpipe pipeline:
	[ 0] Drop columns Columns with at least 0.2 missing value rate
	[ 1] Drop rows by label values
	[ 2] Encode label values
	[ 3] Drop columns 'Name'
	[ 4] Apply dataframe method set_index with kwargs {'keys': 'id'}
	[ 5] Drop rows by qualifier <RowQualifier: Qualify rows with X[Savings] >
	101>
	[ 6] Assign column Viking with df[Country].isin(['Denmark', 'Finland']) &
	~df[Bearded]
	[ 7] Assign column YearlyGrands with df[Savings] * 1000 / df[Age]
	[ 8] Bin Savings by [1].
	[ 9] One-hot encode 'Country'
	[10] Tokenize Quote
	[11] Stemming tokens in Quote...
	[12] Remove stopwords from Quote
	[13] Count-vectorizing column Quote.
	[14] Decompose columns Columns that start with Quote with PCA
	[15] Encode 'Savings_bin', 'Gender'
	[16] Scale columns Columns of dtypes <class 'numpy.number'>
	[17] Drop columns 'Bearded'
	[18] Transform input dataframes to the following schema: <Learnable Schema>
	[19] Validates conditions