Skip to content

Instantly share code, notes, and snippets.

@adityantamarapu
Last active August 26, 2020 03:07
Show Gist options
  • Save adityantamarapu/113cbc803a627891d48936722fb8da5a to your computer and use it in GitHub Desktop.
Save adityantamarapu/113cbc803a627891d48936722fb8da5a to your computer and use it in GitHub Desktop.
Created on Skills Network Labs
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Data Wrangling"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>medName</th>\n",
" <th>medlife_price</th>\n",
" <th>manufacturer</th>\n",
" <th>Ingredients</th>\n",
" <th>medicineType</th>\n",
" <th>Pack Size</th>\n",
" <th>strength</th>\n",
" <th>pack_form</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>ABAMUNE 300 MG</td>\n",
" <td>1333.55</td>\n",
" <td>CIPLA LTD.</td>\n",
" <td>ABACAVIR</td>\n",
" <td>TABLET</td>\n",
" <td>30</td>\n",
" <td>300 MG</td>\n",
" <td>Strip</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>ABAMUNE L</td>\n",
" <td>2537.00</td>\n",
" <td>CIPLA LTD.</td>\n",
" <td>ABACAVIR</td>\n",
" <td>TABLET</td>\n",
" <td>30</td>\n",
" <td>Not Available</td>\n",
" <td>Strip</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>ACEFLEX PLUS 100/500 MG</td>\n",
" <td>15.99</td>\n",
" <td>CIPLA LTD.</td>\n",
" <td>ACECLOFENAC</td>\n",
" <td>TABLET</td>\n",
" <td>10</td>\n",
" <td>100/500 MG</td>\n",
" <td>Strip</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>ACEFLEX PLUS 100/500 MG</td>\n",
" <td>15.99</td>\n",
" <td>CIPLA LTD.</td>\n",
" <td>ACECLOFENAC</td>\n",
" <td>TABLET</td>\n",
" <td>10</td>\n",
" <td>100/500 MG</td>\n",
" <td></td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>ACIGENE MINT</td>\n",
" <td>5.18</td>\n",
" <td>CIPLA LTD.</td>\n",
" <td>NaN</td>\n",
" <td>TABLET</td>\n",
" <td>10</td>\n",
" <td>Not Available</td>\n",
" <td>Strip</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" medName medlife_price manufacturer Ingredients \\\n",
"0 ABAMUNE 300 MG 1333.55 CIPLA LTD. ABACAVIR \n",
"1 ABAMUNE L 2537.00 CIPLA LTD. ABACAVIR \n",
"2 ACEFLEX PLUS 100/500 MG 15.99 CIPLA LTD. ACECLOFENAC \n",
"3 ACEFLEX PLUS 100/500 MG 15.99 CIPLA LTD. ACECLOFENAC \n",
"4 ACIGENE MINT 5.18 CIPLA LTD. NaN \n",
"\n",
" medicineType Pack Size strength pack_form \n",
"0 TABLET 30 300 MG Strip \n",
"1 TABLET 30 Not Available Strip \n",
"2 TABLET 10 100/500 MG Strip \n",
"3 TABLET 10 100/500 MG \n",
"4 TABLET 10 Not Available Strip "
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"df=pd.read_excel(r\"Dataset M.xlsx\")\n",
"df1=pd.read_excel(r\"Dataset O.xlsx\")\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>medName</th>\n",
" <th>onemg_price</th>\n",
" <th>Pack Size</th>\n",
" <th>Pack Form</th>\n",
" <th>Strength</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Cofsils Cough Syrup</td>\n",
" <td>69.7</td>\n",
" <td>100</td>\n",
" <td>Bottle</td>\n",
" <td>Not Available</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Nicogum 2 Mini Lozenges Mint Plus</td>\n",
" <td>76.5</td>\n",
" <td>10</td>\n",
" <td>Strip</td>\n",
" <td>Not Available</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Nicogum 4 Nicotine Gum Fresh Mint Sugar Free</td>\n",
" <td>74.8</td>\n",
" <td>10</td>\n",
" <td>Packet</td>\n",
" <td>Not Available</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Nicotex 21mg Patch</td>\n",
" <td>637.5</td>\n",
" <td>7</td>\n",
" <td>Packet</td>\n",
" <td>21mg</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Nicotex 4mg Chewing Gums Classic Fresh Mint</td>\n",
" <td>74.8</td>\n",
" <td>9</td>\n",
" <td>Packet</td>\n",
" <td>4mg</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" medName onemg_price Pack Size \\\n",
"0 Cofsils Cough Syrup 69.7 100 \n",
"1 Nicogum 2 Mini Lozenges Mint Plus 76.5 10 \n",
"2 Nicogum 4 Nicotine Gum Fresh Mint Sugar Free 74.8 10 \n",
"3 Nicotex 21mg Patch 637.5 7 \n",
"4 Nicotex 4mg Chewing Gums Classic Fresh Mint 74.8 9 \n",
"\n",
" Pack Form Strength \n",
"0 Bottle Not Available \n",
"1 Strip Not Available \n",
"2 Packet Not Available \n",
"3 Packet 21mg \n",
"4 Packet 4mg "
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.drop(['manufacturer','Ingredients','medicineType'],axis=1,inplace=True)\n",
"df1.drop(['manufacturer','Ingredients','Prescription Required','Units in Pack','Stock'],axis=1,inplace=True)\n",
"new=df1[\"onemg_price\"].str.split('¹',1,expand=True)\n",
"df1[\"onemg_price\"]=new[1]\n",
"df1.head()\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>medName</th>\n",
" <th>onemg_price</th>\n",
" <th>Pack Size</th>\n",
" <th>Pack Form</th>\n",
" <th>Strength</th>\n",
" <th>medNameSplit1</th>\n",
" <th>medNameSplit2</th>\n",
" <th>medNameSplit3</th>\n",
" <th>medNameSplit4</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Cofsils Cough Syrup</td>\n",
" <td>69.7</td>\n",
" <td>100</td>\n",
" <td>Bottle</td>\n",
" <td>Not Available</td>\n",
" <td>Cofsils</td>\n",
" <td>Cough Syrup</td>\n",
" <td>Cough</td>\n",
" <td>Syrup</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Nicogum 2 Mini Lozenges Mint Plus</td>\n",
" <td>76.5</td>\n",
" <td>10</td>\n",
" <td>Strip</td>\n",
" <td>Not Available</td>\n",
" <td>Nicogum</td>\n",
" <td>2 Mini Lozenges Mint Plus</td>\n",
" <td>2</td>\n",
" <td>Mini Lozenges Mint Plus</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Nicogum 4 Nicotine Gum Fresh Mint Sugar Free</td>\n",
" <td>74.8</td>\n",
" <td>10</td>\n",
" <td>Packet</td>\n",
" <td>Not Available</td>\n",
" <td>Nicogum</td>\n",
" <td>4 Nicotine Gum Fresh Mint Sugar Free</td>\n",
" <td>4</td>\n",
" <td>Nicotine Gum Fresh Mint Sugar Free</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Nicotex 21mg Patch</td>\n",
" <td>637.5</td>\n",
" <td>7</td>\n",
" <td>Packet</td>\n",
" <td>21mg</td>\n",
" <td>Nicotex</td>\n",
" <td>21mg Patch</td>\n",
" <td>21mg</td>\n",
" <td>Patch</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Nicotex 4mg Chewing Gums Classic Fresh Mint</td>\n",
" <td>74.8</td>\n",
" <td>9</td>\n",
" <td>Packet</td>\n",
" <td>4mg</td>\n",
" <td>Nicotex</td>\n",
" <td>4mg Chewing Gums Classic Fresh Mint</td>\n",
" <td>4mg</td>\n",
" <td>Chewing Gums Classic Fresh Mint</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" medName onemg_price Pack Size \\\n",
"0 Cofsils Cough Syrup 69.7 100 \n",
"1 Nicogum 2 Mini Lozenges Mint Plus 76.5 10 \n",
"2 Nicogum 4 Nicotine Gum Fresh Mint Sugar Free 74.8 10 \n",
"3 Nicotex 21mg Patch 637.5 7 \n",
"4 Nicotex 4mg Chewing Gums Classic Fresh Mint 74.8 9 \n",
"\n",
" Pack Form Strength medNameSplit1 \\\n",
"0 Bottle Not Available Cofsils \n",
"1 Strip Not Available Nicogum \n",
"2 Packet Not Available Nicogum \n",
"3 Packet 21mg Nicotex \n",
"4 Packet 4mg Nicotex \n",
"\n",
" medNameSplit2 medNameSplit3 \\\n",
"0 Cough Syrup Cough \n",
"1 2 Mini Lozenges Mint Plus 2 \n",
"2 4 Nicotine Gum Fresh Mint Sugar Free 4 \n",
"3 21mg Patch 21mg \n",
"4 4mg Chewing Gums Classic Fresh Mint 4mg \n",
"\n",
" medNameSplit4 \n",
"0 Syrup \n",
"1 Mini Lozenges Mint Plus \n",
"2 Nicotine Gum Fresh Mint Sugar Free \n",
"3 Patch \n",
"4 Chewing Gums Classic Fresh Mint "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"new1=df1[\"medName\"].str.split(' ',1,expand=True)\n",
"df1[\"medNameSplit1\"]=new1[0]\n",
"df1[\"medNameSplit2\"]=new1[1]\n",
"new11=df1[\"medNameSplit2\"].str.split(' ',1,expand=True)\n",
"df1[\"medNameSplit3\"]=new11[0]\n",
"df1[\"medNameSplit4\"]=new11[1]\n",
"df1.head()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>medName</th>\n",
" <th>medlife_price</th>\n",
" <th>Pack Size</th>\n",
" <th>strength</th>\n",
" <th>pack_form</th>\n",
" <th>medNameSplit1</th>\n",
" <th>medNameSplit2</th>\n",
" <th>medNameSplit3</th>\n",
" <th>medNameSplit4</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>ABAMUNE 300 MG</td>\n",
" <td>1333.55</td>\n",
" <td>30</td>\n",
" <td>300 MG</td>\n",
" <td>Strip</td>\n",
" <td>ABAMUNE</td>\n",
" <td>300 MG</td>\n",
" <td>300</td>\n",
" <td>MG</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>ABAMUNE L</td>\n",
" <td>2537.00</td>\n",
" <td>30</td>\n",
" <td>Not Available</td>\n",
" <td>Strip</td>\n",
" <td>ABAMUNE</td>\n",
" <td>L</td>\n",
" <td>L</td>\n",
" <td></td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>ACEFLEX PLUS 100/500 MG</td>\n",
" <td>15.99</td>\n",
" <td>10</td>\n",
" <td>100/500 MG</td>\n",
" <td>Strip</td>\n",
" <td>ACEFLEX</td>\n",
" <td>PLUS 100/500 MG</td>\n",
" <td>PLUS</td>\n",
" <td>100/500 MG</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>ACEFLEX PLUS 100/500 MG</td>\n",
" <td>15.99</td>\n",
" <td>10</td>\n",
" <td>100/500 MG</td>\n",
" <td></td>\n",
" <td>ACEFLEX</td>\n",
" <td>PLUS 100/500 MG</td>\n",
" <td>PLUS</td>\n",
" <td>100/500 MG</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>ACIGENE MINT</td>\n",
" <td>5.18</td>\n",
" <td>10</td>\n",
" <td>Not Available</td>\n",
" <td>Strip</td>\n",
" <td>ACIGENE</td>\n",
" <td>MINT</td>\n",
" <td>MINT</td>\n",
" <td></td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" medName medlife_price Pack Size strength pack_form \\\n",
"0 ABAMUNE 300 MG 1333.55 30 300 MG Strip \n",
"1 ABAMUNE L 2537.00 30 Not Available Strip \n",
"2 ACEFLEX PLUS 100/500 MG 15.99 10 100/500 MG Strip \n",
"3 ACEFLEX PLUS 100/500 MG 15.99 10 100/500 MG \n",
"4 ACIGENE MINT 5.18 10 Not Available Strip \n",
"\n",
" medNameSplit1 medNameSplit2 medNameSplit3 medNameSplit4 \n",
"0 ABAMUNE 300 MG 300 MG \n",
"1 ABAMUNE L L \n",
"2 ACEFLEX PLUS 100/500 MG PLUS 100/500 MG \n",
"3 ACEFLEX PLUS 100/500 MG PLUS 100/500 MG \n",
"4 ACIGENE MINT MINT "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"new0=df[\"medName\"].str.split(' ',1,expand=True)\n",
"df[\"medNameSplit1\"]=new0[0]\n",
"df[\"medNameSplit2\"]=new0[1]\n",
"new00=df[\"medNameSplit2\"].str.split(' ',1,expand=True)\n",
"df[\"medNameSplit3\"]=new00[0]\n",
"df[\"medNameSplit4\"]=new00[1]\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(4284, 9)"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df00=df.drop_duplicates(keep=\"first\")\n",
"df11=df1.drop_duplicates(keep='first')\n",
"df11.shape"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df00['medName'] = df00['medName'].str.lower() \n",
"df00['medNameSplit1'] = df00['medNameSplit1'].str.lower() \n",
"df00['medNameSplit2'] = df00['medNameSplit2'].str.lower() \n",
"df00['medNameSplit3'] = df00['medNameSplit3'].str.lower() \n",
"df00['medNameSplit4'] = df00['medNameSplit4'].str.lower()\n",
"df11['medName'] = df11['medName'].str.lower() \n",
"df11['medNameSplit1'] = df11['medNameSplit1'].str.lower() \n",
"df11['medNameSplit2'] = df11['medNameSplit2'].str.lower() \n",
"df11['medNameSplit3'] = df11['medNameSplit3'].str.lower() \n",
"df11['medNameSplit4'] = df11['medNameSplit4'].str.lower()\n",
"df11.sort_values(\"medName\", axis = 0, ascending = True, \n",
" inplace = True, na_position ='last') \n",
"df00.sort_values(\"medName\", axis = 0, ascending = True, \n",
" inplace = True, na_position ='last') "
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>medName_x</th>\n",
" <th>medlife_price</th>\n",
" <th>Pack Size_x</th>\n",
" <th>strength</th>\n",
" <th>pack_form</th>\n",
" <th>medNameSplit1</th>\n",
" <th>medNameSplit2</th>\n",
" <th>medNameSplit3</th>\n",
" <th>medNameSplit4</th>\n",
" <th>medName_y</th>\n",
" <th>onemg_price</th>\n",
" <th>Pack Size_y</th>\n",
" <th>Pack Form</th>\n",
" <th>Strength</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>ab phylline</td>\n",
" <td>117.82</td>\n",
" <td>100 ML</td>\n",
" <td>Not Available</td>\n",
" <td>Bottle</td>\n",
" <td>ab</td>\n",
" <td>phylline</td>\n",
" <td>phylline</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>ab phylline 100 mg</td>\n",
" <td>104.55</td>\n",
" <td>10</td>\n",
" <td>100 MG</td>\n",
" <td>Capsule</td>\n",
" <td>ab</td>\n",
" <td>phylline 100 mg</td>\n",
" <td>phylline</td>\n",
" <td>100 mg</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>ab phylline 100 mg</td>\n",
" <td>104.55</td>\n",
" <td>10</td>\n",
" <td>100 MG</td>\n",
" <td>Bottle</td>\n",
" <td>ab</td>\n",
" <td>phylline 100 mg</td>\n",
" <td>phylline</td>\n",
" <td>100 mg</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>ab phylline n</td>\n",
" <td>148.32</td>\n",
" <td>10</td>\n",
" <td>Not Available</td>\n",
" <td>Strip</td>\n",
" <td>ab</td>\n",
" <td>phylline n</td>\n",
" <td>phylline</td>\n",
" <td>n</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>ab phylline sr 200 mg</td>\n",
" <td>163.20</td>\n",
" <td>10</td>\n",
" <td>200 MG</td>\n",
" <td>Bottle</td>\n",
" <td>ab</td>\n",
" <td>phylline sr 200 mg</td>\n",
" <td>phylline</td>\n",
" <td>sr 200 mg</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" medName_x medlife_price Pack Size_x strength pack_form \\\n",
"0 ab phylline 117.82 100 ML Not Available Bottle \n",
"1 ab phylline 100 mg 104.55 10 100 MG Capsule \n",
"2 ab phylline 100 mg 104.55 10 100 MG Bottle \n",
"3 ab phylline n 148.32 10 Not Available Strip \n",
"4 ab phylline sr 200 mg 163.20 10 200 MG Bottle \n",
"\n",
" medNameSplit1 medNameSplit2 medNameSplit3 medNameSplit4 medName_y \\\n",
"0 ab phylline phylline NaN \n",
"1 ab phylline 100 mg phylline 100 mg NaN \n",
"2 ab phylline 100 mg phylline 100 mg NaN \n",
"3 ab phylline n phylline n NaN \n",
"4 ab phylline sr 200 mg phylline sr 200 mg NaN \n",
"\n",
" onemg_price Pack Size_y Pack Form Strength \n",
"0 NaN NaN NaN NaN \n",
"1 NaN NaN NaN NaN \n",
"2 NaN NaN NaN NaN \n",
"3 NaN NaN NaN NaN \n",
"4 NaN NaN NaN NaN "
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dffinal=pd.merge(df00, df11, how='left', left_on=['medNameSplit1','medNameSplit2','medNameSplit3','medNameSplit4']\n",
" , right_on=['medNameSplit1','medNameSplit2','medNameSplit3','medNameSplit4'])\n",
"dffinal.head()\n",
"# dffinal=pd.concat([df00[\"medName\"],df11[\"medName\"]],join='left')\n",
"# dffinal.head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df00.cmp(df11)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###### Hi, its been a while since I learnt python I usually used tools like Excel and Tableau for the most parts. So, It has been refreshing to try to complete this task. As you can see I tried many ways to solve this but I couldnt. But, I am not willing to give up learning. I am tired of all the fake internship offers on Internshalla who either try to sell courses or ask to do marketing for them. I was actually glad when I was given an actual task which was really hard. If you guys are not going to go through with my resume, will you atleast send me more tasks like this, because I am not giving up learning. Thanks."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python",
"language": "python",
"name": "conda-env-python-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.11"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment