- Online Prototyping
- User Testing & Feedback
- UI Design Patterns
- Colours & Gradients
- User & Profile Photos
- Stock Photography
- Icons
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Tech Company Information Extractor | |
This script extracts information about technology companies from a large text corpus | |
using the Fireworks AI API. It processes the input in chunks, extracts structured data | |
based on a predefined schema, and saves the results in multiple formats. | |
Requirements: | |
- Python 3.7+ | |
- requests |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def makelist_ofallstrings(pdrow): | |
temp_list =[] | |
for i in range(len(pdrow)): | |
list_sep = (pdrow[i]).split(sep=',') | |
for j in list_sep: | |
temp_list.append(j) | |
temp_list = [x.strip() for x in temp_list ] | |
df_temp = pd.Series(temp_list ) | |
df_temp = df_temp.astype(str) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
df_order1 = pd.read_excel(r'visit_orders_jan_thru_may2018.xlsx') df_order2 = pd.read_excel(r'visit_orders_JUN_thru_DEC2018.xlsx') | |
df_order_total = pd.concat([df_order1,df_order2]) | |
df_order_total.info | |
df_order_total.shape | |
""" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def nonetype_remove(pdrow): | |
res = [i for i in pdrow if i] | |
return res | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def basic_clean(text): | |
"""" | |
A simple function to clean up the data. All the words that | |
are not designated as a stop word is then lemmatized after | |
encoding and basic regex parsing are performed. | |
""" | |
wnl = nltk.stem.WordNetLemmatizer() | |
stopwords = nltk.corpus.stopwords.words('english') | |
text = (unicodedata.normalize('NFKD', text) | |
.encode('ascii', 'ignore') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def basic_cleanandcaptialize(lisofstrings): | |
return_list = [] | |
for string in (lisofstrings): | |
string = " ".join(basic_clean(string)) | |
return_list.append((string.title())) | |
return return_list |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df1['Data_total_grams'] = data | |
#words_indices_dataframe = pd.DataFrame() | |
for j in range(len(df1['Data_total_grams'])): | |
df1['Data_total_grams'][j] = df1['Data_total_grams'][j].split(sep=',') | |
df1['Data_total_grams'][j] = basic_cleanandcaptialize(df1['Data_total_grams'][j]) | |
for p in range(len(df1['Data_total_grams'][j])): | |
#print(len(list_of_strings[p].strip())) | |
q = (df1['Data_total_grams'][j][p].strip()).split() | |