/TextCluster.py Secret
Created
October 11, 2021 09:19
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!pip install -U sentence-transformers | |
#https://colab.research.google.com/drive/182BUqhmnIXBGdefxf7LaMGJnssJNVZoP?usp=sharing#scrollTo=ZYWRFiyhzU0g | |
from sentence_transformers import SentenceTransformer | |
from sklearn.cluster import KMeans | |
embedder = SentenceTransformer('distilbert-base-nli-stsb-mean-tokens') | |
corpus = [ | |
"Advanced knowledge of SQL, R and / Or Python (4 years) with methodologies used for advanced analytics is required. Strong data visualization ability and ability to comprehend advanced Excel outputs to suggest alternate visual representation for concepts. Knowledge of statistics, optimization or related field. Experience with common data science toolkits - Pandas, Qlik Sense, Shiny, Plotly for better visualization", | |
"Minimum 8+ years of experience in developing and delivering end to end applications using Java/J2EE, preferably in a Fintech company. Hands-on experience in designing and developing backend components, frontend components. Strong understanding of agile software practices. Strong understanding of application security standards and best practices. Prior experience in managing a team of engineers. Strong ability to collaborate with cross-functional teams including architects, engineers, quality engineering and operations teams to build solutions.Responsible for architecture assessments, cloud migration strategies and execution, legacy enterprise modernization and enablement across custom technologies and cloud migration opportunities", | |
"Proven skills & experience in solution design and architecture, cloud-based solution using different cloud service provider and cloud native services, including designing the cloud infrastructure, designing the cloud application architecture, and designing the cloud security architecture. Ensuring technical viability and successful deployments, while orchestrating key resources and infusing key Infrastructure technologies (e.g. Windows and Linux IaaS, Security, Networking, etc.), and Application Development and DevOps technologies (e.g. App Service, containers, serverless, cloud native, etc.) as appropriateAdvanced problem-solving skills using programming concepts and Data Structures along with basic computer science fundamentals.", | |
"Ability to extract meaningful scores from text and prior experience in text mining projects will be preferred. Prior experience in building models (data cleaning, dependent variable selection, independent variable study and understanding, variable reduction, bivariate analysis, variables grouping, logistic/linear model build, model validation, etc.) will be preferred. Actuaries/FRM/CFA/CQF/PRM certification would be a plus" | |
] | |
corpus_embeddings = embedder.encode(corpus) | |
# Then, we perform k-means clustering using sklearn: | |
from sklearn.cluster import KMeans | |
num_clusters = 3 | |
clustering_model = KMeans(n_clusters=num_clusters) | |
clustering_model.fit(corpus_embeddings) | |
cluster_assignment = clustering_model.labels_ | |
cluster_assignment | |
clustered_sentences = [[] for i in range(num_clusters)] | |
for sentence_id, cluster_id in enumerate(cluster_assignment): | |
clustered_sentences[cluster_id].append(corpus[sentence_id]) | |
for i, cluster in enumerate(clustered_sentences): | |
print("Cluster ", i+1) | |
print(cluster) | |
print("") | |
#Java and Cloud | |
#ML | |
#Finance | |
#Not bad :) | |
#Cluster 1 | |
#['Minimum 8+ years of experience in developing and delivering end to end applications using Java/J2EE, preferably in a Fintech company. Hands-on experience in designing and developing backend components, frontend components. Strong understanding of agile software practices. Strong understanding of application security standards and best practices. Prior experience in managing a team of engineers. Strong ability to collaborate with cross-functional teams including architects, engineers, quality engineering and operations teams to build solutions.Responsible for architecture assessments, cloud migration strategies and execution, legacy enterprise modernization and enablement across custom technologies and cloud migration opportunities', 'Proven skills & experience in solution design and architecture, cloud-based solution using different cloud service provider and cloud native services, including designing the cloud infrastructure, designing the cloud application architecture, and designing the cloud security architecture. Ensuring technical viability and successful deployments, while orchestrating key resources and infusing key Infrastructure technologies (e.g. Windows and Linux IaaS, Security, Networking, etc.), and Application Development and DevOps technologies (e.g. App Service, containers, serverless, cloud native, etc.) as appropriateAdvanced problem-solving skills using programming concepts and Data Structures along with basic computer science fundamentals.'] | |
#Cluster 2 | |
#['Advanced knowledge of SQL, R and / Or Python (4 years) with methodologies used for advanced analytics is required. Strong data visualization ability and ability to comprehend advanced Excel outputs to suggest alternate visual representation for concepts. Knowledge of statistics, optimization or related field. Experience with common data science toolkits - Pandas, Qlik Sense, Shiny, Plotly for better visualization'] | |
#Cluster 3 | |
#['Ability to extract meaningful scores from text and prior experience in text mining projects will be preferred. Prior experience in building models (data cleaning, dependent variable selection, independent variable study and understanding, variable reduction, bivariate analysis, variables grouping, logistic/linear model build, model validation, etc.) will be preferred. Actuaries/FRM/CFA/CQF/PRM certification would be a plus'] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment