Skip to content

Instantly share code, notes, and snippets.

View cyberandy's full-sized avatar
:octocat:
Yo!

Andrea Volpini cyberandy

:octocat:
Yo!
View GitHub Profile
Verifying my Blockstack ID is secured with the address 13a7qtZbLBSMrY32fZLoKtcnnkEzRmezob https://explorer.blockstack.org/address/13a7qtZbLBSMrY32fZLoKtcnnkEzRmezob
@cyberandy
cyberandy / generate-md.py
Last active August 19, 2023 18:53 — forked from pshapiro/metadesc.py
Use sumy summarizer to extract summary from HTML pages that can be used for meta descriptions.
import csv
import os
import requests, sys
import pandas as pd
from sumy.parsers.html import HtmlParser
from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lsa import LsaSummarizer as Lsa
from sumy.summarizers.luhn import LuhnSummarizer as Luhn
from sumy.summarizers.text_rank import TextRankSummarizer as TxtRank
@cyberandy
cyberandy / generate-kw.py
Last active April 26, 2019 20:30
How to generate keywords using textgenrnn (read more about it here: https://wordlift.io/blog/en/keyword-suggestion-tool-tensorflow/)
from textgenrnn import textgenrnn
textgen = textgenrnn(weights_path='keygen_weights.hdf5',
vocab_path='keygen_vocab.json',
config_path='keygen_config.json')
textgen.generate_samples(max_gen_length=5, temperatures=[0.2,0.5])
@cyberandy
cyberandy / read-GNews.py
Last active January 5, 2024 19:48
A super-simple python script to read Google News RSS feeds and store data in a CSV file.
# Expected use >> python read-GNews.py -q [query] -l [language] -p [country]
# The following command will search for the latest news written in German from Austria about "Redbull"
# python read-GNews.py -q Redbull -l de -p AT
#
# Queries can be provided as strings using quotation marks >> python read-GNews.py -q "Redbull Media House" -l de
# Multiple queries can be executed at once >> python read-GNews.py -q "Redbull Media House" -q Redbull -l de -p at -p de
# The script will save a CSV file containing Title, Link, pubDate, Description, Source and Alexa Traffic Rank.
import feedparser
import time
import requests
from bs4 import BeautifulSoup
import time
USER_AGENT = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'}
def fetch_results(search_term, number_results, language_code):
assert isinstance(search_term, str), 'Search term must be a string'
assert isinstance(number_results, int), 'Number of results must be an integer'
{
"@context": "https://www.w3.org/ns/activitystreams",
"id": "https://wordlift.io/blog/en/#website",
"type": "http://schema.org/WebSite",
"http://schema.org/alternateName": "SEO made simple",
"http://schema.org/name": "WordLift Blog",
"http://schema.org/potentialAction": {
"type": "http://schema.org/SearchAction",
"http://schema.org/query-input": "required name=search_term_string",
"http://schema.org/target": "https://wordlift.io/blog/en/?s={search_term_string}"
@cyberandy
cyberandy / lodgingbusiness-wordlift-markup
Last active March 24, 2023 06:47
Example of a JSON-LD for LodgingBusiness that WordLift produces
[
{
"@context": "http://schema.org",
"@id": "http://data.wordlift.io/[entity-name]",
"@type": ["LodgingBusiness"],
"description": "Here goes the description",
"mainEntityOfPage": "https://www.happywordliftclient.com",
"image": [
{ "@type": "ImageObject", "url": "https://www.happywordliftclient.com/wp-content/uploads/2011/07/img-19az-1200x675.jpg", "width": 1200, "height": 675 },
{ "@type": "ImageObject", "url": "https://www.happywordliftclient.com/wp-content/uploads/2011/07/img-19az-1200x900.jpg", "width": 1200, "height": 900 },
@cyberandy
cyberandy / embeddings_projectors.tsv
Last active December 8, 2021 15:13
KG embeddings WordLift Blog
We can't make this file beautiful and searchable because it's too large.
-1.939915269613265991e-01 1.591661721467971802e-01 -1.672201305627822876e-01 1.209882497787475586e-01 -5.643929913640022278e-02 4.935293793678283691e-01 -6.038862466812133789e-02 -1.090269684791564941e-01 2.572527155280113220e-02 3.857810050249099731e-02 -1.982329189777374268e-01 -1.586464643478393555e-01 2.125328406691551208e-02 -4.903687909245491028e-02 1.097567379474639893e-01 -9.682913124561309814e-02 -1.226566657423973083e-01 -2.898987829685211182e-01 5.542956665158271790e-02 1.637598723173141479e-01 9.308864921331405640e-02 9.676701575517654419e-02 -1.217111721634864807e-01 -1.757894456386566162e-01 -4.662900418043136597e-02 -1.659229546785354614e-01 1.724372655153274536e-01 8.154600858688354492e-02 -1.724783033132553101e-01 3.298109024763107300e-02 -1.414460539817810059e-01 3.574710339307785034e-02 -2.298669368028640747e-01 8.744690567255020142e-02 -1.625491678714752197e-01 -1.283032745122909546e-01 -2.943382263183593750e-01 -6.229398399591445923e-02 1.395162492990493774e-01 -1.934481412172317505e-01 -
@cyberandy
cyberandy / embeddings_meta.tsv
Last active December 8, 2021 15:16
KG embeddings meta
We can make this file beautiful and searchable if this error is corrected: No tabs found in this TSV file in line 0.
wl0216/entity/artificial_intelligence
wl0216/entity/gennaro_cuofano
wl0216/entity/google
wl0216/entity/google_analytics
wl0216/entity/how_to_optimize_your_website_for_voice_search
wl0216/entity/json-ld
wl0216/entity/knowledge_graph
wl0216/entity/metadata
wl0216/entity/microdata_html
wl0216/entity/natural_language_processing
{
"embeddings": [
{
"tensorName": "My tensor",
"tensorShape": [
1000,
50
],
"tensorPath": "https://gist.githubusercontent.com/cyberandy/f40e161d69188df7d5a901b50a64f0fe/raw/674f5c78a0c33d6b6a3da4d726ef918938d4bd68/embeddings_projectors.tsv",
"metadataPath": "https://gist.githubusercontent.com/cyberandy/092f00976a842497a113280197cb61d7/raw/6aab7043891897269c5c35df81f878e4e0a040b1/embeddings_meta.tsv"