Skip to content

Instantly share code, notes, and snippets.

@nutanc
Created September 24, 2024 04:53
Show Gist options
  • Save nutanc/ed1b9b3290c323af866469da9d22a684 to your computer and use it in GitHub Desktop.
Save nutanc/ed1b9b3290c323af866469da9d22a684 to your computer and use it in GitHub Desktop.
import requests
import json
from bs4 import BeautifulSoup
def extract_text_from_url(url):
try:
# Send a GET request to the URL
response = requests.get(url)
response.raise_for_status() # Raise an exception for bad status codes
# Parse the HTML content
soup = BeautifulSoup(response.text, 'html.parser')
# Remove script and style elements
for script in soup(["script", "style"]):
script.decompose()
# Get text
text = soup.get_text()
# Break into lines and remove leading and trailing space on each
lines = (line.strip() for line in text.splitlines())
# Break multi-headlines into a line each
chunks = (phrase.strip() for line in lines for phrase in line.split(" "))
# Remove blank lines
text = '\n'.join(chunk for chunk in chunks if chunk)
return text
except requests.RequestException as e:
return f"An error occurred: {e}"
url = "https://speech-kws.ozonetel.com/text_chunker/chunk_text_semantic"
headers = {
"accept": "application/json",
"Content-Type": "application/json"
}
# extract_url="https://timesofindia.indiatimes.com/etimes/trending/in-this-indian-village-no-one-cooks-food-at-home/articleshow/113600676.cmsl"
# extracted_text = extract_text_from_url(extract_url)
# print(extracted_text)
extracted_text="""
On Sunday, India witnessed what is surely is one of the greatest days in country’s sports history. India had never won gold in 44 previous editions of the Chess Olympiad, only to win two on same day read more
Gukesh, Divya and Co at the wheel: The rapid rise of Indian chess and the leaders behind the revolution
The Indian contingent celebrates during the closing ceremony of the 45th Chess Olympiad in Budapest after winning gold in both the Open as well as the Women's categories. Image credit: X/@FIDE_chess
2024 has been a special year for Indian sports. The Indian men’s cricket team finally ended an 11-year wait for a global title. At the Olympics, shooter Manu Bhaker achieved a series of firsts with two bronze medals and the Indian men’s hockey team won a second consecutive bronze.
And earlier this month, India would sign off from the Paralympics with 29 medals, registering their best performance ever in the multi-sport event.
On Sunday, the nation witnessed what surely is one of the greatest days in Indian sporting history, and not just chess. India had never won gold in 44 previous editions of the Chess Olympiad, only to win two of them in the 45th edition that took place in Budapest, Hungary.
Read | ‘Superbly dominant performance’, Netizens hail Indian team after winning historic double gold
The prestigious tournament has existed for a century now but it was only a decade ago that India finally won a medal — winning bronze in the open category in the 2014 edition in Tromso, Norway. Eight years later, India would make the most of home advantage to win two bronze medals in Chennai — a second in the Open category and a first for the women’s team.
The tournament had been dominated by the Soviet Union for decades and Russia continued to make merry after the USSR was dissolved. Several other nations such as Ukraine, Armenia and China also started to catch up in recent years.
On Sunday, it was India’s turn to announce itself as a new superpower in the world of chess. A nation that no longer depended solely on the exploits of one extraordinary individual, but one that had built solid teams across categories to vie for global titles.
Read | How India won historic double gold in Budapest
Titles such as the ones in the Olympiad that they had narrowly missed out on in Chennai two years ago but finally added to their cabinet on Sunday.
The likes of R Praggnanandhaa, Vidit Gujrathi and Pentala Harikrishna had their moments during the course of the tournament. However, it was only due to dominant performances from D Gukesh and Arjun Erigaisi that allowed the Indian team to virtually assure itself of gold even before the final round had taken place.
Gukesh has been having a stellar run this year, winning the Candidates Tournament in Toronto to become the tournament’s youngest champion that earned him a spot in the World Championship showdown against China’s Ding Liren in Singapore later this year.
And judging by the manner in which he bulldozed his opponents in Budapest, Gukesh appears to have raised his game a notch higher since Candidates.
Read | D Gukesh and other history makers from Indian team in Budapest
The 18-year-old from Chennai, the third-youngest Grandmaster ever, finished the 45th Olympiad with a score of 9/10 score with a 3056 performance rating, winning his second consecutive individual Olympiad gold as a result.
D Gukesh was in red-hot form in the 45th Chess Olympiad, winning eight games in 10 rounds. AP
The most defining part of his campaign, in which he finished with eight wins in 10 rounds, was his victory over Fabiano Caruana during India’s Round 10 meeting with USA. It was that decisive result against the former world No 3 that gave India the belief that the gold was theirs for the taking.
For Gukesh, however, his focus was primarily on helping his team win gold rather on his own performance, and he was prepared to do “whatever it took” to help the Srinath Narayanan-captained team achieve their goal.
“Since what happened last time—we were so close as the team to win gold—this time I thought no matter what I’m going to do whatever it takes to win the team gold.
“I did not really think about the individual performance much, I just wanted the team to win,” Gukesh told Chess24 after India’s victory.
— Susan Polgar (@SusanPolgar) September 22, 2024
Erigaisi was also impressive as he became the second Indian to win individual gold in Board 3, finishing with nine wins after appearing in all 11 rounds that earned him a rating of 2,968. His exploits in Budapest would also help him dislodge Caruana from the No. 3 spot in the world rankings, which currently has two Indians among the top-five with Gukesh at the fifth spot.
On Sunday, it was his victory over Jan Subelj during Round 11 meeting with Slovenia that ultimately confirmed the one of the greatest moments in Indian chess history. And while Gukesh’s victory over Caruana in Round 10 was crucial, Erigaisi’s win against Leinier Dominguez was also important in helping India return to winning ways after being held to a draw by Uzbekistan.
As for the women’s section, Divya Deshmukh and Vantika Agrawal headlined India’s performance with Performance Ratings of 2,608 and 2,558 respectively that helped them win individual gold in their respective boards.
Divya Deshmukh won women’s individual gold in Board 3 at the 45th Chess Olympiad. Image: ChessBaseIndia/X
Deshmukh’s resilience showed in how she stood tall and collected India’s only win in their eighth-round meeting with Poland, defeating Aleksandra Maltsevskaya. The 18-year-old from Nagpur finished with eight wins and three draws in 11 rounds while Agrawal won six out of nine, the remaining ending in stalemates.
And like Praggs, Vidit and Pentala, the trio of R Vaishali, Harika Dronavalli and Tania Sachdev too chipped in with important contributions to help the team finish on top of the podium.
India’s historic feat in Budapest can’t be termed their greatest ever in chess just yet. Let’s not forget one Viswanathan Anand and how he put the country on the chess map and made the sport a household name since becoming the first Grandmaster from India in 1988.
Anand isn’t just India’s greatest chess player of all time after multiple world championship reigns along with other achievements such as breaching the 2800 Elo Rating. He surely has done enough to rank among the greatest of all time.
It’s not just on the chess board where Vishy has excelled; the current crop might not have had the kind of meteoric rise that they have experienced without the icon’s guiding hand. Anand had achieved it all as a player and now has chosen to make an impact on chess similar to what Prakash Padukone and Pullela Gopichand did in badminton.
But the emergence of Gukesh, Arjun, Pragg and Divya in recent years has given Indian chess of a new hope — that of ushering in another golden era, one where there will be multiple contenders for world and Olympiad titles.
It certainly has been an extraordinary year for Indian chess, one where Gukesh won the Candidates Tournament, Arjun rose to the No 3 spot and Praggnanandhaa defeating world No 1 Magnus Carlsen. All that’s left is for Gukesh to wrest the world title from Liren’s grasp a couple of months for now.
The Chinese team decided against having a Liren vs Gukesh face off in Budapest. The latter, however, has fired quite the stern warning with his stellar performance already.
A Bombay Bong with an identity crisis. Passionately follow cricket. Hardcore fan of Team India, the Proteas and junk food. Self-proclaimed shutterbug.
"""
data = {
"text": extracted_text
}
response = requests.post(url, headers=headers, json=data)
print("Status Code:", response.status_code)
print("Response JSON:")
print(json.dumps(response.json(), indent=2))
@nutanc
Copy link
Author

nutanc commented Sep 24, 2024

Example output
Status Code: 200
Response JSON:
{
"segmented_text": [
"\nOn Sunday, India witnessed what is surely is one of the greatest days in country\u2019s sports history. India had never won gold in 44 previous editions of the Chess Olympiad, only to win two on same day read more\n\nGukesh, Divya and Co at the wheel: The rapid rise of Indian chess and the leaders behind the revolution\n\nThe Indian contingent celebrates during the closing ceremony of the 45th Chess Olympiad in Budapest after winning gold in both the Open as well as the Women's categories. Image credit: X/@FIDE_chess\n\n2024 has been a special year for Indian sports. The Indian men\u2019s cricket team finally ended an 11-year wait for a global title. At the Olympics, shooter Manu Bhaker achieved a series of firsts with two bronze medals and the Indian men\u2019s hockey team won a second consecutive bronze.\n\n",
"And earlier this month, India would sign off from the Paralympics with 29 medals, registering their best performance ever in the multi-sport event.\n\n On Sunday, the nation witnessed what surely is one of the greatest days in Indian sporting history, and not just chess. India had never won gold in 44 previous editions of the Chess Olympiad, only to win two of them in the 45th edition that took place in Budapest, Hungary.\n\n Read | \u2018Superbly dominant performance\u2019, Netizens hail Indian team after winning historic double gold\n\nThe prestigious tournament has existed for a century now but it was only a decade ago that India finally won a medal \u2014 winning bronze in the open category in the 2014 edition in Tromso, Norway. Eight years later, India would make the most of home advantage to win two bronze medals in Chennai \u2014 a second in the Open category and a first for the women\u2019s team.\n\n The tournament had been dominated by the Soviet Union for decades and Russia continued to make merry after the USSR was dissolved. Several other nations such as Ukraine, Armenia and China also started to catch up in recent years.\n\n On Sunday, it was India\u2019s turn to announce itself as a new superpower in the world of chess. A nation that no longer depended solely on the exploits of one extraordinary individual, but one that had built solid teams across categories to vie for global titles.\n\n Read | How India won historic double gold in Budapest\n\nTitles such as the ones in the Olympiad that they had narrowly missed out on in Chennai two years ago but finally added to their cabinet on Sunday.\n\n",
"The likes of R Praggnanandhaa, Vidit Gujrathi and Pentala Harikrishna had their moments during the course of the tournament. However, it was only due to dominant performances from D Gukesh and Arjun Erigaisi that allowed the Indian team to virtually assure itself of gold even before the final round had taken place.\n\n Gukesh has been having a stellar run this year, winning the Candidates Tournament in Toronto to become the tournament\u2019s youngest champion that earned him a spot in the World Championship showdown against China\u2019s Ding Liren in Singapore later this year.\n\n And judging by the manner in which he bulldozed his opponents in Budapest, Gukesh appears to have raised his game a notch higher since Candidates.\n\n Read | D Gukesh and other history makers from Indian team in Budapest\n\nThe 18-year-old from Chennai, the third-youngest Grandmaster ever, finished the 45th Olympiad with a score of 9/10 score with a 3056 performance rating, winning his second consecutive individual Olympiad gold as a result.\n",
"D Gukesh was in red-hot form in the 45th Chess Olympiad, winning eight games in 10 rounds. AP\n\nThe most defining part of his campaign, in which he finished with eight wins in 10 rounds, was his victory over Fabiano Caruana during India\u2019s Round 10 meeting with USA. It was that decisive result against the former world No 3 that gave India the belief that the gold was theirs for the taking.\n\n For Gukesh, however, his focus was primarily on helping his team win gold rather on his own performance, and he was prepared to do \u201cwhatever it took\u201d to help the Srinath Narayanan-captained team achieve their goal.\n\n \u201cSince what happened last time\u2014we were so close as the team to win gold\u2014this time I thought no matter what I\u2019m going to do whatever it takes to win the team gold.\n\n \u201cI did not really think about the individual performance much, I just wanted the team to win,\u201d Gukesh told Chess24 after India\u2019s victory.\n\n \u2014 Susan Polgar (@SusanPolgar) September 22, 2024\n\nErigaisi was also impressive as he became the second Indian to win individual gold in Board 3, finishing with nine wins after appearing in all 11 rounds that earned him a rating of 2,968. His exploits in Budapest would also help him dislodge Caruana from the No. 3 spot in the world rankings, which currently has two Indians among the top-five with Gukesh at the fifth spot.\n\n On Sunday, it was his victory over Jan Subelj during Round 11 meeting with Slovenia that ultimately confirmed the one of the greatest moments in Indian chess history.",
"And while Gukesh\u2019s victory over Caruana in Round 10 was crucial, Erigaisi\u2019s win against Leinier Dominguez was also important in helping India return to winning ways after being held to a draw by Uzbekistan.\n\n As for the women\u2019s section, Divya Deshmukh and Vantika Agrawal headlined India\u2019s performance with Performance Ratings of 2,608 and 2,558 respectively that helped them win individual gold in their respective boards.\n",
"Divya Deshmukh won women\u2019s individual gold in Board 3 at the 45th Chess Olympiad. Image: ChessBaseIndia/X\n\nDeshmukh\u2019s resilience showed in how she stood tall and collected India\u2019s only win in their eighth-round meeting with Poland, defeating Aleksandra Maltsevskaya. The 18-year-old from Nagpur finished with eight wins and three draws in 11 rounds while Agrawal won six out of nine, the remaining ending in stalemates.\n\n And like Praggs, Vidit and Pentala, the trio of R Vaishali, Harika Dronavalli and Tania Sachdev too chipped in with important contributions to help the team finish on top of the podium.\n\n India\u2019s historic feat in Budapest can\u2019t be termed their greatest ever in chess just yet. Let\u2019s not forget one Viswanathan Anand and how he put the country on the chess map and made the sport a household name since becoming the first Grandmaster from India in 1988.\n\n Anand isn\u2019t just India\u2019s greatest chess player of all time after multiple world championship reigns along with other achievements such as breaching the 2800 Elo Rating. He surely has done enough to rank among the greatest of all time.\n\n It\u2019s not just on the chess board where Vishy has excelled; the current crop might not have had the kind of meteoric rise that they have experienced without the icon\u2019s guiding hand. Anand had achieved it all as a player and now has chosen to make an impact on chess similar to what Prakash Padukone and Pullela Gopichand did in badminton.\n\n But the emergence of Gukesh, Arjun, Pragg and Divya in recent years has given Indian chess of a new hope \u2014 that of ushering in another golden era, one where there will be multiple contenders for world and Olympiad titles.\n It certainly has been an extraordinary year for Indian chess, one where Gukesh won the Candidates Tournament, Arjun rose to the No 3 spot and Praggnanandhaa defeating world No 1 Magnus Carlsen. All that\u2019s left is for Gukesh to wrest the world title from Liren\u2019s grasp a couple of months for now.\n\n The Chinese team decided against having a Liren vs Gukesh face off in Budapest. The latter, however, has fired quite the stern warning with his stellar performance already.\n\n A Bombay Bong with an identity crisis. Passionately follow cricket. Hardcore fan of Team India, the Proteas and junk food.",
"Self-proclaimed shutterbug. \n"
]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment