Skip to content

Instantly share code, notes, and snippets.

@advenk
advenk / gsoc25.md
Last active September 1, 2025 17:50
Enhancing Hindi IE - DBpedia Hindi Chapter 2025

Contributor: Aditya Venkatesh

Mentors: Dr. Sanju Tiwari, Debarghya Datta, Dr. Ronak Panchal

Description: This project took place over the summer of 2025 as part of Google Summer of Code under DBpedia. The aim of this project was to evaluate and enhance various stages of the existing information extraction pipeline from Hindi text. The goals of this project were multi-fold:

  • Streamline the existing pipeline and make it easy to run
  • Evaluate the performance of the existing pipeline
  • Experiment and implement new triplet extraction methods using Small Language Models (SLM)
@advenk
advenk / SPARQL.md
Last active September 1, 2025 13:36
Hindi SPARQL Endpoint - Motivation and Sample Queries

Report: Hindi DBpedia SPARQL Endpoint Query Demonstration

Date: 29 July 2025

Objective: To demonstrate the successful deployment and operational capability of the new Hindi DBpedia SPARQL endpoint.

1. Introduction

This document first introduces how to deploy the Hindi endpoint on any server with simple docker commands. This endpoint is deployed against the Hindi wiki dump as of 1st June 2025. If this needs to be updated, the dump needs to be extracted again and deployed seperately.