Skip to content

Instantly share code, notes, and snippets.

View ZanSara's full-sized avatar

ZanSara ZanSara

View GitHub Profile
@ZanSara
ZanSara / url_reconstruction.py
Last active December 10, 2021 11:02
Add links to Wikipedia to crawled pages
import os
from time import sleep
from pprint import pprint
import requests
from pathlib import Path
from haystack.document_stores.elasticsearch import ElasticsearchDocumentStore
from haystack.utils import launch_es
from haystack.nodes import DensePassageRetriever
@ZanSara
ZanSara / text_to_speech_with_haystack_pipeline.py
Last active May 20, 2022 14:30
text_to_speech_with_haystack_pipeline.py
import logging
from typing import TYPE_CHECKING, Union, List, Optional, Dict, Any, Tuple
import os
import hashlib
from pathlib import Path
from dataclasses import asdict
if not TYPE_CHECKING:
from pydantic.dataclasses import dataclass
@ZanSara
ZanSara / wikimedia_category_downloader.py
Created November 28, 2022 13:38
Script to download all Wikimedia images in a category
# From https://colab.research.google.com/drive/12jGo_tm2bAD7NRiqxvF-XfKfEWgKIx4X#scrollTo=sDL9EihTwBaC&uniqifier=1
# pip install lxml aiohttp asyncio nest_asyncio aiofiles
import shutil
from lxml import etree
from lxml import html
import aiohttp
import asyncio
import aiofiles
@ZanSara
ZanSara / wikipedia_list_downloader.py
Last active December 6, 2022 11:42
Wikipedia get list of pages
from pathlib import Path
import os
import logging
import wikipedia
OUTPUT_DIR = Path(__file__).parent / "animals"
LIST_FILE = Path(__file__).parent / "list_of_zoo_animals.txt"
OVERWRITE_EXISTING = False
with open(LIST_FILE, 'r') as list_file:
@ZanSara
ZanSara / MultiModalRetriever_live_coding.ipynb
Created October 7, 2023 18:26
MultiModalRetriever - Live coding
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ZanSara
ZanSara / Dec_1st_OpenNLP_Meetup.ipynb
Created October 7, 2023 18:41
OpenNLP Meetup #7 - A Practical Introduction to Image Retrieval (Slides)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ZanSara
ZanSara / Cheetah.ipynb
Last active October 7, 2023 18:53
Cheetah.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ZanSara
ZanSara / RAG_Pipelines.ipynb
Last active October 25, 2023 22:13
Office Hours - RAG Pipelines 2.0
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ZanSara
ZanSara / Translated_Hybrid_Retrieval.ipynb
Last active October 15, 2023 13:52
Translated Hybrid Retrieval
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ZanSara
ZanSara / rag_pipelines_from_scratch_to_production.ipynb
Last active November 21, 2023 15:59
RAG Pipelines from scratch (0.88.0).ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.