Skip to content

Instantly share code, notes, and snippets.

@AA-Durocell
Last active May 17, 2024 14:37
from langchain.document_loaders import WikipediaLoader
from langchain.text_splitter import TokenTextSplitter
# Load and split text documents
raw_documents = WikipediaLoader(query="Kendrick Lamar").load()
text_splitter = TokenTextSplitter(chunk_size=512, chunk_overlap=24)
documents = text_splitter.split_documents(raw_documents[:3])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment