Skip to content

Instantly share code, notes, and snippets.

@alphasecio
Created May 17, 2023 01:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save alphasecio/1c2bc09b2e7f0dad89123f2a4eefe30c to your computer and use it in GitHub Desktop.
Save alphasecio/1c2bc09b2e7f0dad89123f2a4eefe30c to your computer and use it in GitHub Desktop.
Split text file using LangChain CharacterTextSplitter
import streamlit as st
from langchain.text_splitter import CharacterTextSplitter
text_file = st.file_uploader("Upload text file", type="txt")
if text_file is not None:
text = text_file.read().decode("utf-8")
if st.button("Load"):
try:
text_splitter = CharacterTextSplitter(
separator = "\n\n",
chunk_size = 1000,
chunk_overlap = 200,
length_function = len,
)
texts = text_splitter.create_documents([text])
st.success(texts[0])
except Exception as e:
st.error(e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment