Skip to content

Instantly share code, notes, and snippets.

@Cdaprod
Last active July 17, 2023 23:37
Show Gist options
  • Save Cdaprod/6338a05be1da1b756f38850b2fa7eacc to your computer and use it in GitHub Desktop.
Save Cdaprod/6338a05be1da1b756f38850b2fa7eacc to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To summarize the data inside a Notion export, you can use the DirectoryLoader class from the langchain.document_loaders module. This loader allows you to load all documents in a directory and provides options for controlling which files to load, showing a progress bar, using multithreading, and changing the loader class.\n",
"\n",
"Here is an example of how to use the DirectoryLoader to load documents from a Notion export:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import DirectoryLoader\n",
"\n",
"loader = DirectoryLoader('path/to/notion/export', glob=\"**/*.md\")\n",
"docs = loader.load()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this example, the glob parameter is used to specify which files to load. The **/*.md pattern will load all Markdown files in the specified directory and its subdirectories.\n",
"\n",
"Once you have loaded the documents, you can use the Langchain summarization models to summarize the data. You can choose between the map_reduce or stuff chain types depending on your needs.\n",
"\n",
"For example, to summarize the data using the map_reduce chain, you can do the following:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.combine_documents.map_reduce import MapReduceDocumentsChain\n",
"\n",
"chain = MapReduceDocumentsChain()\n",
"summary = chain(docs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The summary variable will contain the summarized text.\n",
"\n",
"Alternatively, you can use the stuff chain type:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.combine_documents.stuff import StuffDocumentsChain\n",
"\n",
"chain = StuffDocumentsChain()\n",
"summary = chain(docs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Again, the summary variable will contain the summarized text.\n",
"\n",
"Remember to replace 'path/to/notion/export' with the actual path to your Notion export directory."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment