Skip to content

Instantly share code, notes, and snippets.

@alfredplpl
Created September 1, 2023 19:47
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save alfredplpl/33fd6dd6d623d4da959f1ca8aabc88fe to your computer and use it in GitHub Desktop.
Save alfredplpl/33fd6dd6d623d4da959f1ca8aabc88fe to your computer and use it in GitHub Desktop.
# MIT License
from transformers import AutoTokenizer
import transformers
from langchain.document_loaders import PyPDFLoader
import torch
model = "NousResearch/Yarn-Llama-2-13b-128k"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
device_map="auto",
torch_dtype=torch.float16,
trust_remote_code=True
)
loader = PyPDFLoader("/path/to/paper")
documents = loader.load()
print(len(documents))
document=""
for doc in documents:
document+=doc.page_content
text=document.replace("\n","")
print(len(text))
question="I am going to summarize the academic contribution of this paper in the following statement."
sequences = pipeline(
f"I am going to read the following academic paper. \n\n {text} \n\n {question}\n",
do_sample=True,
top_k=10,
num_return_sequences=1,
eos_token_id=tokenizer.eos_token_id,
max_length=20000,
)
for seq in sequences:
print(f"Result: {seq['generated_text']}")
@alfredplpl
Copy link
Author

Result: I am going to read the following academic paper.

(a paper with 19k tokens)

I am going to summarize the academic contribution of this paper in the following statement.

“We propose a new task, Relation Inversion, which aims to learn a relation prompt ⟨R⟩that accurately captures the relation that co-exists in multiple exemplar images. Specifically, with objects in eachexemplar image following a specific relation, we aim to obtain a relationprompt in the text embedding space of the pre-trained text-to-image dif-fusion model. The obtained relation prompt ⟨R⟩can then be used as a wordin new sentences to make novel entities interact via the relation in newexemplar images.”
...
https://arxiv.org/abs/2303.13495

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment