Skip to content

Instantly share code, notes, and snippets.

@phact
Last active November 4, 2023 01:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save phact/02566963f9b695d957a0450983200496 to your computer and use it in GitHub Desktop.
Save phact/02566963f9b695d957a0450983200496 to your computer and use it in GitHub Desktop.
I noticed that some text from the squad dataset fails on embedding generation with openai ada-002 and decided to do a bit of digging
import openai
texts2 = ['What form of destruction was considered too limited by a smaller group of experts?', 'Prior to being a formal legal term, how was the word "genocide" used in an indictment scenario?', 'Who ultimately defined genocide as a series of strategies leading up to the annihilation of an entire group?', "Lemming's concept of genocide triggered legal action in which realm?", 'What was the nationality of anthropologist Peg LeVine?', 'What relative term did LeVine coin to refer to cultural destruction, without the death of its members?', 'What term was coined to describe the destruction of culture?', 'What kind of scientist is Peg LeVine?', 'What elements of group existence, other than people themselves, can be targets of genocide?', 'What has been the primary focus in the study of genocide?', 'In prosecuting genocide, what must the act be formally acknowledged as?', 'In a general aspect, what is genocide viewed as?', 'In trials of genocidal crimes, what responsibly party is difficult to prosecute?', 'Long before genocide was established as a legal term, what treaty was in place to protect various groups from persecution and mass killings?', 'Why does genocide often go unpunished?', 'Who was the Peace of Westphalia designed to protect?', 'What year was the Peace of Westphalia signed?', 'When was the word "genocide" first used?', 'What is the etymology of the term "genocide"?', 'What is the definition of genocide?', 'Who coined the term "genocide"?', 'Who referred to acts of genocide in 1941?', 'The word "genocide" was unknown until what year?', 'In 1941, how did Winston Churchill refer to the mass killings of Russian prisoners of war?', 'What was the name of the Polish-Jewish lawyer who first described Nazi atrocities as "genocide?"', 'What is the etymological basis of the word "genocide?"', 'As it pertains to violent crimes against targeted groups, what is the ultimate motivation within the actions of genocide?', 'Several considerations were involved in meeting the requirement to determine what?', '', 'What is the key aspect of the targeted part of the group at the starting point of the inquiry?', 'The number of people targeted in a genocide should not be solely evaluated by what?', 'In addition to the numeric size of a targeted group, what other consideration was useful to the ICTY?', 'The issue of what is raised by judges in Paragraph 13?', 'What is the basis for suggesting that several factors regarding the activity of the perpetrators be considered?', 'The extent of what by the perpetrators was considered in an examination of their activity and level of control?', "What will always be restricted in terms of a perpetrator's intent to destroy?", 'While the factor cannot independently indicate if the targeted group is substantial, it can do what?', 'On which date did the Genocide Convention become effective?', 'What was the minimum number of countries necessary to form parties?', 'Of the five permanent members of the UN Security Council, how many were parties to the treaty?', 'What member ratified in 1970?', 'The delay in support by certain powerful members meant the Convention was largely powerless for over how many decades?', 'In 1998 it was written that the CPPCG was a legal entity resulting in which type of compromise?', 'Rather than a definition, the text of the treaty is considered as what type of tool?', 'What does the treaty possess that others lack?', 'The writers Jonassohn and Bjornson cite various reasons for the lack of widespread support of what?', 'What two writers examined the lack of an accepted and singular definition for genocide?', 'The two writers suggested that academics adjusted what in their different definitions to assist them in interpreting events?', 'What writer joined Jonassohn in the study of the whole of human history?', 'With whom was Leo Kuper paired in research that focused on 20th century works?']
texts1 = ['What piece did Chopin dedicate to Schumann?', 'What other musician shows to have elements of Chopin in his work?', 'Who dedicated his 1915 piano Études to Chopin?', "For what publisher to Debussy edit Chopin's music for?", "Who was a student of Chopin's former students and actually recorded some Chopin music?", 'What music did Debussy play a lot at the Paris Conservatoire?', 'Who were Wang Jiawei and Nyima Gyaincain?', 'What important trade did the Ming Dynasty have with Tibet?', 'During what years did the Mongol leader Kublai Khan rule?', 'Who did the Yongle Emperor try to build a religious alliance with?', 'Deshin Shekpa was the head of what school?', 'The Tibetan leaders had a diplomacy with what neighboring state?', 'What did the Tibetans use against Ming forays?', 'Who were the armed protectors for the Gelug Dalai Lama?', 'Which regime did Güshi Khan help establish?', 'When was the Mongol-Tibetan alliance started?', 'In what century did the Tibetan Empire fall?', 'Who signed multiple peace treaties with the Tang?', 'What did one of the treaties between the Tang and Tibet help fix?', 'Who was the Tangs biggest rival?', 'What year did Tang and Tibet sign a treaty to fix the borders?', 'When did the Five Dynasties and Ten Kingdoms period of China take place?', 'When did the Song dynasty take place?', 'What dynasty was concerned with countering northern enemy states?', 'Who ruled the Liao dynasty?', 'Who ruled the Jin dynasty?', 'Which ruler took Western Xia under their control?', "Who was Genghis Khan's successor?", 'What years did Ögedei Khan rule?', 'Who invaded Tibet?', 'Who was the Mongol prince?', 'Who was the leader of the Sakya school of Tibetan Buddhism?', 'Who was the regent of the Mongol Empire?', 'In what years was Töregene Khatun the regent of the Mongol Empire?', 'How many states were ruled by myriarchies?', 'What title did prince Kublai rule as from 1260 to 1294?', 'Who was the superior of prince Kublai?', 'Who became the second Karmapa Lama?', 'With whom did Kublai Khan have a unique relationship with?', 'When did Kublai Khan conquer the song dynasty?', '', 'When did the Yuan dynasty rule?', 'Which dynasty ruled all of china?', 'What did Khubilai claim for a while?', 'Where did Khubilai seek support as Emperor?', 'What year was the Sakya viceregal regime eradicated?', 'Who placed the Sakya viceregal regime position of authority?', 'Who eradicated the Sakya viceregal regime?', 'Which dynasty became ruler of Tibet?']
texts3 = ['What other musician shows to have elements of Chopin in his work?', 'Who dedicated his 1915 piano Études to Chopin?', "For what publisher to Debussy edit Chopin's music for?", "Who was a student of Chopin's former students and actually recorded some Chopin music?", 'What music did Debussy play a lot at the Paris Conservatoire?', 'Who were Wang Jiawei and Nyima Gyaincain?', 'What important trade did the Ming Dynasty have with Tibet?', 'During what years did the Mongol leader Kublai Khan rule?', 'Who did the Yongle Emperor try to build a religious alliance with?', 'Deshin Shekpa was the head of what school?', 'The Tibetan leaders had a diplomacy with what neighboring state?', 'What did the Tibetans use against Ming forays?', 'Who were the armed protectors for the Gelug Dalai Lama?', 'Which regime did Güshi Khan help establish?', 'When was the Mongol-Tibetan alliance started?', 'In what century did the Tibetan Empire fall?', 'Who signed multiple peace treaties with the Tang?', 'What did one of the treaties between the Tang and Tibet help fix?', 'Who was the Tangs biggest rival?', 'What year did Tang and Tibet sign a treaty to fix the borders?', 'When did the Five Dynasties and Ten Kingdoms period of China take place?', 'When did the Song dynasty take place?', 'What dynasty was concerned with countering northern enemy states?', 'Who ruled the Liao dynasty?', 'Who ruled the Jin dynasty?', 'Which ruler took Western Xia under their control?', "Who was Genghis Khan's successor?", 'What years did Ögedei Khan rule?', 'Who invaded Tibet?', 'Who was the Mongol prince?', 'Who was the leader of the Sakya school of Tibetan Buddhism?', 'Who was the regent of the Mongol Empire?', 'In what years was Töregene Khatun the regent of the Mongol Empire?', 'How many states were ruled by myriarchies?', 'What title did prince Kublai rule as from 1260 to 1294?', 'Who was the superior of prince Kublai?', 'Who became the second Karmapa Lama?', 'With whom did Kublai Khan have a unique relationship with?', 'When did Kublai Khan conquer the song dynasty?', '', 'When did the Yuan dynasty rule?', 'Which dynasty ruled all of china?', 'What did Khubilai claim for a while?', 'Where did Khubilai seek support as Emperor?', 'What year was the Sakya viceregal regime eradicated?', 'Who placed the Sakya viceregal regime position of authority?', 'Who eradicated the Sakya viceregal regime?', 'Which dynasty became ruler of Tibet?']
control = ["I like puppies", "kittens are nice too"]
def getEmbedding(process):
try:
response = openai.Embedding.create(model="text-embedding-ada-002", input=process)
print(f'success')
#print(f'response {response}')
except Exception as e:
print(f"failed to get embeddings for {process}")
print(e)
getEmbedding(control)
getEmbedding(texts1)
getEmbedding(texts2)
print('texts1 one by one')
for text in texts1:
getEmbedding(text)
print('texts2 one by one')
for text in texts2:
getEmbedding(text)
print('texts1 slices')
length = 1+len(texts1)
print(length)
for i in range(1, length):
print(f'the first {i} of texts1')
getEmbedding(texts1[:i])
print('texts2 slices')
length = 1+len(texts2)
print(length)
for i in range(1, length):
print(f'the first {i} of texts2')
getEmbedding(texts2[:i])
print('texts3 slices')
length = 1+len(texts3)
print(length)
for i in range(1, length):
print(f'the first {i} of texts3')
getEmbedding(texts3[:i])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment