Skip to content

Instantly share code, notes, and snippets.

@alexcpn
Created April 18, 2023 05:46
Show Gist options
  • Save alexcpn/ceac5f7996332c6a8007b3be0c199109 to your computer and use it in GitHub Desktop.
Save alexcpn/ceac5f7996332c6a8007b3be0c199109 to your computer and use it in GitHub Desktop.
Huggingface GPT2 output generation based on parameters

Processing Message from input() Question: New York

Generated `

New York City. New Yorkers live within walking distance of the capital, and over 90% are located at or near high-speed Internet access points (h/t to WIRED). NYC is a global cultural center with an important influence on commerce; it constitutes one major city in terms
[a]century's worth [of news content]. With its rich media culture coupled by vibrant online communities that foster collaboration among writers from aroundthe world—from emerging markets like China through Latin America into Europe via Asia —NYC has become perhaps most influential place for new creative expression.[1][2], where innovative ideas can be disseminated quickly across disparate audiences without compromising quality control as well,[3],[4](http://www:washingtonpost.-times/.wp.] NYX provides opportunities both inside your home town hall meeting room full time but also outside when you're not there because many people don't have internet connections yet! It offers unparalleled opportunity–and often financial gain –to connect directly between yourself & friends worldwide while still being able "initiate" their work remotely using technology."

Generated 2

New York City. New York, often called New York City[a] or NYC, is the most populous city in the United States. With a 2020 population of 8,804,190 distributed over 300.46 square miles (778.2 km2), New York City is the most densely populated major city in the United States and more than twice as populous as Los Angeles, the nation's second-largest city. New York City is located at the southern tip of New York State. It constitutes the geographical and demographic center of both the Northeast megalopolis and the New York metropolitan area, the largest metropolitan area in the U.S. by both population and urban area. With over 20.1 million people in its metropolitan statistical area and 23.5 million in its combined statistical area as of 2020, New York is one of the world's most populous megacities, and over 58 million people live within 250 mi (400 km) of the city.New York City is a global cultural, financial, entertainment, and media center with a significant influence on commerce, health care and life sciences, research, technology, education, politics, tourism, dining, art, fashion, and sports. Home to the headquarters of the United Nations, New

Differenence between 1 and 2

Generated 1

  encoded_input = tokenizer(prompt, truncation=True, padding=False, return_tensors="pt")
  outputs = model.generate(input_ids = encoded_input.input_ids.to(device),
                                 min_new_tokens=200,
                                 max_new_tokens=250, 
                                 #num_beams=1,# 1 means no beam search
                                 #early_stopping=True,
                                 #num_beam_groups=1, #1 default
                                 #temperature=1, # 1 default
                                 no_repeat_ngram_size=1)

Generated 2

 test_output = model.generate(input_ids = encoded_input.input_ids.to(device),max_length=250,
                    num_return_sequences=1, pad_token_id=tokenizer.eos_token_id)
 test_answer = tokenizer.decode(test_output[0], skip_special_tokens=True)

The model is overfitted on a small text and Generated 2 is giving better output

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment