Last active
January 21, 2021 12:34
-
-
Save XiaoGeNintendo/8622da70e4fc659c5eadab8df95fe146 to your computer and use it in GitHub Desktop.
Generate First Letter Problem Using Source from "China Daily"
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import urllib.request | |
import io | |
import sys | |
import json | |
import re | |
import random | |
print(''' | |
Generate First Letter Problem Using Source China Daily | |
By XGN from HHS 2020 | |
V1.1 build 20201209 | |
''') | |
searchCnt=int(input("Enter search page count:")) | |
res=[] | |
def search(s,source,find): | |
global res | |
x=re.compile("\\?|\\!|\\.").split(s) | |
for i in x: | |
if " "+find.lower()+" " in " "+i.lower()+" ": | |
res.append([i.replace(find,find[0]+"_______")+"(From "+source+")",find]) | |
pass | |
def lookFor(word): | |
print("Trying to download:",word) | |
global searchCnt | |
for i in range(searchCnt): | |
print("Searching Page",i) | |
base="http://newssearch.chinadaily.com.cn/rest/en/search?keywords=%s&sort=dp&page=%d&curType=story&type=&channel=&source=" | |
try: | |
request=urllib.request.Request(base%(word,i)) | |
response=urllib.request.urlopen(request) | |
html=response.read() | |
ret=html.decode('utf-8') | |
js=json.loads(ret,encoding="utf-8") | |
for article in js["content"]: | |
search(article["plainText"],article["title"],word) | |
except: | |
print("Failed to fetch!:") | |
with open('in.txt',mode='r') as f: | |
keywords=f.readlines() | |
for i in keywords: | |
lookFor(i.strip()) | |
random.shuffle(res) | |
with open('out.txt',mode='w') as f2: | |
f2.write("Questions:\n") | |
for i in range(len(res)): | |
f2.write("%d. %s\n\n"%(i+1,res[i][0])) | |
with open('ans.txt',mode='w') as f3: | |
f3.write("Answers:\n") | |
for i in range(len(res)): | |
f3.write("%d. %s\n"%(i+1,res[i][1])) |
Known issues
-
doesn't work for words with '-'
-
you need to enter different forms of words manually, like participate and participated
-
some sentences are wrongly cropped
4. sometimes show the suffix of the string too
- now sometimes doesn't clip the word out
v1.1 update log: Fixed issue #4
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What is this
This is a small program to grab sentences from China Daily, erase some keywords and ask you to solve them with the first letter given.
How to use
First create a file in the same folder called
in.txt
Put all words you want to make problems in it, one in a line.
Then run the program, it will ask you to enter an integer - the search page limit. Bigger number will give you more problems, but will take longer to generate. If you are not sure, please enter 1 (which is about 10 problems for each word)
Wait for it to end and open the
out.txt
andans.txt
!Example
in.txt
out.txt
ans.txt