Skip to content

Instantly share code, notes, and snippets.

@kanj
Last active June 26, 2024 14:00
Show Gist options
  • Save kanj/ef8ce53edf75eb55da736e62e06b595c to your computer and use it in GitHub Desktop.
Save kanj/ef8ce53edf75eb55da736e62e06b595c to your computer and use it in GitHub Desktop.
chatGPT Prompts for Larry Gonick's "The Cartoon Guide to Statistics"

I enjoy Larry Gonick's Cartoon Guides. A combination of an amazing repertoire of dad jokes with insites into the material expresed graphically. Unfortunately most of them do not have problems or quizzes. I decided to try chatGPT as a way to customize testing my comprehension. I am familiar with the material as I have taken courses in the subject. I wanted to share what I did, the issues I ran into and how I addressed them. Let's get started.

My first prompt was to set up a course outline:
 I am an adult learner and I would like to create a course outline in statistics based on the following headings:
1. What is statistics
2. Data Description
3. Probability
4. Random Variables
5. A Tale of Two Distributions
6. Sampling
7. Confidence Intervals
8. Hypothesis Testing
9. Comparing Two Populations
10. Experimental Design
11. Regresasion

The next step is to generate a quiz and take it:
  Please provide a 10 question multiple choice quiz for section 1 - what is statistics. Do not provide the answers until prompted.

Then grade the quiz:
  Please provide an answer key, without explanations, for the most recent quiz.

One is likely to get some wrong answers. I would put them into four categories

  1. Did not read the question or read all the alternatives
  2. Did not understand the material
  3. Question involved material outside the text.
  4. Incorrect Scoring (AKA Hallucination)

For 1 and 2 the following prompt should be useful:
 Please explain the answer to question 9 of the most recent quiz

For 3 sometimes the explanation provided will provide enough information. Also searching the internet for additional material may be useful. Here are a couple sample queries:

  • site:wikipedia.org categorical
  • site:math.libretexts.org categorical

Incorrect scoring (AKA Hallucination) requires more customization. We start with the standard prompt for an explanation for the answer selected.
  Please explain the answer to question 7 of the most recent quiz
### Question 7: **Which sampling method is used when a researcher divides a population into smaller groups and then selects individuals from each group?** - A. Simple random sampling - B. Cluster sampling - C. Systematic sampling - D. Stratified sampling

Correct Answer:

B. Cluster sampling

Explanation:

Cluster sampling involves dividing the population into clusters or groups and then randomly selecting entire clusters to include in the sample. Once clusters are selected, all individuals within those clusters are included in the sample.

How Cluster Sampling Applies to the Question:

In the context of the question, cluster sampling fits because it describes a method where the population is first divided into smaller groups or clusters, and then a subset of these clusters is chosen randomly for inclusion in the sample. Individuals within the selected clusters are then included in the study or survey.

Why Other Options Are Incorrect:

  • A. Simple random sampling: In simple random sampling, each individual in the population has an equal chance of being selected independently of others, without first grouping into clusters.

  • C. Systematic sampling: Systematic sampling involves selecting individuals from a list at regular intervals, with a random starting point, rather than grouping into clusters.

  • D. Stratified sampling: Stratified sampling involves dividing the population into homogeneous subgroups (strata) and then randomly selecting individuals from each stratum. This method ensures representation from each subgroup.

Advantages and Considerations:

  • Advantages: Cluster sampling can be more practical and cost-effective than other methods, especially when the population is large and geographically dispersed.

  • Considerations: It's important to ensure that clusters are representative of the entire population to avoid bias. Variability between clusters should also be considered during analysis.

Therefore, in the context of the question, option B, cluster sampling, is the correct answer as it involves dividing the population into smaller groups (clusters) and then selecting individuals from within each group.

I followed up with a prompt to identify the error:
 In your explanation for question 7 you stated that individuals from the clusters are selected but elsewhere you stated everyone in a selected cluster is selected. Which is corrected?

I apologize for any confusion caused. Let's clarify the correct understanding of cluster sampling:

Cluster Sampling Explanation:

In cluster sampling, the process involves the following steps:

  1. Dividing into Clusters: The population is divided into clusters or groups. These clusters are typically based on naturally occurring groups within the population, such as geographic regions, schools, hospitals, etc.

  2. Random Selection of Clusters: A random sample of clusters is selected from the population. This random selection can be done using simple random sampling or another randomization method.

  3. Including Individuals: After selecting clusters, all individuals within the chosen clusters are included in the sample. This means that if Cluster A and Cluster B are randomly selected, then all individuals within Cluster A and Cluster B are included in the study or survey.

Clarification:

  • When I mentioned "individuals from the clusters are selected," this was incorrect phrasing. In cluster sampling, it's not about selecting individuals from within the clusters but rather selecting entire clusters themselves randomly.

  • The correct approach is that once a cluster is selected, all individuals within that cluster are automatically included in the sample. This is a defining characteristic of cluster sampling.

Correct Understanding:

  • Cluster Sampling: Involves selecting clusters (groups) randomly from the population and including all individuals within those selected clusters in the sample. This approach is different from methods like stratified sampling (where individuals are selected from within strata) or simple random sampling (where individuals are selected randomly without grouping).

Therefore, the accurate statement is that in cluster sampling, all individuals within the selected clusters are selected for inclusion in the sample. This method is used when it's more practical or efficient to sample groups (clusters) rather than individuals directly.

Therefore, in the context of the question, option B, cluster sampling, is the correct answer as it involves dividing the population into smaller groups (clusters) and then selecting individuals from within each group.

I tried the following prompts:

  • In your explanation for question 7 you stated that individuals from the clusters are selected but elsewhere you stated everyone in a selected cluster is selected. Which is corrected?
  • Please provide an updated 10 question multiple choice quiz for section 6 - Sampling. Do not provide the answers until prompted.(This quiz had the same issue as the original quiz)
  • Please review your answer to question 5 of the most recent quiz based on our previous discussion about cluster sampling. (Did not work)
  • Based on our previous discussion we agreed that cluster sampling sampling of clusters not individuals. Do you agree with this and do you want to change your answer to question 5 of the most recent quiz? (chatGPT changed the question to get the original answer)
  • Based on our previous discussion we agreed that cluster sampling sampling of clusters not individuals. Do you agree with this and do you agree that the correct answer to question 5 of the most recent quiz is B - Stratified sampling? (This worked)
  • Please provide an updated answer key, without explanations, for the most recent quiz. (This worked)

I was unable to fix the hallucination about sampling within the conversation I was able to get to a corrected answer key for a specific quiz. If I wanted a new quiz I would have to repeat the process.

Given the issue with Hallucinations is the use of chatGPT as a tutor viable? I would answer a tentative yes based on the following:

  • Both the learner and chatGPT would have to mark the same wrong answer.
  • The learner should use a reputable source to learn from.
  • The learner should be diligent in reviewing missed questions.
  • Even reputable texts have a small fraction of wrong answers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment