Outline for


Introduction to the Practice of Statistics


by David S. Moore and George P. McCabe



Chapter 1

What is statistics?
Section 1.1
  1. Important questions of statistics
    1. What questions are relevant to the data?
    2. Who are the individuals the data describes?
    3. What, precisely, are the variables?
    4. How was the data acquired?
    5. How can the information in a single variable be described succinctly?
    6. Are there relationships between variables?
  2. Exploring (single) variables
    1. Use of graphs, charts, stem plots, histograms? time plots (Q: Can each of these be used equally well for all types of variables?)
    2. Features of note: center, spread, deviations, symmetry, number of modes, outliers, seasonal variation, trends (Q: Do each of the aforementioned concepts apply to all types of variables?)
Terms to Know: statistics, individuals, cases, variable (categorical and quantitative), frequency and relative frequency, distribution, bar graph vs. histogram (what is the difference?)
Section 1.2
  1. Numerical summaries of distributions
    1. Why do we use them?
    2. What are the the drawbacks of their use?
    3. For what types of distributions are they most effective (Note: the answer may not be the same for all numerical summaries!)
    4. Which are resistant?
    5. Which measures of center and spread are paired together?
  2. Outliers
    1. Be able to recognize them (from graph — See Sect 1.1; 1.5 × IQR method)
    2. Propose appropriate (context-specific) ways of dealing with them
    Use of Technology: Be able to
    1. Enter data
    2. Sort data
    3. Find the mean, median, variance, standard deviation
    4. Produce box plot (Minitab only)
Terms to Know: mean, median, measure of center/spread, resistant measure, percentiles/quartiles, IQR, five-number summary, box plot, linear transformation
Section 1.3
  1. Density curves
    1. How can a smooth curve represent a distribution?
    2. Why is this type of “mathematical model” useful?
    3. Why is it essential that the area under such a curve be 1?
    4. Describe the placement of mean, median and percentiles along such a curve
    5. What is special about “normal” (density) curves? How many such curves are there?
  2. Normal distributions
    1. How do you tell if a distribution is well-approximated as a normal distribution?
    2. What are some types of data which are typically normally distributed?
    3. Standardizing a normal distribution
      1. Amounts to a linear transformation
      2. Computing z-scores and going from such a standardized score back to an unstandardized one
      3. Use of Table A to determine area under the standard normal curve, and interpreting the meaning of such areas
      4. Why standardize?
      5. Normal probabilities
    Use of Technology: Be able to
    1. answer questions such as those posed in Examples 1.25-1.27
    2. produce normal quantile plots for a given set of data (Minitab only)
    3. perform density estimation for a given data set (Minitab only)
Terms to Know: density curve, normal curve, (standard) normal distribution, standardized value (or z-score), normal quantile plot (normal probability plot), granularity

Chapter 2: Looking at Data — Relationships

  1. Association between variables
    1. Describe tendencies, not hard-and-fast rules
    2. Not same as causation
  2. Explanatory variable(s)
    1. Often chosen as a result of how data is to be used
    2. Even if data suggests association, wrong to assume changes in explanatory variable cause changes in response variable
  3. Guiding principles
    1. Start with graphical analysis, then add numerical summaries
    2. Look for overall patterns and deviations from those patterns
    3. When the overall pattern is quite regular, use a compact mathematical model to describe it
Terms to Know: associated variables, explanatory/response variables, causation
Section 2.1: Scatterplots
  1. Scatterplots
    1. Relationship between two quantitative variables
    2. Each individual in the study has corresponding point
    3. If one variable designated as explanatory, put it on horizontal axis
    4. Including a categorical variable as a 3rd variable
  2. Examining scatterplots
  3. Studying relationships between a categorical variable and a quantitative one
    1. Use methods of Chapter 1 (back-to-back stemplots, side-by-side boxplots, etc.)
    2. Cannot discuss positive/negative relationship except in those cases where categorical variable has natural ordering (See Example 8, p 115)
Terms to Know: scatterplot, overall pattern, deviation from a pattern, form/direction/strength of a relationship, outlier, positive/negative association, linear relationship, cluster, smoothing a scatterplot
Section 2.2: Correlation
  1. Correlation
    1. Establishes the strength of a linear relationship between two quantitative variables
    2. Properties
      1. r has same value regardless of which variable is considered explanatory
      2. Direction of relationship comes from sign of r
      3. r has no units, and is unaffected by which units are used for a variable
      4. -1 £ r £ 1
      5. Will not detect strong nonlinear relationships between variables (Plot your data!)
      6. Not resistant to outliers
    3. Understand the formula as one involving standardized scores for the two variables
  2. Determining r values by sight
    1. Changes in scale do not affect correlation, but can make our eyes think so
    2. Practice determining r ; or see a scatterplot for a given value of r
Terms to Know: linear relationship, correlation, strength of a relationship
Section 2.3: Least-Squares Regression
  1. Least-squares regression line
    1. Requires two quantitative variables, one designated as explanatory (x), the other as response (y)
    2. The line that best fits the data (i.e., of all possible lines drawn, it's the one that makes the sum of squares of vertical distances to data points the smallest)
    3. Calculation of slope, y-intercept from data (p. 141)
    4. Is dependent upon the units of measurement for explanatory/response variables
  2. Prediction
    1. Regression line is used to predict value of response variable y at a fixed value of explanatory variable x
    2. Reliability/accuracy
      1. Interpolation (prediction at x value falling inside observed data values) vs. extrapolation (prediction at x values far from observed values; often inaccurate)
      2. Interpolated values should be good if strength of fit is good (i.e., if r2 is close to 1 — see below)
      3. Poor results may occur if regression line in one population is used to make predictions in another population
  3. Connections between correlation and regression
    1. Correlation used in calculation of slope for regression line
    2. r2 = (variance of predicted values)/(variance of observed values); i.e., it is the fraction of variation in response values that is explained by least-squares regression of y on x
Terms to Know: regression line, slope, intercept, prediction, square of the correlation
Section 2.4: Cautions about Correlation and Regression
  1. Assessing strength of a linear relationship
    1. Look at residuals
      1. Difference of observed value and predicted (by the regression line) value
      2. Part of the variation in the response variable left unexplained by the linear association
      3. Mean of residuals for least-squares regression is always 0
    2. Residual plots
      1. Scatterplot with unchanged explanatory variable, but response variable is the residual
      2. Can support or refute whether overall pattern of original variables is linear (see discussion of Figure 2.19 on p. 156)
  2. Looking beyond regression
    1. Time plot of residuals — one way presence of lurking variable may be detected
    2. Investigating outliers (both in x and y directions
      1. Large studentized residuals help to detect outliers
      2. Large DFITS help to detect influential observations
  3. Warnings
    1. Beware lurking variables
    2. Do not take associations as causation
    3. Correlations based on averaged data are likely to be much stronger than with individual observations
    4. Successful prediction does not require a cause-and-effect relationship
Terms to Know: residual, residual plot, outliers, influential points, restricted-range
Section 2.5: An Application — Exponential Growth and World Oil Production
Section 2.6: Relations in Categorical Data
  1. Distributions of a two-way table
    1. Marginal distributions
    2. Conditional distributions
      1. At the cell level
      2. Involves looking at row/column percents
      3. Key to discovering nature of relationship between variables
        1. That some relationship exists can be ascertained using test of significance of Chapter 9
        2. Column/row percents can be plotted in histogram
  2. Simpson's paradox
    1. Illustrates how aggregate data can hide lurking variables
Terms to Know: counts/frequencies, percents/relative frequencies, two-way table, row/column variable, marginal distributions, conditional distributions, three-way table, aggregate data
Section 2.7: The Question of Causation
  1. Explanations for associations
    1. Direct causation
      1. Direct cause-and-effect between explanatory and response variables
      2. Difficult to prove via an observational study; best established through controlled experiments
    2. Common response
      1. Both explanatory and response variables change in response to some third (lurking) variable
    3. Confounding
      1. Lurking variable(s) present
      2. Cannot distinguish between effects of several variables upon the response variable
  2. Evidence for causation outside of a controlled experiment
    1. Association is very strong
    2. Association is consistent across numerous studies
    3. Higher levels of treatment associated with stronger responses
    4. Alleged cause preceded (in time) the response
    5. Alleged cause is plausible
Terms to Know: cause-and-effect, lurking variables, common response, confounding

Chapter 3

    Understand the difference in attitudes when looking at data for:
    1. exploratory data analysis
    2. an answer to a specfic question
Section 3.1
  1. Some Internet resources of (available) data
  2. Designs for producing data
    1. Sampling
      1. What are the advantages/disadvantages as compared to a census?
    2. Observational studies vs. experiments
      1. What types of questions can/cannot be answered using an observational study? (see also Section 3.2)
      2. Be able to give examples of each type.
Terms to Know: anecdotal evidence, available data, designs, sample, census, observational study, experiment, confounding
Section 3.2
  1. Advantages of experiments (over observational studies):
    1. Provide good evidence for causation
    2. Can minimize the effect of lurking variables
    3. Can study effects of combined factors
  2. Design of experiments
    1. Control
      1. Comparisons between treatments
      2. Arranging experimental units (subjects) into blocks
        1. Compare to “strata” for sampling
        2. Matched pairs (only when there are just two treatments)
    2. Randomization
      1. Purpose: to remove effects of lurking variables
      2. Complete vs. randomization within blocks
      3. Use of Minitab and Table B to randomize
    3. Replication (many experimental units reduce effects of chance variation
Terms to Know: experimental units/subjects, treatment, factors, explanatory/response/lurking variables, level, placebo, placebo effect, causation, bias, control/experimental group, matching, randomization, statistical significance, blind/double-blind study, lack of realism, blocks
Section 3.3
  1. Some designs for sampling
    1. Voluntary response samples
      1. Ex: A TV news magazine gives an 800-number for the audience to express its opinion.
      2. Highly subject to bias
    2. Probability samples
      1. Each member of the population is assigned a certain probability for being chosen and random chance is used to choose
      2. SRS is special case whem each member of population is assigned the same probability (i.e., equally-likely to be chosen)
    3. Stratified random samples
      1. Population is divided into “strata” (analogous to “blocks” from experiments)
      2. An SRS is taken within each stratum
    4. Mulstistage samples
      1. A hybrid of stratifying and using SRS
        1. Population is stratified (or subdivided)
        2. SRS used to select subdivisions to work wth
      2. Makes “door-to-door” interviews more practical (less costly)
  2. Issues surrounding surveys (and statistical studies in general)
    1. Questions to ask in determining the soundness of a statistical study
    2. Given the answers to the above list of questions, what types of bias is the study prone to?
Terms to Know: population, sample, undercoverage, nonresponse/response bias
Section 3.4
  1. Simulations
    1. Understand how simulations give a sense of the amount of variability in a statistic for various sample sizes.
    2. Be able to use Minitab to produce a sampling distribution (such as Fig. 3.6) when given:
  2. Questions:
Terms to Know: parameter, statistic, sampling variablility, simulation, sampling distribution, unbiased estimator, capture-recapture sampling

Chapter 4

Section 4.1
  1. Definition of probability
    1. Empirical (probability = long-term relative frequency)
    2. Relies entirely on randomness
      1. Short-term results are unpredictable
      2. Long-term pattern behavior
  2. Independence of trials
    1. Fundamental nature of the assumption
      1. Q: Is probability theory anti-religion?
    2. Difficult for people to accept
      1. Law of Averages (the gambler's fallacy): “I just got 10 heads in a row; must be due for some tails.”
      2. Myth of short-run regularity: “I made 10 shots in a row; must have the touch tonight.”
Terms to Know: random, probability (empirical vs. intuitive — see Section 4.2, p. 297, for the latter)
Section 4.2
  1. Probability Model (Note the connection to sampling distribution)
    1. List of all possible outcomes (sample space )
    2. Assignment of a probability to each outcome
      1. 0 £ P(A) £ 1 for every event A
        1. Note how probabilities are assigned under the assumption that the finitely-many outcomes are equally-likely.
      2. If S denotes the sample space, then P(S) = 1
      3. P(A C ) = 1 - P(A) for every event A
      4. Sum rule: P(A or B) = P(A) + P(B)
        1. Requires events A and B to be disjoint (use of Venn diagram to determine this)
        2. P(A) = å P(Ai ) , where the Ai are the (finitely-many) individual outcomes in event A
        3. See Section 4.5 for more general sum rule (one that applies even for non-disjoint events)
  2. Computing probabilities using multiplication rule: P(A and B ) = P(A) P(B)
    1. Events A and B must be independent
      1. Different concept from disjointness
      2. Independence cannot be determined from Venn diagram (unlike disjointness)
    2. See Section 4.5 for more general multiplication rule (one that applies even for non-independent events)
Terms to Know: probability model, outcome, sample space, event, disjoint events, independent events, complement of an event, addition (sum)/multiplication rule
Section 4.3
  1. Discrete random variable
    1. A certain kind of probability model
    2. Lends itself well to display via a probability histogram. Some have special names:
      1. Figure 4.6(a) (probability of obtaining a certain digit from a table of random digits) is example of a uniform probability distribution
      2. Figure 4.8 (Example 4.16) is an example of a binomial probability distribution
  2. Continuous random variable
    1. Probability model is specified by a density curve
      1. Probability of an event corresponds to an area under the curve. Note the implications of P(S) = 1 .
      2. Individual outcomes have probability 0. Thus P(X < v) and P(X £ v) are equal.
    2. Important class of examples are the normally-distributed continuous random variables
Terms to Know: (discrete/continuous) random variable, density curve, probability histogram
Section 4.4
  1. Summarizing probability distributions
    1. Mean (expected value) mX of a random variable X
      1. Know how to calculate for discrete random variables (use weighted average)
      2. Law of Large Numbers and estimation of m
    2. Variance/standard deviation for a discrete random variable
    3. Rules for means, variances
      1. under linear tranformation: a + bX
      2. of the sum of two random variables (Note the independence requirement for variances)
  2. Law of Large Numbers
    1. Allows stable prediction of random outcomes
    2. Does not tell how large
    3. Contrast to the various “laws of small numbers ” on p. 332 — randomness is generally misunderstood by the public
Terms to Know: weighted average, mean/variance/standard deviation of a probability distribution (mean = expected value), Law of Large Numbers
Section 4.5
  1. Conditional probability
    1. Events need not be independent
    2. Q: What does P(B | A ) equal when A and B are independent?
  2. Expanded rules for computing probabilities
    1. Addition Rule (Inclusion-Exclusion Principle): P(A or B ) = P(A) + P(B) - P(A and B )
    2. Multiplication Rule: P(A and B) = P(A) P(B | A )
    3. Combinations of these rules and use of tree diagrams
    4. Bayes' Rule
Terms to Know: union, intersection, conditional probability, personal probability

Chapter 5

Section 5.1
  1. Binomial Distributions
    1. A discrete probability distribution
    2. Applicable situations (i.e., binomial settings)
      1. n (a fixed number) independent observations (rule-of-thumb: population-to-sample size ratio at least ten is sufficient for approximate independence; see Example 5.3 and following)
      2. Each observation falls into one of two categories (call them successes and failures)
      3. Probability of success is p (fixed) for each observation
    3. A different binomial distribution B(n, p) for each pair of values n and p
      1. Skewed for small values of n, p
      2. Approximately normal when np ³ 10 and n(1 - p) ³ 10 (that is, when the expected number of successes and failures are both at least 10)
    4. Computing probabilities for binomially-distributed random variables
      1. Table C
      2. Using calculator/Minitab
    5. Formulas for mean, standard deviation of B(n, p) (bottom, p. 380)
  2. Count vs sample proportion
    1. Both are natural random variables for categorical outcomes
    2. proportion = (count)/(number of observations)
    3. Mean and standard deviation formulas for proportion
      1. Valid if count is binomially distributed; approx. valid in SRS where population-to-sample size ratio is at least 10
      2. Mean formula shows sample proportion for an SRS is unbiased estimator of parameter p
      3. S.D. formula quantifies how spread goes down as sample size goes up
    4. Probability distribution of sample proportion
      1. Not binomially distributed even when count is
      2. Like count, it is approximately normal when np ³ 10 and n(1 - p) ³ 10
  3. Approximating binomially-distributed random variable with normal distribution
    1. As a rule, do only if np ³ 10 and n(1 - p) ³ 10
    2. Expect better results if p » 1/2
    3. Use continuity correction when n is not large
  4. Binomial probabilities P(X = k)
    1. Found in Table C for certain values of n , k and p
    2. Formula from which these table values come (see p. 388)
Terms to Know: population vs. sampling distribution (related to parameter vs. statistic ), sample proportion, binomial distribution, count, continuity correction, success/failure, factorial, unbiased estimator (p. 382)
Section 5.2: Sampling Distribution of Sample Mean
  1. Sample mean in an SRS of size n
    1. A random variable X
      1. Individual observations Xi also random variables
        1. Each Xi distributed as the population, if population is large compared to size n of sample
        2. Mean and S.D. for each Xi is population mean and S.D.:m and s
      2. Definition of X: X = (1/n) S Xi
      3. Mean and standard deviation of X — see p. 399
  2. Distribution of mean X compared to population distribution
    1. Distribution of X is
      1. normal if population (individual Xi ) is normal
      2. increasingly normal (as sample size n increases) even if population is not (Central Limit Theorem)
    2. Spread for X not as great as for population; decreases as n increases (reflected in formula for S.D.)
Terms to Know: sampling distribution, sample mean, unbiased estimator
Section 5.3

Chapter 6: Inference on the mean

Section 6.1: Confidence intervals
  1. Nature of confidence intervals
    1. Range from (estimate - margin of error) to (estimate + margin of error)
    2. Have an associated confidence level C
      1. C percent of the time the confidence interval for the estimated statistic contains the parameter — see Fig. 6.2, p. 438
      2. Desirable to have C as high as possible (usual values: 90%, 95%, 97%)
    3. Margin of error
      1. Desirable to have as small as possible (at odds with desire for large confidence interval)
      2. Can decrease in one of three ways (see bullets on p. 442)
    4. Assume unbiased estimator of parameter — account only for chance variation
  2. Confidence interval for a population mean
    1. Underlying assumptions
      1. Population is normally distributed
        1. Without this, confidence won't be as great as advertised
        2. With n ³ 15 (# of observations), no extreme outliers or skewness confidence isn't severely compromised
        3. Data is unbiased, subject only to random sampling error
      2. Data is an SRS of the population, or can be considered as one (not a multistage/stratified sample)
    2. Extends from (sample mean - margin of error) to (sample mean + margin of error)
    3. The “fine print”
      1. If possible, explain and/or correct outliers (nonresistance of sample mean)
      2. In practice, won't know s; might substitute sample standard deviation s if large sample size
      3. Mustn't interpret
        1. confidence level as a probability that true mean lies in interval; rather, as how often the method gives correct answers
        2. confidence interval/level as a prediction that C% of observations lie inside this interval
Terms to Know: margin of error, inference, confidence level
Section 6.2: Significance Tests
  1. Hypotheses
    1. Null hypothesis (H0 )
      1. A supposition about a population parameter: p = p0 (in this section, m = m0)
      2. Will test compatability of H0 with sample statistic
    2. Alternative hypothesis (Ha )
      1. Statement of an alternative to H0 we suspect to be true
      2. One-sided (Ha: p > p0 or Ha: p < p0 ) vs. two-sided (Ha: p ¹ p0 )
  2. Test for mean
    1. Underlying assumption: sample mean is normally-distributed as N(m, s/Ön )
      1. True if population is normally distributed
      2. Approximately true if sample size n is large
    2. Compute test statistic (a z-score) for sample mean assuming hypothesized population mean
    3. Get associated P-value (probability associated with test statistic and Ha ; see box, p. 461)
    4. Compare P-value to predetermined significance level a
      1. a is a percentage (i.e., it is between 0 and 1)
      2. Common levels of significance: 0.1, 0.05, 0.01
Terms to Know: null hypothesis, alternative hypothesis (1 or 2-sided), test statistic, P-value, statistical significance
Section 6.3: Use/Abuse of Tests
  1. Significance tests are not appropriate for all data sets
    1. Outliers can exaggerate/de-emphasize significance
    2. Confounding is not removed by such tests
      1. Statistical significance establishes results are unlikely due to random chance
      2. Does not provide reason for significance (could be some suspected effect of a treatment, could be poor study design)
  2. Significance level a
    1. Importance of choosing a level ahead of time when decisions will be made based upon results
    2. Choosing a level: Consider
      1. how believable is the null hypothesis
      2. consequences of rejecting null hypothesis
    3. Avoid thinking of results as insignificant if a not reached, significant if it is
  3. Misinterpreting statistical significance
    1. Statistical significance vs. practical importance
    2. Significance may lead to rejecting null hypothesis in favor of alternative; lack of significance only means results are consistent with null hypothesis
  4. Danger of “searching for significance”
Section 6.4

Chapter 7

Section 7.1: Inference for the Mean of a Population
  1. t distributions
    1. Correct distribution for sample mean when s (for underlying population) is not known and s (the sample standard deviation) is used in its place
    2. Description
      1. Standardized so centered about 0
      2. Symmetric and bell-shaped
      3. Larger spread than normal distribution
    3. Degrees of freedom
      1. df = n - 1
      2. More like N(0, 1) as df increases
  2. One-sample t confidence intervals
    1. Used in place of confidence interval for population mean (as learned in Section 6.1) when s (for population) is unknown
    2. Determination of margin of error
      1. One-sample t statistic (determined for a confidence level C from Table D) used in place of z statistic
      2. Use standard error of sample mean in place of standard deviation for sample mean
  3. One-sample t test
    1. Used in place of z test (see Section 6.2, p. 461) when s unknown
    2. Formulate null/alternative hypotheses just as usual
    3. Determine t statistic as you would z statistic, but using SE for sample mean rather than SD
    4. Determine P-value from appropriate t distribution (Table D)
    5. Note method of reporting conclusion (as at end of Example 7.5, p. 511)
  4. Matched pairs t procedures (comparative inference)
    1. Procedures are just like above, but performed on the difference
    2. Usually have H0 : m = 0 and one-sided alternative hypothesis
  5. When are t procedures valid
    1. Exactly correct when population is normal
    2. Approximately correct when n ³ 15 except in case of outliers or strong skewness
    3. Clear skewness (no outliers) OK if n ³ 40
  6. Power of the t test
    1. This is optional reading. To fully understand the discussion, you ought to study Section 6.4 as well.
  7. Inference for non-normal populations
    1. Use a known distribution that is not normal but fits well
    2. Make a transformation that brings about normality
    3. Use distribution-free procedures (Example: the sign test
Terms to Know: standard error, one-sample t, degrees of freedom, matched pairs test, robust
Section 7.2: Comparison of Two Means
  1. Context of two-sample problems
    1. Want to compare responses in two groups
      1. Often (but not exclusively) used in comparative experiments
      2. Usually comparisons made on groups mean responses
    2. Groups can be considered as samples from distinct populations
    3. Responses of units in one group independent of those in the other
  2. Two-sample statistics
    1. Two-sample z statistic
      1. How the formula follows from previous (one-sample) z statistics
        1. Two random variables, one from each group (measuring same thing, but possibly having different distributions)
        2. Looking at difference between these variables, so sample/population mean is difference of ones for each group, variance for difference computed from individual variances via formula, p. 337
      2. Is distributed normally (or approximately so) as N(0, 1) when underlying populations are normal (or approximately)
      3. Used when standard deviations for underlying populations are known (somewhat unusual)
    2. Two-sample t statistic
      1. Used when
        • Samples and population distributions of both groups satisfy conditions mentioned in Section 7.1 for t procedure validity, and
        • standard deviations of populations are not known
      2. Formula is one arising naturally from that for two-sample z statistic
      3. Does not have t distribution
        1. Is approximately t for the correct df
        2. Best df comes from formula, p. 549 (but use software or method below instead of memorizing this)
        3. We get good (conservative) estimate taking df = min{n1 - 1, n2 - 1 }
  3. Inference on the difference of two population means
    1. Two-sample t significance test
      1. Null hypothesis: the population means are equal
      2. Notation used in results
      3. Interpretation of results
    2. Two-sample t confidence interval
      1. Interpretation of such an interval
  4. Robustness
    1. Most robust against nonnormality if sample sizes equal
    2. If sample sizes equal and distributions of two populations the same, can take sample sizes as low as 5
    3. Using t procedures with small samples
  5. Optional material
    1. Software approximation for degrees of freedom
    2. Pooled two-sample t procedures
Terms to Know: difference of sample means, two-sample z and t statistics, conservative estimates
Section 7.3: Optional Topics in Comparing Distributions

Chapter 8

Section 8.1: Inference for 1-proportions
  1. Large-sample confidence interval for population proportion
    1. Basic assumptions
      1. Population-to-sample size ratio is at least 10 (so count is approximately binomially distributed)
      2. Sample size is large enough that expected value of successes and failures are both at least 10 (so binomial dist. well-approximated by normal dist.)
    2. Interval is ( - m, + m )
      1. Margin of error determined differently than for inference on a population mean (see Section 6.1)
      2. Standard error (SE) of sample proportion: like standard deviation with sample proportion in place of true proportion (unknown parameter)
      3. Desired level of confidence (percentage) ® z* (from ¥ row of Table D)
  2. Large-sample significance test for population proportion (H0 : p = p0 )
    1. Comparison to confidence interval
      1. Significance test good if specific (ideal) p0 is suspected
      2. Confidence interval provides range of compatible p
    2. Basic assumptions: as for confidence intervals but np0 ³ 10 and n(1 - p0 ) ³ 10.
    3. P -values determined from appropriate choice of P(Z £ z ) , P(Z ³ z ) or P(Z ³ | z | )
  3. Determination of sample size
    1. Must first specify:
      1. a desired margin of error m
      2. a desired level (percentage) of confidence ® z*
      3. a guessed p* at the true proportion (can take worst-case guess p* = 0.5 )
Terms to Know: standard error, approximate level C confidence interval, null/alternative hypothesis, P -value, z-statistic (test statistic), sample proportion
Section 8.2: Comparison of Two Proportions
  1. Setting
    1. Have categorical data (one variable, 2 options) for samples from two groups (populations)
    2. Want to compare proportions between populations
  2. Inference procedures on the difference of two population proportions (Interpretation and notation)
    1. Confidence intervals
    2. Tests of significance
      1. Standard error arrived at in a somewhat different way than all previous standard errors
      2. pooled estimate (combining of two sample proportions)
      3. Null hypothesis: the two population proportions are equal
  3. Optional material
    1. Relative risk
Terms to Know: difference of sample proportions, pooled estimate

Chapter 9: Inference for Two-Way Tables

  1. Two-way tables
    1. Give counts for two categorical variables
      1. Can be used for categorical information (S or F) for samples from 2 populations (like material studied in Section 8.2)
      2. Variables may have more than two options, resulting in more rows and/or columns
    2. Constructing them
      1. Columns for explanatory variable, rows for response variables
      2. Additional row/column for totals
      3. Grand total
    3. Row/column/marginal percents
  2. Test of significance
    1. Results in 2 ´ 2 case same as if you use procedures of Section 8.2
    2. Hypotheses
      1. Null hypothesis: No association between variables
        1. Leads to expected cell counts
        2. Rejected for small P-value (taken from Table F; see below) in favor of alternative hypothesis
      2. Alternative hypothesis: association exists
        1. Always two-sided
        2. Exact nature of association ascertained by looking at data and should be included in answer (see, for example, the 1st paragraph on p. 632; the last full paragraph on p. 636)
    3. chi-square statistic
      1. df = (#rows - 1)(#cols - 1)
      2. Distributed as c 2(df) (Table F) if
        1. table is 2 ´ 2 and each expected cell count is at least 5
        2. table is bigger than 2 ´ 2, each expected cell count is at least 1, average expected cell count is at least 5
  3. Two models for two-way tables (neither of which would be open to including the same unit in counts appearing in different cells)
    1. Explanatory variable is the population (i.e., each column represents a different population — as in male vs. female; GM cars vs. Ford vs. Chrysler)
    2. Columns represent subdivisions within a single population (as in categorizing Americans by their age as in Table 4.1, p. 350; cats by their source as in Exercise 9.3, p. 644; etc.)
  4. Optional material
    1. Meta-analysis
Terms to Know: two-way table, cell, row/column percentages, expected cell counts, chi-square statistic, joint/conditional distributions

Chapter 12: One-Way ANOVA (see Powerpoint presentation)

  1. Setting for its usage
    1. Two variables: one categorical, one quantitative
      1. Categorical variable usually is population (group) to which unit belongs
      2. Extension of idea, called two-way ANOVA, can deal with two categorical, one quantitative variable
    2. Extension of 2-sample t test
      1. Comparison of means (of quantitative variable) between groups
      2. Gives same results as 2-sample t test when just two groups
  2. The model assumptions
    1. There are I populations
    2. A sample is drawn from each population
      1. Sample size from 1st population is N1, from 2 population is N2, etc.
      2. xij represents the jth observation from the ith group
      3. i represents the sample mean (statistic) within the ith group
      4. represents the sample mean (statistic) for all observations in all groups
    3. Each population is normally distributed about a mean mi with standard deviation si (parameters)
      1. Assumption should be checked when possible by looking at histograms/normal quantile plots within each group
    4. Each si is the same (i.e., si = s for each i )
      1. If not, the problem can often (but not always) be overcome with a transform of the data
      2. Not usually worth formal test to see if S.D.s are the same — consider OK if rule in box on middle of p. 752 is satisfied
      3. Estimate s using pooled (sample) standard deviation sp (sp2 = MSE in Minitab output; see formula on bottom of p. 752)
  3. One-way ANOVA test
    1. A test of significance
      1. H0: "no difference in mean between groups"
      2. Ha: mean in at least one group differs from other groups
    2. Test statistic
      1. F statistic
        1. F = MSG/MSE
        2. Use of dfs in computing MSG/MSE from SSG/SSE
          • Degrees of freedom in numerator: DFG = I - 1
          • Degrees of freedom in denominator: DFE = N - I (N is total number of units across all groups)
        3. Gives ratio of variation among group means to variation within groups
      2. New distribution, called an F distribution
        1. Table E in back of text
        2. Requires knowledge of F statistic, df for numerator (DFG), df for denominator (DFE)
    3. Coefficient of determination R2 = SSG/SST. (Like in regression, indicates percentage of total variation in means from samples are explained by population)
    4. If test demonstrates significance, further analysis must be done to determine how means vary between groups; some alternatives:
      1. Graphical displays (side-by-side boxplots, histograms, etc.)
      2. Contrasts
        1. Preferable when investigator has predisposed opinion about how means will compare in various groups
        2. We will not study these
      3. Multiple comparisons
        1. Inspect difference of means between any two groups (idea is like, though not the same as, using a 2-sample t test on each possible pairing of groups)
        2. Tests of significance are possible on differences of these means — we will not do
        3. Confidence intervals on pairs of differences of means: Be able to
          • understand/interpret Minitab output for Tuckey's/Fisher's Pairwise Comparisons
          • understand why the difference in individual and overall (or family ) error rates
Terms to Know: one-way ANOVA, group, variation among/within groups, ANOVA table, degrees of freedom (DFG, DFE, DFT), sum of squares (SSG, SSE, SST), mean squares (MSG, MSE), F statistic, multiple comparisons, coefficient of determination (R2), pooled standard deviation (sp)

Chapter 13: Two-Way ANOVA

Terms to Know:
Last Modified: