Econ 2500 – Introductory Statistics

York University Department of Economics Professor Xianghong Li

Practice Problems

Chapter 1

1. The states differ greatly in the kinds of severe weather that afflict them. Table ta01_005 (on the course web) shows the average property damage caused by tornadoes per year over the period from 1950 to 1999 in each of the 50 states and Puerto Rico. a. What are the top five states for tornado damage? The bottom five? b. Make a histogram of the data by hand, with classes “0 damage 10,” “10 damage 20,” and so on. Describe the shape, center, and spread of the distribution. Which states may be outliers.

2. Data ex01_035.txt (on the course web) presented the nightly study time claimed by first-year college men and women. The most common methods for formal comparison of two groups use x and s to summarize the data. We wonder if this is appropriate here. a. In general, what kinds of distributions are best summarized by x and s? b. Use R to draw separate histograms for men and women. c. Each set of study times appears to contain a high outlier. Are these points flagged as suspicious by the 1.5 IQR rule? How much does removing the outlier change x and s for each group? The presence of outliers makes us reluctant to use the mean and standard deviation for these data unless we remove the outliers on the grounds that these students were exaggerating.

3. Create a set of 5 positive numbers (repeats allowed) that have median 10 and mean 7. What thought process did you use to create your numbers?

4. Use the definition of the mean x to show that the sum of the deviations xi x of the observations from their mean is always zero. This is one reason why the variance and standard deviation use squared deviations.

5. If you ask a computer to generate “random numbers” between 0 and 1, you will get observations from a uniform distribution. The following figure graphs the density curve for a uniform distribution. Use areas under this density curve to answer the following questions.

