Statistics Workbook For Dummies®, 2nd Edition with Online Practice
Published by: John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, www.wiley.com
Copyright © 2019 by John Wiley & Sons, Inc., Hoboken, New Jersey
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the Publisher. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions
.
Trademarks: Wiley, For Dummies, the Dummies Man logo, Dummies.com, Making Everything Easier, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc., and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc., is not associated with any product or vendor mentioned in this book.
LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITUATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM. THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR A POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRITTEN AND WHEN IT IS READ.
For general information on our other products and services, please contact our Customer Care Department within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993, or fax 317-572-4002. For technical support, please visit https://hub.wiley.com/community/support/dummies
.
Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com
. For more information about Wiley products, visit www.wiley.com
.
Library of Congress Control Number: 2019931550
ISBN 978-1-119-54751-8 (pbk); ISBN 978-1-119-54767-9 (ebk); ISBN 978-1-119-54768-6
Perhaps you’re taking a statistics class, or you’re about to take one. You may understand some of the basic ideas, but you have questions and want a place to go for a little extra help to give you an edge. And you also want a heads-up as to what instructors really think about when they write their exams. Well, look no further; help has arrived in the form of Statistics Workbook For Dummies, 2nd Edition.
This workbook helps you become more comfortable with and confident about statistics. Through plenty of practical problems that take you from step one all the way to your final exam, you review the concepts you know, identify areas where you need to focus more work, and address the little things that can make the difference between a B and an A.
As a statistics professor who has taught tens of thousands of students over the years, I have noticed that certain problems keep cropping up and causing my pen to take points off exams over and over again. And believe me: I want nothing more than to put my red pen away. So I give you all my secrets about what professors really want you to know, the kinds of questions they ask, and the types of answers they love and hate to see (so you can avoid the latter). And I focus only on the topics that you absolutely need to know, with minimal background information.
The major objectives of this workbook are for you to understand, calculate, and interpret the most common statistical formulas and techniques; get a handle on basic probability; gain confidence with difficult statistical topics such as the central limit theorem and p-values; know which statistical technique to use in different situations (for example, when to employ what kind of confidence interval); and evaluate and pinpoint problems with studies, polls, and experiments.
Although I wrote this workbook to serve as a companion to Statistics for Dummies, 2nd Edition (also published by Wiley and written by yours truly), this workbook works quite well with any introductory statistics textbook.
You may be asking how this workbook is different from other workbooks on the shelf. Well, here are a few ways, listed in order of importance:
I also used a few conventions while writing this book that you should be aware of:
This book is for you if you have some exposure to statistics already and want more opportunities to enjoy success through additional practice of the skills and techniques. Or perhaps you’re taking a statistics class and could use some extra support (and insider information). Or maybe you just really want to understand p-values because they keep you awake at night (been there, done that).
Note: If you’re totally new to the subject of statistics, I suggest that you first read Statistics for Dummies, 2nd Edition, (Wiley), because I cover the various concepts of statistics in much more detail in that book (but any introductory text will suffice). After you feel comfortable and confident with the material, you can try the problems in this workbook. Or, as an alternative, you can use this workbook to practice along with what you read in Statistics For Dummies, 2nd Edition.
Icons in this workbook draw your attention to certain features that occur on a regular basis. Think of them as road signs that you encounter on a trip. Here are the road signs you encounter on your journey through this workbook.
Be sure to check out the free Cheat Sheet for a handy guide that covers tips and tricks for answering statistics questions. To get this Cheat Sheet, simply go to www.dummies.com
and enter “Statistics Workbook For Dummies” in the Search box.
You also have the opportunity to complete online quizzes for Chapters 1 through 18 that test your knowledge of the concepts in each chapter. To gain access to the online practice, all you have to do is register by following these simple steps:
www.dummies.com/go/getaccess
.If you do not receive this email within two hours, please check your spam folder before contacting us through our Technical Support website at http://support.wiley.com
or by phone at 877-762-2974.
Now you’re ready to go! You can come back to the practice material as often as you want — simply log on with the username and password you created during your initial login. No need to enter the access code a second time.
Your registration is good for one year from the day you activate your PIN.
I wrote this workbook in a nonlinear way, so you can start anywhere and still understand what’s happening. However, I can make some recommendations to readers who are interested in knowing where to start:
Part 1
IN THIS PART …
Get down to the basics of number crunching.
Make and interpret charts and graphs.
Crank out and understand descriptive statistics.
Develop important skills for critiquing others’ statistics.
Chapter 1
IN THIS CHAPTER
Making tables to summarize categorical data
Highlighting the difference between frequencies and relative frequencies
Interpreting and evaluating tables
Categorical data is data in which individuals are placed into groups or categories — for example gender, region, or type of movie. Summarizing categorical data involves boiling down all the information into just a few numbers that tell its basic story. Because categorical data involves pieces of data that belong in categories, you have to look at how many individuals fall into each group and summarize the numbers appropriately. In this chapter, you practice making, interpreting, and evaluating frequency and relative frequency tables for categorical data.
One way to summarize categorical data is to simply count, or tally up, the number of individuals that fall into each category. The number of individuals in any given category is called the frequency (or count) for that category. If you list all the possible categories along with the frequency for each, you create a frequency table. The total of all the frequencies should equal the size of the sample (because you place each individual in one category).
See the following for an example of summarizing data by using a frequency table.
Q. Suppose that you take a sample of 10 people and ask them all whether they own a cellphone. Each person falls into one of two categories: yes or no. The data are shown in the following table.
Person # |
Cellphone |
Person # |
Cellphone |
1 |
Y |
6 |
Y |
2 |
N |
7 |
Y |
3 |
Y |
8 |
Y |
4 |
N |
9 |
N |
5 |
Y |
10 |
Y |
A. Data summaries boil down the data quickly and clearly.
Own a Cellphone? |
Frequency |
Y |
7 |
N |
3 |
Total |
10 |
1 You survey 20 shoppers to see what type of soft drink they like best, Brand A or Brand B. The results are: A, A, B, B, B, B, B, B, A, A, A, B, A, A, A, A, B, B, A, A. Which brand do the shoppers prefer? Make a frequency table and explain your answer.
2 A local city government asks voters to vote on a tax levy for the local school district. A total of 18,726 citizens vote on the issue. The yes count comes in at 10,479, and the rest of the voters said no.
3 A zoo asks 1,000 people whether they’ve been to the zoo in the last year. The surveyors count that 592 say yes, 198 say no, and 210 don’t respond.
4 Suppose that instead of showing the number in each group, you show just the percentage (called a relative frequency). What’s one advantage a relative frequency table has over a frequency table?
Another way to summarize categorical data is to show the percentage of individuals who fall into each category, thereby creating a relative frequency. The relative frequency of a given category is the frequency (number of individuals in that category) divided by the total sample size, multiplied by 100 to get the percentage. For example, if you survey 50 people and 10 are in favor of a certain issue, the relative frequency of the “in-favor” category is times 100, which gives you 20 percent. If you list all the possible categories along with their relative frequencies, you create a relative frequency table. The total of all the relative frequencies should equal 100 percent (subject to possible round-off error).
See the following for an example of summarizing data by using a relative frequency table.
Q. Using the cellphone data from the following table, make a relative frequency table and interpret the results.
Person # |
Cellphone |
Person # |
Cellphone |
1 |
Y |
6 |
Y |
2 |
N |
7 |
Y |
3 |
Y |
8 |
Y |
4 |
N |
9 |
N |
5 |
Y |
10 |
Y |
A. The following table shows a relative frequency table for the cellphone data. Seventy percent of the people sampled reported owning cellphones, and 30 percent admitted to being technologically behind the times.
Own a Cellphone? |
Relative Frequency |
Y |
70% |
N |
30% |
You get the 70 percent by taking , and you calculate the 30 percent by taking
.
5 You survey 20 shoppers to see what type of soft drink they like best, Brand A or Brand B. The results are: A, A, B, B, B, B, B, B, A, A, A, B, A, A, A, A, B, B, A, A. Which brand do the shoppers prefer?
6 A local city government asked voters in the last election to vote on a tax levy for the local school district. A record 18,726 voted on the issue. The yes count came in at 10,479, and the rest of the voters checked the no box. Show the results in a relative frequency table.
7 A zoo surveys 1,000 people to find out whether they’ve been to the zoo in the last year. The surveyors count that 592 say yes, 198 say no, and 210 don’t respond. Make a relative frequency table and use it to find the response rate (percentage of people who respond to the survey).
8 Name one disadvantage that comes with creating a relative frequency table compared to using a frequency table.
Not all summaries of categorical data are fair and accurate. Knowing what to look for can help you keep your eyes open for misleading and incomplete information.
Instructors often ask you to “interpret the results.” In this case, your instructor wants you to use the statistics available to talk about how they relate to the given situation. In other words, what do the results mean to the person who collects the data?
See the following for an example of critiquing a data summary.
Q. You watch a commercial where the manufacturer of a new cold medicine (“Nocold”) compares it to the leading brand. The results are shown in the following table.
How Nocold Compares |
Percentage |
Much better |
47% |
At least as good |
18% |
A. Much like the cold medicines I always take, the table about “Nocold” does “Nogood.”
9 Suppose that you ask 1,000 people to identify from a list of five vacation spots which ones they’ve already visited. The frequencies you receive are Disney World: 216; New Orleans: 312; Las Vegas: 418; New York City: 359; and Washington, D.C.: 188.
10 If you have only a frequency table, can you find the corresponding relative frequency table? Conversely, if you have only a relative frequency table, can you find the corresponding frequency table? Explain.
1 Eleven shoppers prefer Brand A, and nine shoppers prefer Brand B. The frequency table is shown in the following table. Brand A got more votes, but the results are pretty close.
Brand Preferred |
Frequency |
A |
11 |
B |
9 |
Total |
20 |
2 Frequencies are fine for summarizing data as long as you keep the total number in perspective.
Vote |
Frequency |
Y |
10,479 |
N |
8,247 |
Total |
18,726 |
3 This problem shows the importance of reporting not only the results of participants who respond but also what percentage of the total actually respond.
Gone to the Zoo in the Last Year? |
Frequency |
Y |
592 |
N |
198 |
Nonrespondents |
210 |
Total |
1,000 |
4 Showing the percents rather than counts means making a relative frequency table rather than a frequency table. One advantage of a relative frequency table is that everything sums to 100 percent, making it easier to interpret the results, especially if you have a large number of categories.
5 Relative frequencies do just what they say: They help you relate the results to each other (by finding percentages).
Brand Preferred |
Relative Frequency |
A |
55% |
B |
45% |
6 The results are shown in the following table. The yes percentage is . Because the total is 100%, the no percentage is
.
Vote |
Relative Frequency |
Y |
55.96% |
N |
44.04% |
7 You can see the relative frequency table that follows this answer. Knowing the response rate is critical for interpreting the results of a survey. The higher the response rate, the better. The response rate is – the total percentage of people who responded in any way (yes or no) to the survey. (Note that 21% is the nonresponse rate.)
Gone to the Zoo in the Last Year? |
Relative Frequency |
Y |
|
N |
19.8% |
Nonrespondents |
21.0% |
8 One disadvantage of a relative frequency table is that if you see only the percents, you don’t know how many people participated in the study; therefore, you don’t know how precise the results are. You can get around this problem by putting the total sample size somewhere at the top or bottom of your relative frequency table.
When making a relative frequency table, include the total sample size somewhere on the table.
9 Be careful about how you interpret tables where an individual can be in more than one category at the same time.
Location |
% Who Have Been There |
% Who Haven’t Been There |
Disney World |
|
|
New Orleans |
|
68.8% |
Las Vegas |
|
58.2% |
New York City |
|
64.1% |
Washington, D.C. |
|
81.2% |
Not all tables involving percents should sum to 1. Don’t force tables to sum to 1 when they shouldn’t; do make sure you understand whether each individual can fall under more than one category. In those cases, a typical relative frequency table isn’t appropriate.
10 You can always sum all the frequencies to get a total and then find each relative frequency by taking the frequency divided by the total. However, if you have only the percents, you can’t go back and find the original counts unless you know the total number of individuals. Suppose that you know that 80 percent of the people in a survey like ice cream. How many people in the survey like ice cream? If the total number of respondents is 100, people like ice cream. If the total is 50, you’re looking at
positive answers. If the total is 5, you deal only with
. This illustrates why relative frequency tables need to have the total sample size somewhere.
Watch for total sample sizes when given a relative frequency table. Don’t be misled by percentages alone, thinking they’re always based on large sample sizes, because many are not.