6.4.2 Chapter Practice 2: FDA Recall Data

In this chapter practice, we will look at how we can make sense of data we have never seen before. We are set to work with an Excel file called FDA_Recalls_Trends_2005-14.xlsx. (We will download this file a bit later on this page once we went through this exercise.)

Why start with the context? Because having a general understanding of what we will find in the data will allow us to process it faster. Therefore, before we open the file and we must start with the file name tells us.

What is FDA? What are recalls in general, and what does the word trends mean?

“The Food and Drug Administration (FDA) is responsible for protecting the public health by ensuring the safety, efficacy, and security of human and veterinary drugs, biological products, and medical devices; and by ensuring the safety of our nation’s food supply, cosmetics, and products that emit radiation (Source).”

We may not know what the FDA is off the top of our head, but we live in an age where we encounter recalls on a regular basis. Some of the most notable recalls of 2019 were regarding romaine lettuce, but there are more store, brand or product-specific recalls that have impacted HEB, Kroger, Wal-mart and their produce. One of the most notable Texas recalls impacted a beloved ice cream brand again and again. Recalls may impact food products, but there are many recent recalls that have impacted technology or devices we use every day. The much-publicized recall of the Galaxy Note 7 cost Samsung over $5 billion. A recent Volkswagen recall impacted hundreds of thousands of consumers, while the Takata Airbag recalls are impacting tens of millions of drivers. Recalls happen everywhere: from India to Indiana. Recalls impact consumer health and may cause fatalities.

What is a Food Recall?

A food recall is when a food producer takes a product off the market because there is reason to believe that it may cause consumers to become ill. In some situations, government agencies may request a food recall. Food recalls may happen for many reasons, including but not limited to:

  • Discovery of organisms, including bacteria such as Salmonella or parasites such as Cyclospora.
  • Discovery of foreign objects such as broken glass or metal.
  • Discovery of a major allergen that does not appear on the product label.

Source: https://www.foodsafety.gov/recalls-and-outbreaks

Why Do Food Recalls Matter?

“Food recalls are most importantly a public health issue, but they are also significant economic issues. The average cost of a recall to a food company is $10M in direct costs, in addition to brand damage and lost sales according to a joint industry study by the Food Marketing Institute and the Grocery Manufacturers Association. However, the costs for larger brands may be significantly higher based on the preliminary recall costs reported by firms of some recent recalls.” A single product’s recall can lead to an entire industry to lose hundreds of millions as a whole.

Source: Recall: The Food Industry’s Biggest Threat to Profitability

Let’s look at the Info sheet in the workbook!

Now that we have considered why we should invest in looking at this data, download the FDA_Recalls_Trends_2005-14 Excel file to your class folder on your computer and open it. Our data file is going to show us data from 2005 to 2014. We can generally use multiyear data to see if we can spot trends or patterns.

  •  How big is the data file? > Is its size measured in kilobytes, megabytes or gigabytes? Why does this matter?
  • Is there a data dictionary? > Read it! What can we learn about the data based on it? What can we do with field names that do not make sense when we read them?
  •  What is the source of your data? > What can we do it there is no source noted?
  •  Do we know who collected the data? > Do we know why? If not, then what?
  • What is its purpose of this data? > Does the source say what the purpose of this file is? If they do not, then what should we do?
  • Do you like the structure and the colors of the data dictionary? > If you do not, what can you do?


Let’s look at the Data sheet!

  • How much data is in the file? > Highlight the full range using CTRL+SHIFT+RIGHT KEY, DOWN KEY, how many cells are highlighted?
  • How many field names (columns)? How many records (rows)?
  • Do the field names make sense?
  • What types of questions can you answer with the data?
  • What functions do you know in Excel that will help you process data?

How would you process this data?

Use multiple methods of finding the correct answer in your data!
How do you find values? > Any shortcuts?
How do you add up numbers? > Functions? Total Row? PivotTable?
How do you count text? > Functions? Total Row? PivotTable?
How do you average a range of values? > Functions? Total Row? PivotTable?
How do you find the highest and lowest values? > Sort in what order?
How do you show only a subset of your data? > Excel Table? PivotTable?

Sample Questions

1. How many pounds of products were recalled overall?
a) 122,650,120
b) 873,873,649
c) 213,128,535
d) 428,744,535
2. How many different types of products were recalled from Class I?
a) 875
b) 478
c) 640
d) 280

3. How many pounds of Spaghettios were recalled?

a) 2875
b) 4640
c) 1740
d) 6280
4. How many pounds of Hot Dog Products were recalled?
a) 122875
b) 124610
c) 121740
d) 146280
5. Which product is the one with the most pounds recalled?
a) Canned Meat Products
b) Fresh and Frozen Ground Turkey Products
c) Frozen Pot Pie Products
d) Raw and Frozen Beef Products
6. Which of the following products were recalled on Dec 03 2010?
a) Spaghettios
b) Chicken Tamales
c) Instant Noodle Products
d) Italian Sausage Products
7. Instant Noodle Products were recalled for this Recall Class number:
a) I
b) II
c) III
d) IV
8. Pizza Products were recalled for this reason:
a) Processing Defect
b) Salmonella
c) Undeclared Allergen
d) Undeclared Substance
9. How many products with over 250,000 pound were recalled because of Listeria monocytogenes?
a) 8
b) 42
c) 160
d) 258
10. What is the average of Pounds Recalled for the reason of Misbranding?
a) 357,172
b) 63,278
c) 129,763
d) 59,529


Answer Key

1.D  2.B  3.C  4.B  5.D  6.B  7.B  8.C  9.A 10.D

FDA Data practice by Emese Felvegi is licensed under CC BY 4.0.

Media Attributions

  • DataDict
  • FDAData


Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Excel For Decision Making by Emese Felvegi; Noreen Brown; Barbara Lave; Julie Romey; Mary Schatz; Diane Shingledecker; and Robert McCarn is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.