U.S. Energy Information Administration – company level imports data
The data you will be working with is a larger data set from the U.S. Energy Information Administration (EIA) website. The EIA “collects, analyzes, and disseminates independent and impartial energy information to promote sound policymaking, efficient markets, and public understanding of energy and its interaction with the economy and the environment (Source: EIA.gov).” EIA content is often among the data sets that analysts with backgrounds in accounting, finance, supply chain, and management information systems work within energy-related industries.
Go to the EIA website and review information shared about Petroleum Administration for Defense Districts (PADDs). Read about the history of PADDs and how these abbreviations are used to describe movements through a variety of means of transportation (pipeline, tanker, barge, and rail) between districts today. Bookmark, save or print this page for your records in order to quickly access information and answer questions regarding PADD regions as you work with your data following the steps below.
- Create a folder on your computer for “EIA Data Project”.
- Download the following files shared below:
- Open the EIA_2016_Company_Level_Imports file. There are four sheets in this workbook. The documentation notes the source of the data, the IMPORTS sheet contains 20 columns and 24618 rows, totaling to 492,360 individual points of data. As you look at the header row, there are field names that make sense at once, but there are field names we need explanations for. Therefore, you should open the explanatory notes contained in your EIA_DataDictionary to find out what fields like R_S_NAME or PCOMP_RNAM mean. The two sheets in your workbook that follow the IMPORTS sheet are tables listing World Regions and US Regions with corresponding countries or states listed next to them. You will be using the contents of IMPORTS. (The World and US Regions sheets can be used to make observations about your data based on regional origin or processing to serve as a higher level category above states or PADDs.)
- Open your data file and rename it to EIA_2016_Company_Level_Imports_YourName.xlsx.
- Take some time to familiarize yourself with the PADD regions, the data file and the data dictionary.
- Insert a PivotTable based on the contents of the IMPORTS sheet into a new worksheet. Confirm the source range as IMPORTS!$A$1:$T$24618.
- Move the inserted sheet after IMPORTS, rename it to Processing Facilities.
- Look up the field names for Processing Facility Name and Processing Company State in your Data Dictionary. Use these field names to create a PivotTable showing how many (COUNT) processing facilities are in each state. You can use your output to answer questions such as: How many processing facilities are in the US? Which state has the least/most processing facilities? How many processing facilities are there in Texas?
- Add the field name for the country name to the Filters PivotTable field. Use the filter to show changes in the numbers of processing facilities by country. What is the breakdown of the top five states if you filter for Canada? What is the breakdown if you filter for Mexico? Consider the connection between the changes in outputs based on geographic location.
- Clear the country filter for now, but leave it in place. Add the field name for Product Name to the Filters PivotTable field. Add Quantity for Values. Filter for Crude Oil. Which state processes the most crude oil overall? Which state is second or third?
- Filter for Canada, Mexico, Venezuela. Observe how the outputs for crude oil imported from certain countries changes the order of states.
- Go back to the IMPORTS sheet. Insert a PivotTable based on its contents into a new worksheet. Confirm the source range as IMPORTS!$A$1:$T$24618.
- Move the inserted sheet after the Processing Facilities sheet and rename it to PADDs.
- Use the field names for Processing Company PADD and Processing Company Name to create a PivotTable. Use the output to determine which Processing Company PADD has the most processing companies.
- Add a filter for the Product Name field and determine which Processing Company PADD has the most processing companies for Crude Oil.
- Drag Product Name and Processing Company PADD under the Filter fields, filter for Crude Oil and for PCOMP_PADD 3. Move Processing Company Name under Rows, Processing Facility Name under Values (COUNT). What does this output show? What do the highest and lowest values in this output represent?
- Save your work for your records.
Conceptual idea by Oswald Vinueza. Practice problems by Emese Felvégi and Bauer Tutoring Services based on chapter content and practice. CC BY-NC-SA 3.0.