# Individual Project Using Excel for Hypothesis Testing

Instructions

1) You should submit your excel file along with the answers for full credit 2) Do not email me your work

3) Submit both your excel file and word file with answers to the folder named “Excel Project Submission” under Assignment tab in Classes

4) If excel file is not submitted, only half credit will be given 5) No late work is accepted

Scenario #1

Please go to the website http://www.ssa.gov/OACT/babynames/ and scroll down to the bottom, “items of interest.” Enter your birth year and choose top 1000 popularity on the “Population Names by Birth Year” column.

A. Guys should check male name column and girls should check female name column. Find out the most popular name and calculate the proportion of it out of total. Conduct 95% confidence interval for the proportion of people sharing that name in your cohort.

B. Now, move to the “Popularity of a Name” column. According to the name you chose from A, determine whether there is sufficient evidence to conclude that the name (you chose) is more popular in the year of 2000 (if not working, try 2001, 2002, or 2003, etc.) than it was in your cohort. Use the 5% level of significance. (This question needs all five steps we talked about for hypothesis testing including null/alternate hypothesis)

Note: For this question, use excel for all the numerical work and clearly state your answers.

Scenario #2

Please go to the website https://www.census.gov/data/datasets/2017/demo/popproj/2017- popproj.html and download Table 1. Projected Population by Single Year of Age, Sex, Race, and Hispanic Origin for the United States: 2016 to 2060 in excel file. Among all the data inputs, look only for when all sex, origin, and race are “0.” Then, look for data inputs in the column of POP_1 and POP_2.

A. Conduct 95% confidence interval for each POP_1 and POP_2 using mean and standard deviation calculated using excel.

B. With the data inputs on file, is there a sufficient evidence that the population mean in POP_1 is lower than the one in POP_2 (sex, origin, and race are all “0”) with 4% level of significance? (Show all five steps)

C. How about POP_3 and POP_4? Is there a sufficient evidence that the population mean in POP3 is greater than the one in POP_4 (sex, origin, and race are all “0”) with 5% level of significance? (Show all five steps)

Scenario #3

Please go to the website https://www.ncdc.noaa.gov/temp-and-precip/national-temperature- index/ and check “USCRN” on the datasets from 1895 to 2013. Use those data sets in the excel. You can click on CSV and download the data sets for both USCRN and CLIMDIV.

A. Conduct 90% confidence interval of temperature in USCRN datasets from 1895 to 2013 time period.

B. Now, check “CLIMDIV” on the datasets for the same time period. Conduct 93% confidence interval of temperature in CLIMDIV.

C. Now, I am told that the average temperature reported by CLIMDIV is higher than the one reported by USCRN. Do the dataset provide sufficient evidence to support this with 4% level of significance?

Scenario #4

Please go to the website https://www.ncdc.noaa.gov/cdo-web/datatools/records and, on the “View Selected Records” tab, set the conditions as “Daily Records” for timescale, “Highest Max Temperature” for parameter, “March 14, 2018” as the Starting Date and “April 12, 2018” as the End Date in the Date Range, “All” for record type, “Country” for location category, and “United States” for country, and then click on “Show Records.”

A. Find the station name with the median value of the “record.” (Look for the “record column”)

B. Construct 95% confidence interval with the “record.”

C. How many stations are there outside of the range within 2 standard deviations? If the same station shows more than once, count it as one.

### Place this order or similar order and get an amazing discount. USE Discount code “GET20” for 20% discount

Posted in Uncategorized