Monday, March 16, 2015

T and Z tests and Chi-Squared Testing

Question 1
2a.
Z= (3.2-4)/(.73/squrt. 50)  Z= -7.49 CV =1.96        
Z= (11.7-10)/(1.3/sqrt. 50) Z= 9.24 CV = 1.96
Z= (77-75)/(5.71/sqrt. 50) Z= 2.47 CV= 1.96
b.
The null hypothesis is that there is no difference between the average number of Asian beetles from the county to the state level. The alternative hypothesis is that there is a difference between the average number of Asian beetles from the county to the state level.
I reject the null hypothesis and fail to reject the alternative so there is a difference between the number of Asian beetles at the county and state level. I say this because I got -7.49 for my z score and 1.96 critical value which does not fit the distribution graph.
The null hypothesis is that there is no difference between the average number of emerald ash borer beetles from the county to the state level. The alternative hypothesis is that there is a difference between the average number of emerald ash borer beetles from the county to the state level.
I reject the null hypothesis and fail to reject the alternative so there is a difference between the number of emerald ash borer beetles at the county and state level. I say this because I got 9.24 for my z score and 1.96 critical value which does not fit the distribution graph.

The null hypothesis is that there is no difference between the average number of golden nematode  from the county to the state level. The alternative hypothesis is that there is a difference between the average number of golden nematode from the county to the state level.
I reject the null hypothesis and fail to reject the alternative so there is a difference between the number of emerald ash borer beetles at the county and state level. I say this because I got 2.47 for my z score and 1.96 critical value which does not fit the distribution graph.
3. 
The null hypothesis is that there is no difference between the number of people per party in intervening years. The alternative hypothesis is that there is a difference between the number of people per party in intervening years.
I reject the null hypothesis and fail to reject the alternative  so there is a difference between the number of people in the park in intervening years. I say this because I got 4.92 for my t score and 1.711 critical value which is outside the .05 confidence value range.

Introduction

This assignment was all about visually and statistically comparing "Northern" and "Southern" Wisconsin. I placed them in quotation marks because there no exact measure of what the north part and south part of Wisconsin are. We were presented with a theoretical situation as follows. The tourism board of Wisconsin has asked you to conduct a bit of research regarding the concept of "Up-North." We were provided with a large variety of data from which we were to chose 3 variables to explore. On those 3 variables they want us to conduct a Chi-Squared test. The Chi-Squared test helps us to statistically compare counties north of highway 29 and counties to the south of highway 29. We also will compare the counties through maps based on the 3 variables we chose.

Methods

Part 1

For the first portion of this assignment we created a variety of maps. The first map we were asked to create is one that divides the counties in the state into two groups: counties north of highway 29 and counties south of highway 29.  In order to do this I brought in a street map and zoomed into the state of Wisconsin to locate highway. I then brought in a shape file of the counties in Wisconsin and laid it over the street map. I turned the transparency up to 70% so I could see the street map though the counties. Then looking at the counties position in relation to highway 29 I assigned a 1 to counties north and a 2 to counties to the south. To do this I added a new column in the Excel spread sheet of all the county data. (Figure 1)

Figure 1
The next step was to bring this excel table into ArcMap. To do this I just hit the add data button and select my file. Once me excel sheet was in ArcMap the next step is to join it with the county shape file. By doing this I will be able to map the data in my excel sheet. After I joined the excel and shape file the next step was to chose me 3 variables from the data provided for us. I chose gun deer and bow deer licenses sold and miles of ATV trails per county. Having chosen these variables I created 3 new fields in the attribute table of the county shape file. (Figure 2)

Figure 2
Once the fields were created assigning a ranking of 1-4 to each field based on its quantities began.  To do this I went to the provided data column like tr_ATV which has the ATV trail miles by county and hit the statistics button. This tells me info about the field like sum and max or min values. I am interested in the max value. I took this number divided it by 4 and then subtracted that from the max value 3 times to get my 4 rankings. (Figure 3) I then took my 4 numbers and in the select by attributes option in the county attribute table I entered each with a < symbol in front of them. Then in my fields I created I entered a 1-4 based on the number range selected in the select by attributes. When I was done my table looked like this. (Figure 4)
Figure 4


Figure 3

Once these ranks were assigned I was ready to map my results. This is easy to do. In the symbology tab for the county shape file just select the feature you want to map and assign a color scheme to it. In the legend the ranks of 1-4 are still there but I changed the labels so that instead of the ranks the actual numerical values are displayed. The maps below are my results of the 3 chosen variables and the north south map.

Results

This a map showing the northern and southern counties of Wisconsin based on their location relative to highway 29. Anything north was north anything south was south and the counties that 29 go through were determined by looking at whether more of the county was north or south of 29. This is my version others who did this could have different counties in the north or south based on their interpretation. From the map we can see that 29 does a pretty good job of dividing the state in half top to bottom spatially.

Figure 5

This next map is a chloropleth map displaying the number of gun deer licenses bought per county. We can see that the majority of counties has a fairly low license purchase. The northern part of the state especially seems to have low values. Geographically speaking I think this is caused by the amount of wilderness and forest up in the area and less densely populated towns. The less people there are the fewer licenses sell. I don't think this represents the number of hunters in these areas however because many people buy the licenses in a different county and then drive to these areas to hunt. Overall I would say there is a higher number of licenses bought in the southern part of the state but I think that is because there are more people there to buy them.


Figure 6

Looking at bow deer license purchases we see a similar pattern as the gun deer purchases. Again the northern part of the state has less in general but there is a slight increase of counties with higher purchase rates. The southern part of the state is pretty much the same.


Figure 7
This final map is looking at the number of miles per county of ATV trails. As I expected there are more ATV trails in the northern part of the state. I think this has a lot to do with the fact that there are more snowmobile trails up north as well because there is usually more snow. These trails get used for snowmobiling in winter and ATV riding in summer. There are lees people as well which is necessary for ATV trails because you can't put them through cities or on paved roads. The point of an ATV is to go off-road on all terrains and you need space to be able to do that. There are also more rugged terrains in this part of the state which people enjoy riding more than flat corn fields in the south.

Figure 8
Part 2
After I mapped out the data and made it visually appealing and easy to understand the next part of the assignment was to do statistical analysis. The analysis we were supposed to use is called Chi-Squared. The point of this function is to compare 2 areas based on a variable. In this case the two areas are the previously determined northern and southern Wisconsin counties. In order to perform this function we used a program called SPSS. Once the program is open we bring in our table that we exported from ArcMap containing all the county data. Then we open the crosstabs window. We chose the Chi-squared method and then bring in the North South counties for the rows and one of my 3 variables. You hit ok and figure 9 is the result. I did this test once for each variable so I got 3 different charts. (Figures 9-11)

ATV Chi-Squared Figure 9
Looking at the ATV map we would state the null hypothesis that there is no difference between the amount of ATV trail miles in northern Wisconsin compared to southern. The alternative hypothesis is that there is a difference between the amount of ATV trails. In this case I would fail to reject the alternative hypothesis because we can see from the map and the Chi-Squared value that there is a difference in miles of of ATV trails between northern and southern Wisconsin.
Bow Deer Chi-Squared Figure 10

Looking at the bow deer map we would state the null hypothesis that there is no difference between the amount of bow deer licenses sold in northern Wisconsin compared to southern. The alternative hypothesis is that there is a difference between the amount of licenses sold. In this case I would fail to reject the null hypothesis because looking at the map and the Chi-Squared value we see that there is no difference between the number of bow deer licenses sold in northern and southern Wisconsin.
Gun Deer Chi-Squared Figure 11
Looking at the gun deer map we would state the null hypothesis that there is no difference between the amount of gun deer licenses sold in northern Wisconsin compared to southern. The alternative hypothesis is that there is a difference between the amount of licenses sold. In this case I would fail to reject the null hypothesis because looking at the map and the Chi-Squared value we see that there is no difference between the number of gun deer licenses sold in northern and southern Wisconsin.




Conclusion

From my results I don't think that we can clearly say what is northern and southern Wisconsin. With one of my variables there was a difference between the two parts of the state but for the other 2 variables according to the statistics there was no difference. More variables would have to be tested to get a better idea of northern and southern Wisconsin and what defines them.