Thursday, February 26, 2015

Z Scores, Mean Center, and Standard Distance

Introduction

For this activity we were given tornado width data for the states of Oklahoma and Kansas. One set of data is for 1995 to 2006 and the other for 2007 to 2012. Looking at the spatial distribution and width of the tornados you are given the task of determining whether or not tornado shelters should be installed and if so where those areas are. People in the states are questioning whether or not these shelters are really necessary or a waste of money.This will be determined through the use of mean center, weighted mean center, standard distance, and weighted standard distance.

Methodology

We used several methods to solve this problem. Listed are below are those methods and what they mean when doing this analysis.

Mean Center
Mean center is where the average of the x points and y points occur. In order to find mean center you calculate the average of the x and y values of your data points. You take those values and write it as a coordinate point like (5,4). This point shows the average location of all of your x and y values.

Weighted Mean Center
Weighted mean center is the same procedure as finding the mean center however you can specify a weight. This weight puts more importance on some values more than others which will move the mean center closer to the weighted points. The points aren't all equal value like they are when figure mean center.

Standard Distance
Standard Distance provides an average measure of feature distribution around any given point. It is very similar to the way a standard deviation measures the distribution of data values around the statistical mean but it is used for spatial analysis.

Weighted Standard Distance
This method is very similar to standard distance only you add weights to some of the features as you add weight to points for weighted mean center. An example from this activity would be the larger the tornado width the more the standard distance is going to move towards that features.

Data
All of the above tools were used on the data we were provided with to get our results. The data we received were point feature classes showing tornado locations in Oklahoma and Kansas. It came in two time periods consisting of data from 1995 to 2006 and 2007 to 2012. Not only are the points of each tornado included but also the width of the tornado. We were also given a shape file of Oklahoman and Kansas including county data.

Results

In order to answer the questions of whether or not the tornado shelters are important, in the right place or should get moved we had to use the methods described above. My results are as follows in Figure 1-7.

Figure 1
Figure one shows the location of tornados from 1995 to 2006. The data is displayed in a format called graduated circles. This allows use to see the relative size of each tornado as you can see if you look at the legend for the map. The smaller the circle the smaller the tornado which are measured in feet. I also figured out the weighted and mean center for this data set. As you can see the mean center ends up being pretty close to the middle of the study area. The weighted center moves slightly south which is determined by taking the size of the tornado into consideration not just the location of the tornado. There are more large tornados in the southern part pulling the weighted center in that direction.


Figure 2
 This map shows the tornado locations in graduated circles as well however this is displaying the 2007 to 2012 data. Again like the map above the mean center occurs near the center of the study area. In this case again the weighted mean center moves to the south. This suggests that the southern part of the state had more large tornados during this time period as well.



Figure 3

This map is simply a combination of the graduated circle, mean center and weighted mean center data for both time periods. As you can the weighted mean centers are both slightly to the south from the mean center. This is to be expected based on the first two maps. The probability of larger tornados taking place in the southern part of the state is are represented by this map based on the data we received. The higher probability of larger tornados in this area makes me agree with having tornado shelters in this area even more so than in the upper part of the state. Having data ranging over this long of time helps to illustrate the trend that is occurring: tornados are bigger in the southern part of the state.


Figure 4
 The next method we look at is the standard distance. This calculates the spatial standard deviation of features around a given point. In these maps that point is the weighted mean center. As you can see from this map of the of the 1990 data the majority of the tornados falls within this distance and that it is pretty well centered on the study area per the location other the weighted mean center. When you weight the values by the width of the tornados we see that again the distance moves south ward towards the higher population of tornado occurrences. Contained in these circles is approximately 68% of the total number of tornados.

Figure 5
 Looking at the more recent set of data we see that again the standard distance is in the middle of the study area but expands more to the north to include the 68% of tornados. When the weight is applied we can see that the distance moves south ward but also gets smaller because there is a higher concentration of tornados.


Figure 6
 Again just as before with the mean centers and graduated circles we combine the two maps of standard distances and graduated circles for the two time periods. From this map we can see that the south central part of the state has the highest concentration of tornados.

Figure 7

We also calculated the standard deviation of tornado occurrences in the two states. This chloropleth maps shows the results. The counties with values of -.50 to .50 are closest to the mean. The counties above this range have many more tornados than the average in the states. As you can see the central part of the state has the highest number of tornados and is outside the average of the states. This correlates with the results of figures 1-6 as well.

The z-score is the standard deviation for one particular sample. In this exercise we chose 3 counties to find the z-scores for. The counties were Russel, Caddo, Alfalfa. We found the standard deviation by creating a chloropleth and looking at the statistics related to it. The standard deviation was 4.3 while the mean was 4. Below are the results for the 3 counties.

Russel = 25 tornados with a z score of 4.88
Caddo = 13 tornados with a z score of 2.09
Alfalfa = 4 tornados with a z score of .23

Looking at the z scores we can see that Russel county has many more tornados than the other counties because it is almost 5 more than the average. Alfalfa is right about average with the number of tornados with a z score of .23 barely greater than the average.

Next I found the number of tornados that occur roughly 70% of the time. To do this you find 70% on the z chart which is a z score of .52.  This shows up on the negative side of the graph so we place a - in front of the .52. After doing the calculation I found that 1.76 tornados is the number that should occur 70% of the time.
Next I found the number of tornados that occur roughly 20% of the time. Same as before look at the z chart and instead of 20% we find the 80% which has a z score of .84. Again doing the calculations we find that 7.6 tornados is the number that should occur 20% of the time.

Conclusions

Based on the maps and results of this study I recommend that tornado shelters are placed and kept in the central to south central part of the states. This area has the biggest and most concentrated tornado occurrences compared to the rest of the area. If the trend continues that has been happening over the time period explored here these shelters will need to be put up in a south ward path where the highest threat of tornados will shift to.