Tuesday, November 5, 2013

Small Town Utopia

          We all have our own biases about how the town that we live in is the best, but is there really a clear winner? That is the question I set out to answer in my project. Although there are many towns I could have put in my sample, I just used the small suburbs around Rochester. These suburbs included Byron, Kasson, Stewartville, Triton, and Pine Island. I chose six categories to rank each town on. Three of these categories were educationally based, and the other three were just having to do with the town. The six categories were MCA math scores, MCA reading scores, college readiness, average house value, houses per 1000 people, and crime rates. I chose these because I thought they were all pretty important to making a town great. I was hoping that I could collect numerical data for each category and using those numbers, rank each town. The academic data was the easiest to find. I was able to get that data easily all off of one website and then make graphs and compare them. The other three categories were a little bit more difficult because I had to find websites that had information for all of the towns on them so that I could be consistent. I found that certain websites had slightly different numbers for the categories, so if I used data for one town from one website, I needed to keep that same website for all of the other towns. I created graphs for all of the categories and used a z-score in order to rank the average house values.
            Although I thought it would be relatively easy to find all of the data and compare, it was harder than I thought. I think that the data that I chose provided a minimal undercoverage and nonresponse. This is because the categories that I chose weren't based off of surveys, it was strict city data. One bias that could have occurred was on the houses on the market. The website I used wasn't a specific real estate agency, but it could have been biased to one, making sure that more of those agencies' houses were showing up. I also learned that it was harder to rank the towns than I thought. It was easy when it came to academics because it was already in percent. I had to use z-scores and find the standard deviation for the house values, which is something I learned in stats. It was fun to be able to actually apply that to something that I was doing on my own. When looking at my graphs, it was interesting to see that the math test scores were a lot more varied than the reading test scores. All of the towns were in the 78%-88% when it came to the reading tests, while in math, the scores varied from 45%-74%. It makes me curious as to why it is that way, but that might be a project for another time. I learned that the area around Rochester is a pretty safe place when it comes to crime. All of the towns had the same crime rate, which was the same as the average crime rate of Minnesota. While looking at the amount of houses on the market, it made me wonder why certain towns would have more or less houses for sale. Is it because people are trying to get out? Is it because everyone is trying to move there, or no one wants to leave? Maybe there's a higher demand in some towns versus others. Again, that's not something I ever found out, but this project spurred more questions that I would sometime be interested in finding the answer to.

These are a couple of the graphs and the stats that I used for my project.

Measuring Test Preparation


During this quarter of statistics, the main project I worked on involved creating quizzes for students to take before testing on a module. While I liked the flexible structure of this class, I found that I sometimes felt like I was going into tests unsure of how well I knew the material. The main focus of my course improvement project was to create a pretest that gives students a good determinate of how well they understand the concepts before testing. Specifically, I focused on the two modules dealing with hypothesis testing and quantitative distributions.

In order to create quizzes that are a good indicator of a person's level of understanding in specific concepts, I made the material on the quizzes directly related to the concepts that appear on the tests. Because of this, the grade a person receives on the pre-quiz should give the person a relative idea of how well how they would do on the test if they took it without further preparation. The questions are in either true/false or multiple choice format, and the quizzes have an average of about twenty-five questions each. Additionally, each question has one correct answer and then between  1-3 incorrect answers, depending on the question. All of the incorrect answers are possible answers a person could get from doing the problem incorrectly or without full understanding of the concepts involved. The types of questions I included on these quizzes varied depending on their similarity to homework problems, short answer questions, or direct test questions.

Finally, I researched multiple interactive quiz sites that would be most compatible with the questions I created. I found that the Edmodo site had received the best reviews and that it was iPad compatible so I decided it would be the best option. Upon completion of this project, I  found that creating questions about the concepts really required a deep understanding of the concepts involved. Not only did I have to understand the concepts but I also had to be able to apply them and receive accurate results.

Reflecting back on this quarter, one benefit of this statistics class being largely based upon self-paced work is the ability to pursue projects that are interesting and engaging. In addition to learning new concepts through the required modules, I also feel like I have learned a lot from projects this quarter. While my own project taught me a lot, the projects that my classmates completed also gave me a deeper insight into the world of statistics and its possibilities.



Screen shots of sample quiz questions are shown in the above images

Monday, November 4, 2013

Football Tendencies

My favorite project that I did was my Football Tendency Project. I really enjoyed doing this one because it was about a subject that I really enjoy talking about. That being football. This project didn't take a whole lot of time to do but it definitely wasn't an easy subject to get the stats from. There was a lot of calculating to do, to get all of my information. I was able to use our football team's Hudl program to get a lot of the formations and things so that I had good data. That way I didn't have to watch 8 games worth of film to get all of the formations and the amount of times the plays were ran. I was able to scratch the data with the website.

I took all of the formations and the amount of times they were used against us, from the past football season, to be able to predict what teams would want to do in order to be successful against our team. Then I was able to calculate which formations were most commonly used against our team.

There was also a lot of things that I was able to learn from using this program to get my information. The first thing it taught me were the skills to be able to use this website further during my football season. I learned how in depth the website went into all of the stats that the coaches enter into the system. I was able to see the top 3 plays the team ran out of each formation and which side of the field the play most likely went to. That right there is a project in itself. The second thing that I was able to pull from the data, were the most used formations: Trips Gun, White, and Pistol. This is indeed something that the coaches I'm sure know, but didn't put on Hudl. A lot of my data I believe is dependable, but all of the stuff is from the last year, and teams change year by year, and become less and less predictable as they pick up on the teams schemes against them.

There's Always Room for Improvement

When I asked myself what I could do to improve this course, I thought of aspects of the class that I did not like. Then I realized I shouldn't do a project just because I dislike something, but because a majority of students in the statistics classes see a problem with something. This idea is what jump started my project. I decided that before I try to address an issue, I should find out what students think the major issues are. To do this, I made a survey using Google Forms which asked students about videos, practice problems, and the current structure of the class (group lessons, hybrid, etc.). 33 students between second and fourth block took the survey, and the results I received were very telling.

Once I received the results, I discovered the best way to prepare for module tests and how the structure of the class contributes to students' success in the class. Additionally, I was able to see what aspects of the class were negative and positive according to students. What did I do with these results? I made a list of ways I thought we could improve the negative results. Ultimately, I created course improvement projects that students and Mr. Pethan could do to improve the learning opportunities in the course.

After getting such constructive feedback, I decided this would be the project to share with the class and Mr. Pethan. When preparing to present my results, I realized that no one wants to see lists of numbers or wordy explanations. So, I made an infographic using Piktochart to effectively relay my findings. Below, you can see the infographic as well as the most significant results from the survey.


This project was a long process, so I did not have much time to improve the areas students found problematic. However, I picked one need that over 75% of students said they wanted addressed. Students said they wanted videos in the solution key explaining how to find answers using an online application called StatKey. I made videos for 3 modules that use this application, using the scenarios from practice problems so students could see how to reach the answer. Here is an example of one of the videos I made.


Overall, I have found this statistics class to be a new, innovative way to approach education. Whether using new teaching methods or traditional ones, there will always be aspects to improve upon. In my opinion, it is crucial to ask students where improvements can be made so classes can offer the most effective learning opportunities.

Thursday, October 31, 2013

Course Improvement Project by Shemar Odell

The project, from statistics, that I chose to blog about was my course improvement project. Now this project is simply me going the different units and writing down the most necessary things in order to do well on the test for that unit. Now the reasoning I'm doing this for a course improvement project is a bit more complicated. I guess the best way to explain it would be to give you an example. So here is the situation, I'm prepared and ready for the test, so our teacher decides its a good idea to give every student a verbal mini quiz before the test. Now here is my problem, I know for the most part, what I'm talking about, but whenever someone grills me with questions like that, I always draw a blank. Its suddenly like we are speaking two completely different languages, and his words are going in one ear and straight out the other, with no comprehension whatsoever.

When I realized I had this problem, I decided to turn it into a course improvement project, but I didn't know how. So I thought, hey what's the best way I learn something new? And the answer is, I learn new things by learning about the big picture, then eventually getting to the very small details later. Know occasionally I would forget something along the way, which is basically the premise behind the project.

Here is a example from unit one:
  • Know the definition of the following terms 
    • individuals 
    • Subjects 
    • Variables 
      • Quantitative variables 
      • Categorical variables 
    • Frequency 
    • Relative frequency 
  • Know what the following graphs are and their advantages/disadvantages 
    • Bar graph 
    • Pie/Circle graph 
    • Histograms
This list is very simple and easy to understand. The farther to the left, the more general, and the farther to the right, the more specific. It goes in order from unit one all the way to unit seven.

By Shemar Odell

Yahtzee Simulator

My main objective for this class was to apply programming to various situations in which statistical analysis could be useful to find an optimized solution through simulation.  While looking at project ideas on Piazza, I came across the idea for Yahtzee simulation and thought it would be a fun, engaging project to get engrossed into.  

I soon started to develop the major algorithm for making the game itself work.  This took quite a long time, as I had to completely scrap the first version of the program I made and replace with a simpler, more efficient object-oriented approach.  The program itself also generates a file you can open in Excel that contains data from all of the games that are simulated.  

My AI is currently very very primitive, as it simply holds the dice that have the best chance of getting you a Yahtzee.  This is one region where I plan on expanding in the next quarter.  I will make players with different playing strategies, similar to Mr. Pethan's randomly generated players in the ultimate frisbee game, and these players will show me what strategy for Yahtzee will produce the highest, most consistent scores, which I can then analyze through means, standard deviations, and confidence intervals.

Anyway, if anyone is interested in the code, it is below in a zip file.  Once it is downloaded, open the folder and copy the two files anywhere, just make sure they are both in the same folder.  Once you do this, make sure you have either Python 2.7 or 3.3 installed on your computer, right-click on "simulator.py," and click "Edit with IDLE."  Now scroll to the bottom of the document which should look like this:



Now, where it says playGames(1000, False), you can enter in any number for 1000 and that will be how many games of Yahtzee will be simulated by the program, and False should probably be kept false.  Now, save the document and hit F5 to run the program.  You will see that displayed on the shell is the average information for valuable stats, while the raw data will be stored in the newly created file game_stats, which is in the same directory as your programs.

Feel free to ask me any questions you have about the program or how it works.

Link to download >>>>Yahtzee Files<<<<

NFL QB rating through week 5

        I decided to spend some time gathering data on quarterbacks in the NFL through week 5 and put  the data into excel where I was able to recreate the passer rating and create my own overall quarterback rating. I combined passing and rushing to show that quarterbacks cannot only be based on passing because some quarterbacks do way more than that to help their team.  I found it extremely interesting to see which quarterbacks ended up on top and which ones were towards the bottom, a few of the rankings actually surprised me. The following are not the best 3 but they show how it can be different in both ways and pretty close depending on how the rushing yards affected.



It was a little hard because having to input that much data into an excel spreadsheet, is a little ridiculous.  Then I realized I didn't even need that much data for the project.  I only needed the data used for the NFL Passer rating and a few rushing stats to tweak the quarterback passer rating.  It was also hard to decide which things to switch so my formula would work and it ended up looking like :=(((Rule 1+Rule 2+Rule 3+Rule 4)/6)*100), but each of the rules have other formulas inside of them based on completions, touchdowns, interceptions and yards.  This was really just supposed to try to find a better way to classify NFL quarterbacks.

Wednesday, October 30, 2013

Whose Got the Best Poll?

I used most of my time in this class to analyze political polls in the special New Jersey Senate Election. I am obsessed with politics and enjoy every aspect of them. Polling caught my interest because it was something I didn't understand in the world of politics. I did some background research and I was able to analyze polling charts and backgrounds to help me predict the outcome of the election. Nate Silver's blogs contributed many theories to my project, but I designed my own. I used the New Jersey election as a kind of "guinea pig."  After reading a ton about political polling I was able to utilize realclearpolitics.com and print polling charts. I used the sample size, recency, and recognition of the polling agency to rank the polls accordingly. I ranked a pile of charts and went into election day hoping to be correct, and I wasn't. I then got the election results and analyzed why I didn't rank correctly. I came to the conclusion that the locality of a poll also contributes to its correctness. Locality was not an aspect of polling I found in any ranking system out there. This new aspect could effect the entire ranking of polling agencies and their effectiveness for every political race. Locality allows for name recognition of an agency, which could make voters more receptive.

This project was a lot of work, but also a lot of fun. I was glad to find something new to learn in the political world! Polling is just the first step in understanding the political process, I will now continue on and discover new ways to understand momentum and recognition, maybe even political message.

If you are interested in attempting this project here is a link to project instructions: Whose Got the Best Poll?

Tuesday, October 29, 2013

Ultimate Frisbee Drafting

My favorite thing that we have done it statistics this year is the ultimate frisbee project.  I liked how we played it first ourselves and then watched Moneyball so we understood what drafting was like.  Playing it ourselves also helped us understand what to look for in a player.  Jake, Brady and I looked through the list of players and picked out a few that looked promising.  Next we went through and rated them numerically based on skill.

In the movie Moneyball, Brad Pitt picked his players by their on-base percentage.  For the most part we picked players that had good percentages for short throws and catches. We did this because when we played frisbee as a class we realized that it was a lot easier to score with multiple short throws, instead of few long throws.  It was obvious that short throws were more beneficial just from looking at the statistics of the short throws compared to the long throws. We also looked at people who had won a lot of games.  So our team had some balance, we also made sure to have at least two people that could catch long throws and make them.

 During the actually draft about half of the players we wanted were already taken so we had to make a couple last second decisions.  For this we had the same decision making process, although we were rushed and did not put as much thought into it.  In the end we had a good draft pick and we completely dominated the other teams and got the extra credit points.  I liked this project because it helped me learn how to look at a large amount of data and how to interpret it.   It was kind of fun to go through and pick different players with different strengths to make a solid team.



Nate Levy

We have learned many different things during this quarter of statistics. In the beginning of the year, we mostly focused on the core modules of high school stats, such as doing homework and taking tests. We have done multiple assignments, projects, tests, and presentations. In my own opinion, I thought I did my best work on the course improvement project. I put a heaping full of time and effort into this project. Ian, Isaac, and myself have been working on this project together. 

For this project, we decided to make another review and solution key for all of the modules we completed. We divided this project up into three different parts. Ian handled the first three modules with ease, while he left the harder modules; modules 4, 5, and 6 for myself to complete. On the other hand, Isaac made a whole plethora of multiple choice questions from the essay's on the test. We then combined all of these modules to create one gigantic final review. 

This project has taught me all sorts of different things. I really had to use most of my time during class working on this project. It gave me a whole new respect for teachers that have put their time and effort to make these questions and solutions. It also taught me about teamwork, and how important it can be off the field, like in a statistics classroom. If one person did not finish their part, then our whole review would have been slaughtered, and then we would not get a substantial grade. It took a longer time to finish the solution key than the original questions, because I actually had to calculate all of the answers. I had a lot of fun working on this together with my group, and putting together the huge final review. It was also a lot of fun making the questions, and using our brilliant, creative noggins to make them. 

Final Review

Throughout this quarter in Stats I have completed multiple assignment, projects, and presentations, some fun and others challenging. My course improvement project, I thought, was my best work and could show how much time and effort I have put into this class. I did this projects with two of my friends, Nate Levy and Isaac Jestus.


We split up this project into three parts, I did the first three modules, Nate completed the last three modules, and Isaac put all the essay questions from the tests into multiple choice format. I then combined all of these to make one massive review for those who felt like they needed a review on the whole quarter, or those who feel they need to review certain modules can find that module easily with our neatly organized review.

This project showed me that I had to make sure I really used my time wisely so I wouldn't fall behind and lose points. It also showed me how teamwork was important, my teammates also had to stay on task or else we wouldn't all get done at the same time making the ones who are done first lose points. This also taught me that teachers have a super tough job trying to make up questions for homework and/or tests. It took me forever to finish the solutions for the review because I kept having to change the question because I either couldn't figure out with the numbers I had on there or there was no possible solution. The thing I had most fun with was making graphs that correlate with the questions I put on there. There are examples of my graphs are spread out in this Blog

Sunday, October 27, 2013

Welcome!

Welcome to the Byron High School Statistics class blog.  During each of the two quarters, every student in class will create a post that shares something they are excited about or proud of in class.  Posts will include a description of the project, some original photos or videos that clarify what the student did, and a reflective component that demonstrates what they learned in the process.  Please engage by commenting on the posts, and share this blog with other Statistics students and teachers!

For more about the course itself, including all of our open course materials, check out mrpethan.com.