Basics of Data Collection

Prerequisites
None

Most statistical analyses require that your data be in numerical rather than verbal form (you can’t punch letters into your calculator). Therefore, data collected in verbal form must be coded so that it is represented by numbers. To illustrate, consider the data in Table 1.

Table 1. Example Data.
 Student Name Hair Color Gender Major Height Computer Experience Norma Brown Female Psychology 5’4” Lots Amber Blonde Female Social Science 5’7” Very little Paul Blonde Male History 6’1” Moderate Christopher Black Male Biology 5’10” Lots Sonya Brown Female Psychology 5’4” Little

Can you conduct statistical analyses on the above data or must you re-code it in some way? For example, how would you go about computing the average height of the 5 students. You cannot enter students’ heights in their current form into a statistical program -- the computer would probably give you an error message because it does not understand notation such as 5’4”. One solution is to change all the numbers to inches. So, 5’4” becomes (5 x 12 ) + 4 = 64, and 6’1” becomes (6 x 12 ) + 1 = 73, and so forth. In this way, you are converting height in feet and inches to simply height in inches. From there, it is very easy to ask a statistical program to calculate the mean height in inches for the 5 students.

You may ask, “Why not simply ask subjects to write their height in inches in the first place?” Well, the number one rule of data collection is to ask for information in such a way as it will be most accurately reported. Most people know their height in feet and inches and cannot quickly and accurately convert it into inches “on the fly.” So, in order to preserve data accuracy, it is best for researchers to make the necessary conversions.

Let’s take another example. Suppose you wanted to calculate the mean amount of computer experience for the five students shown in Table 1. One way would be to convert the verbal descriptions to numbers as shown in Table 2. Thus, "Very Little" would be converted to "1" and "Little" would be converted to "2."

Table 1. Conversion of verbal descriptions to numbers.
 1 2 3 4 5 Very Little Little Moderate Lots Very Lots

Measurement Examples

Example #1: How much information should I record?

Say you are volunteering at a track meet at your college, and your job is to record each runner’s time as they pass the finish line for each race. Their times are shown in large red numbers on a digital clock with eight digits to the right of the decimal point, and you are told to record the entire number in your tablet. Thinking eight decimal places is a bit excessive, you only record runners’ times to one decimal place. The track meet begins, and runner number one finishes with a time of 22.93219780 seconds. You dutifully record her time in your tablet, but only to one decimal place, that is 22.9. Race number two finishes and you record 32.7 for the winning runner. The fastest time in Race number three is 25.6. Race number four winning time is 22.9, Race number five is…. But wait! You suddenly realize your mistake; you now have a tie between runner one and runner four for the title of Fastest Overall Runner! You should have recorded more information from the digital clock -- that information is now lost, and you cannot go back in time and record running times to more decimal places.

The point is that you should think very carefully about the scales and specificity of information needed in your research before you begin collecting data. If you believe you might need additional information later but are not sure, measure it; you can always decide to not use some of the data, or “collapse” your data down to lower scales if you wish, but you cannot expand your data set to include more information after the fact. In this example, you probably would not need to record eight digits to the right of the decimal point. But recording only one decimal digit is clearly too few.

Example #2

Pretend for a moment that you are teaching five children in middle school (yikes!), and you are trying to convince them that they must study more in order to earn better grades. To prove your point, you decide to collect actual data from their recent math exams, and, toward this end, you develop a questionnaire to measure their study time and subsequent grades. You might develop a questionnaire which looks like the following:

2. Please indicate how much you studied for this math exam:
a lot……………moderate……….…….little