CP 6691 - Week 3 (Part 2)

Descriptive Research Designs


Interactive Table of Contents (Click on any Block)

Purpose of Descriptive Research
Traps to Avoid When Identifying the Type of Research Design
External and Internal Validity Threats to Consider
Evaluate Sample Study #12 (Human Sexuality Instruction ...)
Evaluate Sample Study #13 (Natural Rates of Teacher Approval ...)

Assignment for Week 4

Part 2 of 2

Purpose of Descriptive Research

We will begin the study of each new research design by describing the purpose of the design. You may recall from the first class period when we talked about the evaluation quiz format, that the first question was to identify what kind of research design the study represented. The reason for this is because each design type has attributes, unique from other design types, that you need to recognize in order to evaluate the design properly. The best way to identify the design is by determining what the purpose of the research study is.

In the case of Descriptive research designs, the purpose is to describe attributes of a group of subjects. If you can determine that this is the purpose of a research study, then you can be sure it is a descriptive study. But, it's not often easy to determine the purpose statement. With some designs which we will study later in the course, there are alternative means. But, for descriptive research designs, you must either identify the purpose as clearly descriptive, or you must systematically eliminate all the other types of designs, leaving only descriptive. As we study other designs, this process will take shape and become clearer. For now, we only know one design type, so there's nothing else to eliminate.

Let's examine the characteristics of descriptive research. First of all, descriptive research comes in two flavors: survey research and direct observation research. If you recall our discussions in the Week 1 about types of measures, you should recognize that survey research makes use of surveys (questionnaires) for data gathering, while direct observation research makes use of observations. When evaluating a descriptive research study, a large part of your evaluation will involve the measure used to collect data as well as the method used to select the sample for the study (things we've already discussed in previous lessons).


Traps to Avoid When Identifying the Type of Research Design

Now that you know how to correctly identify the type of research design you're evaluating, let's look at some traps to avoid in trying to decide what kind of research design you have. One trap to avoid is not to pay attention to the measures used for data collection. It's tempting, for instance, to reason something like this:

If descriptive research designs use questionnaires (surveys) and observations, then if I see a study that uses either surveys or observations, it must be a descriptive study.

Unfortunately, this reasoning is wrong. Remember that a researcher can use any of the four measures we studied in Lesson 4 (singly or in combination) to collect data for his/her research. There is no restriction on what sorts of measures (instruments) the researcher may use.

Another trap to avoid is not to rely on the type of statistics used in the study. The principal statistics used in descriptive research are descriptive statistics. However, just because a particular study uses descriptive statistics does not make it a descriptive research design. In fact, every well done research study, regardless of its type, ought to include descriptive statistics on the attributes of the subjects. We call these demographic data, and they are vitally important in helping readers (and evaluators) generalize results of the study to populations other than the target population used by the researcher.

Descriptive research studies will not have research hypotheses stated in the report because of the nature of descriptive research. Descriptive studies are generally done in areas where there is very little if any knowledge about certain populations or groups of subjects. Because so little is known, it is virtually impossible to formulate an educated guess (an hypothesis) as to the outcome of the study. This brings up yet another trap to avoid. Do not to rely on the absence of a research hypothesis as proof that this study must be descriptive. It is the prerogative of the researcher to use either a research hypothesis, research objective, or research questions in his/her study. In fact, there are many research studies (some you will encounter) that do not contain a research hypothesis -- and they are not all descriptive research designs.

If you avoid these three traps when trying to identify a research design for a study, you'll make very few errors. If you concentrate on finding the purpose and using that as your determining factor, you'll increase your chances of success.


External and Internal Validity Threats to Consider

Before we evaluate our first full research study, let's briefly review the threats to the validity of the study that we've covered the last few weeks. Actually, what I'll do is to introduce you to a categorization technique that should make it easy to remember all the threats you've studied so far and the additional threats you'll learn about in the weeks to come.
 

The figure above is a generic model of a quantitative research design. From left to right, you see that the researcher selects a sample from the population of interest. That selection process, as you'll remember, can be either random or non-random. This is considered the External portion of the design. After the sample is selected, we enter into the Internal portion of the design. Here is where the actual research methodology is undertaken: instruments are administered, data gathered, treatments may be conducted on the subjects, data are analyzed, and results (findings) are produced. After the findings are generated, the only thing left to do is to try to generalize them back to the population of interest if possible.

There are several things to notice in this model, so let's take them one at a time. First, notice that both the external portion and the internal portion of the study have threat possibilities. There are actually three external threats, although only two are discussed here. The third one will be introduced in a later lesson. You have actually seen these threats before, but by different names. The names you see here are their traditional names, and are the ones used by most researchers. Population validity is a measure of how representative the sample is of the population. Recall from Week 1 that we talked about ensuring the sample is representative of the population. This is done by randomly selecting a relatively large sample from the population. In some instances, it may even be necessary to stratify the sample to get the proper representation. Well, the more representative the sample is, the higher the population validity (and that's good). So, you can determine the level of population validity by paying attention to how the sample was selected from the population and how large the sample is.

The second external validity threat is called Personological Variable Validity. This is simply determining if the researcher is collecting data on all the relevant personological (or demographic) variables of the subjects selected for the study. Here's a quick example. Say a researcher were going to do a study of voting preferences in a particular city for an upcoming election. The researcher would select a random sampling of potential voters and would collect (and report) demographic variables also. Beside the variable voting preference, what other demographic variable would you expect the researcher to report? Think about it for a minute, then look in the next paragraph for the answer.

Of all the demographic variables the researcher could collect, I would expect him/her to collect and report the age of the subjects. That would be relevant to me as an evaluator since I would want to know that the people surveyed were eligible voters. It would, for instance, be pretty meaningless if the researcher surveyed school-age children, don't you think? So, when you determine what the researcher's purpose is for the study, be sure to check that he/she collects and reports the appropriate kinds of demographic (personological) variables (at least those you, the evaluator, think are appropriate). If the researcher does a good job of collecting and reporting demographic data, then personological variable validity is high. If the researcher does not report many (or any) demographic variables on the subjects of the study, then personological variable validity is low.

Before moving to the internal portion of the study, let me say one very important thing about external validity threats. External threats ONLY affect external "events" in the study (i.e., generalization). External threats cannot affect the internal workings of the study. In other words, regardless of how the sample was selected, or the size of the sample, or the kinds of demographic data collected, they cannot affect the findings of the study. They will only affect the researcher's (or your) ability to generalize those findings back to the population.

The dark gray box in the center of the figure is the internal portion of the study. Notice that there are three internal validity threats here. Actually, there are eight internal threats in total. We'll cover the rest in later lessons. The Instrumentation threat is a combination of a bunch of threats (weaknesses) we discussed in Week 3 (and in Chapter 6 of your textbook), when we talked about the things that need to be evaluated on the four different measuring instruments used in social science research: pencil-and-paper tests, questionnaires, interviews, and observations. Instrumentation also includes instrument validity and reliability. In some research designs, tests are administered before and after a treatment. When this occurs, the instrumentation threat concerns itself with whether the two tests are of equal difficulty or not. So, you can see that the instrumentation threat contains a lot of stuff, but it all has to do with the instruments used to collect data.

Finally, the Appropriate Use of Inferential Statistics threat is one we talked about in the previous lesson dealing with inferential statistics. It has to do with whether the researcher is using the most appropriate inferential statistic (parametric or non-parametric) depending on whether the data being analyzed is continuous or discreet, respectively. If you don't remember this discussion, please review the lesson. If the researcher does not use inferential statistics in the study, you don't have to worry about this. There is another aspect of this statistics threat, but we'll cover this in a later lesson.

Now, try evaluating study 12 in the Supplemental Book (SB). Use the Typical Evaluation Quiz Format (#11 in the SB) to guide your evaluation by answering each of the five questions there. When you have finished evaluating each study, read on in the appropriate sections that follow and compare your evaluations with mine. Try to categorize your threats (weaknesses) in terms of the external and internal threats we discussed above.


Evaluating Sample Study #12
(Human Sexuality Instruction in Counselor Education Curricula)

1. What kind of research design is this?

As stated at the bottom of page 305 of the study, the purpose of this study is "... to determine to what extent graduate students in counseling are being educated in matters related to human sexuality."" Although it not worded exactly like the purpose statement we cited earlier for descriptive research studies, it appears to be similar. The study is interested in describing the amount of human sexuality instruction being provided to a specific population (graduate students in counseling). So, although we can't be absolutely certain, were can be reasonably sure this is a descriptive study. Once we learn other research designs, it would be easier to identify this as a descriptive study by systematically eliminating the other design types.
 

2. What is the research hypothesis, objective, or question(s), or if none, so state.

At the end of the introduction (bottom of page 307, top of page 308) is a restatement of the purpose statement we say on page 305. This statement is clearly not a hypothesis since it doesn't predict an outcome. It also isn't a question. But, could it be a research objective? To answer that question, apply the criteria from Chapter 5 of your text. Is it a concise statement? - yes. Is it grounded in theory or previous research? - yes (as presented in the introduction). Does it clearly identify the variables to be studied and the relationships between them? The answer to this question is no, because the variables are not clearly stated (or defined well). For instance, what is "sexuality counseling?" What does "the extent to which" mean --- number of courses, number of hours, number of faculty involved in the instruction, number of students enrolled in such courses, etc.? Because these variables are so unclear, this statement doesn't qualify as a good research objective.
 

3. To what population would you feel comfortable generalizing results of this study?

To determine the answer to this question, we have to look at who the subjects are, how they were selected, the population they were selected from, how many were selected, and any additional demographic data provided by the researchers on the subjects. In the Methods section on page 308, we see that participants for this study came from counselor education programs listed in the 1983-1985 Counselor Preparation manual. We also know that 476 programs were surveyed. From the purpose of the study, we got the impression that the target population would be graduate counselor education programs (presumably throughout the United States).

The first question, then, is whether the 476 counselor education programs listed in the manual are representative of all graduate counselor education programs in the U.S., or are they representative of any particular region of the U.S. We aren't told anything about the criteria for schools to get into this manual, nor are we given any demographic data about the schools in the manual that might help us understand what differentiates schools in this manual from schools not in this manual. We are told that surveys were received from all 50 states. So, evidently, schools in this manual are from across the U.S. But, are the schools represented proportionately by state? That is, are there more schools in the manual from states with more counselor education schools? We aren't told the answer to this question either. In fact, we know so little about the schools in this manual, that we should be very careful generalizing beyond the schools in this manual - the schools that were surveyed.

Whatever the researchers discover about the schools that responded to their survey, it should be obvious that those results automatically apply to the sample. That isn't generalization. Generalization means to go beyond the sample studied. So, it would be incorrect to answer this question on an evaluation quiz by saying the results from this study generalize to the 270 programs that responded to the survey. You cannot generalize to the sample studied because the results of the study already apply to the sample (that's where the data came from to generate the results).  If you cannot generalize any farther than the sample studied, then you must conclude that no generalization is possible.

So, we know we can't generalize to all counselor education programs in the U.S. But, what about generalizing results back to the original 476 programs found in the manual? Since only 270 programs responded (59 percent), could we not generalize results from these 270 programs back to the other 41 percent that didn't respond? To answer this question, we have to go back to our study of non-respondent (volunteer) bias from a few lessons ago. Recall that we said it's unwise to assume that those who do respond to a survey are similar in their response patterns to those who do not respond. We really can't say one way or another. With such a large non-respondent group, you should have very little confidence that the responses of the 59 percent represent those of the other 41 percent.

All of this leaves us with a sense that we cannot really confidently generalize the results of this study to any other group since there is so little demographic data provided by the researchers to allow us to do so.
 

4. Identify the strengths and threats to validity in this study.

Let's consider the strengths and threats by External and Internal threat categories.  This is a good, systematic way to ensure that you don't miss anything during the evaluation.

External Threat Categories:

1.  Population Validity - Whether this is a strength or a threat depends on the sample size and sampling method. The researcher sent surveys to all 476 counselor education programs in the U.S.  This is NOT the sample size, however.  When a survey (questionnaire) is the instrument used to collect data, the sample in the study is always made up of those who actually returned the survey (those who are actually participating in the study).  So, in this study, the sample size is 270.  Since this was a voluntary response survey, it is, by definition, a non-random sample.  Therefore, we have a non-random sample of 270 subjects for this study.  The question is whether 270 is a large enough sample size to overcome the sampling error produced by the non-random sampling method. Although 270 appears to be a large number, we are told that it represents only 59 percent of the population to whom the survey was initially sent.  So, it's probably fair to say that a 59 percent return rate is not large enough to overcome non-random sampling error.  Thus, population validity would be considered a threat in this study.

2.  Personological Variable Validity - Because this threat category deals with the demographic data collected on the participants of the study, it can also be called Demographics.  The question you need to ask yourself is whether there are demographic data about the subjects of the study that you feel are important to understanding the results, but were not included in the study.  If you feel there were important demographic data left out, then you will find Demographics to be a threat.  Otherwise, it's a strength.

There is one unique twist in this study you need to be aware of.  Typically, quantitative studies in the social sciences deal with human subjects as the sample.  This study is different, though.  The "subjects" of this study are the counselor education programs!  That's what's being studied.  The humans in the study are simply inputting data into the survey instrument to describe their programs.  Therefore, when considering the Demographics threat category in this study, you should consider the descriptive data reported on the programs.

If you believe that the study provided enough demographic data to sufficiently describe the programs so that you could understand the results, then you should indicate that Demographics are a strength.  If, however, you think there are data no reported that would have helped you better understand the results of the study, then you should indicate that Demographics are a threat.  AND you should also indicate what demographics you believe should have been reported.

For example, in this study, I believe it would have been interesting to know the gender makeup of the faculty for each human sexuality course.  Someone might also extend that to include the gender makeup of the student population for each course.  Since these were not included in the demographics, I would find the demographics to be a threat in this study.

Internal Threat Categories:

1.  Instrumentation - We already identified one threat related to the relatively low response rate to the questionnaire (59 percent). On the strength side, the questionnaire was pilot tested and the researchers used a follow-up reminder post card system to try to increase response rate.

Another potential threat was the procedure used to complete surveys where department chairpersons were asked to identify appropriately informed faculty members. Such individuals would provide valid and reliably responses.  Is it reasonable to assume that faculty members selected by department chairpersons actually were the most informed and best qualified? This is a decision you must make as evaluator.

Also, consider that, as the research report states, "Instructors were asked to make professional judgments about the quality and quantity of sexuality training at their institutions." Without the actual questionnaire to look at, it's impossible to know whether the questions were subjectively or objectively based. But, given the statement above, it's probably reasonable to assume that making judgments about program "quality" might involve some self-report bias.

2.  Appropriate Use of Inferential Statistics - This study used no inferential statistics.  How do I know that?  Two reasons:  First, the researchers tell us in the first sentence in the Results section of the report:  "Descriptive statistics were used to analyze data."  Second, inferential statistics are used to analyze differences between groups and relationships between variables.  Look at all the data tables in this study and you won't find any that address differences between groups or relationships between variables.  So, no threat here...no strength either.
 

5. Are there any ethical problems in this study?

The question of ethical problems relates to two areas. One involves harm done to the subjects of the study by either something the researcher does to the subjects or something the researcher withholds from the subjects that they would normally receive. Nothing of that sort appears to have occurred in this study. The other area involves the unethical manipulation of data or conditions of the study to make results appear different than they really are. This is virtually impossible to determine from a research report, and is usually something you should not accuse a researcher of unless you have hard evidence from a reliable source.

Next, try your hand at evaluating study 13 in the Supplemental Book (SB). As with the previous study, try to categorize your threats (weaknesses) in terms of the external and internal threats we have discussed.

Evaluating Sample Study #13
(Natural Rates of Teacher Approval and Disapproval
in British Secondary School Classrooms
)

1. What kind of research design is this?

The purpose of this study (stated at the end of the Introduction section on page 40) is to determine "the extent to which British secondary school teachers make use of praise and reprimand in their classroom teaching." It appears from this statement that the study will focus on describing how British secondary school teachers (the population) use praise and reprimand in their classes (the attributes of interest). Therefore, it appears this is a descriptive research design.
 

2. What is the research hypothesis, objective, or question(s), or if none, so state.

The same statement identified in #1 above is actually a research objective. We can be confident of this because the statement meets the criteria stated in Chapter 5 of the text. It is concise, grounded in theory and previous research, and clearly identifies the variables of interest (use of praise and reprimand - positive and negative feedback). In fact, this statement more clearly identifies the variables than was the case in the previous study. Compare the two statements for yourself.
 

3. To what population would you feel comfortable generalizing results of this study?

The sample in this study was 130 secondary school teachers selected from a wide variety of schools from different education authorities (we call them school districts in the U.S.) within the West Midlands area of Great Britain. We are told that the sample is large and that it's incidental, which is another term for a convenience sample. So, it is not a random sample. If the sample is not random, it may not be representative. However, if it's a large sample (as this one seems to be), it may be representative merely because of its size. Since we're not told anything about the West Midlands area, we only have the researchers' word that this sample is large (in relation to the number of secondary school teachers). Since we have no reason to mistrust the researchers, we can conclude that this large sample is probably representative of the West Midlands area. But, we cannot say it is representative of the rest of the secondary schools of Great Britain. Therefore, to answer this question, we could say that the results may be generalized to all secondary schools in the West Midlands area of Great Britain because the sample of 130 teachers is likely large enough to overcome the sampling bias (error) created by the non-random (incidental) sampling method.
 

4. Identify the strengths and threats to validity in this study.

Let's consider the strengths and threats by External and Internal threat categories. 

External Threat Categories:

1.  Population Validity - One potential weakness (threat) in this study is the lack of a random selection of the sample. This is mediated somewhat, however, by the fact that the sample is large (according to the researchers).

2.  Personological Variables (Demographics) - Since the results of the study relate to the teachers' use of praise and reprimand, it would seem to me that the experience level of the teacher could affect his/her use of praise and reprimand in certain situations.  It also seems to me that the age of the teacher may have an impact on the teacher's use of praise and reprimand.  Since neither of these demographic variables were addressed, I would most likely say that demographics are threatened in this study.

Internal Threat Categories:

1.  Instrumentation - We are told observers were trained in the use of the observation instrument, which is a strength. We are also told that the validity and reliability of the OPTIC instrument is fully described in another study. Since the researchers created this instrument, it is their responsibility to assess validity and reliability. On the weakness side, though, the researchers should have assessed the reliability of the instrument in the present study since reliability changes with each new group being measured. Another strength is the inter-observer agreement figures reported on page 41 (another measure of reliability). The high values of these agreement indices supports the fact that observers were well-trained.

2.  Appropriate Use of Inferential Statistics - The researchers use both descriptive and inferential statistics to analyze the data. Since inferential statistics are being used, you, as the evaluator, must assess whether the researchers used appropriate inferential statistics. To determine this, you should recall from the previous lesson that whether or not inferential statistics are appropriate in a given study depends on the type of data being analyzed.

Look at Table 4 on page 43 of the study. Statistical tests are performed to determine if statistically significant differences exist between male and female teachers regarding the use of certain positive and negative behaviors. Note the column in the table labeled "P." That stands for the probability that the difference could have occurred by chance. Recall from the previous lesson that if the P value is less than .05, it represents a value that could not have occurred by chance more than 5 times in 100; thus we declare this value to be statistically significant. Notice that only one teacher behavior (Disapproval to social behavior) is statistically significant, meaning that the difference found between male and female teachers is not a chance occurrence.

The particular inferential test used to determine this P value is called the "Mann-Whitney" test. Is this a parametric or non-parametric test? It's actually a non-parametric test, and happens to be appropriate in this instance because the data being tested are categorical (frequencies of occurrence of different behaviors). Since categorical data are discreet, only non-parametric statistics would be appropriate to analyze them. But, what if you didn't know that the Mann-Whitney test was non-parametric? No problem, you would just apply the logic of the rule we learned in in the previous lesson, which says: if the data being analyzed are continuous, then it is appropriate to use parametric statistics, otherwise, non-parametric statistics must be used. In this study, your reasoning should go something like this:

Since the frequency of behaviors (which is categorical data) is being analyzed, and since categorical data are not continuous, the only appropriate statistical test would be a non-parametric test. If the Mann-Whitney test is non-parametric, then it is appropriate. If it is a parametric test, then it is inappropriate.

On page 44 of the SB, we see another inferential statistical test used by the researchers - correlation coefficients. Once again, the frequency of behaviors is being analyzed (which is discrete, categorical data). The only appropriate statistical test in this case is non-parametric. According to the researchers, they used "the non-parametric Spearman rank order" test. So, again, they have used an appropriate statistical test.
 

5. Are there any ethical problems in this study?

Teachers are being observed in this study. We presume that these teachers, like teachers in this country, are observed routinely. Therefore, there is nothing unethical in this. The report also says that teachers are being observed by other teachers. Some may have ethical concerns with this arrangement. However, I don't believe this to be especially unethical as long as teachers are trained in the observation protocols. Some students have noted that students may be harmed by these teachers being evaluated. I disagree with this since teacher observation is a way of life for teachers, and all students eventually have to endure the occasional "visitor" in the classroom. This does no harm to the students and is, therefore, not an ethical problem. I don't see any other potential ethical problems with this study.

If you have any questions concerning this evaluation (if you found things I didn't discuss here, or if you don't understand something I've discussed here), talk with other members of the class to see if you can resolve the issues with them. If not, discuss your questions with the instructor in class or via email.


End of Week 3 Lesson

Assignment For Next Week
Gall: Chapter 11 (Group-Comparison or Causal-Comparative Designs)
SB: Studies 13 and 14
Guide: Chapter 2