Evaluating Research: Methodological Issues


Psychology 9990

Evaluating Research: Methodological Issues

Evaluating Research: Methodological Issues

Evaluating Research: Methodological Issues


Whenever research is conducted data is inherently obtained. Researchers must attempt to make sure that the way in which these results are collected is the same every time. When differences in findings occur upon times of repeating the research, such inconsistencies are deemed problems in reliability. 

The reliability of the measures used to collect data depends on the ‘tool’ used. A researcher collecting reaction times or pulse rates as data will probably have reliability as the machines used are likely to produce very consistent measures of time or rates.

The way to check reliability is to use the test-retest procedure.  This involves using a measure once, and then using it again in the same situation. If the reliability is high, the same results will be witnessed and collected on both occasions meaning there will be a high correlation between the two score sets.  

Consider an experiment on food preferences based on different cuisines such as ‘Chinese’ and ‘Mexican’ in which the researcher is not sure if the questionnaire is a reliable measure of ‘preference’. They use a group of participants and give them the questionnaire on two separate occasions. All the participants would have to be tested at the same time of day (At the time of any Breakfast, Luch or Dinner) and same week of the day so that their hunger levels and consequently ‘food preferences’ would be the same. If the ‘food preference’ scale was reliable, this test-retest procedure would produce a high correlation between the scores on the first and second tasting/testings. If the reliability was low, the test would have had to be redesigned.

There is also the problem in reliability that there are subjective interpretations of data.  For instance, a researcher who is using a questionnaire or interview with open questions may come to find that the same answers could be interpreted in different ways, producing low reliability. If these differences arose between different researchers, this would come to be called an inter-rater reliability problem. This however, can be solved by operationalizing.

Similarly, in an observation researcher gave different interpretations of the same actions, this would be low inter-observer reliability. If the reliability was low, the researchers in either case would need to discuss why the differences arose and find ways to make their interpretations or observations more alike. This can be done by agreeing on operational definitions of the variables being measured and by looking at examples together. These steps would help to make the researches indefinitely more objective. 

To minimize differences, in the way research is conducted that could effectively reduce reliability, standardization can be used, that is if the procedure is kept the same. This could be done by including instructions, materials and apparatus, although it is important to note that there would be no reason to change many of these. The important aspects of standardization are those factors which might differ, such as experimenter’s manner towards participants in different levels of the IV, an interviewer’s body language, verbal mannerisms or an observer’s success at covering their presence. 


Many factors affect validity (and this includes reliability too because a test or task cannot measure what it actually intends to unless the methods are consistent. Objectivity also affects validity in the sense that if a researcher is subjective in their handling and specifically interpretation of data, their findings will not properly reflect the intended measure. 

There are different types of validity that are important – this includes face validity (which is essentially the measure of the procedure and how it appears) A test or task must seem to test what it is actually supposed to. Consider a test of helping behaviour that involved offering to assist people who were stuck in a bath tub full of spiders or lizards. 

It might not be a valid test of helping because people who were frightened of spiders or lizards would not help, even though they might otherwise be of altruistic nature (selflessly helping). This would be deemed a lack of face validity

If participants start to think that they understand the aim of the study, their behaviour patterns and characteristics are very likely to be affected by what we call social desirability and demand characteristics – this obviously lowers validity.  When designing a study, the researcher should aim to minimize demand characteristics that is not make apparent or indicative to the participants how they are expected to behave. 

For Example: 
In the study conducted by Laney et al. Which was based on ‘False Memories’ the researcher needed to hide the aim of the study which he/she did by using several mock/filler questionnaires alongside actual ones (‘Food History Inventory’ and ‘Restaurant Questionnaire’)  that as the ‘Food Preference Questionnaire’, ‘Food Cost Questionnaire’ and ‘Memory or Belief Questionnaire’. 

They might try to remember a certain piece of information really well, or might not report it at all if that is what they think the researcher expects. 

Another problem of validity is whether if the research’s findings are too specific to that own study, not being able to be apply it to other situations. This lacks the general reach it was supposed to have – this means there is a lack of ecological validity. This type of validity explores if findings from the laboratory have a real-life application into the ‘real world’. 

For Example:
An experiment conducted on anxiety and panic attacks inside a laboratory and its findings may differ from that of real-life anxiety and panic attack experiences. However, it is also worth mentioning how a test of anxiety and panic attacks conducted at home may not accurately reflect the situational reality of people who have these experiences at work or even during healthcare procedures. If so, the finding of this test may not generalize beyond the situation tested. 

The task itself matters too. If in a task, participants are asked to do tasks that are similar to the ones in real-life contexts then it has mundane realism (the degree of it being similar to events in real-life contexts). This is significant for a study to have as it would naturally have higher ecological validity if the tasks are realistic. For instance, in an experiment on emotions responses to dangerous animals such as Bears, Insects, Bats or Tigers can be used. 

As it is highly likely that a small number of people would have seen bears, tigers, a few more would have seen a bat but insects are more likely to have been seen by everybody in the participant sample – having higher mundane realism and thus higher ecological validity. This is a variant of; external validity. External Validity is basically referring to whether or not the findings of the study can be generalized beyond the present study. 


As it is apparent, Ecological Validity contributes to the generalisability of the results. Another factor which affects the ability to generalize is the participants of the sample. 

If the sample is very small, or does not contain a wide range of the different kinds of people in the population (such as gender, age, ethnicity, etc) it is actually unlikely to be representative. 

Restricted samples like the one mentioned are more likely to occur when the sampling methods of either opportunity or volunteer sampling is used, rather than if random sampling is used. 

Important Things To Remember About Research Thodology And Processes:  

•  Are measures reliable? 

•  Are the tools and equipment being used collecting consistent results?

•  Are the researchers using those in ways that are consistent?

•  Is the interpretation of data objective?

•  Is the study valid? Does it represent what the aim intends to find out?

•  Take into account the position of reliability and generalizability when it comes to validity. 

•  Are there any variables that may affect results? Such as Social Desirability, Demand Characteristics, Familiarity Bias, Researcher Bias, etc? 

•  To improve the study, light of focus needs to be on: Method, Design, Procedure and Sampling Tool. 

© 2019-2022 O’Level Academy. All Rights Reserved