There are a great deal of reasons why outliers occur in data. There can be technical faults with machinery, a fault in the design of the experiment, the participant could have misunderstood the instructions (or not really respected the concept of the experiment), or a certain participant could just have extraordinary results. Although, it is often accepted that outliers can be removed in certain circumstances, I am going to tell you some of the reasons and why I think they should not be excluded from the findings.
When there is a technical fault with machinery, the data obtained can be drastically effected. On these occasions, the results may not represent what the investigator is attempting to analyse. For example, a research may be interested in reaction times but if a machine is faulty it might record times that are hugely different from the time it took the participant to react. In instances like this it is easy to assume that outliers should simply be cast away as they are in fact not valid. However, I would argue that if the machine had been faulty for some of the trials then there is no certainty that it had worked for the rest. Perhaps the other results seem ‘normal’ because the inaccuracies of the machinery were less obvious. I feel that really there is no other way to proceed than to recollect the answers from the participants (after ensuring the machinery is no longer faulty) to get accurate data. Obviously, this would come to a great cost and effort and time for the researchers but as researchers of science surely we need to make sure that the results are true. Here is a link http://pareonline.net/getvn.asp?v=9&n=6 that describes how, using certain statistical methods, you can instead keep your outliers without violating your results.
The next outlier issue I will discuss is concerning participants. Some times it is thought that participants can, purposefully or not, make mistakes when taking part in experiments. Although there are reason with which people remove outliers from data, such as data entry mistakes,(http://184.108.40.206/wiki/Dealing_with_Outliers), I believe that regardless of whether it was because the participant did not understand or if they just didn’t bother doing it to the best of their capabilities, outliers should not be removed from the results of the investigation. I think that if the participants did not under stand what was expected of them, there has been a fault in the methods of the experimenter (Rosenthal, 1994). The instructions should be written or delivered well enough for everyone to understand easily. This will ensure that they are aware of what is expected of them as what ethical guidelines demand.
If participants have not completed the task suitably because of a lack of interest it is thoroughly accepted that the data is not valid and so should be dismissed. However, I think that sometimes this completely oversees the point. We, as psychologists, attempt to investigate human behaviour. If we ask a person to do a task, no matter how they react, what gives us the right to say that they are incorrect and so shouldn’t be counted? Any human reaction should be important to our full understanding of human behaviour even if it is not the reaction we were looking/hoping for. We are not laboratory rats, we are humans.
This link http://pareonline.net/getvn.asp?v=9&n=6 also describes beautifully how sometimes the only valid scores look like outliers. For example, if you ask teenagers about their drug use, many of them might underestimate their true score due to demand characteristics. Therefore, the real scores would seem exceptionally high and might be regarded as outliers which is another thing we must be careful to consider.
Then finally, there is the occurrence of chance. Sometimes a person’s results are just very extreme compared to the rest of the sample. Whether this is because they come from a different population compared to the rest or whether there is a different factor affecting them should we really exclude them just because their results are different? And if we can do this, where do we draw the line? We might one day end up with studies removing any and all pieces of data that do not agree with the theories, and that is not really science, is it?
Rosenthal, R. (1994). Science and ethics in conducting, analysing, and reporting psychological research. Psychological Science, 5(3), 127-134. doi: 10.1111/j.14679280.1994.tb00646.x