Data Analysis
Data Analysis
Essential questions
How to analyze numerical experimental data?
What statistical methods can be used for experimental data analysis?
Numeric raw data that are collected in experiment, are transformed, and presented in the way that allows to complete a statistical analysis of the relationships between the independent and dependent variables.
Laboratory experiments usually produce multiple measurements of the same parameter (replicate measurements) that can be subject to error. Statistical analysis is used to summarize the measured values by calculating the average which is an estimate of true mean. It also determines the variance that indicates the uncertainty of the measured variable. Strictly speaking,
measured value = true value ± error
While every effort in made to control the sources of errors in experimental research, they are not completely avoidable and thus should be properly represented. To precisely evaluate the uncertainty in measurement, true repeats of experiments should be conducted. They should involve redundancy in every aspect of experimental setup to reduce the errors in the data collected.
These are some examples of applications of statistics in experimental data analysis. When the two different parameters are measured in experimental settings, and the researcher wants to know if there is a difference between the two sets of measured values, analysis of variance (t-tests) can be utilized to establish if that is the case.
If one variable was measured under the variety of experimental conditions with regard to the second variable, regression analysis can be utilized to provide mathematical description of their relationship.
Most statistical methods are based on the assumption that the errors in measurement have a normal probability distribution (also known as the Gaussian distribution). This is commonly the case for continuous set of values that are obtained in experimental research.
Numeric data that fit normal distribution can be analyzed with parametric tests. They are based on the assumption that the samples are equally homogeneous (which is attained by random sampling). Non-parametric statistical tests are used to analyze the data that have no normal distribution due to small sample size or other similar events.
Commonly a measured value is compared with either a known value or another measured value. Analysis of variance (ANOVA) examines differences between samples of experimental data. This includes simultaneous comparison and statistical analysis of multiple data sets rather than the comparison of two averages. With respect to the comparison of two averages, or an experimental average and a known value, the ANOVA is reduced to what is commonly called a t-test, and can be used to determine the validity of hypothesis. The first step is to formulate the null hypothesis. Conventionally the null hypothesis states that there is no difference between the two values. Then the statistical analysis of the data is applied to examine the strength of evidence to reject the null hypothesis in favor of specific alternate hypothesis:
https://www.youtube.com/watch?v=0XsovsSnRuw
Suggested readings
Ali Z., Bhaskar S. B. (2016) Basic statistical tools in research and data analysis. Indian J Anaesth. 60(9):662-669.
Satake E. B. (2015) Statistical Methods and Reasoning for the Clinical Sciences Evidence-Based Practice. Ist ed. San Diego: Plural Publishing, Inc.
McDonald, J.H. (2014) Handbook of Biological Statistics (3rd ed.). Sparky House Publishing, Baltimore, Maryland.
RR: Lab Research Modules
Page Options