STATISTICAL METHODS IN BIOLOGY

STATISTICAL METHODS IN BIOLOGY 1. Introduction 2. Populations and samples 3. Hypotheses testing and parameter estimation 4. Experimental design for biological data 5. Most widely used statistical tests I 6. Most widely used statistical tests II 7. Linear regression 8. Nonlinear regression 9. Regression model fit 10. Correlation 11. Elements of statistical data modelling 12. Model comparison 13. Variance analysis 14. Covariance analysis 15. Summary of the material, analysis of examples, discussion

INTRODUCTION 1. Why plan an experiment and what are the stages of planning an experiment? 2. The power of test estimating p factors affecting the power of testing 3. Types of samples random selection according to a prespecified criterion case-control blocked design cross-over split plot 4. Taking measurements calibration inaccuracy influence of the observer examples of measurements

INTRODUCTION Why do we plan biological experiments??? 1. The structure of the sample must be appropriate for testing experimental hypotheses 2. The sample size must guarantee sufficient power of testing 3. Constraints on sample size: ethical issues, costs, time Copyright 2017 Joanna Szyda

INTRODUCTION Stages of designing an experiment Formulation of the research objective e.g. impact of fresh air on lipid concentration in mussel tissues Formulation of the hypotheses: H 0 and H 1 e.g. H 0 : k 1 = k 2 H 1 : k 1 k 2 Determining the size and structure of the sample Sample collection Test Decision on hypotheses

POWER OF TESTING

POWER OF TESTING Type II error TRUE HYPOTHESIS ERRORS H 0 H 1 ACCEPTED HYPOTHESIS H 0 H 1 : probability of rejecting a true H 1 1- : power = probability of detection a true difference / effect / etc.

POWER OF TESTING power analysis should precede each experiment

POWER OF TESTING - factors affecting the power 1. Sample size It is easier to detect the effect in large samples More informative Lower impact of sampling error

POWER OF TESTING - factors affecting the power 2. Effect size H 0 : k 1 =k 2 H 1 : k 1 k 2 It is easier to detect large effects k 1 k 2 k 1 k 2

POWER OF TESTING - factors affecting the power 3. Variability of data in the sample (sample variance) It is easier to detect a difference in homogenous samples k 1 k 2 k 1 k 2

POWER OF TESTING - factors affecting the power 4. Assumed type I error rate It is easier to detect a true difference with large but we also allow for a higher probability of rejecting a true H true H 0 0 Accepted H 1 Accepted H 0 True H 1

POWER OF TESTING - example 1. Calculation of the sample size required for, 1-b and effect size 2. T test 3. G-power software (http://www.gpower.hhu.de/)

POWER OF TESTING - determining sample size for a given power 1. Information from the literature

POWER OF TESTING - determining sample size for a given power 2. Numerically

POWER OF TESTING - determining sample size for a given power 3. Empirically computer simulations multiple (1 000 generation of in silico following predefined rules Known true values testing counting I- and II-go errors

TYPES OF SAMPLES

TYPES OF SAMPLES - random Random selection 1. Relatively easy to collect No experiment needed Easy to obtain a large sample 2. The results can be applied to the entire population

TYPES OF SAMPLES selection according to a criterion 1. Individuals selected nonrandomly 2. Must fulfil some criteria Nuclear families Multigenerational families parent-child trios Complicated family structure, individuals selected based on disease phenotypes Copyright 2017 Joanna Szyda

TYPES OF SAMPLES - selection according to a criterion 1. Predefined relationship structure 2. Inbred line cross (laboratory organisms) intercross backcross P: x P: x F1: x F1: x F2: F2:

TYPES OF SAMPLES case-control 1. Individuals within each category are randomly selected 2. The case group - subjected to an experimental factor 3. The control group - reference for comparison with the case group badawcza kontrolna e.g. Sick persons Persons under medication e.g. Healthy persons Persons taking placebo

TYPES OF SAMPLES - próby zblokowane badawcza kontrolna np. dieta A np. dieta B 3 35 125 37 11 7 12 11 1. Osobniki w obrębie obu kategorii są podobne pod względem jednej lub kilku cech, które mogą mieć potencjalny wpływ na wynik testu - blokowanie 2. Blokowanie zmniejsza wpływ zmienności indywidualnej wewnątrz grupy - większa moc testowania 3. Próby danych często trudne do zebrania 4. Często nie wiemy na podstawie jakich kryteriów przeprowadzić blokowanie

TYPES OF SAMPLES - cross over dieta A Etap 1 dieta B dieta A Etap 2 dieta B 1. Te same osobniki występują w obu grupach 2. Eliminacja zmienności wewnątrzosobniczej 3. Możliwe tylko dla niektórych rodzajów badań (np. pomiar cechy tani, przyżyciowy, proste warunki utrzymania osobników)

TYPES OF SAMPLES - split plot 2 czynniki eksperymentalne

TYPES OF SAMPLES - split plot 2 czynniki eksperymentalne dieta A, dieta B

TYPES OF SAMPLES - split plot 2 czynniki eksperymentalne dieta A, dieta B trening A, trening B

WYKONYWANIE POMIARÓW

POMIARY - niedokładność 1. Kalibrowanie urządzeń pomiarowych ustawianie / kontrola urządzeń pomiarowych na podstawie analizy próbek o znanych wartościach wielokrotne kalibrowanie w czasie wykonywania pomiarów 2. Niedokładność pomiarów nie jesteśmy w stanie dokonać pomiaru z nieskończoną dokładnością precyzja powinna być jednakowa dla wszystkich próbek w danym badaniu

POMIARY - wpływ obserwatora wewnątrz obserwatora np. zmęczenie zmiana oceny subiektywnej pomiędzy obserwatorami np. różnice w subiektywnych ocenach Zasady prowadzenia obserwacji: Nie wykonywać zbyt wielu obserwacji na raz Nie stosować uproszczonych skrótów Tworzyć zapasowe kopie danych Tworzyć protokoły przebiegu eksperymentu Wykorzystywać elektroniczne formularze bazy danych

POMIARY - przykłady cech Łatwe do skwantyfikowania np. wzrost trudne do skwantyfikowania np. obserwacje behawioralne

? PLANOWANIE EKSPERYMENTÓW