# Experimental Design

What are the 3 main principles of experimental deisgn
randomisation, replication, blocking
What are experiments used to explore.
Processes. How do inputs(treatment factors) affect outputs (responses)
What makes a designed experiment
Only if the conditions are controlled can we draw conclusions about cause and effect. Observational data(surveys) can be used to identifu associations but not causal relationships.
What are the six components of an experiment
1. Questions of interest. 2. Experimental units. 3. Treatments. 4. Assigning treatments to units. 5. Measuring the response. 6. Analysis.
What is an experimental unit
Experimental units are the objects to which the treatments are applied.
Why is replication important?
If we only use it once we would not know if the observed difference is due to the treatments or just random variability.
Define blocking
experimental units are divided into subsets (called blocks) so that units within the same block are more similar than units from different subsets or blocks. If two units in the same block get different treatments the treatments can be compared more precisely than if the treatments were assigned to units in different blocks.
Describe randomisation
Allocation of experimental units to combinations of treatment factor levels should be determined by a random process.
Why is randomisation important
First it ensures that each treatment has the same probability of getting good (or bad) units and thus avoids systematic bias. Second it justifies one of the key assumptions for the statistical analysis. Our statistical analysis assumes that observations are independent. **This is almost never strictly true in practice but randomisation means that our estimates will behave as if they were based on independent observations
Statistical analysis of an experiment is shaped by what three design components
block structure: Describes the experimental units. Treatment structure: Describes the treatments. Allocation of treatments to experimental units.
What is the ultimate goal of analysis
The ultimate goal of the analysis is to answer the questions of interest. To accomplish this we need to estimate the difference in the response between different treatments and calculate standard errors for these estimates. This allows us to determine which differences signal real effects and which can be explained by experimental variability.
What is The defining feature of a completely randomised design
the experimental units are homogeneous. This means that any pair of units are as similar as any other pair.
CRD data model
response = signal + noise The signal relates the expected value of the response to the treatment factors. The noise relates the variability in the response to the blocking factors. We will use fixed terms (constants) for the signal and random terms (random variables) to model the noise.
Explain with an example of what replication means in experimental design and why this is important.
replication means that each treatment is used more than once in an experiment. Important because it allows us to estimate the inherent variability in the data. This allows us to judge whether an observed difference could be due to chance variation.
List a more modern technique than Bonferonni and Tukey
false discovery rate is a statistical method used in multiple hypothesis testing to correct for multiple comparisons
define the response variable
the response is the quantity that we measure in the experiment that we think will help answer our question of interest
define a treatment factor
treatment factors are variables in the experiment, within the control of the experimenter that we think could affect the response.
Define a blocking factor
A blocking factor is a variable which we think could affect the response but is peripheral to the question of interest. That is, we wish to control for the effects of the blocking factors so that we can see the effects, if any, of the treatments.
An experimental unit:
Experimental units are the objects to which the treatments are applied. Example: In this experiment, the treatments are applied to the tubes of blood, so the tubes are the experimental units.
[10 marks] A crop scientist wants to examine the effect on yield of four different barley varieties in three different soil type. The scientist only has two fields available. She would like to use the fields as blocks, but each field can only be ploughed into three strips. (a) [3 marks] Explain why a split-plot design is better than a completely randomised design (CRD) or a randomised complete block design (RCBD) for this experiment.
There are 4 Γ 3 = 12 treatments, so we require 12 experimental units for a single run. If we regard the strips as plots, then we would either have to carry out the experiment over multiple time periods, or have more fields in order to use an CRD. In order to use a RCBD, regarding the strips as blocks, we would need to have twelve units per block, and we would end up carrying out 72 runs. A split-plot design lets us replicate our treatments twice using only 24 runs in a single time period, hence it is the most efficient use of resource.
Explain what landa meands in regards to BIBD
each pair of treatments occurs in the same block landa times
complete the formula in regards to BIBD. t x r = ?
b x k
Two key principles of BIBD with regards to treatments
Each treatment is applied to the same number of units, Each pair of treatments occur together in the same block equally often. These ensure that we can compare any pair of treatments with the same precision
What's an important principle of BIBD with regards to blocks
The blocks are smaller than the number of treatments.
What's the data model of a BIBD
E(yij) + bi + eij bi is block effect eij is unit effect in block
How many strata in BIBD table and describe them
blocks units in blocks
What is a key difference between a BIBD and RCBD
assigning treatment to units. - for a RCBD the entire set of treatmetns is used in each block - for a BIBD only some of the treatments are used in a given block. Therefore for a given block we need to specify the subset of treatments applied to each unit.
Desribe the two parts of randomisaiton in BIBD
randomly assign which set is used for each block Randomly assign treatments to units within blocks
With regards to BIBD why do we want e to be as large as possible
this means more information is in the units in block stratum which should have a smaller mean square than the blocks stratum
BIBD ANOVa table Where is most of the information about treatment? Which stratum has the smallest E(mean SQ)
in the unit in block stratum the unit in block stratum
What R command do you use for BIBD to look at estimated treatment effets
dummy.coef
Whats the key thing you have to remember when using LSD, TSR for BIBD
the reps * e
In BIBD how to calculate all possible sets of a certain size and what is the formula. And what do you use the amount of sets for
you do t choose r. using the formula for choose t! / k!(t - k)! That gives you a design with the amount of blocks to have
With BIBD what are you trying to do with the blocks with regard to within block variaiblity
trying to make this as small as possible
in BIBD, each treatment occurs the same number of times, what else with regards to the blocks occurs the same number of times
each pair of treatments in the same block
where is the information about treatments split in the BIBD stratum
between the blocks stratum and the blocks:units stratum. Balance ensures that the split is the same for all treatment comparisons ( Contrasts)
What is the key idea of the RCBD and why do we do this
group experimental units into subsets such that there is less variability between members of the same subset than members of different subsets. we do this to increase the precision of between treatment comparisons.
How many blocking factors do we need for a RCBD to describe the blocking structure
We need two One for blocks to identify which block each unit is in Units to distinguish between units within blocks
How do you describe the block structure for RCBD
blocks/units, units nested in blocks. the units within each block are assumed to be homogeneous
What are the three types of varaibility we need to consider with regards to the experimental units with RCBD
- the variance of an individual unit - the covariance of two units from the same block - the covariance of two units from different blocks
What is the variance of an indivdual unit (Also worded as the between block variability)
Nblock + unit = top residual mean square
What is the covariance of COV(X,Y)
E{X-mx)(Y-mx)}
Whatis the covariance of COV(x,X) (Also worded as the estimate between units within a block)
= var(X) = unit = bottom line residual value
What's the correlation between two units from the same block
block value/(unit + block)
Explain the yield identity