## Normal Distribution

This may well be the same as a page entitled t and Normal Distribution.

The CIE Syllabus says:

(i) formulate hypotheses and apply a hypothesis test concerning the population mean using a small sample drawn from a normal population of unknown variance, using a t-test

(ii) calculate a pooled estimate of a population variance from two samples (and do calculations too)

(iii) formulate hypotheses  concerning the difference of population means, and apply as appropriate: a) a  2-sample (pooled) t-test,  b) a paired sample t-test c) a test using a Normal distribution  - including choosing the right test.

(iv) determine a confidence interval for a population mean based on a small sample drawn from a normal population of unknown variance, using a t-distribution

(v) determine a confidence interval for a difference of population means using a t-distribution or a Normal distribution as appropriate.

We have established that the means of samples from a distribution are clustered as noting the use of the ‘hat’ on the sigma. Our two assumptions are that the sample is random and that the variable is normally distributed (the parent population is assumed to be Normal) and that if we look at two samples then the items could be paired or pooled.

Paired test: We have a set of n differences for X in two conditions. Call this variable U.
Then the sample mean
can be found, as can the sample variance S². We can estimate σ̂ ²

using the relationship  nS² = (n-1)  σ̂ ² .

We recognize that the degrees of freedom is one less than n, so provided that U is distributed normally in the population then Unpaired test: We have two sets of data, Xand X  taken from a parent population assumed to be Normally distributed. The samples sizes are n and n, the sample means are X̅₁ and  X̅₂ while the sample variances are    S₁² and S₂². Then and adding variances because X² is multiplied by -1.

But the difference of the means should be zero, so the standardised distribution is but we use the t-distribution because the samples are relatively small, with the number of degrees of freedom n+n-2. We must be careful with our alternative hypothesis to distinguish between the one and two tail tests; similarly we must be aware which figures our data tables give, one or two tailed. A diagram helps dramatically to show that if the test statistic is bigger than the tabulated figure, we will reject H0, the null hypothesis.

At all times you are assuming  that your samples are unbiased and that the underlying population is Normally distributed.

1  The heights of a random sample of Gaoxin policemen were measured as 176, 179, 179, 180, 181, 183. The heights of a similar sample of eleven in Xi’an city gave Σy= 1991, Σ(y²)= 54, where all heights are measured to the nearest centimeter. Test, at the 5% level, whether the city policemen are shorter than those in Gaoxin. Assume that both forces have a normal distribution of height and that they have a common population variance.

2  A large number of tomato plants are grown under similar controlled conditions. A new fertilizer is applied to half the plants chosen randomly and the others are treated with the standard fertilizer. Random samples of a hundred are chosen from each half and records are kept of the weight of tomato collected from each plant. Both samples can be treated as being large samples from a normal distribution with a common variance. The standard treatment produces Σx= 990, Σx²=10079. The new treatment produces Σy= 1030, Σy²=11046.
a. Obtain a 96% symmetric confidence interval for the mean mass of the ‘new’ crop.
b. Calculate a pooled estimate of the variance
c. Assuming that the new fertilizer cannot be any worse than the old one, and stating your hypotheses clearly, test at the 3% level the suggestion that the new fertilizer implies a greater crop mass.

3  There is a theory that the foot size is connected to intelligence, on the grounds that both are indicators of good genes and feeding. Sampling first year undergraduates at Oxford University by measuring feet to the nearest cm, I obtained these figures:

Length     24, 25, 26, 27, 28, 29, 30

Frequency 1,  2,   3,   9,   6,   5,   1

a.  Find the sample mean and the unbiased estimate of the population variance.

b.  Calculate a 96% confidence interval for the mean length of foot for such people.

A collection of 48 similar students from Oxford Brookes University is also measured. We can assume that these students are of medium intelligence.  The mean foot length is 26.6cm and Σ(y²)= 123.20. If we assume that the two samples are drawn at random from independent normal distributions with a common variance, then

c.  obtain an unbiased two-sample estimate of this common variance

d.  test the genetic theory at the 1% significance level, clearly stating your hypotheses. Make a comment on the theory in the light of your test, including discussing how your conclusion changes with significance level.

4  A machine is used to fill bags with powdered ink and a random sample of twenty bags taken from a week’s production gave a mean mass of 499.1g and a std dev (r.m.s.d.) of 0.63g. A week later a sample of 25 bags gave 500.2g and 0.48g. Stating the conditions you require as assumptions, test

a.  whether the mean has changed significantly during the week, and, both at the 5% level,

b.  whether the mean in the second week could be 500g.

Editing this page in 2017, I discovered that Σ(y²)= appeared consistently as Σ(y-)=.  I do not understand, but I guessed that I had intended the index rather than a mean, y̅, which I type as y followed by opt U+0305. Since discovery, I show index of two unsing the special character ², U+00B2. I am disappointed that ¹²³ are positioned differently from ⁴⁵⁶⁷⁸⁹⁰ as shown ⁰¹²³⁴⁵⁶⁷⁸⁹. There is, however, consistency with subscripts, ₀₁₂₃₄₅₆₇₈₉.

1.   Gaoxin: mean 179.67, S²=4.556    Xian: 181, 5.422. test t is -1.13.   Accept H0.

2.   4.41, (9.87, 10.73), 3.61, z = 1.49.

3.   Mean,     Variance 2.00.  (26.77, 27.39) z= 1.97 Accept H0.

4.    |t| = 6.496, Yes; t= 2.041, Yes