Re-testing the hypothesis about the independence of logarithmic returns. Testing simple hypotheses using the Pearson chi-square test in MS EXCEL Methods for testing a sample for normality

Let's take a look at the stand inMSEXCELPearson chi-square test for testing simple hypotheses

After removing the experimental data (if selection) must be carried out by selecting the law of division, which best describes the fall value represented by this by choice. Verification of the extent to which the experimental data are described by the chosen theoretical law of the subject is based on research criteria now. Null hypothesis Therefore, there is a hypothesis about the equality of the subdivision of the fall value to a certain theoretical law.

Let's take a look at the stagnation now Pearson X 2 test (chi-square) Based on simple hypotheses (the parameters of the theoretical division are taken into account). Then - if only the shape of the division is specified, and the parameters of this division and the values statistics X 2 be assessed/insuranced on one or the other basis selections.

Note: In English literature the procedure is stagnant. Pearson's criterion X 2 May I name it The chi-square goodness of fit test.

Let us outline the procedure for testing hypotheses:

  • based on selections values ​​are calculated statistics, which indicates the type of hypothesis that is being tested. For example, for vikory purposes t-statistics(as it is unknown);
  • for the minds of truth null hypothesis, divided the whole statistics Vidomo and can be used for calculating international incomes (for example, for t-statistics tse);
  • calculated on the basis selections significance statistics equals critical values ​​for a given value ();
  • null hypothesis throw in as it is significant statistics more critical (or the possibility of omitting the value statistics() less level of significance, which is an equivalent approach.

Let's do it testing hypotheses for different divisions.

Discrete drop

It is acceptable that two people are playing by the brush. The skin has its own set of brushes. The players throw 3 dice at a time. The skin round is won by the one who has the most sixties at a time. The results will be recorded. In one of the graves, after 100 rounds, there was a suspicion that the brushes of his opponent were asymmetrical, so the stench was not strong. she often wins (she often throws away sixes). You want to analyze how incredibly large the enemy’s heritage is.

Note: Because dice 3, then you can roll 0 at a time; 1; 2 chi 3 six, then. The Vipadkova value can have 4 values.

From the theory of probability, we know that if the cubes are symmetrical, then the probability of the resulting sixes increases. Therefore, after 100 rounds of frequency, the number of sixes can be calculated using an additional formula
=BINOM.DIST(A7,3,1/6,BALANCE)*100

The formula is transferred to the middle A7 Limit the number of sixes that fell in one round.

Note: Rozrahunki induced in butt file on Arkusha Discrete.

For cleaning with caution(Observed) theoretical frequencies(Expected) handy.

With a significant difference in frequency suppression in the theoretical division, null hypothesis About the division of the fall value behind the theoretical law, it can be explained. So, since the supernic's cural tussles are asymmetrical, then the frequencies are prevented from “subtly increasing” in binomial division.

In our opinion, at first glance, the frequencies are close and without calculation, it is important to make an unambiguous conclusion. It's stagnant Pearson criterion X 2 in order to replace the subjective conditioning of “soundingly different”, which can be done on the basis of leveling histogram, vikoristovvati mathematically correct not confirmed

Vikorista is aware of the fact that due to law of great numbers Observed frequency due to increasing obligations selections n reliability, consistent with the theoretical law (at times, binomial law). Whenever possible, the selections n exceed 100.

Entered test statistics, Yaku is significant X 2:

de O l – the frequency is guarded, so that the peak value of the peak acceptable values, E l – the theoretical frequency (Expected). L – the number of values ​​that can be taken as a variable value (in some cases it is 4).

As you can see from the formula, statistics This means that the frequencies are close to the theoretical ones. With its help, you can evaluate the “distance” between these frequencies. Since the sum of these “distributions” is “too great,” then these frequencies “are extremely different.” It’s clear that our cube is symmetrical (that’s why it’s stagnant binomial law), then it is certain that the amount of “getting up” will be “too great” and will be small. To calculate this potential, we need to know the division statistics X 2 ( statistics X 2 is calculated on a casual basis selections That's why it's a drop in size and, therefore, has its own inequality divide).

Zi rich-atomic analogue Moivre-Laplace integral theorem It is clear that beyond n->∞ our subadonic value X 2 is asymptotic with L - 1 degrees of freedom.

Well, since the values ​​have been calculated statistics X 2 (the sum of “distances” between frequencies) will be greater than the limit value, then we will have to throw up null hypothesis. At the hour of re-verification parametric hypotheses, the limit value is specified via level of significance. It is certain that the statistics X 2 has a value less or higher than the calculated one ( p-meaning), will be less level of significance, That null hypothesis can be thrown out.

In our version, the statistics value is still 22,757. The probability that the X 2 statistics will have a greater or more significant value 22.757 is very small (0.000045) and can be calculated using the formulas
=XI2.DIST.PH(22.757,4-1) or else
=ХІ2.TEST(Observed; Expected)

Note: The XI2.TEST() function is specially created to check the connection between two categorical variables (div.).

Movability 0.000045 is significantly less, significantly less. level of significance 0.05. Well, the gravets may all set up to suspect their opponent of dishonesty ( null hypothesis about his honesty comes out).

When frozen criterion X 2 it is necessary to follow this, so that selections n it will be great, otherwise it will be an unlawful approximation to the gender statistics X 2. It is important that it is sufficient for the observation frequencies (Observed) to be greater than 5. However, this is not the case, then small frequencies are combined into one or they are added to other frequencies, and the combined value is assigned a total Yes, obviously, the number of steps of freedom changes X 2 - rose.

In order to paint the hardened paint criterion X 2(), it is necessary to change the distribution intervals (increase L and, obviously, increase the volume steps of freedom), however, the difference is limited by the amount of caution that was lost in the skin interval (d.b.>5).

Uninterrupted fallout

Pearson's criterion X 2 You can stosuvati just like that.

Let's take a look at Yakus selection, What adds up to 200 value. Null hypothesis I firmly believe that selection crushed s.

Note: Vipadkovy values butt file on arkushi Bezperervne generated using additional formulas =NORM.ST.INV(RAND()). This has new meanings selections are generated when the skin changes the leaf.

As evidenced by the current set of data, you can visually evaluate it.

As can be seen from the diagrams, the values ​​of the selections fit well together. However, as for reversals of the hypothesis stagnant Pearson criterion X2.

For this purpose, the range of changing the taper value is at intervals of 0.5. The theoretical frequencies are calculable. The observed frequencies can be calculated using the additional function FREQUENCY(), and the theoretical ones - using the additional function NORM.ST.DIST().

Note: Yak i for discrete video it is necessary to stitch so that selection It was too long, and the interval was >5 values.

The statistics X 2 are computable and equal to the critical values ​​for a given level of significance(0.05). Because We divided the range of changes in the fall value into 10 intervals, the number of steps of freedom is equal to 9. The critical values ​​can be calculated using the formula
=XI2.OBR.PH(0.05;9) or
= ХІ2.OBR (1-0.05; 9)

The diagram shows that the statistical values ​​are higher than 8.19, which is significantly higher criticalnull hypothesis doesn't jump up.

Pointed lower, towards selection appearance of low significance and on the stand criterion Goodbye Pearson X 2 The null hypothesis was developed (don’t be surprised that the random values ​​were generated using the additional formula =NORM.ST.INV(RAND()), which will ensure selection h standard normal size).

Null hypothesis carefully, wanting to visually expand the data to reach close to a straight line.

Like a butt we can also take it selectionз U(-3; 3). It is obvious from the graph that null hypothesis May buti vidhilena.

Criterion Goodbye Pearson X 2 I also confirm that null hypothesis May buti vidhilena.

The criterion is stagnant in two cases:

1) to establish empirical divisions with theoretical ones (equivalent, normal and some others);

2) creation of two empirical divisions of the same signs.

The criterion indicates nutrition about those whose frequency increases different meanings signs in the empirical and theoretical divisions or in two empirical divisions.

The sign can be measured on any scale, even a nominal one.

Obmenenya:

2) the theoretical frequency for the cutaneous center of the table is not to blame but is less than 5: f³5. This means that since the number of discharges is set in advance and can be changed, we can use method c 2 without accumulating the minimum number of caution. So, since the number of discharges ( k) is specified in advance, the minimum amount of caution (n min) is determined by the following formula: n min = 5 k

3) the selected discharges can “swallow” the entire division in order to cover the entire range of sign variability. In this case, grouping into categories may be the same for all divisions;

4) it is necessary to make a correction for continuity when setting subdivided signs, which will result in more than 2 values. When a change is made, the value from 2 changes;

5) the categories of guilt must be such that they do not overlap: once caution has been extended to one category, it can no longer be extended to another category.

Calculation criterion:

1) when equalizing the empirical and theoretical equal distribution. And it’s better to quickly look at Table 34.

Table 34

Discharge f ej f t (f e j -f t) (f e j -f t) 2 (f e j -f t) / f t

Here, 1st employee is given the names of ranks,

In the 2nd station the empirical frequencies for the skin discharge f e j are given, where j changes from 1 to k,

3 stovpchik has a theoretical frequency, however, the frequency for the skin discharge is calculated using the formula f t =n/k,

in step 4 there is a difference between the empirical and theoretical frequencies for the skin discharge,

at 5 stovpchik the value of 4 stovpchik is based on the square according to the skin discharge,

The 6th stopper has the same value as the 5th stopper to the theoretical frequency of the skin discharge.

If c 2 >c 2 0.01, then the empirical distribution differs from the equal one, like c 2 £c 2 0.05, then the empirical division does not differ from the equal one, like c 2 .05< c 2 £c 2 0,01, то отличие эмпирического распределения от равномерного значимо на 5% уровне.

Table 35

Division of studies according to the cognitive style “differentiality-integrality” and breakdown of data according to criterion c 2

butt. Pupils of the pre-adolescent age (60 people aged 13-14 years) showed a cognitive style of “differentiality-integrality” according to the methodology of G.A. Berulava. The skin style seems to have three strategies: theoretical, active, emotional. The division of students by style is presented in Table 35. Can you confirm that this group of students consistently represents all these styles?

Decision: n=60 >

Let us formulate an experimental hypothesis: the division of scholars behind the “differentiality-integrality” styles from three strategies to equal ones.

k=6, then, f t =60/6=10.

For n=to-1=6-1=5

c 2 0.05 = 11.070 c 2 0.01 = 15.089

c 2 >c 2 0.01 The experimental hypothesis is now abandoned.

Subject: The division of studies on the styles of “differentiality-integrality” from three strategies is divided into equal ones.

2) When two empirical divisions are equal:

The calculations are also calculated using additional table 36.

Table 36

nr f e1 j f e2 j f е1 j +f е2 j f t1 j f t2 j (f e1 j -f t1 j) 2 f t1 j (f e2 j -f t2 j) 2 f t2 j

Here in column 1 the names of the discharges are recorded,

at the other station, the corresponding frequencies of the first empirical division are recorded (f e1 j), where j is changed from 1 to,

in the third section, similar frequencies of another empirical division are recorded (f e2 j),

The 4th column has the sum of the empirical frequencies of the first and the other subdivision according to the skin discharge (f е1 j +f е2 j),

in the 7th column, the square of the difference is found to be similar to the empirical frequency of the first division with its theoretical frequency for the skin discharge and divided by this theoretical frequency ((f e1 j -f t1 j) 2 / f t1 j),

in the 8th section, the square of the difference is found to be similar to the empirical frequency of the other division with its theoretical frequency for the skin discharge and divided by this theoretical frequency ((f e2 j -f t2 j) 2 / f t2 j).

The value of the criterion is the sum of all the values ​​of 7 and 8 points, then.

.

If c 2 >c 2 0.01, then one empirical division is divided from the other, like c 2 £c 2 0.05, then the first empirical division is not divided from another, like c 2 0.05< c 2 £c 2 0,01, то отличие двух эмпирических распределений друг от друга значимо на 5% уровне.

butt. Among the students of the pre-adolescent age of the mass school (25 individuals) and the graduates of the children's school (25 individuals), the peculiarities of the image of “I” were identified according to the method “The way I imagine myself.” The result showed 7 categories to identify. The data is presented in Table 36. How many divisions differ in the categories of children's classrooms and mass schools?

Decision: n 1 =88 (the number of children from the mass school are identified), n 2 =111 (the number of children from the children's booth is identified). n 1, n 2 >30, therefore, the criterion c 2 is stagnant.

Let us formulate an experimental hypothesis: the distribution of subdivisions of the children's everyday life and the mass school in different categories are completely different.

The empirical values ​​of the criterion in Table 37 are calculable.

Table 37

The number of substances identified in a child's home and a mass school about oneself and the development of criteria for 2

Category no. temple f 1 f 2 f 1 + f 2 f t 1 f t 2 (f 1 -f t 1) 2 f t 1 (f 2 -f t 2) 2 f t2
13,27 16,73 0,81 0,53
19,45 24,54 0,33 0,26
8,84 11,15 1,67 1,33
10,17 12,83 8,27 6,55
12,38 15,62 4,69 3,72
15,48 19,52 0,01 0,01
8,4 10,59 5,19 4,1

1) formal bibliographical information; 2) relationship with distant people; 3) put before one’s age, maturity, independence; 4) attention, interests, acuity, intelligence; 5) behavior; 6) details of specialness; 7) appearance, positioned up to one-year-olds with a protracted status.

χ 2 em =0.81+0.33+1.67+8.27+4.69+0.01+5.19+0.53+0.26+1.33+6.55+3 72 +0.01 +4.1 = 37.47;

We know the number of steps of freedom = 7-1 = 6.

For =6 χ 2 0.01 =16.812; χ 2 0.05 = 12.592.

χ 2 em >

Subject: There are many differences between different categories of children and children from the mass school.

Correction for non-periodity be entered only if n=1. The formula then looks like this:

.

butt. At the student I, the course of the pedagogical university (faculty of the fіziki of the mathematics, the bioxy, filologii) viiyavodniy to the clawed style “Polency-Floody” for the methodology “Zamaskovani Fіguri” Gottshaltda. The results of the research are presented in Table 37. What are the categories of importance associated with these styles?

Decision: n 1 =49 (number of young men), n 2 =53 (number of girls), n 1, n 2 >30, therefore, the criterion c 2 stagnates.

Let us formulate an experimental hypothesis. Boys and girls students differ in their adherence to the cognitive style of “field-field-field”.

We know the empirical values ​​of the criterion in Table 38.

Table 38

Divide the girls and boys according to their belonging to the style of “field-field-field” and the size of the value of the criterion χ 2

k=2, same, n=1.

For a given n - 2 0.01 = 6.635; χ 2 0.05 = 3.841.

χ 2 em > χ 2 0.01 Þ the experimental hypothesis is accepted.

Subject: Young boys and girls are growing up for their belonging to the cognitive style “field-field-field”.

Give N 0 to the one who F(x) = F 0 (x); alternative hypothesis H 1: F(x) ¹ F 0 (x). In the Pearson criterion, the statistical value is taken to be the linear value c 2, the empirical value of which is given by the formula

where k - The number of intervals at which the values ​​are divided and the value of CB X is calculated; m i - Frequency of i interval; p i - the probability of SV X falling into the i-th interval, calculated for the theoretical division law.

When n ® ¥ SV pragna rozpodіlu c 2 s l= k – r – 1 degrees of freedom, where k is the number of intervals, r is the number of parameters of the theoretical division, calculated based on experimental data.

Vimoga, to n ® ¥, є suttevoy. In practice, it is sufficient to pay attention to n ³ 50 so that the amount of care at the skin interval m i is not less than 5. For each interval m i< 5, то имеет смысл объединить соседние интервалы.

Viklademo algorithm based on criterion c2.

1. Know the value

2. For the selected level a, after addendum VI, find the values, de l= k - r - 1.

3. If £, then the hypothesis H 0 is accepted. It can be taken into account that the theoretical and empirical laws of divisions are observed; yakscho
> , hypothesis H 0 is rejected.

Example 29.2. When sowing lion an important showmanє depth of laying. To evaluate the culture, 100 measurements were taken. The results of the analysis are shown in Table 29.3.

Table 29.3.

For additional criterion c 2, check the hypothesis H 0 about the normal distribution of CB X - the depth of the soil at equal value a = 0.01.

Decision. Let's know the i S behind the sampling data

Fragments in extreme intervals of m i values< 5, объединим их.

Table 29.4.

1. Let us know the probability p i of the receipt of CB X in the i interval using the formula

We know the meanings from Table II of the appendices.

Verification: .

Calculable values:

2. l= k - r - 1 = 5 - 2 - 1 = 2. From Table II we know = 9.21.

3. Oskolki< , то гипотезу Н 0 о нормальном распределении СВ Х отвергать нет оснований.

§ 30. Verification of hypotheses about the uniformity of samples (non-parametric criteria).

Let there be two independent selections, drawn from the general population, the laws of which are unknown. Hypothesis H0 has been verified: F1(x) = F2(x), where F1(x) and F2(x) are unknown functions of the division. Alternative hypothesis H1: F1(x) ¹ F2(x).

Kolmogorov–Smirnov criterion. This criterion is stagnant, since it can be assumed that the functions F 1 (x) and F 2 (x) are continuous.

As a criterion for statistics, the value is taken

where n 1, n 2 are the functions of the first and other samples, F 1, E (x), F 2, E (x) are empirical functions of the division of the first and other samples.

If the hypothesis H 0 is true, in a series of large samples (n 1 ³ 50, n 2 ³ 50) the division converges to the Kolmogorov division (Table VII of the supplements). For small selections of the value of D cr, special tables are used.

The hypothesis H 0 can be verified in this way. Yakshcho
> D cr, then the hypothesis is rejected or accepted.

Example 30.1. To administer this drug to the growth of piglets, a test was carried out, the results of which are shown in Table 30.1.

Table 30.1.

At the same time, piglets from the control group were trained without treatment with the drug (Table 30.2).

Table 30.2.

It is also necessary to verify the hypothesis H 0 at equal significance of a = 0.05, so that all samples are described by the same division function, then. The drug should not be used for the growth of Suttev’s piglets.

Decision. The data is calculated in the table, doctors, so
n 1 = 100, n 2 = 200.

Table 30.3.

Vikorist and table VII appendices, we know

D cr = D 1 – a = D 0.95" K 0.95 = 1.36.

Oskolki D cr< , то гипотезу Н 0 следует принять, т.е. препарат не оказывает существенного влияния на рост поросят.

In case the selections are small, hand-stitch Wilcoxon–Vitni test.

Let us formulate the rule for the second stagnation (n 1 £ 25, n 2 £ 25). To verify the hypothesis H 0: F 1 (x) = F 2 (x) with the alternative hypothesis H 1: F 1 (x) ¹ F 2 (x) next:

1. Combine two selections into one and expand the options in order of growth, expand W – the sum of numbers, the option that is smaller than the selection process.

2. Find out from Table VIII addendums w lower.cr = w( , n 1 , n 2) i w upper.cr =
= (n 1 + n 2 + 1) n 1 - w lower edge.

Yakshcho w n.kr< W < w в.кр, то нет оснований отвергнуть гипотезу, в противоположном случае гипотеза Н 0 отвергается.

Respect 30.1. If the middle option is avoided, then each of them is given ranks equal to the arithmetic mean of the serial numbers that are avoided.

Respect 30.2. The Wilcoxon test can also be used for larger samples. Which one changes the size of w n.kr and w v.kr (div.).

Example 30.2. To estimate the salary (w.o.) at two enterprises, two samples were collected: n 1 = 8 and n 2 = 9:

First enterprise 330, 390, 400, 410, 420, 450, 460, 470

IInd enterprise 340, 400, 410, 420, 430, 440, 460, 480, 490

Using the Wilcoxon-White test, check the null hypothesis H 0 about the same payment for two enterprises against the hypothesis H 1: payment for the same (a = 0.05).

Decision. Let's form a zagal variation series

330 ; 340; 390 ; 400 ; 400; 410 ; 410; 420 ; 420; 430; 440; 450 ; 460 ; 460; 470 ; 480; 490

1 2 34,5 4,5 6,5 6,5 8,5 8,5 10 11 1213,5 13,5 15 16 17

To validate the abovementioned Wilcoxon-White criterion, I first select the sample that has the smallest value n 1 = 8.

We know the value of W. For which ordinal number the option is less by selecting and we know its amount:

W = 1+3+4.5+6.5+8.5+12+13.5+15=64.

We know the value of w lower.cr = w(0.025; 8; 9) = 51.

We know the value of w upper cr = (n 1 +n 2 + 1) n 1 - w lower cr = (8 + 9 + 1) 8 - 51 = 93.

The fragments are concluding the relationship between N.Kr.< W < w в.кр (51 < 64 < 93), то нет оснований отвергнуть гипотезу Н 0 , т.е. оплата труда на I-м и II-м предприятиях различается незначительно.

Lecture 6 Analysis of two samples

6.1 Parametric criteria. 1

6.1.2 Student's t-test ( t-test) 2

6.1.3 F – Fisher criterion. 6

6.2 Non-parametric criteria. 7

6.2.1 Sign criterion ( G-criterion) 7

Let us begin with the tasks of statistical analysis, which arise after the determination of the main (sample) characteristics and the analysis of one sample, and the subsequent analysis of many samples. The most important point that comes up during the analysis of two samples is the one about the variability between samples. In this case, it is necessary to verify the statistical hypotheses about the similarity of both samples of the same population and the similarity of the averages.

Since the type of division and the function of the division of the sample are given to us, then different assessments of the capacities of the two groups of independent guards can be based on the results parametric criteria statistics: or Student's test ( t ), since the selections are aligned based on average values ​​( X ta U), or according to Fisher's criterion ( F ), since the equalization of samples is carried out using their own dispersions.

The use of parametric statistical criteria without first checking the type of division can lead to fatal mistakesunder the hour of re-verification of the working hypothesis.

To understand the meaning of difficulties in the practice of pedagogical research, follow the steps of vikorystuvat non-parametric criteria statistics , such as the sign criterion, the two-sample Wilcoxon criterion, the van der Waerden criterion, the Spearman criterion, the choice of which does not involve a large number of sample members and knowledge, type of division, but still live among a whole range of minds.

Nonparametric statistics criteria - It is free to adhere to the law of the division of elections and to go beyond the assumed caution about independence.

6.1 Parametric criteria

Before the groupie parametric criteria methods of mathematical statistics This includes methods for calculating descriptive statistics, testing graphs for the normality of the division, and testing hypotheses about the relevance of two samples of the same population. These methods are based on the assumption that the distribution of samples is subject to the normal (Gaussian) distribution law. Among the parametric criteria of statistics, we will consider the Student and Fisher test.

6.1.1 Methods for checking samples for normality

To determine where we stand on the right side of a normal division, the following methods can be used:

1) between the axes you can draw a frequency polygon (empirical function of the division) and normal curve according to the data of the investigation. Following the shape of the normal division curve and the graph of the empirical function of the division, one can identify the parameters by which the remaining curve varies from the first;

2) be calculated middle, median And the fashion and on the basis of which is determined by the evolution of the normal division. Since the mode, median and arithmetic mean of one type do not differ significantly, we can be right with the normal division. Since the median rises significantly from the mean, we are to the right of the asymmetric sample.

3) the kurtosis of the curved divergence is due to complement 0. The curve with a positive kurtosis is significantly vertical beyond the curve of the normal divergence. Curves with negative kurtosis are more similar to the normal kurtosis curve;

4) after determining the average value of the frequency subdivision and the standard variation, find the appropriate intervals for the subdivision and align them with the active data in the series:

a) - until the interval there may be close to 25% of the aggregate frequency,

b) - until the interval there may be close to 50% of the aggregate frequency,

c) - until the interval there may be close to 75% of the frequency of the aggregate,

d) - until the interval there may be close to 100% of the aggregate frequency.

6.1.2 Student's t-test ( t-test)

The criterion makes it possible to know the certainty that the average values ​​of the sample are brought to the same totality. This criterion is most often used to test the hypothesis: “The middle two samples are brought to the same population.”

If the criterion is correct, two phases can be seen. At the first stage, it is necessary to verify the hypothesis about the jealousy of the general middle two independent, disobedient selections (so titles double-digit t-test). This type has a control group and an experimental (pre-trace) group, the number of tests in groups may vary.

In another case, if one and the same group of objects generates numerical material for testing hypotheses about the mean, the vicorist has the following titles: guy's t-test. What are they called vibrations? fallow, knitted.

a) a series of independent selections

The statistical criteria for the selection of unrelated, independent samples is the same:

de , - arithmetic means in the experimental and control groups,

The standard calculation is the distribution of arithmetic averages. Know the formula:

,(2)

de n 1 and n 2 similar to the size of the first and other samples.

If n 1 =n 2 then the standard calculation of the difference in arithmetic means will follow the formula:

(3)

de the value of the sample.

Pidrahunok number of steps of freedom Follow the formula:

k = n 1 + n 2 - 2. (4)

For numerical equality of samples k = 2 n - 2.

Next, it is necessary to equate the calculated t-tempo values ​​with the theoretical values ​​to the Student's t-section (addition to the handbook of statistics). Yakscho t emp

Let's take a look at the butt of Vikoristan t -Student's t-test for awkward and unequal samples due to the number of samples

butt 1. In two groups of students - experimental and control - the same results were obtained from the initial subject (test scores; div. Table 1).

Table 1. Experimental results

First group (experimental) N 1 = 11 osib

Another group (control)

N 2 = 9 osib

121413161191315151814

The total number of members of the selection: n1=11, n2=9.

Distribution of arithmetic means: X equal =13.636; Y av =9.444

Standard care: x =2.460; s y =2.186

Using formula (2) the insurance standard calculates the difference in arithmetic means:

We take into account the statistics criterion:

In the experiment, the value of t was equal to the table values ​​with the levels of freedom levels equal to the formula (4) to the number of tests minus two (18).

The tabulated value of t crit is equal to 2.1, assuming the possibility of producing a milk trial in five rounds out of a hundred (significance level = 5% or 0.05).

Once the experimentally empirical values ​​are removed, the table is re-arranged, so it is possible to accept an alternative hypothesis (H 1) for the fact that the scientists of the experimental group show the average level of knowledge. In the experiment t=3.981, table t=2.10, 3.981>2.10, the evidence shows the superiority of the experimental approach.

Here they may be blamed food :

1. What is it that is removed from the list of values ​​that do not appear smaller in the table? Then you need to accept the null hypothesis.

2. Has the experimental method been superior? This is not the case, as shown, because from the very beginning the risk of mercy is allowed in five episodes of a hundred (p = 0.05). Our experiment was one of these five episodes. However, 95% of possible cases seem to be based on an alternative hypothesis, but it does not rely on statistical evidence.

3. Why do the results show up better in the control group than in the experimental group? We change, for example, in places, having obtained the arithmetic mean of the experimental group and the control group:

The star is screaming new method until you have clearly identified yourself for various reasons, perhaps. The fragments have an absolute value of 3.9811>2.1, another alternative hypothesis (H2) about the superiority of the traditional method is accepted.

b) a series of knitted (paired) selections

In different samples with an equal number of changes in the skin, you can use the simple formula to use Student's t-test.

The calculation of the t value is based on the following formula:

d is the difference between the different values ​​of variable X and variable Y, and d is the average difference;

Sd is calculated using the following formula:

(6)

Number of steps of freedom k is indicated by this formula k=n -1. Let's look at the example of Student's t-test for viscosity and, obviously, equal to the number of samples.

Yakscho t emp

Butt 2. There has been a growing trend of students' orientation towards artistic and aesthetic values. By intensifying the formation of this orientation, conversations were held in the experimental group, exhibitions of baby babies were organized, museums and art galleries were organized, interviews were held with musicians, artists and in. The food supply is regular: what is the effectiveness of the work? By checking the effectiveness of this work before the beginning of the experiment and after the test was given. The results of a small number of tests are summarized in Table 2 methodically.

Table 2. Experimental results

Learn

(n=10)

Bali

Additional options

to the experiment (X)

at the end

experiment (U)

d

d 2

Ivanov

Novikov

Sidorov

Pirogov

Agapov

Suvorov

Rizhikiv

Serov

Sokir

Bistrov

Seredne

14,8

21,1

From now on, we are looking at the formula:

Then we assume formula (6) and eliminate it:

I, find and then formulate formula (5). We reject:

Number of steps of freedom: k =10-1=9 and in the table Appendix 1 we know t crit =2.262, experimentally t=6.678, evidence of the possibility of accepting an alternative hypothesis (H 1) about the reliable values ​​of arithmetic means , in order to be shy about the effectiveness of experimental influx

In terms of statistical hypotheses of rejection, the result sounds like this: at 5% equal, the hypothesis H 0 is abandoned and the hypothesis H 1 is accepted.

6.1.3 F – Fisher criterion

Fisher criterion allows you to equalize the sample dispersion values ​​of two independent samples. To calculate F, it is necessary to know the ratio of the dispersion of two samples, and so that the greater the dispersion would be in the numerator, and the lesser in the denominator. The formula for calculating the Fisher criterion is as follows:

de - dispersions of the first and other samples are consistent.

Since, according to the mental criterion, the value of the number is greater or equal to the value of the sign, then the value of F emp will always be greater or equal to the unit.

The number of degrees of freedom is calculated just as simply:

k 1 =n l - 1 for the first sample (the same for the third sample, the dispersion value is larger) and k 2 =n 2 - 1 for another selection.

In Addendum 1, the critical values ​​of the Fisher criterion are found in the values ​​k 1 (top row of the table) and k 2 (left column of the table).

If t em >t critical, then the null hypothesis is accepted, otherwise the alternative is accepted.

butt 3. In two-thirds of the classes, the rose development test was carried out on the TURMSH test of ten students. The values ​​of the average values ​​were not reliably differentiated, so the psychologist cites nutrition - what is the difference in the level of homogeneity of indicators of pink development between classes.

Decision. For the Fisher criterion, it is necessary to equalize the dispersion of test scores for both classes. The test results are presented in the table:

Table 3.

School No.

First class

Other class

Sumi

Seredne

60,6

63,6

The expanded dispersions for the variables X and Y can be deduced:

s x 2 = 572.83; s y 2 = 174.04

Then, using formula (8) for the breakdown using Fisher’s F criterion, we find:

According to table 3 Addendum 1 for F criterion at degrees of freedom in both types of equal k = 10 - 1 = 9 it is known F crit = 3.18 (<3.29), следовательно, в терминах статистических гипотез можно утвер­ждать, что Н 0 (гипотеза о сходстве) может быть отвергнута на уровне 5%, а принимается в этом случае гипотеза Н 1 . Иc следователь может утверждать, что по степени однородности такого показа­теля, как умственное развитие, имеется различие между выбор­ками из двух классов.

6.2 Non-parametric tests

Keeping the results up to date (after hundreds of observations) before and after any influx, the investigator comes to the end, so as to avoid bias, there is a difference in the equalized samples. Such an approach is categorically unpleasant, because for networks it is impossible to determine the level of reliability of the subnets. The data taken by the powers that be do not allow for the possibility of working on statistically reliable findings. To ensure the effectiveness of any influx, it is necessary to identify a statistically significant trend in the displaced indicators. To complete such tasks, the investigator can use low performance criteria. Non-parametric tests will be discussed below: the sign test and the chi-square test.

6.2.1 Sign criterion ( G-criterion)

Criterion for the purpose of equalizing the capacity of the members of two fallow selections on the basis of vimiryuvans, divided behind the scale not lower than the rank.

And two series of caution over fallout changes X ta U, taken away when looking at the two fallow samples. On their basis N pairs of the form (x i, y i), de X i, y i - the results of the courtyard's death are the same and the same power in one and the same object.

Pedagogical research subjects may include students, teachers, and school administration. When x i, y i I can be bouti, in the way, by the bastard, the Vistulan -by -the -Resurrection of the Viconnya, the same, the same ibznikh, the ribi -rye il il il of the pissle of the thoroughly pedigree holler.

Elements of skin pair x i, y i match each other for the value, and the pair is given the sign «+» , yakshcho x i< у i , sign «-» , yakshcho x i > y i і «0» , yakshcho x i = y i.

Null hypothesis are formulated in the current order: the state of the government has no significant differences during the first and second extinctions. Alternative hypothesis: laws of subdivision of quantities X And in the massacre, so that I will become aware of the complete massacre in the same totality in the first and second iterations of this power.

Criterion statistics (T) is designated by the upcoming rank:

it is acceptable that 3 N pairs (x, y,) there were a few pairs that had meanings x i i y i Rivni. Such bets are marked with the sign “0” and there is no insurance for the adjustment of the T value. It is acceptable that from the number N of the number of pairs marked with the sign “0”, everything is lost n steam. Sered is quiet, so lost n pairs depends on the number of pairs marked with the sign “-”, so bets in which x i< y i . The values ​​of T are equal to the number of pairs and have a minus sign.

The null hypothesis is accepted onsignificance equal to 0.05, to avoid significance T< n - t a , где значение n - t a is shown from the statistical table for the criterion of signs Addendum 2.

butt 4.The scientists completed a control work, a direct re-verification of the acquired concept. Fifteen students were then given an electronic textbook to formulate this concept in students with a low level of learning. After testing, the students again completed the same test as they were assessed using the five-point system.

The results of the yard victorious work are vimiruvannya on a scale of order (five-point scale). In minds, it is possible to stagnate the sign criterion of the revealed trend of change, I will become aware of the scientific literature after the introduction of the teaching aids, as soon as all the assumptions of this criterion are completed.

The results of the yard victorious work (in the balls) of 15 studies are written down in the form of a table (div. table 1).

Table 4.

Academic (No.)

Pershe Vikonanny

Another Wiconanny

Sign of distinction

The hypothesis is being verified H 0 : the level of knowledge of students did not advance after the training of a support worker. Alternative hypothesis: the level of knowledge of students did not advance after the conversion of a worker.

We support the significance of the statistics to the criterion T of the current number of positive differences in marks rejected by the schools. Table below for data. 4 T=10, n=12.

To determine the critical values ​​of statistics using the n-ta criterion, use the table. Additions 2. For significance level a = 0.05 at n =12 values ​​n-ta=9. The unevenness of T>n-ta (10>9) then ends. Therefore, before the rule of accepting the solution, the null hypothesis is equal to the significance of 0.05 and an alternative hypothesis is accepted, which allows us to develop a conclusion about the reduction in the knowledge of students after self-learning Sibnik.

Butt 5.It is understood that taking a mathematics course requires students to formulate one of the techniques of logical thinking (for example, the method of rationalization) due to the fact that the formation is not carried out directly. To verify this assumption, such an experiment was carried out.

Uchnyam VII The class was assigned 5 commands, the unraveling of which is based on the history of this recipe. It was important that the study of Volodya is based on this method, since it gives the correct answer for 3 or more tasks.

The following scale was divided into the following: if 1 or 2 tasks are correctly answered, the score is “0”; 3 tasks clearly answered - rating “1”; correctly answered 4 tasks - rating “2”; clearly stated 5th task - rating “3”.

The work was carried out in two days: at the end of the spring and at the end of the grass of the oncoming fate. They were written by 35 of these same students, selected through a random selection process from 7 different schools. The results of the yard victorious work are recorded in the form of a table (div. table 5).

For the purposes of the experiment, we formulate the null hypothesis as follows: N 0 - the development of mathematics does not conform to the formation of the following method of thought. Then there is an alternative hypothesis: H 1 – the learning of mathematics combines with the science of philosophy.

Table 5.

Table below for data. 5 values ​​of statistics T=15 - the number of differences with the “+” sign. Z 35 par 12 sign "0"; to mean, n = 35-12 = 23.

Behind the table Addendum 2 for n =23 and the level of significance is 0.025, it is known that the statistical value of the criterion is more critical than 16. So, the inequality of T is true

Therefore, before the rule, we decide to make a conclusion about those whose results do not provide sufficient evidence for the development of the null hypothesis, so that we do not have sufficient evidence for the conclusion of the assertion about those who have learned mathematics It does not in itself accept the voluptuous vision as a method of thought.

6.2.2 Test χ2 (chi-square)

The criterion χ 2 (chi-square) is used to equalize the divisions of objects of two aggregates based on the variation of the scale of names in the two independent in elections.

It is acceptable that the state of power that is being taught (for example, the crowning of a singing church) appears on the skin object behind a scale of naming, which has only two mutually exclusive categories (for example: it’s been written correctly - it’s been written incorrectly). The results of the survey will be followed by the power of the objects of the two elections, and a four-fold 2X2 table will be formed. (Div. Table 6).

Table 6.

This table About ij- number of objectsi-oh vibration that was wasted inj-th category for the camp of pre-monitored meat;i = 1.2- Number of samples;j = 1.2- Number of categories;; N- great caution, which is ancient Pro 11 + Pro 12 + Pro 21 + Pro 22 or else n 1 + n 2 .

Then, based on the data in Table 2X2 (div. Table 6), it is possible to verify the null hypothesis about the similarity of the probability of hitting objects of the first and other aggregates in the first (other) category of the scale of power extinction, which is being verified , for example, the hypothesis about the equality of trustworthiness of the faithful crown of the deity control studies and classroom experiments.

When reversing null hypotheses, it is not obligatory to determine the significance of the hypotheses p 1і p 2 were visible, since the hypotheses only establish between them the actions of relationships (jealousy, more or less).

To verify the above-mentioned null hypotheses based on the data in the 2X2 table (div. Table 6), the statistical values ​​are adjusted to the criterion T behind the offensive formula:

(9)

de n 1, n 2 - obligatory elections,N=n 1 + n 2- extreme caution.

The hypothesis is being verified H0: p 1 £ p 2- for alternatives H 1: p 1 > p 2. Let's go a - praise and significance. The meaning of statistics T, Based on experimental data, it is equal to the critical statistical values x 1-2 a,how to appear behind the table s 2 s one step of freedom (div. Addendum 2) with the understanding of the obtained meaning a . It's right to be nervous T< x 1-2 a , then the null hypothesis is accepted equally a If this inequality is not consistent, then we do not have sufficient evidence to support the null hypothesis.

In connection with this, replacing the exact subsection of statistics T half-hearted s 2 s One step of freedom allows one to reach a good closeness to the great selections, the stagnant criterion is surrounded by many minds.

1) the sum of obligations of two elections is less than 20;

2)I would like one of the absolute frequencies in the 2X2 table, compiled on the basis of experimental data, to be less than 5.

Butt 6.An experiment was carried out to identify the best of the handbooks, written by two teams of authors, apparently for the purposes of studying geometry and replacing programs IX class. To conduct the experiment using the random selection method, two districts were selected, most of the schools were located in rural areas. Students from the first district (20 classes) started with teacher No. 1, students from another region (15 classes) started behind teacher No. 2.

Let's take a look at the methodology for assessing the testimonials of teachers in experimental schools in two districts on one part of the questionnaire: “Which is an accessible aid in general for independent reading and which helps in mastering the material, which the reader has not explained to lasi (Verb: so - no.)

The placement of readers to the highest level of assistants is measured on a scale of hiring, which has two categories: yes, no. The grievances of the selection of readers are not independent.

The types of 20 readers in the first district and 15 readers in another district are divided into two categories and are written in the form of a 2X2 table (Table 5).

Table 7.

All table values 7 is not less than 5, which is consistent with the criterion used in the minds z 2 You need to follow the statistics criterion to follow formula (9).

Following the table from Appendix 2 for one step of freedom ( v = l ) that level of importance a =0.05 we know x 1- a a= T critical = 3.84. Be careful<Т критич (1,86<3,84). Согласно правилу принятия ре­шений для критерия z 2 However, the results of the study of readers in two experimental areas do not provide sufficient evidence to support the hypothesis about the continued availability of assistants. 1 and 2 for independent reading by students.

The use of the chi-square test is only possible if the objects in two samples from two populations fall into two categories. For example, the students of the experimental and control classes are divided into several categories, depending on the badges (for points: 2, 3, 4, 5), which are taken away by the students for their performance in the control work.

The results of the vimirvania will be followed by the authorities of the skin sampling objects are divided into Z categories. Based on these data, table 2ХС is formed, in which there are two rows (for the number of analyzed aggregates) and Z columns (for a number of different categories I will follow the authority received from the investigator).

Table 8

Based on the data in Table 8, it is possible to verify the null hypothesis about the equality of the likelihood of objects entering the first and other aggregates in the skini (i = l,2, ..., C) categories, in order to verify the conclusion of all upcoming jealousies: p 11 = p 21 p 12 = p 22, …, p 1 c = p 2 c. It is possible, for example, to re-verify the hypothesis about the equality of compatibility between the marks “5”, “4”, “3” and “2” for the study of control and experimental classes of the singing department.

To verify the null hypothesis using an additional criterion z 2 Based on the data in table 2ХС, the statistical values ​​for the criterion are determined T following the offensive formula:

(10)

de n 1і n 2- Obsyagi vibіrk.

Significance T, based on experimental data, equals critical values x 1- a,how to appear behind the table c 2 z k =C-1 degree of freedom with equal level of importance a . For viconic inequities T> x 1- a aThe null hypothesis is gaining momentum A and an alternative hypothesis is accepted. This means that the division of objects into Z the category behind the country of the government authorities is monitored differently in the two analyzed aggregates.

Butt 7. Let's take a look at the methodology for equalizing the results of writing work, which verified the mastery of one of the sections of the course by students in one and another region.

Using the random selection method, a sample of 50 people was collected from the students in the first district where the work was written, and a sample of 50 people was collected from the students in the other district. Presumably, before specially developed criteria for assessing the quality of work, the work of the skin can be classified into one of four categories: bad, mediocre, good, excellent. The results of the research work of two selected scientists are used to verify the hypothesis that assistant no. have higher ratings, lower studies in another area.

The results of the research and study of both selections are written in a 2X4 table (Table. 9 ).

Table 9.

It is clear to the minds of the vikoristan criterion z 2 The statistical criterion is to follow the corrected formula (10).

It is obvious to use the two-sided chi-square test in the table from Appendix 2 for one step of freedom ( k Grabar M.I., Krasnyanska K.A. Status of mathematical statistics in pedagogical research. Nonparametric methods. M., “Pedagogy”, 1977, page 54

Grabar M.I., Krasnyanska K.A. Status of mathematical statistics in pedagogical research. Nonparametric methods. M., “Pedagogy”, 1977, page 57

OPR. Empirical frequencies are those frequencies that are actually avoided.

REVISION OF THE HYPOTHESIS ABOUT ROZPODIL OF THE GENERAL SUMMARY. PEARSON CRITERION

As stated earlier, the assumption about the appearance of the sex may be hanging, coming from theoretical changes of mind. Prote, even if the theoretical law of division is well chosen, there will be inevitable differences between the empirical and theoretical divisions. Naturally, nutrition is to blame: which is explained by the fact that the theoretical law of the division of selections is not available. To confirm the nutrition and the criterion, then.

OPR. Criterion now is called a criterion for testing the hypothesis about the transfer of the law to an unknown species.

Up to the skin criterion, then. depending on the division, check the table folds you need to know k kr (div. additions). Once the critical point has been found, the sampling data is used to calculate the values ​​of the criterion that must be observed. Before obs. Yakshcho Before obs > k kr, then the null hypothesis is thrown out, as if by chance, it is accepted.

Let us describe the application of the Pearson criterion before testing the hypothesis about the normal distribution of the population. Does the Pearson criterion provide evidence for those with a slight discrepancy between empirical and theoretical frequencies?

Pearson's criterion, as any criterion not to prove the validity of a hypothesis, places caution on the accepted equal significance, whether good or bad.

Dear God, let the choice of obedience be taken away from the empirical division. For equal significance, it is necessary to verify the null hypothesis: the general population is distributed normally.

As a criterion for verifying the null hypothesis, we take the linear value c 2 = , de - empirical frequencies; - Theoretical frequencies.

This SV is located with 2 - division with k - degrees of freedom. The number of steps of freedom is equal to k = m -r -1, m - the number of partial sampling intervals; r – number of parameters for the section. For a normal division r=2 (i s), then k=m –3.

To verify the null hypothesis at a given equal value: the population is distributed normally, you need:

1.Calculate the sample mean and sample mean square value.

2. Calculate the theoretical frequencies,

de p – obsyag vibіrki; h – krok (difference between two courier options); ; The significance of the function is monitored by the program.

3. Compare the empirical and theoretical frequencies using the additional Pearson criterion. For whom:



a) be aware of the significance of the criterion;

b) using the table of critical points in the division c 2, given the level of significance a and the number of degrees of freedom k, find the critical point.

Yakshcho< - нет оснований отвергнуть нулевую гипотезу. Если >- The null hypothesis is being thrown out.

Respect. Unnumerable frequencies (<5) следует объединить; в этом случае и соответствующие им теоретические частоты также надо сложить. Если производилось объединение частот, то при определении числа степеней свободы следует в качестве m принять число групп выборки, оставшихся после объединения частот.