diff --git a/source/statistics/tests/parametric/ztest.rst b/source/statistics/tests/parametric/ztest.rst index 75b2807..740a232 100644 --- a/source/statistics/tests/parametric/ztest.rst +++ b/source/statistics/tests/parametric/ztest.rst @@ -1,8 +1,8 @@ Z-Test ------- -The z-test is used to assess if the mean :math:`\overline{x}` of sample :math:`X` significantly differ from the one of a known population. -The *significance level* is determined by a *p-value* threshold chosen prior doing the test. +The z-test is used to assess if the mean :math:`\overline{x}` of sample :math:`X` differs from the one of a known population. +The *significance level* of this difference is determined by a *p-value* threshold chosen prior doing the test. Conditions for using a z-test: @@ -16,8 +16,8 @@ Conditions for using a z-test: -To perform a z-test, you should compute the *standard score* (or *z-score*) of your sample. -It characterizes how far from the population mean :math:`\mu` your sample mean :math:`\overline{x}` is, in unit of standard deviation :math:`\sigma`. +To perform a z-test, you should compute the *standard score* (or *z-score*) of your sample :math:`X`. +The *z-score*, noted :math:`Z`, characterizes how far from the population mean :math:`\mu` your sample mean :math:`\overline{x}` is, in unit of standard deviation :math:`\sigma`. It is computed as follow: .. math:: @@ -33,24 +33,32 @@ It is computed as follow: However, if :math:`n` is sufficiently large, the distribution followed by :math:`Z` is very close to a normal one. So close that, using z-test in place of the student test to compute *p-values* leads to nominal differences (`source <https://stats.stackexchange.com/questions/625578/why-is-the-sample-standard-deviation-used-in-the-z-test>`__). -From :math:`Z`, the z-test *p-value* can be derived using the :math:`\mathcal{N}(0,1)` :ref:`CDF <CDF>`. -That *p-value* is computed as follow: +From :math:`Z`, a *p-value* can be derived using the :math:`\mathcal{N}(0,1)` :ref:`CDF <CDF>` noted :math:`\Phi_{0,1}(x)`: * Left "tail" of the :math:`\mathcal{N}(0,1)` distribution: .. math:: - \alpha=P(\mathcal{N}(0,1)<Z\sigma)=P(\mathcal{N}(0,1)<Z\times 1)=P(\mathcal{N}(0,1)<Z) + \alpha &= P(\mathcal{N}(0,1)<Z\sigma) + + &=P(\mathcal{N}(0,1)<Z\times 1) + + &=P(\mathcal{N}(0,1)<Z)=\Phi_{0,1}(Z) * Right "tail" of the :math:`\mathcal{N}(0,1)` distribution: .. math:: - \alpha=1-P(\mathcal{N}(0,1)<Z\sigma)=1-P(\mathcal{N}(0,1)<Z\times 1)=1-P(\mathcal{N}(0,1)<Z) + \alpha &= 1-P(\mathcal{N}(0,1)<Z\sigma) + + &=1-P(\mathcal{N}(0,1)<Z\times 1) + + &=1-P(\mathcal{N}(0,1)<Z)=1-\Phi_{0,1}(Z) .. image:: ../../figures/normal_law_tails.svg :align: center - -If a z-test is done over one tail (left or right) it is called a **one-tailed** z-test. -If a z-test is done over both tails (left and right) it is called a **two-tailed** z-test. + +| +| If the test is done over one tail (left OR right) it is called a **one-tailed** z-test. +| If the test is done over both tails (left AND right) it is called a **two-tailed** z-test. The following code shows you how to obtain the *p-value* in R: @@ -64,7 +72,7 @@ Output example: Alpha approximated is 0.0359588035958804 Alpha from built-in CDF 0.0359303191129258 -If the :math:`\alpha` value given by the test is lower or equal to the *p-value* threshold chosen prior the test, +If the :math:`\alpha` value given by the test is lower or equal to the *p-value* threshold chosen initially, :math:`H_0` is rejected and :math:`H_1` is considered accepted. An alternative way of doing the z-test is to build a **rejection region** from the *p-value*.