To this end, we can define the following function:
\begin{equation}\label{eq:cost}
g_w(x_1,x_2) = \frac{1}{1+e^{-h_w(x_1,x_2)}}
\end{equation}
The next step is to define a cost function. A common approach in binary logistic function is to use
the *Cross-Entropy* loss function. It is much more convenient than the classical Mean Square Error
used in polynomial regression. Indeed, the gradient is stronger even for small error (see [[https://www.youtube.com/watch?v=gIx974WtVb4&t=110s][here]] for
more informations). Thus, it looks like the following:
With $n$ the number of observations, $x_j^{(i)}$ is the value of the $j^{th}$ independant variable
associated with the observation $y^{(i)}$. The next step is to $min_w J(w)$ for each weight $w_i$
(performing the gradient decent, see [[https://towardsdatascience.com/gradient-descent-demystified-bc30b26e432a][here]]). Thus we compute each partial derivatives:
For more informations on binary logistic regression, here are usefull links:
- [[https://ml-cheatsheet.readthedocs.io/en/latest/logistic_regression.html][Logistic Regression -- ML Glossary documentation]]
- [[https://math.stackexchange.com/questions/2503428/derivative-of-binary-cross-entropy-why-are-my-signs-not-right][Derivative of the Binary Cross Entropy]]
The method used here is similar to the one used [[https://scipython.com/blog/plotting-the-decision-boundary-of-a-logistic-regression-model/][here]]. In binary logistic regression, decision
boundary is located where:\\ \[g_w(x_1,x_2)=0.5 \implies h_w(x_1,x_2)=0\]
In addition we now that our decision boundary has the following form \[x_2=ax_1+b\]
Thus, we can easily deduce b since if $x_1=0$ we have $x_2=a\times 0 + b \implies x_2=b$. Thus:
\begin{equation}
h_w(0,x_2)=w_1 + w_3x_2=0\\
\implies x_2=\frac{-w_1}{w_3}
\end{equation}
To deduce the a coefficient, it is slighly more complicated. If we know two points $(x_1^a,x_2^a)$ and $(x_1^b,x_2^b)$
on the decision boundary line, we know that $a=\frac{x_2^b-x_2^a}{x_1^b-x_1^a}$. thus if we compute: