ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • ์Šค์นผ๋ผ, ๋ฒกํ„ฐ๋ฅผ ํ–‰๋ ฌ๋กœ ๋ฏธ๋ถ„
    Mathematics/Linear algebra 2023. 5. 3. 19:04
    ๋ฐ˜์‘ํ˜•

    1. ์Šค์นผ๋ผ๋ฅผ ํ–‰๋ ฌ๋กœ ๋ฏธ๋ถ„

    ์ด ๊ฒฝ์šฐ๋Š” ํ–‰๋ ฌ์ด ์ž…๋ ฅ์ด๊ณ , ์ถœ๋ ฅ๋˜๋Š” ํ•จ์ˆ˜๊ฐ€ ์Šค์นผ๋ผ์ผ๋•Œ ์ ์šฉ ๊ฐ€๋Šฅํ•˜๋‹ค. $$\textbf{x}=\left [ \begin{matrix}
    x_{11} & x_{12} \\
    x_{21} & x_{22} \\
    \end{matrix} \right ]$$
    $$d\textbf{x}=\left [ \begin{matrix}
    dx_{11} & dx_{12} \\
    dx_{21} & dx_{22} \\
    \end{matrix} \right ]$$

    ์—ฌ๊ธฐ์—์„œ $f$ ๋ผ๋Š” ์Šค์นผ๋ผ ํ•จ์ˆ˜๋ฅผ ํ–‰๋ ฌ $\textbf{x}$๋กœ ๋ฏธ๋ถ„ํ•˜๊ณ ์ž ํ•œ๋‹ค. 

     

    ์›ํ•˜๋Š” ๋ฏธ๋ถ„๊ฐ’์„ ์–ป์œผ๋ ค๋ฉด ์ˆœ๊ฐ„๋ณ€ํ™”์œจ $\\df$๋ฅผ $ d\textbf{x}=\left [ \begin{matrix}
    dx_{11} & dx_{12} \\
    dx_{21} & dx_{22} \\
    \end{matrix} \right ]$ ์™€ $\frac{\partial f}{\partial \textbf{x}^{T}}=\left [ \begin{matrix}
    \frac{\partial f}{\partial x_{11}} & \frac{\partial f}{\partial x_{21}} \\
    \frac{\partial f}{\partial x_{12}} & \frac{\partial f}{\partial x_{22}} \\
    \end{matrix} \right ]$ ๋กœ ์ž˜ ํ‘œํ˜„ํ•ด์•ผ ํ•œ๋‹ค. 

    $$df=\frac{\partial f}{\partial x_{11}}dx_{11}+\frac{\partial f}{\partial x_{12}}dx_{12}+\frac{\partial f}{\partial x_{21}}dx_{21}+\frac{\partial f}{\partial x_{22}}dx_{22}$$

     

    $$d\textbf{x}\frac{\partial \textbf{f}}{\partial \textbf{x}^{T}}=\left [ \begin{matrix}
    dx_{11} & dx_{12} \\
    dx_{21} & dx_{22} \\
    \end{matrix} \right ]\left [ \begin{matrix}
    \frac{\partial f}{\partial x_{11}} & \frac{\partial f}{\partial x_{21}} \\
    \frac{\partial f}{\partial x_{12}} & \frac{\partial f}{\partial x_{22}} \\
    \end{matrix} \right ]=\left [ \begin{matrix}
    \frac{\partial f}{\partial x_{11}}dx_{11}+ \frac{\partial f}{\partial x_{12}}dx_{12}&  \\
     & \frac{\partial f}{\partial x_{21}}dx_{21}+ \frac{\partial f}{\partial x_{22}}dx_{22} \\
    \end{matrix} \right ] $$

    ๋”ฐ๋ผ์„œ $d\textbf{f}$๋Š” ์œ„์—์„œ ๋„์ถœํ•œ ํ–‰๋ ฌ์‹์˜ ๋Œ€๊ฐ์„ฑ๋ถ„์„ ๋”ํ•œ trace๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค. 

    $$\therefore d\textbf{f}=tr(d\textbf{x}\frac{\partial \textbf{f}}{\partial \textbf{x}^{T}})$$

    2. ํ–‰๋ ฌ์„ ํ–‰๋ ฌ๋กœ ๋ฏธ๋ถ„

    ๋ฒกํ„ฐ๋ฅผ ๋ฒกํ„ฐ๋กœ ๋ฏธ๋ถ„์„ ํ•  ์ค„ ์•Œ๋ฉด ํ–‰๋ ฌ์„ ํ–‰๋ ฌ๋กœ๋„ ๋ฏธ๋ถ„๊ฐ€๋Šฅํ•˜๋‹ค.

    ํ•˜์ง€๋งŒ ์•ž์„  ์ฆ๋ช…๋ฐฉ์‹์œผ๋กœ๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ํ–‰๋ ฌ์„ ๋ฒกํ„ฐ๋กœ ๋ฐ”๊พผ ๋‹ค์Œ(vectorize), ๋ฒกํ„ฐ๋ฅผ ๋ฏธ๋ถ„ํ•˜์—ฌ ๊ตฌํ•œ๋‹ค.

    <vectorize>

    ํ–‰๋ ฌ์„ ๋ฒกํ„ฐ๋กœ ๋ฐ”๊พธ์–ด์ฃผ๊ธฐ ์œ„ํ•ด ํ–‰๋ ฌ์„ row vector๋กœ ๋งŒ๋“ ๋‹ค. 

    $$\textrm{vec}\left [ \begin{pmatrix}
    x_{11} & x_{12} \\
    x_{12} & x_{12} \\
    \end{pmatrix} \right ]=\left [ \begin{matrix}
    x_{11} & x_{12} & x_{21} & x_{22} \\
    \end{matrix} \right ]$$

    $$\textrm{vec}\left [ \begin{pmatrix}
    y_{11} & y_{12} \\
    y_{12} & y_{12} \\
    \end{pmatrix} \right ]=\left [ \begin{matrix}
    y_{11} & y_{12} & y_{21} & y_{22} \\
    \end{matrix} \right ]$$

    $$d\textrm{vec}\left( \textbf{F}\right) =d\textrm{vec}\left( \textbf{x}\right) \dfrac{\partial \textrm{vec}\left( \textbf{F}\right) }{\partial \textrm{vec}^{T}\left( \textbf{x}\right) }$$

    2. ๋ฒกํ„ฐ๋ฅผ ํ–‰๋ ฌ๋กœ ๋ฏธ๋ถ„

    ์ด ๊ฒฝ์šฐ์—๋„ ํ–‰๋ ฌ์„ ๋ฒกํ„ฐํ™”์‹œ์ผœ ๋ฏธ๋ถ„์„ ๊ทธ๋Œ€๋กœ ์ ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ์ ‘๊ทผ ๊ฐ€๋Šฅํ•˜๋‹ค.

    ์˜ˆ๋ฅผ ๋“ค์–ด $\textbf{y}=\textbf{x}\textbf{W}$๋ฅผ $\textbf{W}$๋กœ ๋ฏธ๋ถ„ํ•œ๋‹ค๋ฉด?

    ์œ„์—์„œ $\textbf{y}=\left [ \begin{matrix}
    y_{1} & y_{2} \\
    \end{matrix} \right ],\textbf{x}=\left [ \begin{matrix}
    x_{1} & x_{2} \\
    \end{matrix} \right ],\textbf{W}=\left [ \begin{matrix}
    w_{11} & w_{12} \\
    w_{21} & w_{22} \\
    \end{matrix} \right ]$ ์ด๋‹ค. 

    ํ–‰๋ ฌ $\textbf{W}$๋ฅผ vectorize ํ•˜๋ฉด $\textrm{vec}(\mathbf{w})=\left [ \begin{matrix}
    w_{11} & w_{12} & w_{21} & w_{22} \\
    \end{matrix} \right ]$ ์ด๋‹ค. 

    $\mathbf{y}=[\begin{matrix}
    y_{1} & y_{2} \\
    \end{matrix}]=\left [ \begin{matrix}
    x_{1} & x_{2} \\
    \end{matrix} \right ]\left [ \begin{matrix}
    w_{11} & w_{12} \\
    w_{21} & w_{22} \\
    \end{matrix} \right ]=\left [ \begin{matrix}
    x_{1}w_{11}+x_{2}w_{21} & x_{1}w_{12}+x_{2}w_{22} \\
    \end{matrix} \right ]$

    ๋”ฐ๋ผ์„œ ๋ฏธ๋ถ„ ๊ฒฐ๊ณผ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. 

    $$\frac{d\textbf{y}}{d\textbf{W}}=\frac{\partial \textbf{y}}{\partial \textrm{vec}^{\mathbf{T}}(\textbf{w})}=\left [ \begin{matrix}
    x_{1} & 0 \\
    0 & x_{1} \\
    x_{2} & 0 \\
    0 & x_{2} \\
    \end{matrix} \right ]=\textbf{x}^{T}\bigotimes I_{2} \textrm{(Kronecker product)}$$

Designed by Tistory.