neural_network/numpy_nn/gradient_chain_rule_10.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "96242cbd",
   "metadata": {},
   "source": [
    "## 1.梯度gradient"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ebf88971",
   "metadata": {},
   "source": [
    "我们使用上一章的函数，已经知道了分别对x，y，z的偏导。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "80ee0f3b",
   "metadata": {},
   "source": [
    "$$f(x,y,z)=3x^2z-y^2+5z+2yz$$  \n",
    "$$\\frac{\\partial}{\\partial x}f(x,y,z)=9x^2z$$  \n",
    "$$\\frac{\\partial}{\\partial y}f(x,y,z)=-2y + 2z$$  \n",
    "$$\\frac{\\partial}{\\partial z}f(x,y,z)=3x^3+5+2y$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d4a4c9bf",
   "metadata": {},
   "source": [
    "那么梯度gradient是什么呢？其实就是这些变量偏导的向量。$\\nabla$读nabla，是倒立的delta $\\triangle$。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9eda3400",
   "metadata": {},
   "source": [
    "$$\\nabla f(x, y, z) = \n",
    " \\left[\n",
    " \\begin{matrix}\n",
    "   \\frac{\\partial}{\\partial x}f(x, y, z) \\\\\n",
    "   \\frac{\\partial}{\\partial y}f(x, y, z) \\\\\n",
    "   \\frac{\\partial}{\\partial z}f(x, y, z)\n",
    "  \\end{matrix}\n",
    "  \\right] =  \\left[\n",
    " \\begin{matrix}\n",
    "   \\frac{\\partial}{\\partial x} \\\\\n",
    "   \\frac{\\partial}{\\partial y} \\\\\n",
    "   \\frac{\\partial}{\\partial z}\n",
    "  \\end{matrix}\n",
    "  \\right]f(x, y, z) = \\left[\n",
    " \\begin{matrix}\n",
    "   9x^2z \\\\\n",
    "   -2y + 2z \\\\\n",
    "   3x^3+5+2y\n",
    "  \\end{matrix}\n",
    "  \\right]\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "62034238",
   "metadata": {},
   "source": [
    "## 2.链式求导chain rule"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5ab8fbb8",
   "metadata": {},
   "source": [
    "在神经网络中，数据输入神经网络，与权重相乘加上偏置激活输出，输出有作为下一层输入，重复前面的过程。这必然涉及到函数的嵌套。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7cacc6fa",
   "metadata": {},
   "source": [
    "z=f(x)#x是数据，f是函数，z是函数输出  \n",
    "y= g(z)#z作为g函数的输入，最后得到y  \n",
    "y=g(f(x))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c1c8fb16",
   "metadata": {},
   "source": [
    "如果我们对x进行求导。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "540f6541",
   "metadata": {},
   "source": [
    "$$\\frac{d}{d x}f(g(x))=\\frac{df(g(x))}{dg(x)}\\cdot\\frac{dg(x)}{dx}=f'(g(x))\\cdot g'(x)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "352fd3be",
   "metadata": {},
   "source": [
    "如果是多个变量，使用偏导符号。\n",
    "$$\\frac{\\partial}{\\partial x}f(g(y, h(x, z))) = \\frac{\\partial f(g(y, h(x, z)))}{\\partial g(y, h(x, z))}\\cdot \\frac{\\partial g(y, h(x, z))}{\\partial h(x, z)}\\cdot \\frac{\\partial h(x, z)}{\\partial x}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "05be67ba",
   "metadata": {},
   "source": [
    "举个例子。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a4186e2b",
   "metadata": {},
   "source": [
    "$$h(x) = f(g(x)) = 3(2x^2)^5$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5d4bfbd5",
   "metadata": {},
   "source": [
    "首先把g(x)看成一个整体变量。\n",
    "$$f'(g(x))=3\\cdot 5(2x^2)^{5-1}=15(2x^2)^4$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d6ecd4f7",
   "metadata": {},
   "source": [
    "再对g(x)求导。\n",
    "$$g'(x)=2\\cdot 2 x^{2-1}=4\\cdot x^1$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6ea9a4cd",
   "metadata": {},
   "source": [
    "$$h'(x)=f'(g(x))\\cdot g'(x)=15(2x^2)^4\\cdot 4\\cdot x^1 = 960x^9$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3f230095",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}