You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

190 lines
4.3 KiB
Plaintext

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

{
"cells": [
{
"cell_type": "markdown",
"id": "96242cbd",
"metadata": {},
"source": [
"## 1.梯度gradient"
]
},
{
"cell_type": "markdown",
"id": "ebf88971",
"metadata": {},
"source": [
"我们使用上一章的函数已经知道了分别对xyz的偏导。"
]
},
{
"cell_type": "markdown",
"id": "80ee0f3b",
"metadata": {},
"source": [
"$$f(x,y,z)=3x^2z-y^2+5z+2yz$$ \n",
"$$\\frac{\\partial}{\\partial x}f(x,y,z)=9x^2z$$ \n",
"$$\\frac{\\partial}{\\partial y}f(x,y,z)=-2y + 2z$$ \n",
"$$\\frac{\\partial}{\\partial z}f(x,y,z)=3x^3+5+2y$$"
]
},
{
"cell_type": "markdown",
"id": "d4a4c9bf",
"metadata": {},
"source": [
"那么梯度gradient是什么呢其实就是这些变量偏导的向量。$\\nabla$读nabla是倒立的delta $\\triangle$。"
]
},
{
"cell_type": "markdown",
"id": "9eda3400",
"metadata": {},
"source": [
"$$\\nabla f(x, y, z) = \n",
" \\left[\n",
" \\begin{matrix}\n",
" \\frac{\\partial}{\\partial x}f(x, y, z) \\\\\n",
" \\frac{\\partial}{\\partial y}f(x, y, z) \\\\\n",
" \\frac{\\partial}{\\partial z}f(x, y, z)\n",
" \\end{matrix}\n",
" \\right] = \\left[\n",
" \\begin{matrix}\n",
" \\frac{\\partial}{\\partial x} \\\\\n",
" \\frac{\\partial}{\\partial y} \\\\\n",
" \\frac{\\partial}{\\partial z}\n",
" \\end{matrix}\n",
" \\right]f(x, y, z) = \\left[\n",
" \\begin{matrix}\n",
" 9x^2z \\\\\n",
" -2y + 2z \\\\\n",
" 3x^3+5+2y\n",
" \\end{matrix}\n",
" \\right]\n",
"$$"
]
},
{
"cell_type": "markdown",
"id": "62034238",
"metadata": {},
"source": [
"## 2.链式求导chain rule"
]
},
{
"cell_type": "markdown",
"id": "5ab8fbb8",
"metadata": {},
"source": [
"在神经网络中,数据输入神经网络,与权重相乘加上偏置激活输出,输出有作为下一层输入,重复前面的过程。这必然涉及到函数的嵌套。"
]
},
{
"cell_type": "markdown",
"id": "7cacc6fa",
"metadata": {},
"source": [
"z=f(x)#x是数据f是函数z是函数输出 \n",
"y= g(z)#z作为g函数的输入最后得到y \n",
"y=g(f(x))"
]
},
{
"cell_type": "markdown",
"id": "c1c8fb16",
"metadata": {},
"source": [
"如果我们对x进行求导。"
]
},
{
"cell_type": "markdown",
"id": "540f6541",
"metadata": {},
"source": [
"$$\\frac{d}{d x}f(g(x))=\\frac{df(g(x))}{dg(x)}\\cdot\\frac{dg(x)}{dx}=f'(g(x))\\cdot g'(x)$$"
]
},
{
"cell_type": "markdown",
"id": "352fd3be",
"metadata": {},
"source": [
"如果是多个变量,使用偏导符号。\n",
"$$\\frac{\\partial}{\\partial x}f(g(y, h(x, z))) = \\frac{\\partial f(g(y, h(x, z)))}{\\partial g(y, h(x, z))}\\cdot \\frac{\\partial g(y, h(x, z))}{\\partial h(x, z)}\\cdot \\frac{\\partial h(x, z)}{\\partial x}$$"
]
},
{
"cell_type": "markdown",
"id": "05be67ba",
"metadata": {},
"source": [
"举个例子。"
]
},
{
"cell_type": "markdown",
"id": "a4186e2b",
"metadata": {},
"source": [
"$$h(x) = f(g(x)) = 3(2x^2)^5$$"
]
},
{
"cell_type": "markdown",
"id": "5d4bfbd5",
"metadata": {},
"source": [
"首先把g(x)看成一个整体变量。\n",
"$$f'(g(x))=3\\cdot 5(2x^2)^{5-1}=15(2x^2)^4$$"
]
},
{
"cell_type": "markdown",
"id": "d6ecd4f7",
"metadata": {},
"source": [
"再对g(x)求导。\n",
"$$g'(x)=2\\cdot 2 x^{2-1}=4\\cdot x^1$$"
]
},
{
"cell_type": "markdown",
"id": "6ea9a4cd",
"metadata": {},
"source": [
"$$h'(x)=f'(g(x))\\cdot g'(x)=15(2x^2)^4\\cdot 4\\cdot x^1 = 960x^9$$"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3f230095",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}