Showing preview only (395K chars total). Download the full file or copy to clipboard to get everything.
Repository: Kevin-free/chatgpt-prompt-engineering-for-developers
Branch: main
Commit: a07a9368720e
Files: 20
Total size: 329.6 KB
Directory structure:
gitextract_v09w7ej1/
├── .gitignore
├── README.md
├── notebooks-en/
│ ├── l2-guidelines.ipynb
│ ├── l3-iterative-prompt-development.ipynb
│ ├── l4-summarizing.ipynb
│ ├── l5-inferring.ipynb
│ ├── l6-transforming.ipynb
│ ├── l7-expanding.ipynb
│ └── l8-chatbot.ipynb
├── notebooks-zh/
│ ├── 1. 引言.md
│ ├── 2. 指南 Guidelines.ipynb
│ ├── 3. 迭代 Iterative.ipynb
│ ├── 4. 摘要 Summarizing.ipynb
│ ├── 5. 推断 Inferring.ipynb
│ ├── 6. 转换 Transforming.ipynb
│ ├── 7. 扩展 Expanding.ipynb
│ ├── 8. 聊天机器人 Chatbot.ipynb
│ └── 9. 总结.md
├── notes/
│ └── Prompt Engineering 提示工程 @Kevin的学堂.xmind
└── tutorial/
└── 20230723-ChatGPT最新注册教程.md
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
pdf/
tnews-finetuning*
contribute.md
*.drawio
*.db
content/metric.csv
.DS_Store
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
.python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
.idea
================================================
FILE: README.md
================================================

# 《ChatGPT Prompt Engineering for Developers》中英版
该项目主要用于存放吴恩达《ChatGPT Prompt Engineering for Developers》课程的 notebook 和笔记。
如果您发现任何问题,或者贡献资料,欢迎向我们提交 Pull Request。
本项目文件夹说明:
- `images`:图片,该项目所用图片;
- `notebooks-en`:从课程中复制至本地的英文 Notebook;
- `notebooks-zh`:从课程中复制至本地的中文 Notebook;
- `notes`: 笔记,包括思维导图。
## 课程介绍
吴恩达《ChatGPT Prompt Engineering for Developers》课程,主要内容为指导开发者如何构建 Prompt 并基于 OpenAI API 构建新的、基于 LLM 的应用,包括:
> 书写 Prompt 的原则;
> 文本总结(如总结用户评论);
> 文本推断(如情感分类、主题提取);
> 文本转换(如翻译、自动纠错);
> 扩展(如书写邮件);
- **中文视频地址:[面向开发者的 ChatGPT 提示词工程](https://space.bilibili.com/15467823/channel/seriesdetail?sid=3247315&ctype=0)**
- **英文原版地址:[ChatGPT Prompt Engineering for Developers](https://learn.deeplearning.ai)**
## 项目意义
LLM 正在逐步改变人们的生活,而对于开发者,如何基于 LLM 提供的 API 快速、便捷地开发一些具备更强能力、集成 LLM 的应用,来便捷地实现一些更新颖、更实用的能力,是一个急需学习的重要能力。由吴恩达老师与 OpenAI 合作推出的 《ChatGPT Prompt Engineering for Developers》教程面向入门 LLM 的开发者,深入浅出地介绍了对于开发者,如何构造 Prompt 并基于 OpenAI 提供的 API 实现包括总结、推断、转换等多种常用功能,是入门 LLM 开发的经典教程。因此,我们将该课程翻译为中文,并复现其范例代码,支持国内中文学习者直接使用,以帮助中文学习者更好地学习 LLM 开发。
## 项目受众
适用于所有具备基础 Python 能力,想要入门 LLM 的开发者。
## 项目亮点
《ChatGPT Prompt Engineering for Developers》作为由吴恩达老师与 OpenAI 联合推出的官方教程,在可预见的未来会成为 LLM 的重要入门教程,但是目前还只支持英文版且国内访问受限,打造中文版且国内流畅访问的教程具有重要意义。
## 内容大纲

## 致谢
**核心贡献者**
- [Kevin](https://github.com/Kevin-free)
**其他**
1. https://github.com/datawhalechina/prompt-engineering-for-developers
2. https://github.com/GitHubDaily/ChatGPT-Prompt-Engineering-for-Developers-in-Chinese
3. https://github.com/ZhangHanDong/rustchat
## 欢迎关注
<div align=center>
<p>扫描下方二维码关注公众号:Kevin的学堂</p>

</div>
  Kevin的学堂,专注于数据结构和算法,后端开发,ChatGPT&AI相关知识的分享。同时也会分享工作和生活的经验,欢迎关注~ 和 Kevin 一起学习,一起更优秀!
## LICENSE
<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="知识共享许可协议" style="border-width:0" src="https://img.shields.io/badge/license-CC%20BY--NC--SA%204.0-lightgrey" /></a><br />本作品采用<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议</a>进行许可。
================================================
FILE: notebooks-en/l2-guidelines.ipynb
================================================
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/GitHubDaily/ChatGPT-Prompt-Engineering-for-Developers-in-Chinese/blob/master/assets/2-Guidelines/l2-guidelines.ipynb)\n",
"# Guidelines for Prompting\n",
"In this lesson, you'll practice two prompting principles and their related tactics in order to write effective prompts for large language models.\n",
"\n",
"## Setup\n",
"#### Load the API key and relevant Python libaries."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Basic congfig \n",
"# Install basic package and set key\n",
"!pip install openai\n",
"!export OPENAI_API_KEY='sk-...'"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"In this course, we've provided some code that loads the OpenAI API key for you."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"import os\n",
"\n",
"from dotenv import load_dotenv, find_dotenv\n",
"_ = load_dotenv(find_dotenv())\n",
"\n",
"openai.api_key = os.getenv('OPENAI_API_KEY')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### helper function\n",
"Throughout this course, we will use OpenAI's `gpt-3.5-turbo` model and the [chat completions endpoint](https://platform.openai.com/docs/guides/chat). \n",
"\n",
"This helper function will make it easier to use prompts and look at the generated outputs:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def get_completion(prompt, model=\"gpt-3.5-turbo\"):\n",
" messages = [{\"role\": \"user\", \"content\": prompt}]\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=0, # this is the degree of randomness of the model's output\n",
" )\n",
" return response.choices[0].message[\"content\"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prompting Principles\n",
"- **Principle 1: Write clear and specific instructions**\n",
"- **Principle 2: Give the model time to “think”**\n",
"\n",
"### Tactics\n",
"\n",
"#### Tactic 1: Use delimiters to clearly indicate distinct parts of the input\n",
"- Delimiters can be anything like: ```, \"\"\", < >, `<tag> </tag>`, `:`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"text = f\"\"\"\n",
"You should express what you want a model to do by \\ \n",
"providing instructions that are as clear and \\ \n",
"specific as you can possibly make them. \\ \n",
"This will guide the model towards the desired output, \\ \n",
"and reduce the chances of receiving irrelevant \\ \n",
"or incorrect responses. Don't confuse writing a \\ \n",
"clear prompt with writing a short prompt. \\ \n",
"In many cases, longer prompts provide more clarity \\ \n",
"and context for the model, which can lead to \\ \n",
"more detailed and relevant outputs.\n",
"\"\"\"\n",
"prompt = f\"\"\"\n",
"Summarize the text delimited by triple backticks \\ \n",
"into a single sentence.\n",
"```{text}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Tactic 2: Ask for a structured output\n",
"- JSON, HTML"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Generate a list of three made-up book titles along \\ \n",
"with their authors and genres. \n",
"Provide them in JSON format with the following keys: \n",
"book_id, title, author, genre.\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Tactic 3: Ask the model to check whether conditions are satisfied"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"text_1 = f\"\"\"\n",
"Making a cup of tea is easy! First, you need to get some \\ \n",
"water boiling. While that's happening, \\ \n",
"grab a cup and put a tea bag in it. Once the water is \\ \n",
"hot enough, just pour it over the tea bag. \\ \n",
"Let it sit for a bit so the tea can steep. After a \\ \n",
"few minutes, take out the tea bag. If you \\ \n",
"like, you can add some sugar or milk to taste. \\ \n",
"And that's it! You've got yourself a delicious \\ \n",
"cup of tea to enjoy.\n",
"\"\"\"\n",
"prompt = f\"\"\"\n",
"You will be provided with text delimited by triple quotes. \n",
"If it contains a sequence of instructions, \\ \n",
"re-write those instructions in the following format:\n",
"\n",
"Step 1 - ...\n",
"Step 2 - …\n",
"…\n",
"Step N - …\n",
"\n",
"If the text does not contain a sequence of instructions, \\ \n",
"then simply write \\\"No steps provided.\\\"\n",
"\n",
"\\\"\\\"\\\"{text_1}\\\"\\\"\\\"\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(\"Completion for Text 1:\")\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"text_2 = f\"\"\"\n",
"The sun is shining brightly today, and the birds are \\\n",
"singing. It's a beautiful day to go for a \\ \n",
"walk in the park. The flowers are blooming, and the \\ \n",
"trees are swaying gently in the breeze. People \\ \n",
"are out and about, enjoying the lovely weather. \\ \n",
"Some are having picnics, while others are playing \\ \n",
"games or simply relaxing on the grass. It's a \\ \n",
"perfect day to spend time outdoors and appreciate the \\ \n",
"beauty of nature.\n",
"\"\"\"\n",
"prompt = f\"\"\"\n",
"You will be provided with text delimited by triple quotes. \n",
"If it contains a sequence of instructions, \\ \n",
"re-write those instructions in the following format:\n",
"\n",
"Step 1 - ...\n",
"Step 2 - …\n",
"…\n",
"Step N - …\n",
"\n",
"If the text does not contain a sequence of instructions, \\ \n",
"then simply write \\\"No steps provided.\\\"\n",
"\n",
"\\\"\\\"\\\"{text_2}\\\"\\\"\\\"\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(\"Completion for Text 2:\")\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Tactic 4: \"Few-shot\" prompting"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Your task is to answer in a consistent style.\n",
"\n",
"<child>: Teach me about patience.\n",
"\n",
"<grandparent>: The river that carves the deepest \\ \n",
"valley flows from a modest spring; the \\ \n",
"grandest symphony originates from a single note; \\ \n",
"the most intricate tapestry begins with a solitary thread.\n",
"\n",
"<child>: Teach me about resilience.\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Principle 2: Give the model time to “think” \n",
"\n",
"#### Tactic 1: Specify the steps required to complete a task"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"text = f\"\"\"\n",
"In a charming village, siblings Jack and Jill set out on \\ \n",
"a quest to fetch water from a hilltop \\ \n",
"well. As they climbed, singing joyfully, misfortune \\ \n",
"struck—Jack tripped on a stone and tumbled \\ \n",
"down the hill, with Jill following suit. \\ \n",
"Though slightly battered, the pair returned home to \\ \n",
"comforting embraces. Despite the mishap, \\ \n",
"their adventurous spirits remained undimmed, and they \\ \n",
"continued exploring with delight.\n",
"\"\"\"\n",
"# example 1\n",
"prompt_1 = f\"\"\"\n",
"Perform the following actions: \n",
"1 - Summarize the following text delimited by triple \\\n",
"backticks with 1 sentence.\n",
"2 - Translate the summary into French.\n",
"3 - List each name in the French summary.\n",
"4 - Output a json object that contains the following \\\n",
"keys: french_summary, num_names.\n",
"\n",
"Separate your answers with line breaks.\n",
"\n",
"Text:\n",
"```{text}```\n",
"\"\"\"\n",
"response = get_completion(prompt_1)\n",
"print(\"Completion for prompt 1:\")\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#### Ask for output in a specified format"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt_2 = f\"\"\"\n",
"Your task is to perform the following actions: \n",
"1 - Summarize the following text delimited by \n",
" <> with 1 sentence.\n",
"2 - Translate the summary into French.\n",
"3 - List each name in the French summary.\n",
"4 - Output a json object that contains the \n",
" following keys: french_summary, num_names.\n",
"\n",
"Use the following format:\n",
"Text: <text to summarize>\n",
"Summary: <summary>\n",
"Translation: <summary translation>\n",
"Names: <list of names in Italian summary>\n",
"Output JSON: <json with summary and num_names>\n",
"\n",
"Text: <{text}>\n",
"\"\"\"\n",
"response = get_completion(prompt_2)\n",
"print(\"\\nCompletion for prompt 2:\")\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Tactic 2: Instruct the model to work out its own solution before rushing to a conclusion"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Determine if the student's solution is correct or not.\n",
"\n",
"Question:\n",
"I'm building a solar power installation and I need \\\n",
" help working out the financials. \n",
"- Land costs $100 / square foot\n",
"- I can buy solar panels for $250 / square foot\n",
"- I negotiated a contract for maintenance that will cost \\ \n",
"me a flat $100k per year, and an additional $10 / square \\\n",
"foot\n",
"What is the total cost for the first year of operations \n",
"as a function of the number of square feet.\n",
"\n",
"Student's Solution:\n",
"Let x be the size of the installation in square feet.\n",
"Costs:\n",
"1. Land cost: 100x\n",
"2. Solar panel cost: 250x\n",
"3. Maintenance cost: 100,000 + 100x\n",
"Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Note that the student's solution is actually not correct.\n",
"#### We can fix this by instructing the model to work out its own solution first."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Your task is to determine if the student's solution \\\n",
"is correct or not.\n",
"To solve the problem do the following:\n",
"- First, work out your own solution to the problem. \n",
"- Then compare your solution to the student's solution \\ \n",
"and evaluate if the student's solution is correct or not. \n",
"Don't decide if the student's solution is correct until \n",
"you have done the problem yourself.\n",
"\n",
"Use the following format:\n",
"Question:\n",
"```\n",
"question here\n",
"```\n",
"Student's solution:\n",
"```\n",
"student's solution here\n",
"```\n",
"Actual solution:\n",
"```\n",
"steps to work out the solution and your solution here\n",
"```\n",
"Is the student's solution the same as actual solution \\\n",
"just calculated:\n",
"```\n",
"yes or no\n",
"```\n",
"Student grade:\n",
"```\n",
"correct or incorrect\n",
"```\n",
"\n",
"Question:\n",
"```\n",
"I'm building a solar power installation and I need help \\\n",
"working out the financials. \n",
"- Land costs $100 / square foot\n",
"- I can buy solar panels for $250 / square foot\n",
"- I negotiated a contract for maintenance that will cost \\\n",
"me a flat $100k per year, and an additional $10 / square \\\n",
"foot\n",
"What is the total cost for the first year of operations \\\n",
"as a function of the number of square feet.\n",
"``` \n",
"Student's solution:\n",
"```\n",
"Let x be the size of the installation in square feet.\n",
"Costs:\n",
"1. Land cost: 100x\n",
"2. Solar panel cost: 250x\n",
"3. Maintenance cost: 100,000 + 100x\n",
"Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000\n",
"```\n",
"Actual solution:\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Model Limitations: Hallucinations\n",
"- Boie is a real company, the product name is not real."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Tell me about AeroGlide UltraSlim Smart Toothbrush by Boie\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Notes on using the OpenAI API outside of this classroom\n",
"\n",
"To install the OpenAI Python library:\n",
"```\n",
"!pip install openai\n",
"```\n",
"\n",
"The library needs to be configured with your account's secret key, which is available on the [website](https://platform.openai.com/account/api-keys). \n",
"\n",
"You can either set it as the `OPENAI_API_KEY` environment variable before using the library:\n",
" ```\n",
" !export OPENAI_API_KEY='sk-...'\n",
" ```\n",
"\n",
"Or, set `openai.api_key` to its value:\n",
"\n",
"```\n",
"import openai\n",
"openai.api_key = \"sk-...\"\n",
"```"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### A note about the backslash\n",
"- In the course, we are using a backslash `\\` to make the text fit on the screen without inserting newline '\\n' characters.\n",
"- GPT-3 isn't really affected whether you insert newline characters or not. But when working with LLMs in general, you may consider whether newline characters in your prompt may affect the model's performance."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "gpt_index",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.16"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: notebooks-en/l3-iterative-prompt-development.ipynb
================================================
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/GitHubDaily/ChatGPT-Prompt-Engineering-for-Developers-in-Chinese/blob/master/assets/3-Iterative/l3-iterative-prompt-development.ipynb)\n",
"# Iterative Prompt Develelopment\n",
"In this lesson, you'll iteratively analyze and refine your prompts to generate marketing copy from a product fact sheet.\n",
"\n",
"## Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Basic congfig \n",
"# Install basic package and set key\n",
"!pip install openai\n",
"!export OPENAI_API_KEY='sk-...'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"import os\n",
"\n",
"from dotenv import load_dotenv, find_dotenv\n",
"_ = load_dotenv(find_dotenv()) # read local .env file\n",
"\n",
"openai.api_key = os.getenv('OPENAI_API_KEY')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def get_completion(prompt, model=\"gpt-3.5-turbo\"):\n",
" messages = [{\"role\": \"user\", \"content\": prompt}]\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=0, # this is the degree of randomness of the model's output\n",
" )\n",
" return response.choices[0].message[\"content\"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Generate a marketing product description from a product fact sheet"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fact_sheet_chair = \"\"\"\n",
"OVERVIEW\n",
"- Part of a beautiful family of mid-century inspired office furniture, \n",
"including filing cabinets, desks, bookcases, meeting tables, and more.\n",
"- Several options of shell color and base finishes.\n",
"- Available with plastic back and front upholstery (SWC-100) \n",
"or full upholstery (SWC-110) in 10 fabric and 6 leather options.\n",
"- Base finish options are: stainless steel, matte black, \n",
"gloss white, or chrome.\n",
"- Chair is available with or without armrests.\n",
"- Suitable for home or business settings.\n",
"- Qualified for contract use.\n",
"\n",
"CONSTRUCTION\n",
"- 5-wheel plastic coated aluminum base.\n",
"- Pneumatic chair adjust for easy raise/lower action.\n",
"\n",
"DIMENSIONS\n",
"- WIDTH 53 CM | 20.87”\n",
"- DEPTH 51 CM | 20.08”\n",
"- HEIGHT 80 CM | 31.50”\n",
"- SEAT HEIGHT 44 CM | 17.32”\n",
"- SEAT DEPTH 41 CM | 16.14”\n",
"\n",
"OPTIONS\n",
"- Soft or hard-floor caster options.\n",
"- Two choices of seat foam densities: \n",
" medium (1.8 lb/ft3) or high (2.8 lb/ft3)\n",
"- Armless or 8 position PU armrests \n",
"\n",
"MATERIALS\n",
"SHELL BASE GLIDER\n",
"- Cast Aluminum with modified nylon PA6/PA66 coating.\n",
"- Shell thickness: 10 mm.\n",
"SEAT\n",
"- HD36 foam\n",
"\n",
"COUNTRY OF ORIGIN\n",
"- Italy\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Your task is to help a marketing team create a \n",
"description for a retail website of a product based \n",
"on a technical fact sheet.\n",
"\n",
"Write a product description based on the information \n",
"provided in the technical specifications delimited by \n",
"triple backticks.\n",
"\n",
"Technical specifications: ```{fact_sheet_chair}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Issue 1: The text is too long \n",
"- Limit the number of words/sentences/characters."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Your task is to help a marketing team create a \n",
"description for a retail website of a product based \n",
"on a technical fact sheet.\n",
"\n",
"Write a product description based on the information \n",
"provided in the technical specifications delimited by \n",
"triple backticks.\n",
"\n",
"Use at most 50 words.\n",
"\n",
"Technical specifications: ```{fact_sheet_chair}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"len(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Issue 2. Text focuses on the wrong details\n",
"- Ask it to focus on the aspects that are relevant to the intended audience."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Your task is to help a marketing team create a \n",
"description for a retail website of a product based \n",
"on a technical fact sheet.\n",
"\n",
"Write a product description based on the information \n",
"provided in the technical specifications delimited by \n",
"triple backticks.\n",
"\n",
"The description is intended for furniture retailers, \n",
"so should be technical in nature and focus on the \n",
"materials the product is constructed from.\n",
"\n",
"Use at most 50 words.\n",
"\n",
"Technical specifications: ```{fact_sheet_chair}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response) "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Your task is to help a marketing team create a \n",
"description for a retail website of a product based \n",
"on a technical fact sheet.\n",
"\n",
"Write a product description based on the information \n",
"provided in the technical specifications delimited by \n",
"triple backticks.\n",
"\n",
"The description is intended for furniture retailers, \n",
"so should be technical in nature and focus on the \n",
"materials the product is constructed from.\n",
"\n",
"At the end of the description, include every 7-character \n",
"Product ID in the technical specification.\n",
"\n",
"Use at most 50 words.\n",
"\n",
"Technical specifications: ```{fact_sheet_chair}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response) "
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Issue 3. Description needs a table of dimensions\n",
"- Ask it to extract information and organize it in a table."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Your task is to help a marketing team create a \n",
"description for a retail website of a product based \n",
"on a technical fact sheet.\n",
"\n",
"Write a product description based on the information \n",
"provided in the technical specifications delimited by \n",
"triple backticks.\n",
"\n",
"The description is intended for furniture retailers, \n",
"so should be technical in nature and focus on the \n",
"materials the product is constructed from.\n",
"\n",
"At the end of the description, include every 7-character \n",
"Product ID in the technical specification.\n",
"\n",
"After the description, include a table that gives the \n",
"product's dimensions. The table should have two columns.\n",
"In the first column include the name of the dimension. \n",
"In the second column include the measurements in inches only.\n",
"\n",
"Give the table the title 'Product Dimensions'.\n",
"\n",
"Format everything as HTML that can be used in a website. \n",
"Place the description in a <div> element.\n",
"\n",
"Technical specifications: ```{fact_sheet_chair}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load Python libraries to view HTML"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import display, HTML"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"display(HTML(response))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "gpt_index",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.16"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: notebooks-en/l4-summarizing.ipynb
================================================
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/GitHubDaily/ChatGPT-Prompt-Engineering-for-Developers-in-Chinese/blob/master/assets/4-Summarizing/l4-summarizing.ipynb)\n",
"# Summarizing\n",
"In this lesson, you will summarize text with a focus on specific topics.\n",
"\n",
"## Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Basic congfig \n",
"# Install basic package and set key\n",
"!pip install openai\n",
"!export OPENAI_API_KEY='sk-...'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"import os\n",
"\n",
"from dotenv import load_dotenv, find_dotenv\n",
"_ = load_dotenv(find_dotenv()) # read local .env file\n",
"\n",
"openai.api_key = os.getenv('OPENAI_API_KEY')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def get_completion(prompt, model=\"gpt-3.5-turbo\"): # Andrew mentioned that the prompt/ completion paradigm is preferable for this class\n",
" messages = [{\"role\": \"user\", \"content\": prompt}]\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=0, # this is the degree of randomness of the model's output\n",
" )\n",
" return response.choices[0].message[\"content\"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Text to summarize"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prod_review = \"\"\"\n",
"Got this panda plush toy for my daughter's birthday, \\\n",
"who loves it and takes it everywhere. It's soft and \\ \n",
"super cute, and its face has a friendly look. It's \\ \n",
"a bit small for what I paid though. I think there \\ \n",
"might be other options that are bigger for the \\ \n",
"same price. It arrived a day earlier than expected, \\ \n",
"so I got to play with it myself before I gave it \\ \n",
"to her.\n",
"\"\"\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Summarize with a word/sentence/character limit"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Your task is to generate a short summary of a product \\\n",
"review from an ecommerce site. \n",
"\n",
"Summarize the review below, delimited by triple \n",
"backticks, in at most 30 words. \n",
"\n",
"Review: ```{prod_review}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Summarize with a focus on shipping and delivery"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Your task is to generate a short summary of a product \\\n",
"review from an ecommerce site to give feedback to the \\\n",
"Shipping deparmtment. \n",
"\n",
"Summarize the review below, delimited by triple \n",
"backticks, in at most 30 words, and focusing on any aspects \\\n",
"that mention shipping and delivery of the product. \n",
"\n",
"Review: ```{prod_review}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Summarize with a focus on price and value"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Your task is to generate a short summary of a product \\\n",
"review from an ecommerce site to give feedback to the \\\n",
"Shipping deparmtment. \n",
"\n",
"Summarize the review below, delimited by triple \n",
"backticks, in at most 30 words, and focusing on any aspects \\\n",
"that mention shipping and delivery of the product. \n",
"\n",
"Review: ```{prod_review}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Summarize with a focus on price and value"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Your task is to generate a short summary of a product \\\n",
"review from an ecommerce site to give feedback to the \\\n",
"pricing deparmtment, responsible for determining the \\\n",
"price of the product. \n",
"\n",
"Summarize the review below, delimited by triple \n",
"backticks, in at most 30 words, and focusing on any aspects \\\n",
"that are relevant to the price and perceived value. \n",
"\n",
"Review: ```{prod_review}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Comment\n",
"- Summaries include topics that are not related to the topic of focus."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Try \"extract\" instead of \"summarize\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Your task is to extract relevant information from \\ \n",
"a product review from an ecommerce site to give \\\n",
"feedback to the Shipping department. \n",
"\n",
"From the review below, delimited by triple quotes \\\n",
"extract the information relevant to shipping and \\ \n",
"delivery. Limit to 30 words. \n",
"\n",
"Review: ```{prod_review}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Summarize multiple product reviews"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"review_1 = prod_review \n",
"\n",
"# review for a standing lamp\n",
"review_2 = \"\"\"\n",
"Needed a nice lamp for my bedroom, and this one \\\n",
"had additional storage and not too high of a price \\\n",
"point. Got it fast - arrived in 2 days. The string \\\n",
"to the lamp broke during the transit and the company \\\n",
"happily sent over a new one. Came within a few days \\\n",
"as well. It was easy to put together. Then I had a \\\n",
"missing part, so I contacted their support and they \\\n",
"very quickly got me the missing piece! Seems to me \\\n",
"to be a great company that cares about their customers \\\n",
"and products. \n",
"\"\"\"\n",
"\n",
"# review for an electric toothbrush\n",
"review_3 = \"\"\"\n",
"My dental hygienist recommended an electric toothbrush, \\\n",
"which is why I got this. The battery life seems to be \\\n",
"pretty impressive so far. After initial charging and \\\n",
"leaving the charger plugged in for the first week to \\\n",
"condition the battery, I've unplugged the charger and \\\n",
"been using it for twice daily brushing for the last \\\n",
"3 weeks all on the same charge. But the toothbrush head \\\n",
"is too small. I’ve seen baby toothbrushes bigger than \\\n",
"this one. I wish the head was bigger with different \\\n",
"length bristles to get between teeth better because \\\n",
"this one doesn’t. Overall if you can get this one \\\n",
"around the $50 mark, it's a good deal. The manufactuer's \\\n",
"replacements heads are pretty expensive, but you can \\\n",
"get generic ones that're more reasonably priced. This \\\n",
"toothbrush makes me feel like I've been to the dentist \\\n",
"every day. My teeth feel sparkly clean! \n",
"\"\"\"\n",
"\n",
"# review for a blender\n",
"review_4 = \"\"\"\n",
"So, they still had the 17 piece system on seasonal \\\n",
"sale for around $49 in the month of November, about \\\n",
"half off, but for some reason (call it price gouging) \\\n",
"around the second week of December the prices all went \\\n",
"up to about anywhere from between $70-$89 for the same \\\n",
"system. And the 11 piece system went up around $10 or \\\n",
"so in price also from the earlier sale price of $29. \\\n",
"So it looks okay, but if you look at the base, the part \\\n",
"where the blade locks into place doesn’t look as good \\\n",
"as in previous editions from a few years ago, but I \\\n",
"plan to be very gentle with it (example, I crush \\\n",
"very hard items like beans, ice, rice, etc. in the \\ \n",
"blender first then pulverize them in the serving size \\\n",
"I want in the blender then switch to the whipping \\\n",
"blade for a finer flour, and use the cross cutting blade \\\n",
"first when making smoothies, then use the flat blade \\\n",
"if I need them finer/less pulpy). Special tip when making \\\n",
"smoothies, finely cut and freeze the fruits and \\\n",
"vegetables (if using spinach-lightly stew soften the \\ \n",
"spinach then freeze until ready for use-and if making \\\n",
"sorbet, use a small to medium sized food processor) \\ \n",
"that you plan to use that way you can avoid adding so \\\n",
"much ice if at all-when making your smoothie. \\\n",
"After about a year, the motor was making a funny noise. \\\n",
"I called customer service but the warranty expired \\\n",
"already, so I had to buy another one. FYI: The overall \\\n",
"quality has gone done in these types of products, so \\\n",
"they are kind of counting on brand recognition and \\\n",
"consumer loyalty to maintain sales. Got it in about \\\n",
"two days.\n",
"\"\"\"\n",
"\n",
"reviews = [review_1, review_2, review_3, review_4]\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"for i in range(len(reviews)):\n",
" prompt = f\"\"\"\n",
" Your task is to generate a short summary of a product \\ \n",
" review from an ecommerce site. \n",
"\n",
" Summarize the review below, delimited by triple \\\n",
" backticks in at most 20 words. \n",
"\n",
" Review: ```{reviews[i]}```\n",
" \"\"\"\n",
"\n",
" response = get_completion(prompt)\n",
" print(i, response, \"\\n\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "gpt_index",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.16"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: notebooks-en/l5-inferring.ipynb
================================================
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/GitHubDaily/ChatGPT-Prompt-Engineering-for-Developers-in-Chinese/blob/master/assets/5-Inferring/l5-inferring.ipynb)\n",
"# Inferring\n",
"In this lesson, you will infer sentiment and topics from product reviews and news articles.\n",
"\n",
"## Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Basic congfig \n",
"# Install basic package and set key\n",
"!pip install openai\n",
"!export OPENAI_API_KEY='sk-...'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"import os\n",
"\n",
"from dotenv import load_dotenv, find_dotenv\n",
"_ = load_dotenv(find_dotenv()) # read local .env file\n",
"\n",
"openai.api_key = os.getenv('OPENAI_API_KEY')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def get_completion(prompt, model=\"gpt-3.5-turbo\"):\n",
" messages = [{\"role\": \"user\", \"content\": prompt}]\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=0, # this is the degree of randomness of the model's output\n",
" )\n",
" return response.choices[0].message[\"content\"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Product review text"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"lamp_review = \"\"\"\n",
"Needed a nice lamp for my bedroom, and this one had \\\n",
"additional storage and not too high of a price point. \\\n",
"Got it fast. The string to our lamp broke during the \\\n",
"transit and the company happily sent over a new one. \\\n",
"Came within a few days as well. It was easy to put \\\n",
"together. I had a missing part, so I contacted their \\\n",
"support and they very quickly got me the missing piece! \\\n",
"Lumina seems to me to be a great company that cares \\\n",
"about their customers and products!!\n",
"\"\"\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Sentiment (positive/negative)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"What is the sentiment of the following product review, \n",
"which is delimited with triple backticks?\n",
"\n",
"Review text: '''{lamp_review}'''\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"What is the sentiment of the following product review, \n",
"which is delimited with triple backticks?\n",
"\n",
"Give your answer as a single word, either \"positive\" \\\n",
"or \"negative\".\n",
"\n",
"Review text: '''{lamp_review}'''\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Identify types of emotions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Identify a list of emotions that the writer of the \\\n",
"following review is expressing. Include no more than \\\n",
"five items in the list. Format your answer as a list of \\\n",
"lower-case words separated by commas.\n",
"\n",
"Review text: '''{lamp_review}'''\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Identify anger"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Is the writer of the following review expressing anger?\\\n",
"The review is delimited with triple backticks. \\\n",
"Give your answer as either yes or no.\n",
"\n",
"Review text: '''{lamp_review}'''\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Extract product and company name from customer reviews"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Identify the following items from the review text: \n",
"- Item purchased by reviewer\n",
"- Company that made the item\n",
"\n",
"The review is delimited with triple backticks. \\\n",
"Format your response as a JSON object with \\\n",
"\"Item\" and \"Brand\" as the keys. \n",
"If the information isn't present, use \"unknown\" \\\n",
"as the value.\n",
"Make your response as short as possible.\n",
" \n",
"Review text: '''{lamp_review}'''\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Doing multiple tasks at once"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Identify the following items from the review text: \n",
"- Sentiment (positive or negative)\n",
"- Is the reviewer expressing anger? (true or false)\n",
"- Item purchased by reviewer\n",
"- Company that made the item\n",
"\n",
"The review is delimited with triple backticks. \\\n",
"Format your response as a JSON object with \\\n",
"\"Sentiment\", \"Anger\", \"Item\" and \"Brand\" as the keys.\n",
"If the information isn't present, use \"unknown\" \\\n",
"as the value.\n",
"Make your response as short as possible.\n",
"Format the Anger value as a boolean.\n",
"\n",
"Review text: '''{lamp_review}'''\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Inferring topics"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"story = \"\"\"\n",
"In a recent survey conducted by the government, \n",
"public sector employees were asked to rate their level \n",
"of satisfaction with the department they work at. \n",
"The results revealed that NASA was the most popular \n",
"department with a satisfaction rating of 95%.\n",
"\n",
"One NASA employee, John Smith, commented on the findings, \n",
"stating, \"I'm not surprised that NASA came out on top. \n",
"It's a great place to work with amazing people and \n",
"incredible opportunities. I'm proud to be a part of \n",
"such an innovative organization.\"\n",
"\n",
"The results were also welcomed by NASA's management team, \n",
"with Director Tom Johnson stating, \"We are thrilled to \n",
"hear that our employees are satisfied with their work at NASA. \n",
"We have a talented and dedicated team who work tirelessly \n",
"to achieve our goals, and it's fantastic to see that their \n",
"hard work is paying off.\"\n",
"\n",
"The survey also revealed that the \n",
"Social Security Administration had the lowest satisfaction \n",
"rating, with only 45% of employees indicating they were \n",
"satisfied with their job. The government has pledged to \n",
"address the concerns raised by employees in the survey and \n",
"work towards improving job satisfaction across all departments.\n",
"\"\"\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Infer 5 topics"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Determine five topics that are being discussed in the \\\n",
"following text, which is delimited by triple backticks.\n",
"\n",
"Make each item one or two words long. \n",
"\n",
"Format your response as a list of items separated by commas.\n",
"\n",
"Text sample: '''{story}'''\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"response.split(sep=',')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"topic_list = [\n",
" \"nasa\", \"local government\", \"engineering\", \n",
" \"employee satisfaction\", \"federal government\"\n",
"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Make a news alert for certain topics"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Determine whether each item in the following list of \\\n",
"topics is a topic in the text below, which\n",
"is delimited with triple backticks.\n",
"\n",
"Give your answer as list with 0 or 1 for each topic.\\\n",
"\n",
"List of topics: {\", \".join(topic_list)}\n",
"\n",
"Text sample: '''{story}'''\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"topic_dict = {i.split(': ')[0]: int(i.split(': ')[1]) for i in response.split(sep='\\n')}\n",
"if topic_dict['nasa'] == 1:\n",
" print(\"ALERT: New NASA story!\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "gpt_index",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.16"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: notebooks-en/l6-transforming.ipynb
================================================
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/GitHubDaily/ChatGPT-Prompt-Engineering-for-Developers-in-Chinese/blob/master/assets/6-Transforming/l6-transforming.ipynb)\n",
"# Transforming\n",
"\n",
"In this notebook, we will explore how to use Large Language Models for text transformation tasks such as language translation, spelling and grammar checking, tone adjustment, and format conversion.\n",
"\n",
"## Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Basic congfig \n",
"# Install basic package and set key\n",
"!pip install openai\n",
"!export OPENAI_API_KEY='sk-...'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"import os\n",
"\n",
"from dotenv import load_dotenv, find_dotenv\n",
"_ = load_dotenv(find_dotenv()) # read local .env file\n",
"\n",
"openai.api_key = os.getenv('OPENAI_API_KEY')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def get_completion(prompt, model=\"gpt-3.5-turbo\", temperature=0): \n",
" messages = [{\"role\": \"user\", \"content\": prompt}]\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=temperature, \n",
" )\n",
" return response.choices[0].message[\"content\"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Translation\n",
"\n",
"ChatGPT is trained with sources in many languages. This gives the model the ability to do translation. Here are some examples of how to use this capability."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Translate the following English text to Spanish: \\ \n",
"```Hi, I would like to order a blender```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Tell me which language this is: \n",
"```Combien coûte le lampadaire?```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Translate the following text to French and Spanish\n",
"and English pirate: \\\n",
"```I want to order a basketball```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Translate the following text to Spanish in both the \\\n",
"formal and informal forms: \n",
"'Would you like to order a pillow?'\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Universal Translator\n",
"Imagine you are in charge of IT at a large multinational e-commerce company. Users are messaging you with IT issues in all their native languages. Your staff is from all over the world and speaks only their native languages. You need a universal translator!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"user_messages = [\n",
" \"La performance du système est plus lente que d'habitude.\", # System performance is slower than normal \n",
" \"Mi monitor tiene píxeles que no se iluminan.\", # My monitor has pixels that are not lighting\n",
" \"Il mio mouse non funziona\", # My mouse is not working\n",
" \"Mój klawisz Ctrl jest zepsuty\", # My keyboard has a broken control key\n",
" \"我的屏幕在闪烁\" # My screen is flashing\n",
"] "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"for issue in user_messages:\n",
" prompt = f\"Tell me what language this is: ```{issue}```\"\n",
" lang = get_completion(prompt)\n",
" print(f\"Original message ({lang}): {issue}\")\n",
"\n",
" prompt = f\"\"\"\n",
" Translate the following text to English \\\n",
" and Korean: ```{issue}```\n",
" \"\"\"\n",
" response = get_completion(prompt)\n",
" print(response, \"\\n\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Try it yourself!\n",
"Try some translations on your own!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tone Transformation\n",
"Writing can vary based on the intended audience. ChatGPT can produce different tones.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"Translate the following from slang to a business letter: \n",
"'Dude, This is Joe, check out this spec on this standing lamp.'\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Format Conversion\n",
"ChatGPT can translate between formats. The prompt should describe the input and output formats."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"data_json = { \"resturant employees\" :[ \n",
" {\"name\":\"Shyam\", \"email\":\"shyamjaiswal@gmail.com\"},\n",
" {\"name\":\"Bob\", \"email\":\"bob32@gmail.com\"},\n",
" {\"name\":\"Jai\", \"email\":\"jai87@gmail.com\"}\n",
"]}\n",
"\n",
"prompt = f\"\"\"\n",
"Translate the following python dictionary from JSON to an HTML \\\n",
"table with column headers and title: {data_json}\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import display, Markdown, Latex, HTML, JSON\n",
"display(HTML(response))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Spellcheck/Grammar check.\n",
"\n",
"Here are some examples of common grammar and spelling problems and the LLM's response. \n",
"\n",
"To signal to the LLM that you want it to proofread your text, you instruct the model to 'proofread' or 'proofread and correct'."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"text = [ \n",
" \"The girl with the black and white puppies have a ball.\", # The girl has a ball.\n",
" \"Yolanda has her notebook.\", # ok\n",
" \"Its going to be a long day. Does the car need it’s oil changed?\", # Homonyms\n",
" \"Their goes my freedom. There going to bring they’re suitcases.\", # Homonyms\n",
" \"Your going to need you’re notebook.\", # Homonyms\n",
" \"That medicine effects my ability to sleep. Have you heard of the butterfly affect?\", # Homonyms\n",
" \"This phrase is to cherck chatGPT for speling abilitty\" # spelling\n",
"]\n",
"for t in text:\n",
" prompt = f\"\"\"Proofread and correct the following text\n",
" and rewrite the corrected version. If you don't find\n",
" and errors, just say \"No errors found\". Don't use \n",
" any punctuation around the text:\n",
" ```{t}```\"\"\"\n",
" response = get_completion(prompt)\n",
" print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"text = f\"\"\"\n",
"Got this for my daughter for her birthday cuz she keeps taking \\\n",
"mine from my room. Yes, adults also like pandas too. She takes \\\n",
"it everywhere with her, and it's super soft and cute. One of the \\\n",
"ears is a bit lower than the other, and I don't think that was \\\n",
"designed to be asymmetrical. It's a bit small for what I paid for it \\\n",
"though. I think there might be other options that are bigger for \\\n",
"the same price. It arrived a day earlier than expected, so I got \\\n",
"to play with it myself before I gave it to my daughter.\n",
"\"\"\"\n",
"prompt = f\"proofread and correct this review: ```{text}```\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install redlines\n",
"from redlines import Redlines\n",
"\n",
"diff = Redlines(text,response)\n",
"display(Markdown(diff.output_markdown))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"proofread and correct this review. Make it more compelling. \n",
"Ensure it follows APA style guide and targets an advanced reader. \n",
"Output in markdown format.\n",
"Text: ```{text}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"display(Markdown(response))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Try it yourself!\n",
"Try changing the instructions to form your own review."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Thanks to the following sites:\n",
"\n",
"https://writingprompts.com/bad-grammar-examples/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "gpt_index",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.16"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: notebooks-en/l7-expanding.ipynb
================================================
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/GitHubDaily/ChatGPT-Prompt-Engineering-for-Developers-in-Chinese/blob/master/assets/7-Expanding/l7-expanding.ipynb)\n",
"# Expanding\n",
"In this lesson, you will generate customer service emails that are tailored to each customer's review.\n",
"\n",
"## Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Basic congfig \n",
"# Install basic package and set key\n",
"!pip install openai\n",
"!export OPENAI_API_KEY='sk-...'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"import os\n",
"\n",
"from dotenv import load_dotenv, find_dotenv\n",
"_ = load_dotenv(find_dotenv()) # read local .env file\n",
"\n",
"openai.api_key = os.getenv('OPENAI_API_KEY')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def get_completion(prompt, model=\"gpt-3.5-turbo\",temperature=0): # Andrew mentioned that the prompt/ completion paradigm is preferable for this class\n",
" messages = [{\"role\": \"user\", \"content\": prompt}]\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=temperature, # this is the degree of randomness of the model's output\n",
" )\n",
" return response.choices[0].message[\"content\"]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Customize the automated reply to a customer email"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# given the sentiment from the lesson on \"inferring\",\n",
"# and the original customer message, customize the email\n",
"sentiment = \"negative\"\n",
"\n",
"# review for a blender\n",
"review = f\"\"\"\n",
"So, they still had the 17 piece system on seasonal \\\n",
"sale for around $49 in the month of November, about \\\n",
"half off, but for some reason (call it price gouging) \\\n",
"around the second week of December the prices all went \\\n",
"up to about anywhere from between $70-$89 for the same \\\n",
"system. And the 11 piece system went up around $10 or \\\n",
"so in price also from the earlier sale price of $29. \\\n",
"So it looks okay, but if you look at the base, the part \\\n",
"where the blade locks into place doesn’t look as good \\\n",
"as in previous editions from a few years ago, but I \\\n",
"plan to be very gentle with it (example, I crush \\\n",
"very hard items like beans, ice, rice, etc. in the \\ \n",
"blender first then pulverize them in the serving size \\\n",
"I want in the blender then switch to the whipping \\\n",
"blade for a finer flour, and use the cross cutting blade \\\n",
"first when making smoothies, then use the flat blade \\\n",
"if I need them finer/less pulpy). Special tip when making \\\n",
"smoothies, finely cut and freeze the fruits and \\\n",
"vegetables (if using spinach-lightly stew soften the \\ \n",
"spinach then freeze until ready for use-and if making \\\n",
"sorbet, use a small to medium sized food processor) \\ \n",
"that you plan to use that way you can avoid adding so \\\n",
"much ice if at all-when making your smoothie. \\\n",
"After about a year, the motor was making a funny noise. \\\n",
"I called customer service but the warranty expired \\\n",
"already, so I had to buy another one. FYI: The overall \\\n",
"quality has gone done in these types of products, so \\\n",
"they are kind of counting on brand recognition and \\\n",
"consumer loyalty to maintain sales. Got it in about \\\n",
"two days.\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prompt = f\"\"\"\n",
"You are a customer service AI assistant.\n",
"Your task is to send an email reply to a valued customer.\n",
"Given the customer email delimited by ```, \\\n",
"Generate a reply to thank the customer for their review.\n",
"If the sentiment is positive or neutral, thank them for \\\n",
"their review.\n",
"If the sentiment is negative, apologize and suggest that \\\n",
"they can reach out to customer service. \n",
"Make sure to use specific details from the review.\n",
"Write in a concise and professional tone.\n",
"Sign the email as `AI customer agent`.\n",
"Customer review: ```{review}```\n",
"Review sentiment: {sentiment}\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "gpt_index",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.16"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: notebooks-en/l8-chatbot.ipynb
================================================
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://colab.research.google.com/github/GitHubDaily/ChatGPT-Prompt-Engineering-for-Developers-in-Chinese/blob/master/assets/8-Chatbot/l8-chatbot.ipynb)\n",
"# The Chat Format\n",
"\n",
"In this notebook, you will explore how you can utilize the chat format to have extended conversations with chatbots personalized or specialized for specific tasks or behaviors.\n",
"\n",
"## Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Basic congfig \n",
"# Install basic package and set key\n",
"!pip install openai\n",
"!export OPENAI_API_KEY='sk-...'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import openai\n",
"from dotenv import load_dotenv, find_dotenv\n",
"_ = load_dotenv(find_dotenv()) # read local .env file\n",
"\n",
"openai.api_key = os.getenv('OPENAI_API_KEY')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def get_completion(prompt, model=\"gpt-3.5-turbo\"):\n",
" messages = [{\"role\": \"user\", \"content\": prompt}]\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=0, # this is the degree of randomness of the model's output\n",
" )\n",
" return response.choices[0].message[\"content\"]\n",
"\n",
"def get_completion_from_messages(messages, model=\"gpt-3.5-turbo\", temperature=0):\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=temperature, # this is the degree of randomness of the model's output\n",
" )\n",
"# print(str(response.choices[0].message))\n",
" return response.choices[0].message[\"content\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"messages = [ \n",
"{'role':'system', 'content':'You are an assistant that speaks like Shakespeare.'}, \n",
"{'role':'user', 'content':'tell me a joke'}, \n",
"{'role':'assistant', 'content':'Why did the chicken cross the road'}, \n",
"{'role':'user', 'content':'I don\\'t know'} ]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"response = get_completion_from_messages(messages, temperature=1)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"messages = [ \n",
"{'role':'system', 'content':'You are friendly chatbot.'}, \n",
"{'role':'user', 'content':'Hi, my name is Isa'} ]\n",
"response = get_completion_from_messages(messages, temperature=1)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"messages = [ \n",
"{'role':'system', 'content':'You are friendly chatbot.'}, \n",
"{'role':'user', 'content':'Yes, can you remind me, What is my name?'} ]\n",
"response = get_completion_from_messages(messages, temperature=1)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"messages = [ \n",
"{'role':'system', 'content':'You are friendly chatbot.'},\n",
"{'role':'user', 'content':'Hi, my name is Isa'},\n",
"{'role':'assistant', 'content': \"Hi Isa! It's nice to meet you. \\\n",
"Is there anything I can help you with today?\"},\n",
"{'role':'user', 'content':'Yes, you can remind me, What is my name?'} ]\n",
"response = get_completion_from_messages(messages, temperature=1)\n",
"print(response)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# OrderBot\n",
"We can automate the collection of user prompts and assistant responses to build a OrderBot. The OrderBot will take orders at a pizza restaurant. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install panel jupyter_bokeh"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def collect_messages(_):\n",
" prompt = inp.value_input\n",
" inp.value = ''\n",
" context.append({'role':'user', 'content':f\"{prompt}\"})\n",
" response = get_completion_from_messages(context) \n",
" context.append({'role':'assistant', 'content':f\"{response}\"})\n",
" panels.append(\n",
" pn.Row('User:', pn.pane.Markdown(prompt, width=600)))\n",
" panels.append(\n",
" pn.Row('Assistant:', pn.pane.Markdown(response, width=600, style={'background-color': '#F6F6F6'})))\n",
" \n",
" return pn.Column(*panels)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import panel as pn # GUI\n",
"pn.extension()\n",
"\n",
"panels = [] # collect display \n",
"\n",
"context = [ {'role':'system', 'content':\"\"\"\n",
"You are OrderBot, an automated service to collect orders for a pizza restaurant. \\\n",
"You first greet the customer, then collects the order, \\\n",
"and then asks if it's a pickup or delivery. \\\n",
"You wait to collect the entire order, then summarize it and check for a final \\\n",
"time if the customer wants to add anything else. \\\n",
"If it's a delivery, you ask for an address. \\\n",
"Finally you collect the payment.\\\n",
"Make sure to clarify all options, extras and sizes to uniquely \\\n",
"identify the item from the menu.\\\n",
"You respond in a short, very conversational friendly style. \\\n",
"The menu includes \\\n",
"pepperoni pizza 12.95, 10.00, 7.00 \\\n",
"cheese pizza 10.95, 9.25, 6.50 \\\n",
"eggplant pizza 11.95, 9.75, 6.75 \\\n",
"fries 4.50, 3.50 \\\n",
"greek salad 7.25 \\\n",
"Toppings: \\\n",
"extra cheese 2.00, \\\n",
"mushrooms 1.50 \\\n",
"sausage 3.00 \\\n",
"canadian bacon 3.50 \\\n",
"AI sauce 1.50 \\\n",
"peppers 1.00 \\\n",
"Drinks: \\\n",
"coke 3.00, 2.00, 1.00 \\\n",
"sprite 3.00, 2.00, 1.00 \\\n",
"bottled water 5.00 \\\n",
"\"\"\"} ] # accumulate messages\n",
"\n",
"\n",
"inp = pn.widgets.TextInput(value=\"Hi\", placeholder='Enter text here…')\n",
"button_conversation = pn.widgets.Button(name=\"Chat!\")\n",
"\n",
"interactive_conversation = pn.bind(collect_messages, button_conversation)\n",
"\n",
"dashboard = pn.Column(\n",
" inp,\n",
" pn.Row(button_conversation),\n",
" pn.panel(interactive_conversation, loading_indicator=True, height=300),\n",
")\n",
"\n",
"dashboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"messages = context.copy()\n",
"messages.append(\n",
"{'role':'system', 'content':'create a json summary of the previous food order. Itemize the price for each item\\\n",
" The fields should be 1) pizza, include size 2) list of toppings 3) list of drinks, include size 4) list of sides include size 5)total price '}, \n",
")\n",
" #The fields should be 1) pizza, price 2) list of toppings 3) list of drinks, include size include price 4) list of sides include size include price, 5)total price '}, \n",
"\n",
"response = get_completion_from_messages(messages, temperature=0)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "gpt_index",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.16"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
================================================
FILE: notebooks-zh/1. 引言.md
================================================
# 简介
**作者 吴恩达教授**
欢迎来到本课程,我们将为开发人员介绍 ChatGPT 提示工程。本课程由 Isa Fulford 教授和我一起授课。Isa Fulford 是 OpenAI 的技术团队成员,曾开发过受欢迎的 ChatGPT 检索插件,并且在教授人们如何在产品中使用 LLM 或 LLM 技术方面做出了很大贡献。她还参与编写了教授人们使用 Prompt 的 OpenAI cookbook。
互联网上有很多有关提示的材料,例如《30 prompts everyone has to know》之类的文章。这些文章主要集中在 ChatGPT Web 用户界面上,许多人在使用它执行特定的、通常是一次性的任务。但是,我认为 LLM 或大型语言模型作为开发人员的更强大功能是使用 API 调用到 LLM,以快速构建软件应用程序。我认为这方面还没有得到充分的重视。实际上,我们在 DeepLearning.AI 的姊妹公司 AI Fund 的团队一直在与许多初创公司合作,将这些技术应用于许多不同的应用程序上。看到 LLM API 能够让开发人员非常快速地构建应用程序,这真是令人兴奋。
在本课程中,我们将与您分享一些可能性以及如何实现它们的最佳实践。
随着大型语言模型(LLM)的发展,LLM 大致可以分为两种类型,即基础LLM和指令微调LLM。基础LLM是基于文本训练数据,训练出预测下一个单词能力的模型,其通常是在互联网和其他来源的大量数据上训练的。例如,如果你以“从前有一只独角兽”作为提示,基础LLM可能会继续预测“生活在一个与所有独角兽朋友的神奇森林中”。但是,如果你以“法国的首都是什么”为提示,则基础LLM可能会根据互联网上的文章,将答案预测为“法国最大的城市是什么?法国的人口是多少?”,因为互联网上的文章很可能是有关法国国家的问答题目列表。
许多 LLMs 的研究和实践的动力正在指令调整的 LLMs 上。指令调整的 LLMs 已经被训练来遵循指令。因此,如果你问它,“法国的首都是什么?”,它更有可能输出“法国的首都是巴黎”。指令调整的 LLMs 的训练通常是从已经训练好的基本 LLMs 开始,该模型已经在大量文本数据上进行了训练。然后,使用输入是指令、输出是其应该返回的结果的数据集来对其进行微调,要求它遵循这些指令。然后通常使用一种称为 RLHF(reinforcement learning from human feedback,人类反馈强化学习)的技术进行进一步改进,使系统更能够有帮助地遵循指令。
因为指令调整的 LLMs 已经被训练成有益、诚实和无害的,所以与基础LLMs相比,它们更不可能输出有问题的文本,如有害输出。许多实际使用场景已经转向指令调整的LLMs。您在互联网上找到的一些最佳实践可能更适用于基础LLMs,但对于今天的大多数实际应用,我们建议将注意力集中在指令调整的LLMs上,这些LLMs更容易使用,而且由于OpenAI和其他LLM公司的工作,它们变得更加安全和更加协调。
因此,本课程将重点介绍针对指令调整 LLM 的最佳实践,这是我们建议您用于大多数应用程序的。在继续之前,我想感谢 OpenAI 和 DeepLearning.ai 团队为 Izzy 和我所提供的材料作出的贡献。我非常感激 OpenAI 的 Andrew Main、Joe Palermo、Boris Power、Ted Sanders 和 Lillian Weng,他们参与了我们的头脑风暴材料的制定和审核,为这个短期课程编制了课程大纲。我也感激 Deep Learning 方面的 Geoff Ladwig、Eddy Shyu 和 Tommy Nelson 的工作。
当您使用指令调整 LLM 时,请类似于考虑向另一个人提供指令,假设它是一个聪明但不知道您任务的具体细节的人。当 LLM 无法正常工作时,有时是因为指令不够清晰。例如,如果您说“请为我写一些关于阿兰·图灵的东西”,清楚表明您希望文本专注于他的科学工作、个人生活、历史角色或其他方面可能会更有帮助。更多的,您还可以指定文本采取像专业记者写作的语调,或者更像是您向朋友写的随笔。
当然,如果你想象一下让一位新毕业的大学生为你完成这个任务,你甚至可以提前指定他们应该阅读哪些文本片段来写关于 Alan Turing的文本,那么这能够帮助这位新毕业的大学生更好地成功完成这项任务。下一章你会看到如何让提示清晰明确,创建提示的一个重要原则,你还会从提示的第二个原则中学到给LLM时间去思考。
================================================
FILE: notebooks-zh/2. 指南 Guidelines.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 第二章 编写 Prompt 的原则\n",
"\n",
" 本章的主要内容为编写 Prompt 的原则,在本章中,我们将给出两个编写 Prompt 的原则与一些相关的策略,你将练习基于这两个原则来编写有效的 Prompt,从而便捷而有效地使用 LLM。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 一、环境配置"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"本教程使用 OpenAI 所开放的 ChatGPT API,因此你需要首先拥有一个 ChatGPT 的 API_KEY(也可以直接访问官方网址在线测试),然后需要安装 openai 的第三方库"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"首先需要安装所需第三方库:\n",
"\n",
"openai:\n",
"\n",
"```bash\n",
"pip install openai\n",
"```\n",
"\n",
"dotenv:\n",
"\n",
"```bash\n",
"pip install -U python-dotenv\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"# 将自己的 API-KEY 导入系统环境变量\n",
"!export OPENAI_API_KEY='api-key'"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"import os\n",
"from dotenv import load_dotenv, find_dotenv\n",
"# 导入第三方库\n",
"\n",
"_ = load_dotenv(find_dotenv())\n",
"# 读取系统中的环境变量\n",
"\n",
"openai.api_key = os.getenv('OPENAI_API_KEY')\n",
"# 设置 API_KEY"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"我们将在后续课程中深入探究 OpenAI 提供的 ChatCompletion API 的使用方法,在此处,我们先将它封装成一个函数,你无需知道其内部机理,仅需知道调用该函数输入 Prompt 其将会给出对应的 Completion 即可。"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# 一个封装 OpenAI 接口的函数,参数为 Prompt,返回对应结果\n",
"def get_completion(prompt, model=\"gpt-3.5-turbo\"):\n",
" '''\n",
" prompt: 对应的提示\n",
" model: 调用的模型,默认为 gpt-3.5-turbo(ChatGPT),有内测资格的用户可以选择 gpt-4\n",
" '''\n",
" messages = [{\"role\": \"user\", \"content\": prompt}]\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=0, # 模型输出的温度系数,控制输出的随机程度\n",
" )\n",
" # 调用 OpenAI 的 ChatCompletion 接口\n",
" return response.choices[0].message[\"content\"]\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 二、两个基本原则"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 原则一:编写清晰、具体的指令\n",
"\n",
"你应该通过提供尽可能清晰和具体的指令来表达您希望模型执行的操作。这将引导模型给出正确的输出,并减少你得到无关或不正确响应的可能。编写清晰的指令不意味着简短的指令,因为在许多情况下,更长的提示实际上更清晰且提供了更多上下文,这实际上可能导致更详细更相关的输出。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**策略一:使用分隔符清晰地表示输入的不同部分**,分隔符可以是:```,\"\",<>,\\<tag>,<\\tag>等\n",
"\n",
"你可以使用任何明显的标点符号将特定的文本部分与提示的其余部分分开。这可以是任何可以使模型明确知道这是一个单独部分的标记。使用分隔符是一种可以避免提示注入的有用技术。提示注入是指如果用户将某些输入添加到提示中,则可能会向模型提供与您想要执行的操作相冲突的指令,从而使其遵循冲突的指令而不是执行您想要的操作。即,输入里面可能包含其他指令,会覆盖掉你的指令。对此,使用分隔符是一个不错的策略。\n",
"\n",
"以下是一个例子,我们给出一段话并要求 GPT 进行总结,在该示例中我们使用 ``` 来作为分隔符\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Clear and specific instructions should be provided to guide a model towards the desired output, and longer prompts can provide more clarity and context for the model, leading to more detailed and relevant outputs.\n"
]
}
],
"source": [
"# 中文版见下一个 cell\n",
"text = f\"\"\"\n",
"You should express what you want a model to do by \\ \n",
"providing instructions that are as clear and \\ \n",
"specific as you can possibly make them. \\ \n",
"This will guide the model towards the desired output, \\ \n",
"and reduce the chances of receiving irrelevant \\ \n",
"or incorrect responses. Don't confuse writing a \\ \n",
"clear prompt with writing a short prompt. \\ \n",
"In many cases, longer prompts provide more clarity \\ \n",
"and context for the model, which can lead to \\ \n",
"more detailed and relevant outputs.\n",
"\"\"\"\n",
"prompt = f\"\"\"\n",
"Summarize the text delimited by triple backticks \\ \n",
"into a single sentence.\n",
"```{text}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"提供清晰具体的指示,避免无关或不正确响应,不要混淆写清晰和写简短,更长的提示可以提供更多清晰度和上下文信息,导致更详细和相关的输出。\n"
]
}
],
"source": [
"text = f\"\"\"\n",
"你应该提供尽可能清晰、具体的指示,以表达你希望模型执行的任务。\\\n",
"这将引导模型朝向所需的输出,并降低收到无关或不正确响应的可能性。\\\n",
"不要将写清晰的提示与写简短的提示混淆。\\\n",
"在许多情况下,更长的提示可以为模型提供更多的清晰度和上下文信息,从而导致更详细和相关的输出。\n",
"\"\"\"\n",
"# 需要总结的文本内容\n",
"prompt = f\"\"\"\n",
"把用三个反引号括起来的文本总结成一句话。\n",
"```{text}```\n",
"\"\"\"\n",
"# 指令内容,使用 ``` 来分隔指令和待总结的内容\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**策略二:要求一个结构化的输出**,可以是 Json、HTML 等格式\n",
"\n",
"第二个策略是要求生成一个结构化的输出,这可以使模型的输出更容易被我们解析,例如,你可以在 Python 中将其读入字典或列表中。。\n",
"\n",
"在以下示例中,我们要求 GPT 生成三本书的标题、作者和类别,并要求 GPT 以 Json 的格式返回给我们,为便于解析,我们指定了 Json 的键。"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[\n",
" {\n",
" \"book_id\": 1,\n",
" \"title\": \"The Lost City of Zorath\",\n",
" \"author\": \"Aria Blackwood\",\n",
" \"genre\": \"Fantasy\"\n",
" },\n",
" {\n",
" \"book_id\": 2,\n",
" \"title\": \"The Last Survivors\",\n",
" \"author\": \"Ethan Stone\",\n",
" \"genre\": \"Science Fiction\"\n",
" },\n",
" {\n",
" \"book_id\": 3,\n",
" \"title\": \"The Secret Life of Bees\",\n",
" \"author\": \"Lila Rose\",\n",
" \"genre\": \"Romance\"\n",
" }\n",
"]\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Generate a list of three made-up book titles along \\ \n",
"with their authors and genres. \n",
"Provide them in JSON format with the following keys: \n",
"book_id, title, author, genre.\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"books\": [\n",
" {\n",
" \"book_id\": 1,\n",
" \"title\": \"The Shadow of the Wind\",\n",
" \"author\": \"Carlos Ruiz Zafón\",\n",
" \"genre\": \"Mystery\"\n",
" },\n",
" {\n",
" \"book_id\": 2,\n",
" \"title\": \"The Name of the Wind\",\n",
" \"author\": \"Patrick Rothfuss\",\n",
" \"genre\": \"Fantasy\"\n",
" },\n",
" {\n",
" \"book_id\": 3,\n",
" \"title\": \"The Hitchhiker's Guide to the Galaxy\",\n",
" \"author\": \"Douglas Adams\",\n",
" \"genre\": \"Science Fiction\"\n",
" }\n",
" ]\n",
"}\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"请生成包括书名、作者和类别的三本虚构书籍清单,\\\n",
"并以 JSON 格式提供,其中包含以下键:book_id、title、author、genre。\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**策略三:要求模型检查是否满足条件**\n",
"\n",
"如果任务做出的假设不一定满足,我们可以告诉模型先检查这些假设,如果不满足,指示并停止执行。你还可以考虑潜在的边缘情况以及模型应该如何处理它们,以避免意外的错误或结果。\n",
"\n",
"在如下示例中,我们将分别给模型两段文本,分别是制作茶的步骤以及一段没有明确步骤的文本。我们将要求模型判断其是否包含一系列指令,如果包含则按照给定格式重新编写指令,不包含则回答未提供步骤。"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Completion for Text 1:\n",
"Step 1 - Get some water boiling.\n",
"Step 2 - Grab a cup and put a tea bag in it.\n",
"Step 3 - Once the water is hot enough, pour it over the tea bag.\n",
"Step 4 - Let it sit for a bit so the tea can steep.\n",
"Step 5 - After a few minutes, take out the tea bag.\n",
"Step 6 - Add some sugar or milk to taste.\n",
"Step 7 - Enjoy your delicious cup of tea!\n",
"\n",
"\n"
]
}
],
"source": [
"text_1 = f\"\"\"\n",
"Making a cup of tea is easy! First, you need to get some \\ \n",
"water boiling. While that's happening, \\ \n",
"grab a cup and put a tea bag in it. Once the water is \\ \n",
"hot enough, just pour it over the tea bag. \\ \n",
"Let it sit for a bit so the tea can steep. After a \\ \n",
"few minutes, take out the tea bag. If you \\ \n",
"like, you can add some sugar or milk to taste. \\ \n",
"And that's it! You've got yourself a delicious \\ \n",
"cup of tea to enjoy.\n",
"\"\"\"\n",
"prompt = f\"\"\"\n",
"You will be provided with text delimited by triple quotes. \n",
"If it contains a sequence of instructions, \\ \n",
"re-write those instructions in the following format:\n",
"\n",
"Step 1 - ...\n",
"Step 2 - …\n",
"…\n",
"Step N - …\n",
"\n",
"If the text does not contain a sequence of instructions, \\ \n",
"then simply write \\\"No steps provided.\\\"\n",
"\n",
"\\\"\\\"\\\"{text_1}\\\"\\\"\\\"\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(\"Completion for Text 1:\")\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Completion for Text 2:\n",
"No steps provided.\n"
]
}
],
"source": [
"text_2 = f\"\"\"\n",
"The sun is shining brightly today, and the birds are \\\n",
"singing. It's a beautiful day to go for a \\ \n",
"walk in the park. The flowers are blooming, and the \\ \n",
"trees are swaying gently in the breeze. People \\ \n",
"are out and about, enjoying the lovely weather. \\ \n",
"Some are having picnics, while others are playing \\ \n",
"games or simply relaxing on the grass. It's a \\ \n",
"perfect day to spend time outdoors and appreciate the \\ \n",
"beauty of nature.\n",
"\"\"\"\n",
"prompt = f\"\"\"You will be provided with text delimited by triple quotes. \n",
"If it contains a sequence of instructions, \\ \n",
"re-write those instructions in the following format:\n",
"Step 1 - ...\n",
"Step 2 - …\n",
"…\n",
"Step N - …\n",
"\n",
"If the text does not contain a sequence of instructions, \\ \n",
"then simply write \\\"No steps provided.\\\"\n",
"\n",
"\\\"\\\"\\\"{text_2}\\\"\\\"\\\"\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(\"Completion for Text 2:\")\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Text 1 的总结:\n",
"第一步 - 把水烧开。\n",
"第二步 - 拿一个杯子并把茶包放进去。\n",
"第三步 - 把烧开的水倒在茶包上。\n",
"第四步 - 等待几分钟,让茶叶浸泡。\n",
"第五步 - 取出茶包。\n",
"第六步 - 如果你愿意,可以加一些糖或牛奶调味。\n",
"第七步 - 就这样,你可以享受一杯美味的茶了。\n"
]
}
],
"source": [
"# 有步骤的文本\n",
"text_1 = f\"\"\"\n",
"泡一杯茶很容易。首先,需要把水烧开。\\\n",
"在等待期间,拿一个杯子并把茶包放进去。\\\n",
"一旦水足够热,就把它倒在茶包上。\\\n",
"等待一会儿,让茶叶浸泡。几分钟后,取出茶包。\\\n",
"如果你愿意,可以加一些糖或牛奶调味。\\\n",
"就这样,你可以享受一杯美味的茶了。\n",
"\"\"\"\n",
"prompt = f\"\"\"\n",
"您将获得由三个引号括起来的文本。\\\n",
"如果它包含一系列的指令,则需要按照以下格式重新编写这些指令:\n",
"\n",
"第一步 - ...\n",
"第二步 - …\n",
"…\n",
"第N步 - …\n",
"\n",
"如果文本中不包含一系列的指令,则直接写“未提供步骤”。\"\n",
"\\\"\\\"\\\"{text_1}\\\"\\\"\\\"\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(\"Text 1 的总结:\")\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Text 2 的总结:\n",
"未提供步骤。\n"
]
}
],
"source": [
"# 无步骤的文本\n",
"text_2 = f\"\"\"\n",
"今天阳光明媚,鸟儿在歌唱。\\\n",
"这是一个去公园散步的美好日子。\\\n",
"鲜花盛开,树枝在微风中轻轻摇曳。\\\n",
"人们外出享受着这美好的天气,有些人在野餐,有些人在玩游戏或者在草地上放松。\\\n",
"这是一个完美的日子,可以在户外度过并欣赏大自然的美景。\n",
"\"\"\"\n",
"prompt = f\"\"\"\n",
"您将获得由三个引号括起来的文本。\\\n",
"如果它包含一系列的指令,则需要按照以下格式重新编写这些指令:\n",
"\n",
"第一步 - ...\n",
"第二步 - …\n",
"…\n",
"第N步 - …\n",
"\n",
"如果文本中不包含一系列的指令,则直接写“未提供步骤”。\"\n",
"\\\"\\\"\\\"{text_2}\\\"\\\"\\\"\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(\"Text 2 的总结:\")\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**策略四:提供少量示例**\n",
"\n",
"即在要求模型执行实际任务之前,提供给它少量成功执行任务的示例。\n",
"\n",
"例如,在以下的示例中,我们告诉模型其任务是以一致的风格回答问题,并先给它一个孩子和一个祖父之间的对话的例子。孩子说,“教我耐心”,祖父用这些隐喻回答。因此,由于我们已经告诉模型要以一致的语气回答,现在我们说“教我韧性”,由于模型已经有了这个少样本示例,它将以类似的语气回答下一个任务。"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<grandparent>: Resilience is like a tree that bends with the wind but never breaks. It is the ability to bounce back from adversity and keep moving forward, even when things get tough. Just like a tree that grows stronger with each storm it weathers, resilience is a quality that can be developed and strengthened over time.\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Your task is to answer in a consistent style.\n",
"\n",
"<child>: Teach me about patience.\n",
"\n",
"<grandparent>: The river that carves the deepest \\ \n",
"valley flows from a modest spring; the \\ \n",
"grandest symphony originates from a single note; \\ \n",
"the most intricate tapestry begins with a solitary thread.\n",
"\n",
"<child>: Teach me about resilience.\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<祖父母>: 韧性就像是一棵树,它需要经历风吹雨打、寒冬酷暑,才能成长得更加坚强。在生活中,我们也需要经历各种挫折和困难,才能锻炼出韧性。记住,不要轻易放弃,坚持下去,你会发现自己变得更加坚强。\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"你的任务是以一致的风格回答问题。\n",
"\n",
"<孩子>: 教我耐心。\n",
"\n",
"<祖父母>: 挖出最深峡谷的河流源于一处不起眼的泉眼;最宏伟的交响乐从单一的音符开始;最复杂的挂毯以一根孤独的线开始编织。\n",
"\n",
"<孩子>: 教我韧性。\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 原则二:给模型时间去思考\n",
"\n",
"如果模型匆忙地得出了错误的结论,您应该尝试重新构思查询,请求模型在提供最终答案之前进行一系列相关的推理。换句话说,如果您给模型一个在短时间或用少量文字无法完成的任务,它可能会猜测错误。这种情况对人来说也是一样的。如果您让某人在没有时间计算出答案的情况下完成复杂的数学问题,他们也可能会犯错误。因此,在这些情况下,您可以指示模型花更多时间思考问题,这意味着它在任务上花费了更多的计算资源。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**策略一:指定完成任务所需的步骤**\n",
"\n",
"接下来我们将通过给定一个复杂任务,给出完成该任务的一系列步骤,来展示这一策略的效果"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"首先我们描述了杰克和吉尔的故事,并给出一个指令。该指令是执行以下操作。首先,用一句话概括三个反引号限定的文本。第二,将摘要翻译成法语。第三,在法语摘要中列出每个名称。第四,输出包含以下键的 JSON 对象:法语摘要和名称数。然后我们要用换行符分隔答案。"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Completion for prompt 1:\n",
"Two siblings, Jack and Jill, go on a quest to fetch water from a well on a hilltop, but misfortune strikes and they both tumble down the hill, returning home slightly battered but with their adventurous spirits undimmed.\n",
"\n",
"Deux frères et sœurs, Jack et Jill, partent en quête d'eau d'un puits sur une colline, mais un malheur frappe et ils tombent tous les deux de la colline, rentrant chez eux légèrement meurtris mais avec leurs esprits aventureux intacts. \n",
"Noms: Jack, Jill.\n",
"\n",
"{\n",
" \"french_summary\": \"Deux frères et sœurs, Jack et Jill, partent en quête d'eau d'un puits sur une colline, mais un malheur frappe et ils tombent tous les deux de la colline, rentrant chez eux légèrement meurtris mais avec leurs esprits aventureux intacts.\",\n",
" \"num_names\": 2\n",
"}\n"
]
}
],
"source": [
"text = f\"\"\"\n",
"In a charming village, siblings Jack and Jill set out on \\ \n",
"a quest to fetch water from a hilltop \\ \n",
"well. As they climbed, singing joyfully, misfortune \\ \n",
"struck—Jack tripped on a stone and tumbled \\ \n",
"down the hill, with Jill following suit. \\ \n",
"Though slightly battered, the pair returned home to \\ \n",
"comforting embraces. Despite the mishap, \\ \n",
"their adventurous spirits remained undimmed, and they \\ \n",
"continued exploring with delight.\n",
"\"\"\"\n",
"# example 1\n",
"prompt_1 = f\"\"\"\n",
"Perform the following actions: \n",
"1 - Summarize the following text delimited by triple \\\n",
"backticks with 1 sentence.\n",
"2 - Translate the summary into French.\n",
"3 - List each name in the French summary.\n",
"4 - Output a json object that contains the following \\\n",
"keys: french_summary, num_names.\n",
"\n",
"Separate your answers with line breaks.\n",
"\n",
"Text:\n",
"```{text}```\n",
"\"\"\"\n",
"response = get_completion(prompt_1)\n",
"print(\"Completion for prompt 1:\")\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"prompt 1:\n",
"1-兄妹在山顶井里打水时发生意外,但仍然保持冒险精神。\n",
"2-Dans un charmant village, les frère et sœur Jack et Jill partent chercher de l'eau dans un puits au sommet de la montagne. Malheureusement, Jack trébuche sur une pierre et tombe de la montagne, suivi de près par Jill. Bien qu'ils soient légèrement blessés, ils retournent chez eux chaleureusement. Malgré cet accident, leur esprit d'aventure ne diminue pas et ils continuent à explorer joyeusement.\n",
"3-Jack, Jill\n",
"4-{\n",
" \"French_summary\": \"Dans un charmant village, les frère et sœur Jack et Jill partent chercher de l'eau dans un puits au sommet de la montagne. Malheureusement, Jack trébuche sur une pierre et tombe de la montagne, suivi de près par Jill. Bien qu'ils soient légèrement blessés, ils retournent chez eux chaleureusement. Malgré cet accident, leur esprit d'aventure ne diminue pas et ils continuent à explorer joyeusement.\",\n",
" \"num_names\": 2\n",
"}\n"
]
}
],
"source": [
"text = f\"\"\"\n",
"在一个迷人的村庄里,兄妹杰克和吉尔出发去一个山顶井里打水。\\\n",
"他们一边唱着欢乐的歌,一边往上爬,\\\n",
"然而不幸降临——杰克绊了一块石头,从山上滚了下来,吉尔紧随其后。\\\n",
"虽然略有些摔伤,但他们还是回到了温馨的家中。\\\n",
"尽管出了这样的意外,他们的冒险精神依然没有减弱,继续充满愉悦地探索。\n",
"\"\"\"\n",
"# example 1\n",
"prompt_1 = f\"\"\"\n",
"执行以下操作:\n",
"1-用一句话概括下面用三个反引号括起来的文本。\n",
"2-将摘要翻译成法语。\n",
"3-在法语摘要中列出每个人名。\n",
"4-输出一个 JSON 对象,其中包含以下键:French_summary,num_names。\n",
"\n",
"请用换行符分隔您的答案。\n",
"\n",
"Text:\n",
"```{text}```\n",
"\"\"\"\n",
"response = get_completion(prompt_1)\n",
"print(\"prompt 1:\")\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"上述输出仍然存在一定问题,例如,键“姓名”会被替换为法语,因此,我们给出一个更好的 Prompt,该 Prompt 指定了输出的格式"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Completion for prompt 2:\n",
"Summary: 兄妹杰克和吉尔在山顶井里打水时发生意外,但他们仍然保持冒险精神继续探索。\n",
"Translation: Jack and Jill, deux frères et sœurs, ont eu un accident en allant chercher de l'eau dans un puits de montagne, mais ils ont continué à explorer avec un esprit d'aventure.\n",
"Names: Jack, Jill\n",
"Output JSON: {\"french_summary\": \"Jack and Jill, deux frères et sœurs, ont eu un accident en allant chercher de l'eau dans un puits de montagne, mais ils ont continué à explorer avec un esprit d'aventure.\", \"num_names\": 2}\n"
]
}
],
"source": [
"prompt_2 = f\"\"\"\n",
"Your task is to perform the following actions: \n",
"1 - Summarize the following text delimited by <> with 1 sentence.\n",
"2 - Translate the summary into French.\n",
"3 - List each name in the French summary.\n",
"4 - Output a json object that contains the \n",
"following keys: french_summary, num_names.\n",
"\n",
"Use the following format:\n",
"Text: <text to summarize>\n",
"Summary: <summary>\n",
"Translation: <summary translation>\n",
"Names: <list of names in French summary>\n",
"Output JSON: <json with summary and num_names>\n",
"\n",
"Text: <{text}>\n",
"\"\"\"\n",
"response = get_completion(prompt_2)\n",
"print(\"\\nCompletion for prompt 2:\")\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"prompt 2:\n",
"摘要:兄妹杰克和吉尔在迷人的村庄里冒险,不幸摔伤后回到家中,但仍然充满冒险精神。\n",
"翻译:In a charming village, siblings Jack and Jill set out to fetch water from a mountaintop well. While climbing and singing, Jack trips on a stone and tumbles down the mountain, with Jill following closely behind. Despite some bruises, they make it back home safely. Their adventurous spirit remains undiminished as they continue to explore with joy.\n",
"名称:Jack,Jill\n",
"输出 JSON:{\"English_summary\": \"In a charming village, siblings Jack and Jill set out to fetch water from a mountaintop well. While climbing and singing, Jack trips on a stone and tumbles down the mountain, with Jill following closely behind. Despite some bruises, they make it back home safely. Their adventurous spirit remains undiminished as they continue to explore with joy.\", \"num_names\": 2}\n"
]
}
],
"source": [
"prompt_2 = f\"\"\"\n",
"1-用一句话概括下面用<>括起来的文本。\n",
"2-将摘要翻译成英语。\n",
"3-在英语摘要中列出每个名称。\n",
"4-输出一个 JSON 对象,其中包含以下键:English_summary,num_names。\n",
"\n",
"请使用以下格式:\n",
"文本:<要总结的文本>\n",
"摘要:<摘要>\n",
"翻译:<摘要的翻译>\n",
"名称:<英语摘要中的名称列表>\n",
"输出 JSON:<带有 English_summary 和 num_names 的 JSON>\n",
"\n",
"Text: <{text}>\n",
"\"\"\"\n",
"response = get_completion(prompt_2)\n",
"print(\"\\nprompt 2:\")\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**策略二:指导模型在下结论之前找出一个自己的解法**\n",
"\n",
"有时候,在明确指导模型在做决策之前要思考解决方案时,我们会得到更好的结果。\n",
"\n",
"接下来我们会给出一个问题和一个学生的解答,要求模型判断解答是否正确"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The student's solution is correct.\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Determine if the student's solution is correct or not.\n",
"\n",
"Question:\n",
"I'm building a solar power installation and I need \\\n",
" help working out the financials. \n",
"- Land costs $100 / square foot\n",
"- I can buy solar panels for $250 / square foot\n",
"- I negotiated a contract for maintenance that will cost \\ \n",
"me a flat $100k per year, and an additional $10 / square \\\n",
"foot\n",
"What is the total cost for the first year of operations \n",
"as a function of the number of square feet.\n",
"\n",
"Student's Solution:\n",
"Let x be the size of the installation in square feet.\n",
"Costs:\n",
"1. Land cost: 100x\n",
"2. Solar panel cost: 250x\n",
"3. Maintenance cost: 100,000 + 100x\n",
"学生的解决方案:\n",
"设x为发电站的大小,单位为平方英尺。\n",
"费用:\n",
"Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"学生的解决方案是正确的。\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"判断学生的解决方案是否正确。\n",
"\n",
"问题:\n",
"我正在建造一个太阳能发电站,需要帮助计算财务。\n",
"\n",
" 土地费用为 100美元/平方英尺\n",
" 我可以以 250美元/平方英尺的价格购买太阳能电池板\n",
" 我已经谈判好了维护合同,每年需要支付固定的10万美元,并额外支付每平方英尺10美元\n",
" 作为平方英尺数的函数,首年运营的总费用是多少。\n",
"\n",
"学生的解决方案:\n",
"设x为发电站的大小,单位为平方英尺。\n",
"费用:\n",
"\n",
" 土地费用:100x\n",
" 太阳能电池板费用:250x\n",
" 维护费用:100,000美元+100x\n",
" 总费用:100x+250x+100,000美元+100x=450x+100,000美元\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"但是注意,学生的解决方案实际上是错误的。\n",
"\n",
"我们可以通过指导模型先自行找出一个解法来解决这个问题。\n",
"\n",
"在接下来这个 Prompt 中,我们要求模型先自行解决这个问题,再根据自己的解法与学生的解法进行对比,从而判断学生的解法是否正确。同时,我们给定了输出的格式要求。通过明确步骤,让模型有更多时间思考,有时可以获得更准确的结果。在这个例子中,学生的答案是错误的,但如果我们没有先让模型自己计算,那么可能会被误导以为学生是正确的。"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Let x be the size of the installation in square feet.\n",
"\n",
"Costs:\n",
"1. Land cost: 100x\n",
"2. Solar panel cost: 250x\n",
"3. Maintenance cost: 100,000 + 10x\n",
"\n",
"Total cost: 100x + 250x + 100,000 + 10x = 360x + 100,000\n",
"\n",
"Is the student's solution the same as actual solution just calculated:\n",
"No\n",
"\n",
"Student grade:\n",
"Incorrect\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Your task is to determine if the student's solution \\\n",
"is correct or not.\n",
"To solve the problem do the following:\n",
"- First, work out your own solution to the problem. \n",
"- Then compare your solution to the student's solution \\ \n",
"and evaluate if the student's solution is correct or not. \n",
"Don't decide if the student's solution is correct until \n",
"you have done the problem yourself.\n",
"\n",
"Use the following format:\n",
"Question:\n",
"```\n",
"question here\n",
"```\n",
"Student's solution:\n",
"```\n",
"student's solution here\n",
"```\n",
"Actual solution:\n",
"```\n",
"steps to work out the solution and your solution here\n",
"```\n",
"Is the student's solution the same as actual solution \\\n",
"just calculated:\n",
"```\n",
"yes or no\n",
"```\n",
"Student grade:\n",
"```\n",
"correct or incorrect\n",
"```\n",
"\n",
"Question:\n",
"```\n",
"I'm building a solar power installation and I need help \\\n",
"working out the financials. \n",
"- Land costs $100 / square foot\n",
"- I can buy solar panels for $250 / square foot\n",
"- I negotiated a contract for maintenance that will cost \\\n",
"me a flat $100k per year, and an additional $10 / square \\\n",
"foot\n",
"What is the total cost for the first year of operations \\\n",
"as a function of the number of square feet.\n",
"``` \n",
"Student's solution:\n",
"```\n",
"Let x be the size of the installation in square feet.\n",
"Costs:\n",
"1. Land cost: 100x\n",
"2. Solar panel cost: 250x\n",
"3. Maintenance cost: 100,000 + 100x\n",
"Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000\n",
"```\n",
"Actual solution:\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"正确的解决方案和步骤:\n",
" 1. 计算土地费用:100美元/平方英尺 * x平方英尺 = 100x美元\n",
" 2. 计算太阳能电池板费用:250美元/平方英尺 * x平方英尺 = 250x美元\n",
" 3. 计算维护费用:10万美元 + 10美元/平方英尺 * x平方英尺 = 10万美元 + 10x美元\n",
" 4. 计算总费用:100x美元 + 250x美元 + 10万美元 + 10x美元 = 360x + 10万美元\n",
"\n",
"学生的解决方案和实际解决方案是否相同:否\n",
"\n",
"学生的成绩:不正确\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"请判断学生的解决方案是否正确,请通过如下步骤解决这个问题:\n",
"\n",
"步骤:\n",
"\n",
" 首先,自己解决问题。\n",
" 然后将你的解决方案与学生的解决方案进行比较,并评估学生的解决方案是否正确。在自己完成问题之前,请勿决定学生的解决方案是否正确。\n",
"\n",
"使用以下格式:\n",
"\n",
" 问题:问题文本\n",
" 学生的解决方案:学生的解决方案文本\n",
" 实际解决方案和步骤:实际解决方案和步骤文本\n",
" 学生的解决方案和实际解决方案是否相同:是或否\n",
" 学生的成绩:正确或不正确\n",
"\n",
"问题:\n",
"\n",
" 我正在建造一个太阳能发电站,需要帮助计算财务。 \n",
" - 土地费用为每平方英尺100美元\n",
" - 我可以以每平方英尺250美元的价格购买太阳能电池板\n",
" - 我已经谈判好了维护合同,每年需要支付固定的10万美元,并额外支付每平方英尺10美元\n",
" 作为平方英尺数的函数,首年运营的总费用是多少。\n",
"\n",
"学生的解决方案:\n",
"\n",
" 设x为发电站的大小,单位为平方英尺。\n",
" 费用:\n",
" 1. 土地费用:100x\n",
" 2. 太阳能电池板费用:250x\n",
" 3. 维护费用:100,000+100x\n",
" 总费用:100x+250x+100,000+100x=450x+100,000\n",
"\n",
"实际解决方案和步骤:\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 三、局限性"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**虚假知识**:模型偶尔会生成一些看似真实实则编造的知识\n",
"\n",
"如果模型在训练过程中接触了大量的知识,它并没有完全记住所见的信息,因此它并不很清楚自己知识的边界。这意味着它可能会尝试回答有关晦涩主题的问题,并编造听起来合理但实际上并不正确的答案。我们称这些编造的想法为幻觉。\n",
"\n",
"例如在如下示例中,我们要求告诉我们 Boie 公司生产的 AeroGlide UltraSlim Smart Toothbrush 产品的信息,事实上,这个公司是真实存在的,但产品是编造的,模型则会一本正经地告诉我们编造的知识。\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The AeroGlide UltraSlim Smart Toothbrush by Boie is a high-tech toothbrush that uses advanced sonic technology to provide a deep and thorough clean. It features a slim and sleek design that makes it easy to hold and maneuver, and it comes with a range of smart features that help you optimize your brushing routine.\n",
"\n",
"One of the key features of the AeroGlide UltraSlim Smart Toothbrush is its advanced sonic technology, which uses high-frequency vibrations to break up plaque and bacteria on your teeth and gums. This technology is highly effective at removing even the toughest stains and buildup, leaving your teeth feeling clean and fresh.\n",
"\n",
"In addition to its sonic technology, the AeroGlide UltraSlim Smart Toothbrush also comes with a range of smart features that help you optimize your brushing routine. These include a built-in timer that ensures you brush for the recommended two minutes, as well as a pressure sensor that alerts you if you're brushing too hard.\n",
"\n",
"Overall, the AeroGlide UltraSlim Smart Toothbrush by Boie is a highly advanced and effective toothbrush that is perfect for anyone looking to take their oral hygiene to the next level. With its advanced sonic technology and smart features, it provides a deep and thorough clean that leaves your teeth feeling fresh and healthy.\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Tell me about AeroGlide UltraSlim Smart Toothbrush by Boie\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Boie公司生产的AeroGlide UltraSlim Smart Toothbrush是一款智能牙刷,具有以下特点:\n",
"\n",
"1. 超薄设计:刷头仅有0.8毫米的厚度,可以更容易地进入口腔深处,清洁更彻底。\n",
"\n",
"2. 智能感应:牙刷配备了智能感应技术,可以自动识别刷头的位置和方向,确保每个部位都得到充分的清洁。\n",
"\n",
"3. 高效清洁:牙刷采用了高速振动技术,每分钟可达到40000次,可以有效去除牙菌斑和污渍。\n",
"\n",
"4. 轻松携带:牙刷采用了便携式设计,可以轻松放入口袋或旅行包中,随时随地进行口腔清洁。\n",
"\n",
"5. 环保材料:牙刷采用了环保材料制造,不含有害物质,对环境友好。\n",
"\n",
"总之,Boie公司生产的AeroGlide UltraSlim Smart Toothbrush是一款高效、智能、环保的牙刷,可以帮助用户轻松保持口腔健康。\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"告诉我 Boie 公司生产的 AeroGlide UltraSlim Smart Toothbrush 的相关信息\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"模型会输出看上去非常真实的编造知识,这有时会很危险。因此,请确保使用我们在本节中介绍的一些技巧,以尝试在构建自己的应用程序时避免这种情况。这是模型已知的一个弱点,也是我们正在积极努力解决的问题。在你希望模型根据文本生成答案的情况下,另一种减少幻觉的策略是先要求模型找到文本中的任何相关引用,然后要求它使用这些引用来回答问题,这种追溯源文档的方法通常对减少幻觉非常有帮助。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**说明:在本教程中,我们使用 \\ 来使文本适应屏幕大小以提高阅读体验,GPT 并不受 \\ 的影响,但在你调用其他大模型时,需额外考虑 \\ 是否会影响模型性能**"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
"autoclose": false,
"autocomplete": true,
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 1,
"hotkeys": {
"equation": "Ctrl-E",
"itemize": "Ctrl-I"
},
"labels_anchors": false,
"latex_user_defs": false,
"report_style_numbering": false,
"user_envs_cfg": false
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": true
}
},
"nbformat": 4,
"nbformat_minor": 4
}
================================================
FILE: notebooks-zh/3. 迭代 Iterative.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 迭代式提示开发\n",
"\n",
"当使用 LLM 构建应用程序时,我从来没有在第一次尝试中就成功使用最终应用程序中所需的 Prompt。但这并不重要,只要您有一个好的迭代过程来不断改进您的 Prompt,那么你就能够得到一个适合任务的 Prompt。我认为在提示方面,第一次成功的几率可能会高一些,但正如上所说,第一个提示是否有效并不重要。最重要的是为您的应用程序找到有效提示的过程。\n",
"\n",
"因此,在本章中,我们将以从产品说明书中生成营销文案这一示例,展示一些框架,以提示你思考如何迭代地分析和完善你的 Prompt。\n",
"\n",
"如果您之前与我一起上过机器学习课程,您可能见过我使用的一张图表,说明了机器学习开发的流程。通常是先有一个想法,然后再实现它:编写代码,获取数据,训练模型,这会给您一个实验结果。然后您可以查看输出结果,进行错误分析,找出它在哪里起作用或不起作用,甚至可以更改您想要解决的问题的确切思路或方法,然后更改实现并运行另一个实验等等,反复迭代,以获得有效的机器学习模型。在编写 Prompt 以使用 LLM 开发应用程序时,这个过程可能非常相似,您有一个关于要完成的任务的想法,可以尝试编写第一个 Prompt,满足上一章说过的两个原则:清晰明确,并且给系统足够的时间思考。然后您可以运行它并查看结果。如果第一次效果不好,那么迭代的过程就是找出为什么指令不够清晰或为什么没有给算法足够的时间思考,以便改进想法、改进提示等等,循环多次,直到找到适合您的应用程序的 Prompt。\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 环境配置\n",
"\n",
"同上一章,我们首先需要配置使用 OpenAI API 的环境"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"import os\n",
"from dotenv import load_dotenv, find_dotenv\n",
"# 导入第三方库\n",
"\n",
"_ = load_dotenv(find_dotenv())\n",
"# 读取系统中的环境变量\n",
"\n",
"openai.api_key = os.getenv('OPENAI_API_KEY')\n",
"# 设置 API_KEY"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# 一个封装 OpenAI 接口的函数,参数为 Prompt,返回对应结果\n",
"def get_completion(prompt, model=\"gpt-3.5-turbo\"):\n",
" '''\n",
" prompt: 对应的提示\n",
" model: 调用的模型,默认为 gpt-3.5-turbo(ChatGPT),有内测资格的用户可以选择 gpt-4\n",
" '''\n",
" messages = [{\"role\": \"user\", \"content\": prompt}]\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=0, # 模型输出的温度系数,控制输出的随机程度\n",
" )\n",
" # 调用 OpenAI 的 ChatCompletion 接口\n",
" return response.choices[0].message[\"content\"]\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 任务——从产品说明书生成一份营销产品描述"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"这里有一个椅子的产品说明书,描述说它是一个中世纪灵感家族的一部分,讨论了构造、尺寸、椅子选项、材料等等,产地是意大利。假设您想要使用这份说明书帮助营销团队为在线零售网站撰写营销式描述"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# 示例:产品说明书\n",
"fact_sheet_chair = \"\"\"\n",
"OVERVIEW\n",
"- Part of a beautiful family of mid-century inspired office furniture, \n",
"including filing cabinets, desks, bookcases, meeting tables, and more.\n",
"- Several options of shell color and base finishes.\n",
"- Available with plastic back and front upholstery (SWC-100) \n",
"or full upholstery (SWC-110) in 10 fabric and 6 leather options.\n",
"- Base finish options are: stainless steel, matte black, \n",
"gloss white, or chrome.\n",
"- Chair is available with or without armrests.\n",
"- Suitable for home or business settings.\n",
"- Qualified for contract use.\n",
"\n",
"CONSTRUCTION\n",
"- 5-wheel plastic coated aluminum base.\n",
"- Pneumatic chair adjust for easy raise/lower action.\n",
"\n",
"DIMENSIONS\n",
"- WIDTH 53 CM | 20.87”\n",
"- DEPTH 51 CM | 20.08”\n",
"- HEIGHT 80 CM | 31.50”\n",
"- SEAT HEIGHT 44 CM | 17.32”\n",
"- SEAT DEPTH 41 CM | 16.14”\n",
"\n",
"OPTIONS\n",
"- Soft or hard-floor caster options.\n",
"- Two choices of seat foam densities: \n",
"medium (1.8 lb/ft3) or high (2.8 lb/ft3)\n",
"- Armless or 8 position PU armrests \n",
"\n",
"MATERIALS\n",
"SHELL BASE GLIDER\n",
"- Cast Aluminum with modified nylon PA6/PA66 coating.\n",
"- Shell thickness: 10 mm.\n",
"SEAT\n",
"- HD36 foam\n",
"\n",
"COUNTRY OF ORIGIN\n",
"- Italy\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Introducing our stunning mid-century inspired office chair, the perfect addition to any home or business setting. Part of a beautiful family of office furniture, including filing cabinets, desks, bookcases, meeting tables, and more, this chair is available in several options of shell color and base finishes to suit your style. Choose from plastic back and front upholstery (SWC-100) or full upholstery (SWC-110) in 10 fabric and 6 leather options.\n",
"\n",
"The chair is constructed with a 5-wheel plastic coated aluminum base and features a pneumatic chair adjust for easy raise/lower action. It is available with or without armrests and is qualified for contract use. The base finish options are stainless steel, matte black, gloss white, or chrome.\n",
"\n",
"Measuring at a width of 53 cm, depth of 51 cm, and height of 80 cm, with a seat height of 44 cm and seat depth of 41 cm, this chair is designed for ultimate comfort. You can also choose between soft or hard-floor caster options and two choices of seat foam densities: medium (1.8 lb/ft3) or high (2.8 lb/ft3). The armrests are available in either an armless or 8 position PU option.\n",
"\n",
"The materials used in the construction of this chair are of the highest quality. The shell base glider is made of cast aluminum with modified nylon PA6/PA66 coating and has a shell thickness of 10 mm. The seat is made of HD36 foam, ensuring maximum comfort and durability.\n",
"\n",
"This chair is made in Italy and is the perfect combination of style and functionality. Upgrade your workspace with our mid-century inspired office chair today!\n"
]
}
],
"source": [
"# 提示:基于说明书生成营销描述\n",
"prompt = f\"\"\"\n",
"Your task is to help a marketing team create a \n",
"description for a retail website of a product based \n",
"on a technical fact sheet.\n",
"\n",
"Write a product description based on the information \n",
"provided in the technical specifications delimited by \n",
"triple backticks.\n",
"\n",
"Technical specifications: ```{fact_sheet_chair}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# 示例:产品说明书\n",
"fact_sheet_chair = \"\"\"\n",
"概述\n",
"\n",
" 美丽的中世纪风格办公家具系列的一部分,包括文件柜、办公桌、书柜、会议桌等。\n",
" 多种外壳颜色和底座涂层可选。\n",
" 可选塑料前后靠背装饰(SWC-100)或10种面料和6种皮革的全面装饰(SWC-110)。\n",
" 底座涂层选项为:不锈钢、哑光黑色、光泽白色或铬。\n",
" 椅子可带或不带扶手。\n",
" 适用于家庭或商业场所。\n",
" 符合合同使用资格。\n",
"\n",
"结构\n",
"\n",
" 五个轮子的塑料涂层铝底座。\n",
" 气动椅子调节,方便升降。\n",
"\n",
"尺寸\n",
"\n",
" 宽度53厘米|20.87英寸\n",
" 深度51厘米|20.08英寸\n",
" 高度80厘米|31.50英寸\n",
" 座椅高度44厘米|17.32英寸\n",
" 座椅深度41厘米|16.14英寸\n",
"\n",
"选项\n",
"\n",
" 软地板或硬地板滚轮选项。\n",
" 两种座椅泡沫密度可选:中等(1.8磅/立方英尺)或高(2.8磅/立方英尺)。\n",
" 无扶手或8个位置PU扶手。\n",
"\n",
"材料\n",
"外壳底座滑动件\n",
"\n",
" 改性尼龙PA6/PA66涂层的铸铝。\n",
" 外壳厚度:10毫米。\n",
" 座椅\n",
" HD36泡沫\n",
"\n",
"原产国\n",
"\n",
" 意大利\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"产品描述:\n",
"\n",
"我们自豪地推出美丽的中世纪风格办公家具系列,其中包括文件柜、办公桌、书柜、会议桌等。我们的产品采用多种外壳颜色和底座涂层,以满足您的个性化需求。您可以选择塑料前后靠背装饰(SWC-100)或10种面料和6种皮革的全面装饰(SWC-110),以使您的办公室更加舒适和时尚。\n",
"\n",
"我们的底座涂层选项包括不锈钢、哑光黑色、光泽白色或铬,以满足您的不同需求。椅子可带或不带扶手,适用于家庭或商业场所。我们的产品符合合同使用资格,为您提供更加可靠的保障。\n",
"\n",
"我们的产品采用五个轮子的塑料涂层铝底座,气动椅子调节,方便升降。尺寸为宽度53厘米|20.87英寸,深度51厘米|20.08英寸,高度80厘米|31.50英寸,座椅高度44厘米|17.32英寸,座椅深度41厘米|16.14英寸,为您提供舒适的使用体验。\n",
"\n",
"我们的产品还提供软地板或硬地板滚轮选项,两种座椅泡沫密度可选:中等(1.8磅/立方英尺)或高(2.8磅/立方英尺),以及无扶手或8个位置PU扶手,以满足您的不同需求。\n",
"\n",
"我们的产品采用改性尼龙PA6/PA66涂层的铸铝外壳底座滑动件,外壳厚度为10毫米,座椅采用HD36泡沫,为您提供更加舒适的使用体验。我们的产品原产国为意大利,为您提供更加优质的品质保证。\n"
]
}
],
"source": [
"# 提示:基于说明书创建营销描述\n",
"prompt = f\"\"\"\n",
"你的任务是帮助营销团队基于技术说明书创建一个产品的营销描述。\n",
"\n",
"根据```标记的技术说明书中提供的信息,编写一个产品描述。\n",
"\n",
"技术说明: ```{fact_sheet_chair}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 问题一:生成文本太长\n",
"\n",
"它似乎很好地写了一个描述,介绍了一个惊人的中世纪灵感办公椅,很好地完成了要求,即从技术说明书开始编写产品描述。但是当我看到这个时,我会觉得这个太长了。\n",
"\n",
"所以我有了一个想法。我写了一个提示,得到了结果。但是我对它不是很满意,因为它太长了,所以我会澄清我的提示,并说最多使用50个字。\n",
"\n",
"因此,我通过要求它限制生成文本长度来解决这一问题"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Introducing our beautiful medieval-style office furniture collection, including filing cabinets, desks, bookcases, and conference tables. Choose from a variety of shell colors and base coatings, with optional plastic or fabric/leather decoration. The chair features a plastic-coated aluminum base with five wheels and pneumatic height adjustment. Perfect for home or commercial use. Made in Italy.\n"
]
}
],
"source": [
"# 优化后的 Prompt,要求生成描述不多于 50 词\n",
"prompt = f\"\"\"\n",
"Your task is to help a marketing team create a \n",
"description for a retail website of a product based \n",
"on a technical fact sheet.\n",
"\n",
"Write a product description based on the information \n",
"provided in the technical specifications delimited by \n",
"triple backticks.\n",
"\n",
"Use at most 50 words.\n",
"\n",
"Technical specifications: ```{fact_sheet_chair}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"取出回答并根据空格拆分,答案为54个字,较好地完成了我的要求"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"54\n"
]
}
],
"source": [
"lst = response.split()\n",
"print(len(lst))"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"中世纪风格办公家具系列,包括文件柜、办公桌、书柜、会议桌等。多种颜色和涂层可选,可带或不带扶手。底座涂层选项为不锈钢、哑光黑色、光泽白色或铬。适用于家庭或商业场所,符合合同使用资格。意大利制造。\n"
]
}
],
"source": [
"# 优化后的 Prompt,要求生成描述不多于 50 词\n",
"prompt = f\"\"\"\n",
"您的任务是帮助营销团队基于技术说明书创建一个产品的零售网站描述。\n",
"\n",
"根据```标记的技术说明书中提供的信息,编写一个产品描述。\n",
"\n",
"使用最多50个词。\n",
"\n",
"技术规格:```{fact_sheet_chair}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"97"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 由于中文需要分词,此处直接计算整体长度\n",
"len(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"LLM在遵循非常精确的字数限制方面表现得还可以,但并不那么出色。有时它会输出60或65个单词的内容,但这还算是合理的。这原因是 LLM 解释文本使用一种叫做分词器的东西,但它们往往在计算字符方面表现一般般。有很多不同的方法来尝试控制你得到的输出的长度。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 问题二:文本关注在错误的细节上\n",
"\n",
"我们会发现的第二个问题是,这个网站并不是直接向消费者销售,它实际上旨在向家具零售商销售家具,他们会更关心椅子的技术细节和材料。在这种情况下,你可以修改这个提示,让它更精确地描述椅子的技术细节。\n",
"\n",
"解决方法:要求它专注于与目标受众相关的方面。"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Introducing our beautiful medieval-style office furniture collection, including file cabinets, desks, bookcases, and conference tables. Available in multiple shell colors and base coatings, with optional plastic or fabric/leather upholstery. Features a plastic-coated aluminum base with five wheels and pneumatic chair adjustment. Suitable for home or commercial use and made with high-quality materials, including cast aluminum with a modified nylon coating and HD36 foam. Made in Italy.\n"
]
}
],
"source": [
"# 优化后的 Prompt,说明面向对象,应具有什么性质且侧重于什么方面\n",
"prompt = f\"\"\"\n",
"Your task is to help a marketing team create a \n",
"description for a retail website of a product based \n",
"on a technical fact sheet.\n",
"\n",
"Write a product description based on the information \n",
"provided in the technical specifications delimited by \n",
"triple backticks.\n",
"\n",
"The description is intended for furniture retailers, \n",
"so should be technical in nature and focus on the \n",
"materials the product is constructed from.\n",
"\n",
"Use at most 50 words.\n",
"\n",
"Technical specifications: ```{fact_sheet_chair}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"这款中世纪风格办公家具系列包括文件柜、办公桌、书柜和会议桌等,适用于家庭或商业场所。可选多种外壳颜色和底座涂层,底座涂层选项为不锈钢、哑光黑色、光泽白色或铬。椅子可带或不带扶手,可选软地板或硬地板滚轮,两种座椅泡沫密度可选。外壳底座滑动件采用改性尼龙PA6/PA66涂层的铸铝,座椅采用HD36泡沫。原产国为意大利。\n"
]
}
],
"source": [
"# 优化后的 Prompt,说明面向对象,应具有什么性质且侧重于什么方面\n",
"prompt = f\"\"\"\n",
"您的任务是帮助营销团队基于技术说明书创建一个产品的零售网站描述。\n",
"\n",
"根据```标记的技术说明书中提供的信息,编写一个产品描述。\n",
"\n",
"该描述面向家具零售商,因此应具有技术性质,并侧重于产品的材料构造。\n",
"\n",
"使用最多50个单词。\n",
"\n",
"技术规格: ```{fact_sheet_chair}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"我可能进一步想要在描述的结尾包括产品ID。因此,我可以进一步改进这个提示,要求在描述的结尾,包括在技术说明中的每个7个字符产品ID。"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Introducing our beautiful medieval-style office furniture collection, featuring file cabinets, desks, bookshelves, and conference tables. Available in multiple shell colors and base coatings, with optional plastic or fabric/leather decorations. The chair comes with or without armrests and has a plastic-coated aluminum base with five wheels and pneumatic height adjustment. Suitable for home or commercial use. Made in Italy.\n",
"\n",
"Product IDs: SWC-100, SWC-110\n"
]
}
],
"source": [
"# 更进一步,要求在描述末尾包含 7个字符的产品ID\n",
"prompt = f\"\"\"\n",
"Your task is to help a marketing team create a \n",
"description for a retail website of a product based \n",
"on a technical fact sheet.\n",
"\n",
"Write a product description based on the information \n",
"provided in the technical specifications delimited by \n",
"triple backticks.\n",
"\n",
"The description is intended for furniture retailers, \n",
"so should be technical in nature and focus on the \n",
"materials the product is constructed from.\n",
"\n",
"At the end of the description, include every 7-character \n",
"Product ID in the technical specification.\n",
"\n",
"Use at most 50 words.\n",
"\n",
"Technical specifications: ```{fact_sheet_chair}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"这款中世纪风格的办公家具系列包括文件柜、办公桌、书柜和会议桌等,适用于家庭或商业场所。可选多种外壳颜色和底座涂层,底座涂层选项为不锈钢、哑光黑色、光泽白色或铬。椅子可带或不带扶手,可选塑料前后靠背装饰或10种面料和6种皮革的全面装饰。座椅采用HD36泡沫,可选中等或高密度,座椅高度44厘米,深度41厘米。外壳底座滑动件采用改性尼龙PA6/PA66涂层的铸铝,外壳厚度为10毫米。原产国为意大利。产品ID:SWC-100/SWC-110。\n"
]
}
],
"source": [
"# 更进一步\n",
"prompt = f\"\"\"\n",
"您的任务是帮助营销团队基于技术说明书创建一个产品的零售网站描述。\n",
"\n",
"根据```标记的技术说明书中提供的信息,编写一个产品描述。\n",
"\n",
"该描述面向家具零售商,因此应具有技术性质,并侧重于产品的材料构造。\n",
"\n",
"在描述末尾,包括技术规格中每个7个字符的产品ID。\n",
"\n",
"使用最多50个单词。\n",
"\n",
"技术规格: ```{fact_sheet_chair}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 问题三:需要一个表格形式的描述\n",
"\n",
"以上是许多开发人员通常会经历的迭代提示开发的简短示例。我的建议是,像上一章中所演示的那样,Prompt 应该保持清晰和明确,并在必要时给模型一些思考时间。在这些要求的基础上,通常值得首先尝试编写 Prompt ,看看会发生什么,然后从那里开始迭代地完善 Prompt,以逐渐接近所需的结果。因此,许多成功的Prompt都是通过这种迭代过程得出的。我将向您展示一个更复杂的提示示例,可能会让您对ChatGPT的能力有更深入的了解。\n",
"\n",
"这里我添加了一些额外的说明,要求它抽取信息并组织成表格,并指定表格的列、表名和格式,还要求它将所有内容格式化为可以在网页使用的 HTML。"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<div>\n",
" <p>Introducing our beautiful collection of medieval-style office furniture, including file cabinets, desks, bookcases, and conference tables. Choose from a variety of shell colors and base coatings. You can opt for plastic front and backrest decoration (SWC-100) or full decoration with 10 fabrics and 6 leathers (SWC-110). Base coating options include stainless steel, matte black, glossy white, or chrome. The chair is available with or without armrests and is suitable for both home and commercial settings. It is contract eligible.</p>\n",
" <p>The structure features a plastic-coated aluminum base with five wheels. The chair is pneumatically adjustable for easy height adjustment.</p>\n",
" <p>Product IDs: SWC-100, SWC-110</p>\n",
" <table>\n",
" <caption>Product Dimensions</caption>\n",
" <tr>\n",
" <td>Width</td>\n",
" <td>20.87 inches</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Depth</td>\n",
" <td>20.08 inches</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Height</td>\n",
" <td>31.50 inches</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Seat Height</td>\n",
" <td>17.32 inches</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Seat Depth</td>\n",
" <td>16.14 inches</td>\n",
" </tr>\n",
" </table>\n",
" <p>Options include soft or hard floor casters. You can choose from two seat foam densities: medium (1.8 pounds/cubic foot) or high (2.8 pounds/cubic foot). The chair is available with or without 8-position PU armrests.</p>\n",
" <p>Materials:</p>\n",
" <ul>\n",
" <li>Shell, base, and sliding parts: cast aluminum coated with modified nylon PA6/PA66. Shell thickness: 10mm.</li>\n",
" <li>Seat: HD36 foam</li>\n",
" </ul>\n",
" <p>Made in Italy.</p>\n",
"</div>\n"
]
}
],
"source": [
"# 要求它抽取信息并组织成表格,并指定表格的列、表名和格式\n",
"prompt = f\"\"\"\n",
"Your task is to help a marketing team create a \n",
"description for a retail website of a product based \n",
"on a technical fact sheet.\n",
"\n",
"Write a product description based on the information \n",
"provided in the technical specifications delimited by \n",
"triple backticks.\n",
"\n",
"The description is intended for furniture retailers, \n",
"so should be technical in nature and focus on the \n",
"materials the product is constructed from.\n",
"\n",
"At the end of the description, include every 7-character \n",
"Product ID in the technical specification.\n",
"\n",
"After the description, include a table that gives the \n",
"product's dimensions. The table should have two columns.\n",
"In the first column include the name of the dimension. \n",
"In the second column include the measurements in inches only.\n",
"\n",
"Give the table the title 'Product Dimensions'.\n",
"\n",
"Format everything as HTML that can be used in a website. \n",
"Place the description in a <div> element.\n",
"\n",
"Technical specifications: ```{fact_sheet_chair}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
" <p>Introducing our beautiful collection of medieval-style office furniture, including file cabinets, desks, bookcases, and conference tables. Choose from a variety of shell colors and base coatings. You can opt for plastic front and backrest decoration (SWC-100) or full decoration with 10 fabrics and 6 leathers (SWC-110). Base coating options include stainless steel, matte black, glossy white, or chrome. The chair is available with or without armrests and is suitable for both home and commercial settings. It is contract eligible.</p>\n",
" <p>The structure features a plastic-coated aluminum base with five wheels. The chair is pneumatically adjustable for easy height adjustment.</p>\n",
" <p>Product IDs: SWC-100, SWC-110</p>\n",
" <table>\n",
" <caption>Product Dimensions</caption>\n",
" <tr>\n",
" <td>Width</td>\n",
" <td>20.87 inches</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Depth</td>\n",
" <td>20.08 inches</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Height</td>\n",
" <td>31.50 inches</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Seat Height</td>\n",
" <td>17.32 inches</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Seat Depth</td>\n",
" <td>16.14 inches</td>\n",
" </tr>\n",
" </table>\n",
" <p>Options include soft or hard floor casters. You can choose from two seat foam densities: medium (1.8 pounds/cubic foot) or high (2.8 pounds/cubic foot). The chair is available with or without 8-position PU armrests.</p>\n",
" <p>Materials:</p>\n",
" <ul>\n",
" <li>Shell, base, and sliding parts: cast aluminum coated with modified nylon PA6/PA66. Shell thickness: 10mm.</li>\n",
" <li>Seat: HD36 foam</li>\n",
" </ul>\n",
" <p>Made in Italy.</p>\n",
"</div>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# 表格是以 HTML 格式呈现的,加载出来\n",
"from IPython.display import display, HTML\n",
"\n",
"display(HTML(response))"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<div>\n",
"<h2>中世纪风格办公家具系列椅子</h2>\n",
"<p>这款椅子是中世纪风格办公家具系列的一部分,适用于家庭或商业场所。它有多种外壳颜色和底座涂层可选,包括不锈钢、哑光黑色、光泽白色或铬。您可以选择带或不带扶手的椅子,以及软地板或硬地板滚轮选项。此外,您可以选择两种座椅泡沫密度:中等(1.8磅/立方英尺)或高(2.8磅/立方英尺)。</p>\n",
"<p>椅子的外壳底座滑动件是改性尼龙PA6/PA66涂层的铸铝,外壳厚度为10毫米。座椅采用HD36泡沫,底座是五个轮子的塑料涂层铝底座,可以进行气动椅子调节,方便升降。此外,椅子符合合同使用资格,是您理想的选择。</p>\n",
"<p>产品ID:SWC-100</p>\n",
"</div>\n",
"\n",
"<table>\n",
" <caption>产品尺寸</caption>\n",
" <tr>\n",
" <th>宽度</th>\n",
" <td>20.87英寸</td>\n",
" </tr>\n",
" <tr>\n",
" <th>深度</th>\n",
" <td>20.08英寸</td>\n",
" </tr>\n",
" <tr>\n",
" <th>高度</th>\n",
" <td>31.50英寸</td>\n",
" </tr>\n",
" <tr>\n",
" <th>座椅高度</th>\n",
" <td>17.32英寸</td>\n",
" </tr>\n",
" <tr>\n",
" <th>座椅深度</th>\n",
" <td>16.14英寸</td>\n",
" </tr>\n",
"</table>\n"
]
}
],
"source": [
"# 要求它抽取信息并组织成表格,并指定表格的列、表名和格式\n",
"prompt = f\"\"\"\n",
"您的任务是帮助营销团队基于技术说明书创建一个产品的零售网站描述。\n",
"\n",
"根据```标记的技术说明书中提供的信息,编写一个产品描述。\n",
"\n",
"该描述面向家具零售商,因此应具有技术性质,并侧重于产品的材料构造。\n",
"\n",
"在描述末尾,包括技术规格中每个7个字符的产品ID。\n",
"\n",
"在描述之后,包括一个表格,提供产品的尺寸。表格应该有两列。第一列包括尺寸的名称。第二列只包括英寸的测量值。\n",
"\n",
"给表格命名为“产品尺寸”。\n",
"\n",
"将所有内容格式化为可用于网站的HTML格式。将描述放在<div>元素中。\n",
"\n",
"技术规格:```{fact_sheet_chair}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<h2>中世纪风格办公家具系列椅子</h2>\n",
"<p>这款椅子是中世纪风格办公家具系列的一部分,适用于家庭或商业场所。它有多种外壳颜色和底座涂层可选,包括不锈钢、哑光黑色、光泽白色或铬。您可以选择带或不带扶手的椅子,以及软地板或硬地板滚轮选项。此外,您可以选择两种座椅泡沫密度:中等(1.8磅/立方英尺)或高(2.8磅/立方英尺)。</p>\n",
"<p>椅子的外壳底座滑动件是改性尼龙PA6/PA66涂层的铸铝,外壳厚度为10毫米。座椅采用HD36泡沫,底座是五个轮子的塑料涂层铝底座,可以进行气动椅子调节,方便升降。此外,椅子符合合同使用资格,是您理想的选择。</p>\n",
"<p>产品ID:SWC-100</p>\n",
"</div>\n",
"\n",
"<table>\n",
" <caption>产品尺寸</caption>\n",
" <tr>\n",
" <th>宽度</th>\n",
" <td>20.87英寸</td>\n",
" </tr>\n",
" <tr>\n",
" <th>深度</th>\n",
" <td>20.08英寸</td>\n",
" </tr>\n",
" <tr>\n",
" <th>高度</th>\n",
" <td>31.50英寸</td>\n",
" </tr>\n",
" <tr>\n",
" <th>座椅高度</th>\n",
" <td>17.32英寸</td>\n",
" </tr>\n",
" <tr>\n",
" <th>座椅深度</th>\n",
" <td>16.14英寸</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# 表格是以 HTML 格式呈现的,加载出来\n",
"from IPython.display import display, HTML\n",
"\n",
"display(HTML(response))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"本章的主要内容是 LLM 在开发应用程序中的迭代式提示开发过程。开发者需要先尝试编写提示,然后通过迭代逐步完善它,直至得到需要的结果。关键在于拥有一种有效的开发Prompt的过程,而不是知道完美的Prompt。对于一些更复杂的应用程序,可以对多个样本进行迭代开发提示并进行评估。最后,可以在更成熟的应用程序中测试多个Prompt在多个样本上的平均或最差性能。在使用 Jupyter 代码笔记本示例时,请尝试不同的变化并查看结果。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
"autoclose": false,
"autocomplete": true,
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 1,
"hotkeys": {
"equation": "Ctrl-E",
"itemize": "Ctrl-I"
},
"labels_anchors": false,
"latex_user_defs": false,
"report_style_numbering": false,
"user_envs_cfg": false
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": true
}
},
"nbformat": 4,
"nbformat_minor": 4
}
================================================
FILE: notebooks-zh/4. 摘要 Summarizing.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"id": "b58204ea",
"metadata": {},
"source": [
"# 文本概括 Summarizing"
]
},
{
"cell_type": "markdown",
"id": "b70ad003",
"metadata": {},
"source": [
"## 1 引言"
]
},
{
"cell_type": "markdown",
"id": "12fa9ea4",
"metadata": {},
"source": [
"当今世界上有太多的文本信息,几乎没有人能够拥有足够的时间去阅读所有我们想了解的东西。但令人感到欣喜的是,目前LLM在文本概括任务上展现了强大的水准,也已经有不少团队将这项功能插入了自己的软件应用中。\n",
"\n",
"本章节将介绍如何使用编程的方式,调用API接口来实现“文本概括”功能。"
]
},
{
"cell_type": "markdown",
"id": "1de4fd1e",
"metadata": {},
"source": [
"首先,我们需要OpenAI包,加载API密钥,定义getCompletion函数。"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "9f679f1f",
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"import os\n",
"OPENAI_API_KEY = os.environ.get(\"OPENAI_API_KEY\")\n",
"openai.api_key = OPENAI_API_KEY\n",
"\n",
"def get_completion(prompt, model=\"gpt-3.5-turbo\"): \n",
" messages = [{\"role\": \"user\", \"content\": prompt}]\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=0, # 值越低则输出文本随机性越低\n",
" )\n",
" return response.choices[0].message[\"content\"]"
]
},
{
"cell_type": "markdown",
"id": "9cca835b",
"metadata": {},
"source": [
"## 2 单一文本概括Prompt实验"
]
},
{
"cell_type": "markdown",
"id": "0c1e1b92",
"metadata": {},
"source": [
"这里我们举了个商品评论的例子。对于电商平台来说,网站上往往存在着海量的商品评论,这些评论反映了所有客户的想法。如果我们拥有一个工具去概括这些海量、冗长的评论,便能够快速地浏览更多评论,洞悉客户的偏好,从而指导平台与商家提供更优质的服务。"
]
},
{
"cell_type": "markdown",
"id": "9dc2e2bc",
"metadata": {},
"source": [
"**输入文本**"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "4d9c0eeb",
"metadata": {},
"outputs": [],
"source": [
"prod_review = \"\"\"\n",
"Got this panda plush toy for my daughter's birthday, \\\n",
"who loves it and takes it everywhere. It's soft and \\ \n",
"super cute, and its face has a friendly look. It's \\ \n",
"a bit small for what I paid though. I think there \\ \n",
"might be other options that are bigger for the \\ \n",
"same price. It arrived a day earlier than expected, \\ \n",
"so I got to play with it myself before I gave it \\ \n",
"to her.\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "aad5bd2a",
"metadata": {},
"source": [
"**输入文本(中文翻译)**"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "43b5dd25",
"metadata": {},
"outputs": [],
"source": [
"prod_review_zh = \"\"\"\n",
"这个熊猫公仔是我给女儿的生日礼物,她很喜欢,去哪都带着。\n",
"公仔很软,超级可爱,面部表情也很和善。但是相比于价钱来说,\n",
"它有点小,我感觉在别的地方用同样的价钱能买到更大的。\n",
"快递比预期提前了一天到货,所以在送给女儿之前,我自己玩了会。\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "662c9cd2",
"metadata": {},
"source": [
"### 2.1 限制输出文本长度"
]
},
{
"cell_type": "markdown",
"id": "a6d10814",
"metadata": {},
"source": [
"我们尝试限制文本长度为最多30词。"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "02208fbc",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Soft and cute panda plush toy loved by daughter, but a bit small for the price. Arrived early.\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Your task is to generate a short summary of a product \\\n",
"review from an ecommerce site. \n",
"\n",
"Summarize the review below, delimited by triple \n",
"backticks, in at most 30 words. \n",
"\n",
"Review: ```{prod_review}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "0df0eb90",
"metadata": {},
"source": [
"中文翻译版本"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "bf4b39f9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"可爱软熊猫公仔,女儿喜欢,面部表情和善,但价钱有点小贵,快递提前一天到货。\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"你的任务是从电子商务网站上生成一个产品评论的简短摘要。\n",
"\n",
"请对三个反引号之间的评论文本进行概括,最多30个词汇。\n",
"\n",
"评论: ```{prod_review_zh}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "e9ab145e",
"metadata": {},
"source": [
"### 2.2 关键角度侧重"
]
},
{
"cell_type": "markdown",
"id": "f84d0123",
"metadata": {},
"source": [
"有时,针对不同的业务,我们对文本的侧重会有所不同。例如对于商品评论文本,物流会更关心运输时效,商家更加关心价格与商品质量,平台更关心整体服务体验。\n",
"\n",
"我们可以通过增加Prompt提示,来体现对于某个特定角度的侧重。"
]
},
{
"cell_type": "markdown",
"id": "d6f8509a",
"metadata": {},
"source": [
"**侧重于运输**"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "9d8a32a6",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The panda plush toy arrived a day earlier than expected, but the customer felt it was a bit small for the price paid.\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Your task is to generate a short summary of a product \\\n",
"review from an ecommerce site to give feedback to the \\\n",
"Shipping deparmtment. \n",
"\n",
"Summarize the review below, delimited by triple \n",
"backticks, in at most 30 words, and focusing on any aspects \\\n",
"that mention shipping and delivery of the product. \n",
"\n",
"Review: ```{prod_review}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "0bd4243a",
"metadata": {},
"source": [
"中文翻译版本"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "80636c3e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"快递提前到货,熊猫公仔软可爱,但有点小,价钱不太划算。\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"你的任务是从电子商务网站上生成一个产品评论的简短摘要。\n",
"\n",
"请对三个反引号之间的评论文本进行概括,最多30个词汇,并且聚焦在产品运输上。\n",
"\n",
"评论: ```{prod_review_zh}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "76c97fea",
"metadata": {},
"source": [
"可以看到,输出结果以“快递提前一天到货”开头,体现了对于快递效率的侧重。"
]
},
{
"cell_type": "markdown",
"id": "83275907",
"metadata": {},
"source": [
"**侧重于价格与质量**"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "767f252c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The panda plush toy is soft, cute, and loved by the recipient, but the price may be too high for its size compared to other options.\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Your task is to generate a short summary of a product \\\n",
"review from an ecommerce site to give feedback to the \\\n",
"pricing deparmtment, responsible for determining the \\\n",
"price of the product. \n",
"\n",
"Summarize the review below, delimited by triple \n",
"backticks, in at most 30 words, and focusing on any aspects \\\n",
"that are relevant to the price and perceived value. \n",
"\n",
"Review: ```{prod_review}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "cf54fac4",
"metadata": {},
"source": [
"中文翻译版本"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "728d6c57",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"可爱软熊猫公仔,面部表情友好,但价钱有点高,尺寸较小。快递提前一天到货。\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"你的任务是从电子商务网站上生成一个产品评论的简短摘要。\n",
"\n",
"请对三个反引号之间的评论文本进行概括,最多30个词汇,并且聚焦在产品价格和质量上。\n",
"\n",
"评论: ```{prod_review_zh}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "972dbb1b",
"metadata": {},
"source": [
"可以看到,输出结果以“质量好、价格小贵、尺寸小”开头,体现了对于产品价格与质量的侧重。"
]
},
{
"cell_type": "markdown",
"id": "b3ed53d2",
"metadata": {},
"source": [
"### 2.3 关键信息提取"
]
},
{
"cell_type": "markdown",
"id": "ba6f5c25",
"metadata": {},
"source": [
"在2.2节中,虽然我们通过添加关键角度侧重的Prompt,使得文本摘要更侧重于某一特定方面,但是可以发现,结果中也会保留一些其他信息,如价格与质量角度的概括中仍保留了“快递提前到货”的信息。有时这些信息是有帮助的,但如果我们只想要提取某一角度的信息,并过滤掉其他所有信息,则可以要求LLM进行“文本提取(Extract)”而非“文本概括(Summarize)”。"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "2d60dc58",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\"The product arrived a day earlier than expected.\"\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Your task is to extract relevant information from \\ \n",
"a product review from an ecommerce site to give \\\n",
"feedback to the Shipping department. \n",
"\n",
"From the review below, delimited by triple quotes \\\n",
"extract the information relevant to shipping and \\ \n",
"delivery. Limit to 30 words. \n",
"\n",
"Review: ```{prod_review}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "0339b877",
"metadata": {},
"source": [
"中文翻译版本"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "c845ccab",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"快递比预期提前了一天到货。\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"你的任务是从电子商务网站上的产品评论中提取相关信息。\n",
"\n",
"请从以下三个反引号之间的评论文本中提取产品运输相关的信息,最多30个词汇。\n",
"\n",
"评论: ```{prod_review_zh}```\n",
"\"\"\"\n",
"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "50498a2b",
"metadata": {},
"source": [
"## 3 多条文本概括Prompt实验"
]
},
{
"cell_type": "markdown",
"id": "a291541a",
"metadata": {},
"source": [
"在实际的工作流中,我们往往有许许多多的评论文本,以下展示了一个基于for循环调用“文本概括”工具并依次打印的示例。当然,在实际生产中,对于上百万甚至上千万的评论文本,使用for循环也是不现实的,可能需要考虑整合评论、分布式等方法提升运算效率。"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "ee7caa78",
"metadata": {},
"outputs": [],
"source": [
"review_1 = prod_review \n",
"\n",
"# review for a standing lamp\n",
"review_2 = \"\"\"\n",
"Needed a nice lamp for my bedroom, and this one \\\n",
"had additional storage and not too high of a price \\\n",
"point. Got it fast - arrived in 2 days. The string \\\n",
"to the lamp broke during the transit and the company \\\n",
"happily sent over a new one. Came within a few days \\\n",
"as well. It was easy to put together. Then I had a \\\n",
"missing part, so I contacted their support and they \\\n",
"very quickly got me the missing piece! Seems to me \\\n",
"to be a great company that cares about their customers \\\n",
"and products. \n",
"\"\"\"\n",
"\n",
"# review for an electric toothbrush\n",
"review_3 = \"\"\"\n",
"My dental hygienist recommended an electric toothbrush, \\\n",
"which is why I got this. The battery life seems to be \\\n",
"pretty impressive so far. After initial charging and \\\n",
"leaving the charger plugged in for the first week to \\\n",
"condition the battery, I've unplugged the charger and \\\n",
"been using it for twice daily brushing for the last \\\n",
"3 weeks all on the same charge. But the toothbrush head \\\n",
"is too small. I’ve seen baby toothbrushes bigger than \\\n",
"this one. I wish the head was bigger with different \\\n",
"length bristles to get between teeth better because \\\n",
"this one doesn’t. Overall if you can get this one \\\n",
"around the $50 mark, it's a good deal. The manufactuer's \\\n",
"replacements heads are pretty expensive, but you can \\\n",
"get generic ones that're more reasonably priced. This \\\n",
"toothbrush makes me feel like I've been to the dentist \\\n",
"every day. My teeth feel sparkly clean! \n",
"\"\"\"\n",
"\n",
"# review for a blender\n",
"review_4 = \"\"\"\n",
"So, they still had the 17 piece system on seasonal \\\n",
"sale for around $49 in the month of November, about \\\n",
"half off, but for some reason (call it price gouging) \\\n",
"around the second week of December the prices all went \\\n",
"up to about anywhere from between $70-$89 for the same \\\n",
"system. And the 11 piece system went up around $10 or \\\n",
"so in price also from the earlier sale price of $29. \\\n",
"So it looks okay, but if you look at the base, the part \\\n",
"where the blade locks into place doesn’t look as good \\\n",
"as in previous editions from a few years ago, but I \\\n",
"plan to be very gentle with it (example, I crush \\\n",
"very hard items like beans, ice, rice, etc. in the \\ \n",
"blender first then pulverize them in the serving size \\\n",
"I want in the blender then switch to the whipping \\\n",
"blade for a finer flour, and use the cross cutting blade \\\n",
"first when making smoothies, then use the flat blade \\\n",
"if I need them finer/less pulpy). Special tip when making \\\n",
"smoothies, finely cut and freeze the fruits and \\\n",
"vegetables (if using spinach-lightly stew soften the \\ \n",
"spinach then freeze until ready for use-and if making \\\n",
"sorbet, use a small to medium sized food processor) \\ \n",
"that you plan to use that way you can avoid adding so \\\n",
"much ice if at all-when making your smoothie. \\\n",
"After about a year, the motor was making a funny noise. \\\n",
"I called customer service but the warranty expired \\\n",
"already, so I had to buy another one. FYI: The overall \\\n",
"quality has gone done in these types of products, so \\\n",
"they are kind of counting on brand recognition and \\\n",
"consumer loyalty to maintain sales. Got it in about \\\n",
"two days.\n",
"\"\"\"\n",
"\n",
"reviews = [review_1, review_2, review_3, review_4]"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "9d1aa5ac",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 Soft and cute panda plush toy loved by daughter, but a bit small for the price. Arrived early. \n",
"\n",
"1 Affordable lamp with storage, fast shipping, and excellent customer service. Easy to assemble and missing parts were quickly replaced. \n",
"\n",
"2 Good battery life, small toothbrush head, but effective cleaning. Good deal if bought around $50. \n",
"\n",
"3 The product was on sale for $49 in November, but the price increased to $70-$89 in December. The base doesn't look as good as previous editions, but the reviewer plans to be gentle with it. A special tip for making smoothies is to freeze the fruits and vegetables beforehand. The motor made a funny noise after a year, and the warranty had expired. Overall quality has decreased. \n",
"\n"
]
}
],
"source": [
"for i in range(len(reviews)):\n",
" prompt = f\"\"\"\n",
" Your task is to generate a short summary of a product \\ \n",
" review from an ecommerce site. \n",
"\n",
" Summarize the review below, delimited by triple \\\n",
" backticks in at most 20 words. \n",
"\n",
" Review: ```{reviews[i]}```\n",
" \"\"\"\n",
" response = get_completion(prompt)\n",
" print(i, response, \"\\n\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "eb878522",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
"autoclose": false,
"autocomplete": true,
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 1,
"hotkeys": {
"equation": "Ctrl-E",
"itemize": "Ctrl-I"
},
"labels_anchors": false,
"latex_user_defs": false,
"report_style_numbering": false,
"user_envs_cfg": false
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": true
}
},
"nbformat": 4,
"nbformat_minor": 5
}
================================================
FILE: notebooks-zh/5. 推断 Inferring.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"id": "3630c235-f891-4874-bd0a-5277d4d6aa82",
"metadata": {},
"source": [
"# 推断\n",
"\n",
"在这节课中,你将从产品评论和新闻文章中推断情感和主题。\n",
"\n",
"这些任务可以看作是模型接收文本作为输入并执行某种分析的过程。这可能涉及提取标签、提取实体、理解文本情感等等。如果你想要从一段文本中提取正面或负面情感,在传统的机器学习工作流程中,需要收集标签数据集、训练模型、确定如何在云端部署模型并进行推断。这样做可能效果还不错,但是这个过程需要很多工作。而且对于每个任务,如情感分析、提取实体等等,都需要训练和部署单独的模型。\n",
"\n",
"大型语言模型的一个非常好的特点是,对于许多这样的任务,你只需要编写一个prompt即可开始产生结果,而不需要进行大量的工作。这极大地加快了应用程序开发的速度。你还可以只使用一个模型和一个 API 来执行许多不同的任务,而不需要弄清楚如何训练和部署许多不同的模型。\n",
"\n",
"\n",
"## 启动"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "a821d943",
"metadata": {
"height": 132
},
"outputs": [],
"source": [
"import openai\n",
"import os\n",
"\n",
"OPENAI_API_KEY = os.environ.get(\"OPENAI_API_KEY2\")\n",
"openai.api_key = OPENAI_API_KEY"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "e82f5577",
"metadata": {
"height": 164
},
"outputs": [],
"source": [
"def get_completion(prompt, model=\"gpt-3.5-turbo\"):\n",
" messages = [{\"role\": \"user\", \"content\": prompt}]\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=0, # this is the degree of randomness of the model's output\n",
" )\n",
" return response.choices[0].message[\"content\"]"
]
},
{
"cell_type": "markdown",
"id": "51d2fdfa-c99f-4750-8574-dba7712cd7f0",
"metadata": {},
"source": [
"## 商品评论文本\n",
"\n",
"这是一盏台灯的评论。"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "b0f3b49b",
"metadata": {
"height": 200
},
"outputs": [],
"source": [
"lamp_review = \"\"\"\n",
"Needed a nice lamp for my bedroom, and this one had \\\n",
"additional storage and not too high of a price point. \\\n",
"Got it fast. The string to our lamp broke during the \\\n",
"transit and the company happily sent over a new one. \\\n",
"Came within a few days as well. It was easy to put \\\n",
"together. I had a missing part, so I contacted their \\\n",
"support and they very quickly got me the missing piece! \\\n",
"Lumina seems to me to be a great company that cares \\\n",
"about their customers and products!!\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "bc6260f0",
"metadata": {},
"outputs": [],
"source": [
"# 中文\n",
"lamp_review_zh = \"\"\"\n",
"我需要一盏漂亮的卧室灯,这款灯具有额外的储物功能,价格也不算太高。\\\n",
"我很快就收到了它。在运输过程中,我们的灯绳断了,但是公司很乐意寄送了一个新的。\\\n",
"几天后就收到了。这款灯很容易组装。我发现少了一个零件,于是联系了他们的客服,他们很快就给我寄来了缺失的零件!\\\n",
"在我看来,Lumina 是一家非常关心顾客和产品的优秀公司!\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "30d6e4bd-3337-45a3-8c99-a734cdd06743",
"metadata": {},
"source": [
"## 情感(正向/负向)\n",
"\n",
"现在让我们来编写一个prompt来分类这个评论的情感。如果我想让系统告诉我这个评论的情感是什么,只需要编写 “以下产品评论的情感是什么” 这个prompt,加上通常的分隔符和评论文本等等。\n",
"\n",
"然后让我们运行一下。结果显示这个产品评论的情感是积极的,这似乎是非常正确的。虽然这盏台灯不完美,但这个客户似乎非常满意。这似乎是一家关心客户和产品的伟大公司,可以认为积极的情感似乎是正确的答案。"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "e3157601",
"metadata": {
"height": 149
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The sentiment of the product review is positive.\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"What is the sentiment of the following product review, \n",
"which is delimited with triple backticks?\n",
"\n",
"Review text: ```{lamp_review}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "ac5b0bb9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"情感是积极的/正面的。\n"
]
}
],
"source": [
"# 中文\n",
"prompt = f\"\"\"\n",
"以下用三个反引号分隔的产品评论的情感是什么?\n",
"\n",
"评论文本: ```{lamp_review_zh}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "76be2320",
"metadata": {},
"source": [
"如果你想要给出更简洁的答案,以便更容易进行后处理,可以使用上面的prompt并添加另一个指令,以一个单词 “正面” 或 “负面” 的形式给出答案。这样就只会打印出 “正面” 这个单词,这使得文本更容易接受这个输出并进行处理。"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "acf9ca16",
"metadata": {
"height": 200
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"positive\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"What is the sentiment of the following product review, \n",
"which is delimited with triple backticks?\n",
"\n",
"Give your answer as a single word, either \"positive\" \\\n",
"or \"negative\".\n",
"\n",
"Review text: ```{lamp_review}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "84a761b3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"正面\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"以下用三个反引号分隔的产品评论的情感是什么?\n",
"\n",
"用一个单词回答:「正面」或「负面」。\n",
"\n",
"评论文本: ```{lamp_review_zh}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "81d2a973-1fa4-4a35-ae35-a2e746c0e91b",
"metadata": {},
"source": [
"## 识别情感类型\n",
"\n",
"让我们看看另一个prompt,仍然使用台灯评论。这次我要让它识别出以下评论作者所表达的情感列表,不超过五个。"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "8aa7934b",
"metadata": {
"height": 183
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"satisfied, grateful, impressed, content, pleased\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Identify a list of emotions that the writer of the \\\n",
"following review is expressing. Include no more than \\\n",
"five items in the list. Format your answer as a list of \\\n",
"lower-case words separated by commas.\n",
"\n",
"Review text: ```{lamp_review}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "e615c13a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"满意,感激,信任,赞扬,愉快\n"
]
}
],
"source": [
"# 中文\n",
"prompt = f\"\"\"\n",
"识别以下评论的作者表达的情感。包含不超过五个项目。将答案格式化为以逗号分隔的单词列表。\n",
"\n",
"评论文本: ```{lamp_review_zh}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "cc4444f7",
"metadata": {},
"source": [
"大型语言模型非常擅长从一段文本中提取特定的东西。在上面的例子中,评论正在表达情感,这可能有助于了解客户如何看待特定的产品。"
]
},
{
"cell_type": "markdown",
"id": "a428d093-51c9-461c-b41e-114e80876409",
"metadata": {},
"source": [
"## 识别愤怒\n",
"\n",
"对于很多企业来说,了解某个顾客是否非常生气很重要。所以你可能有一个类似这样的分类问题:以下评论的作者是否表达了愤怒情绪?因为如果有人真的很生气,那么可能值得额外关注,让客户支持或客户成功团队联系客户以了解情况,并为客户解决问题。"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "dba1a538",
"metadata": {
"height": 166
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"No\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Is the writer of the following review expressing anger?\\\n",
"The review is delimited with triple backticks. \\\n",
"Give your answer as either yes or no.\n",
"\n",
"Review text: ```{lamp_review}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "85bad324",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"否\n"
]
}
],
"source": [
"# 中文\n",
"prompt = f\"\"\"\n",
"以下评论的作者是否表达了愤怒?评论用三个反引号分隔。给出是或否的答案。\n",
"\n",
"评论文本: ```{lamp_review_zh}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "11ca57a2",
"metadata": {},
"source": [
"上面这个例子中,客户并没有生气。注意,如果使用常规的监督学习,如果想要建立所有这些分类器,不可能在几分钟内就做到这一点。我们鼓励大家尝试更改一些这样的prompt,也许询问客户是否表达了喜悦,或者询问是否有任何遗漏的部分,并看看是否可以让prompt对这个灯具评论做出不同的推论。"
]
},
{
"cell_type": "markdown",
"id": "936a771e-ca78-4e55-8088-2da6f3820ddc",
"metadata": {},
"source": [
"## 从客户评论中提取产品和公司名称\n",
"\n",
"接下来,让我们从客户评论中提取更丰富的信息。信息提取是自然语言处理(NLP)的一部分,与从文本中提取你想要知道的某些事物相关。因此,在这个prompt中,我要求它识别以下内容:购买物品和制造物品的公司名称。\n",
"\n",
"同样,如果你试图总结在线购物电子商务网站的许多评论,对于这些评论来说,弄清楚是什么物品,谁制造了该物品,弄清楚积极和消极的情感,以跟踪特定物品或特定制造商的积极或消极情感趋势,可能会很有用。\n",
"\n",
"在下面这个示例中,我们要求它将响应格式化为一个 JSON 对象,其中物品和品牌是键。"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "a13bea1b",
"metadata": {
"height": 285
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"Item\": \"lamp with additional storage\",\n",
" \"Brand\": \"Lumina\"\n",
"}\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Identify the following items from the review text: \n",
"- Item purchased by reviewer\n",
"- Company that made the item\n",
"\n",
"The review is delimited with triple backticks. \\\n",
"Format your response as a JSON object with \\\n",
"\"Item\" and \"Brand\" as the keys. \n",
"If the information isn't present, use \"unknown\" \\\n",
"as the value.\n",
"Make your response as short as possible.\n",
" \n",
"Review text: ```{lamp_review}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "e9ffe056",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"物品\": \"卧室灯\",\n",
" \"品牌\": \"Lumina\"\n",
"}\n"
]
}
],
"source": [
"# 中文\n",
"prompt = f\"\"\"\n",
"从评论文本中识别以下项目:\n",
"- 评论者购买的物品\n",
"- 制造该物品的公司\n",
"\n",
"评论文本用三个反引号分隔。将你的响应格式化为以 “物品” 和 “品牌” 为键的 JSON 对象。\n",
"如果信息不存在,请使用 “未知” 作为值。\n",
"让你的回应尽可能简短。\n",
" \n",
"评论文本: ```{lamp_review_zh}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "954d125d",
"metadata": {},
"source": [
"如上所示,它会说这个物品是一个卧室灯,品牌是 Luminar,你可以轻松地将其加载到 Python 字典中,然后对此输出进行其他处理。"
]
},
{
"cell_type": "markdown",
"id": "a38880a5-088f-4609-9913-f8fa41fb7ba0",
"metadata": {},
"source": [
"## 一次完成多项任务\n",
"\n",
"提取上面所有这些信息使用了 3 或 4 个prompt,但实际上可以编写单个prompt来同时提取所有这些信息。"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "e7dda9e5",
"metadata": {
"height": 336
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"Sentiment\": \"positive\",\n",
" \"Anger\": false,\n",
" \"Item\": \"lamp with additional storage\",\n",
" \"Brand\": \"Lumina\"\n",
"}\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Identify the following items from the review text: \n",
"- Sentiment (positive or negative)\n",
"- Is the reviewer expressing anger? (true or false)\n",
"- Item purchased by reviewer\n",
"- Company that made the item\n",
"\n",
"The review is delimited with triple backticks. \\\n",
"Format your response as a JSON object with \\\n",
"\"Sentiment\", \"Anger\", \"Item\" and \"Brand\" as the keys.\n",
"If the information isn't present, use \"unknown\" \\\n",
"as the value.\n",
"Make your response as short as possible.\n",
"Format the Anger value as a boolean.\n",
"\n",
"Review text: ```{lamp_review}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "939c2b0e",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"Sentiment\": \"正面\",\n",
" \"Anger\": false,\n",
" \"Item\": \"卧室灯\",\n",
" \"Brand\": \"Lumina\"\n",
"}\n"
]
}
],
"source": [
"# 中文\n",
"prompt = f\"\"\"\n",
"从评论文本中识别以下项目:\n",
"- 情绪(正面或负面)\n",
"- 审稿人是否表达了愤怒?(是或否)\n",
"- 评论者购买的物品\n",
"- 制造该物品的公司\n",
"\n",
"评论用三个反引号分隔。将您的响应格式化为 JSON 对象,以 “Sentiment”、“Anger”、“Item” 和 “Brand” 作为键。\n",
"如果信息不存在,请使用 “未知” 作为值。\n",
"让你的回应尽可能简短。\n",
"将 Anger 值格式化为布尔值。\n",
"\n",
"评论文本: ```{lamp_review_zh}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "5e09a673",
"metadata": {},
"source": [
"这个例子中,我们告诉它将愤怒值格式化为布尔值,然后输出一个 JSON。大家可以自己尝试不同的变化,或者甚至尝试完全不同的评论,看看是否仍然可以准确地提取这些内容。"
]
},
{
"cell_type": "markdown",
"id": "235fc223-2c89-49ec-ac2d-78a8e74a43ac",
"metadata": {},
"source": [
"## 推断主题\n",
"\n",
"大型语言模型的一个很酷的应用是推断主题。给定一段长文本,这段文本是关于什么的?有什么话题?"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "8a74cc3e",
"metadata": {
"height": 472
},
"outputs": [],
"source": [
"story = \"\"\"\n",
"In a recent survey conducted by the government, \n",
"public sector employees were asked to rate their level \n",
"of satisfaction with the department they work at. \n",
"The results revealed that NASA was the most popular \n",
"department with a satisfaction rating of 95%.\n",
"\n",
"One NASA employee, John Smith, commented on the findings, \n",
"stating, \"I'm not surprised that NASA came out on top. \n",
"It's a great place to work with amazing people and \n",
"incredible opportunities. I'm proud to be a part of \n",
"such an innovative organization.\"\n",
"\n",
"The results were also welcomed by NASA's management team, \n",
"with Director Tom Johnson stating, \"We are thrilled to \n",
"hear that our employees are satisfied with their work at NASA. \n",
"We have a talented and dedicated team who work tirelessly \n",
"to achieve our goals, and it's fantastic to see that their \n",
"hard work is paying off.\"\n",
"\n",
"The survey also revealed that the \n",
"Social Security Administration had the lowest satisfaction \n",
"rating, with only 45% of employees indicating they were \n",
"satisfied with their job. The government has pledged to \n",
"address the concerns raised by employees in the survey and \n",
"work towards improving job satisfaction across all departments.\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "811ff13f",
"metadata": {},
"outputs": [],
"source": [
"# 中文\n",
"story_zh = \"\"\"\n",
"在政府最近进行的一项调查中,要求公共部门的员工对他们所在部门的满意度进行评分。\n",
"调查结果显示,NASA 是最受欢迎的部门,满意度为 95%。\n",
"\n",
"一位 NASA 员工 John Smith 对这一发现发表了评论,他表示:\n",
"“我对 NASA 排名第一并不感到惊讶。这是一个与了不起的人们和令人难以置信的机会共事的好地方。我为成为这样一个创新组织的一员感到自豪。”\n",
"\n",
"NASA 的管理团队也对这一结果表示欢迎,主管 Tom Johnson 表示:\n",
"“我们很高兴听到我们的员工对 NASA 的工作感到满意。\n",
"我们拥有一支才华横溢、忠诚敬业的团队,他们为实现我们的目标不懈努力,看到他们的辛勤工作得到回报是太棒了。”\n",
"\n",
"调查还显示,社会保障管理局的满意度最低,只有 45%的员工表示他们对工作满意。\n",
"政府承诺解决调查中员工提出的问题,并努力提高所有部门的工作满意度。\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "a8ea91d6-e841-4ee2-bed9-ca4a36df177f",
"metadata": {},
"source": [
"## 推断5个主题\n",
"\n",
"上面是一篇虚构的关于政府工作人员对他们工作机构感受的报纸文章。我们可以让它确定五个正在讨论的主题,用一两个字描述每个主题,并将输出格式化为逗号分隔的列表。"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "5c267cbe",
"metadata": {
"height": 217
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"government survey, public sector employees, job satisfaction, NASA, Social Security Administration\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Determine five topics that are being discussed in the \\\n",
"following text, which is delimited by triple backticks.\n",
"\n",
"Make each item one or two words long. \n",
"\n",
"Format your response as a list of items separated by commas.\n",
"\n",
"Text sample: ```{story}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "f92f90fe",
"metadata": {
"height": 30,
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"['government survey',\n",
" ' public sector employees',\n",
" ' job satisfaction',\n",
" ' NASA',\n",
" ' Social Security Administration']"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response.split(sep=',')"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "cab27b65",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"调查结果, NASA, 社会保障管理局, 员工满意度, 政府承诺\n"
]
}
],
"source": [
"# 中文\n",
"prompt = f\"\"\"\n",
"确定以下给定文本中讨论的五个主题。\n",
"\n",
"每个主题用1-2个单词概括。\n",
"\n",
"输出时用逗号分割每个主题。\n",
"\n",
"给定文本: ```{story_zh}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "34be1d2a-1309-4512-841a-b6f67338938b",
"metadata": {},
"source": [
"## 为特定主题制作新闻提醒\n",
"\n",
"假设我们有一个新闻网站或类似的东西,这是我们感兴趣的主题:NASA、地方政府、工程、员工满意度、联邦政府等。假设我们想弄清楚,针对一篇新闻文章,其中涵盖了哪些主题。可以使用这样的prompt:确定以下主题列表中的每个项目是否是以下文本中的主题。以 0 或 1 的形式给出答案列表。"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "94b8fa65",
"metadata": {
"height": 81
},
"outputs": [],
"source": [
"topic_list = [\n",
" \"nasa\", \"local government\", \"engineering\", \n",
" \"employee satisfaction\", \"federal government\"\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "626c5b8e",
"metadata": {
"height": 234
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"nasa: 1\n",
"local government: 0\n",
"engineering: 0\n",
"employee satisfaction: 1\n",
"federal government: 1\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"Determine whether each item in the following list of \\\n",
"topics is a topic in the text below, which\n",
"is delimited with triple backticks.\n",
"\n",
"Give your answer as list with 0 or 1 for each topic.\\\n",
"\n",
"List of topics: {\", \".join(topic_list)}\n",
"\n",
"Text sample: ```{story}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "902a7c74",
"metadata": {
"height": 79
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"ALERT: New NASA story!\n"
]
}
],
"source": [
"topic_dict = {i.split(': ')[0]: int(i.split(': ')[1]) for i in response.split(sep='\\n')}\n",
"if topic_dict['nasa'] == 1:\n",
" print(\"ALERT: New NASA story!\")"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "9f53d337",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"美国航空航天局:1\n",
"地方政府:1\n",
"工程:0\n",
"员工满意度:1\n",
"联邦政府:1\n"
]
}
],
"source": [
"# 中文\n",
"prompt = f\"\"\"\n",
"判断主题列表中的每一项是否是给定文本中的一个话题,\n",
"\n",
"以列表的形式给出答案,每个主题用 0 或 1。\n",
"\n",
"主题列表:美国航空航天局、地方政府、工程、员工满意度、联邦政府\n",
"\n",
"给定文本: ```{story_zh}```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "08247dbf",
"metadata": {},
"source": [
"所以,这个故事是关于 NASA 的。它不是关于地方政府的,不是关于工程的。它是关于员工满意度的,它是关于联邦政府的。这在机器学习中有时被称为 Zero-Shot 学习算法,因为我们没有给它任何标记的训练数据。仅凭prompt,它就能确定哪些主题在新闻文章中涵盖了。\n",
"\n",
"如果我们想生成一个新闻提醒,也可以使用这个处理新闻的过程。假设我非常喜欢 NASA 所做的工作,就可以构建一个这样的系统,每当 NASA 新闻出现时,输出提醒。"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "53bf1abd",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"提醒: 关于美国航空航天局的新消息\n"
]
}
],
"source": [
"topic_dict = {i.split(':')[0]: int(i.split(':')[1]) for i in response.split(sep='\\n')}\n",
"if topic_dict['美国航空航天局'] == 1:\n",
" print(\"提醒: 关于美国航空航天局的新消息\")"
]
},
{
"cell_type": "markdown",
"id": "76ccd189",
"metadata": {},
"source": [
"这就是关于推断的全部内容了,仅用几分钟时间,我们就可以构建多个用于对文本进行推理的系统,而以前则需要熟练的机器学习开发人员数天甚至数周的时间。这非常令人兴奋,无论是对于熟练的机器学习开发人员还是对于新手来说,都可以使用prompt来非常快速地构建和开始相当复杂的自然语言处理任务。"
]
},
{
"cell_type": "markdown",
"id": "f88408ae-469a-4b02-a043-f6b4f0b14bf9",
"metadata": {},
"source": [
"## 尝试你的实验!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1bd3553f",
"metadata": {
"height": 30
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
"autoclose": false,
"autocomplete": true,
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 1,
"hotkeys": {
"equation": "Ctrl-E",
"itemize": "Ctrl-I"
},
"labels_anchors": false,
"latex_user_defs": false,
"report_style_numbering": false,
"user_envs_cfg": false
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {
"height": "calc(100% - 180px)",
"left": "10px",
"top": "150px",
"width": "256px"
},
"toc_section_display": true,
"toc_window_display": true
}
},
"nbformat": 4,
"nbformat_minor": 5
}
================================================
FILE: notebooks-zh/6. 转换 Transforming.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"id": "78624add",
"metadata": {},
"source": [
"## 1 引言"
]
},
{
"cell_type": "markdown",
"id": "2fac57c2",
"metadata": {},
"source": [
"LLM非常擅长将输入转换成不同的格式,例如多语种文本翻译、拼写及语法纠正、语气调整、格式转换等。\n",
"\n",
"本章节将介绍如何使用编程的方式,调用API接口来实现“文本转换”功能。"
]
},
{
"cell_type": "markdown",
"id": "f7816496",
"metadata": {},
"source": [
"首先,我们需要OpenAI包,加载API密钥,定义getCompletion函数。"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "ac57ad72",
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"import os\n",
"OPENAI_API_KEY = os.environ.get(\"OPENAI_API_KEY2\")\n",
"openai.api_key = OPENAI_API_KEY\n",
"\n",
"def get_completion(prompt, model=\"gpt-3.5-turbo\", temperature=0): \n",
" messages = [{\"role\": \"user\", \"content\": prompt}]\n",
" response = openai.ChatCompletion.create(\n",
" model=model,\n",
" messages=messages,\n",
" temperature=temperature, # 值越低则输出文本随机性越低\n",
" )\n",
" return response.choices[0].message[\"content\"]"
]
},
{
"cell_type": "markdown",
"id": "bf3733d4",
"metadata": {},
"source": [
"## 2 文本翻译"
]
},
{
"cell_type": "markdown",
"id": "1b418e32",
"metadata": {},
"source": [
"**中文转西班牙语**"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "8a5bee0c",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hola, me gustaría ordenar una batidora.\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"将以下中文翻译成西班牙语: \\ \n",
"```您好,我想订购一个搅拌机。```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "e3e922b4",
"metadata": {},
"source": [
"**识别语种**"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "c2c66002",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"这是法语。\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"请告诉我以下文本是什么语种: \n",
"```Combien coûte le lampadaire?```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "c1841354",
"metadata": {},
"source": [
"**多语种翻译**"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "b0c4fa41",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"中文:我想订购一个篮球。\n",
"英文:I want to order a basketball.\n",
"法语:Je veux commander un ballon de basket.\n",
"西班牙语:Quiero pedir una pelota de baloncesto.\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"请将以下文本分别翻译成中文、英文、法语和西班牙语: \n",
"```I want to order a basketball.```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "68723ba5",
"metadata": {},
"source": [
"**翻译+正式语气**"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "2c52ca54",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"正式语气:请问您需要订购枕头吗?\n",
"非正式语气:你要不要订一个枕头?\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"请将以下文本翻译成中文,分别展示成正式与非正式两种语气: \n",
"```Would you like to order a pillow?```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "b2dc4c56",
"metadata": {},
"source": [
"**通用翻译器**"
]
},
{
"cell_type": "markdown",
"id": "54b00aa4",
"metadata": {},
"source": [
"随着全球化与跨境商务的发展,交流的用户可能来自各个不同的国家,使用不同的语言,因此我们需要一个通用翻译器,识别各个消息的语种,并翻译成目标用户的母语,从而实现更方便的跨国交流。"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "21f3af91",
"metadata": {},
"outputs": [],
"source": [
"user_messages = [\n",
" \"La performance du système est plus lente que d'habitude.\", # System performance is slower than normal \n",
" \"Mi monitor tiene píxeles que no se iluminan.\", # My monitor has pixels that are not lighting\n",
" \"Il mio mouse non funziona\", # My mouse is not working\n",
" \"Mój klawisz Ctrl jest zepsuty\", # My keyboard has a broken control key\n",
" \"我的屏幕在闪烁\" # My screen is flashing\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "6a884190",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"原始消息 (法语): La performance du système est plus lente que d'habitude.\n",
"\n",
"中文翻译:系统性能比平时慢。\n",
"英文翻译:The system performance is slower than usual. \n",
"=========================================\n",
"原始消息 (西班牙语): Mi monitor tiene píxeles que no se iluminan.\n",
"\n",
"中文翻译:我的显示器有一些像素点不亮。\n",
"英文翻译:My monitor has pixels that don't light up. \n",
"=========================================\n",
"原始消息 (意大利语): Il mio mouse non funziona\n",
"\n",
"中文翻译:我的鼠标不工作了。\n",
"英文翻译:My mouse is not working. \n",
"=========================================\n",
"原始消息 (波兰语): Mój klawisz Ctrl jest zepsuty\n",
"\n",
"中文翻译:我的Ctrl键坏了\n",
"英文翻译:My Ctrl key is broken. \n",
"=========================================\n",
"原始消息 (中文): 我的屏幕在闪烁\n",
"\n",
"中文翻译:我的屏幕在闪烁。\n",
"英文翻译:My screen is flickering. \n",
"=========================================\n"
]
}
],
"source": [
"for issue in user_messages:\n",
" prompt = f\"告诉我以下文本是什么语种,直接输出语种,如法语,无需输出标点符号: ```{issue}```\"\n",
" lang = get_completion(prompt)\n",
" print(f\"原始消息 ({lang}): {issue}\\n\")\n",
"\n",
" prompt = f\"\"\"\n",
" 将以下消息分别翻译成英文和中文,并写成\n",
" 中文翻译:xxx\n",
" 英文翻译:yyy\n",
" 的格式:\n",
" ```{issue}```\n",
" \"\"\"\n",
" response = get_completion(prompt)\n",
" print(response, \"\\n=========================================\")"
]
},
{
"cell_type": "markdown",
"id": "6ab558a2",
"metadata": {},
"source": [
"## 3 语气/风格调整"
]
},
{
"cell_type": "markdown",
"id": "b85ae847",
"metadata": {},
"source": [
"写作的语气往往会根据受众对象而有所调整。例如,对于工作邮件,我们常常需要使用正式语气与书面用词,而对同龄朋友的微信聊天,可能更多地会使用轻松、口语化的语气。"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "84ce3099",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"尊敬的XXX(收件人姓名):\n",
"\n",
"您好!我是XXX(发件人姓名),在此向您咨询一个问题。上次我们交流时,您提到我们部门需要采购显示器,但我忘记了您所需的尺寸是多少英寸。希望您能够回复我,以便我们能够及时采购所需的设备。\n",
"\n",
"谢谢您的帮助!\n",
"\n",
"此致\n",
"\n",
"敬礼\n",
"\n",
"XXX(发件人姓名)\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"将以下文本翻译成商务信函的格式: \n",
"```小老弟,我小羊,上回你说咱部门要采购的显示器是多少寸来着?```\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "98df9009",
"metadata": {},
"source": [
"## 4 格式转换"
]
},
{
"cell_type": "markdown",
"id": "0bf9c074",
"metadata": {},
"source": [
"ChatGPT非常擅长不同格式之间的转换,例如JSON到HTML、XML、Markdown等。在下述例子中,我们有一个包含餐厅员工姓名和电子邮件的列表的JSON,我们希望将其从JSON转换为HTML。"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "fad3f358",
"metadata": {},
"outputs": [],
"source": [
"data_json = { \"resturant employees\" :[ \n",
" {\"name\":\"Shyam\", \"email\":\"shyamjaiswal@gmail.com\"},\n",
" {\"name\":\"Bob\", \"email\":\"bob32@gmail.com\"},\n",
" {\"name\":\"Jai\", \"email\":\"jai87@gmail.com\"}\n",
"]}"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "f54e7398",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<table>\n",
" <caption>resturant employees</caption>\n",
" <thead>\n",
" <tr>\n",
" <th>name</th>\n",
" <th>email</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>Shyam</td>\n",
" <td>shyamjaiswal@gmail.com</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Bob</td>\n",
" <td>bob32@gmail.com</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Jai</td>\n",
" <td>jai87@gmail.com</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n"
]
}
],
"source": [
"prompt = f\"\"\"\n",
"将以下Python字典从JSON转换为HTML表格,保留表格标题和列名:{data_json}\n",
"\"\"\"\n",
"response = get_completion(prompt)\n",
"print(response)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "a0026f3c",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<table>\n",
" <caption>resturant employees</caption>\n",
" <thead>\n",
" <tr>\n",
" <th>name</th>\n",
" <th>email</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>Shyam</td>\n",
" <td>shyamjaiswal@gmail.com</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Bob</td>\n",
" <td>bob32@gmail.com</td>\n",
" </tr>\n",
" <tr>\n",
" <td>Jai</td>\n",
" <td>jai87@gmail.com</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from IPython.display import display, Markdown, Latex, HTML, JSON\n",
"display(HTML(response))"
]
},
{
"cell_type": "markdown",
"id": "29b7167b",
"metadata": {},
"source": [
"## 5 拼写及语法纠正"
]
},
{
"cell_type": "markdown",
"id": "22776140",
"metadata": {},
"source": [
"拼写及语法的检查与纠正是一个十分常见的需求,特别是使用非母语语言,例如发表英文论文时,这是一件十分重要的事情。\n",
"\n",
"以下给了一个例子,有一个句子列表,其中有些句子存在拼写或语法问题,有些则没有,我们循环遍历每个句子,要求模型校对文本,如果正确则输出“未发现错误”,如果错误则输出纠正后的文本。"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "b7d04bc0",
"metadata": {},
"outputs": [],
"source": [
"text = [ \n",
" \"The girl with the black and white puppies have a ball.\", # The girl has a ball.\n",
" \"Yolanda has her notebook.\", # ok\n",
" \"Its going to be a long day. Does the car need it’s oil changed?\", # Homonyms\n",
" \"Their goes my freedom. There going to bring they’re suitcases.\", # Homonyms\n",
" \"Your going to need you’re notebook.\", # Homonyms\n",
" \"That medicine effects my ability to sleep. Have you heard of the butterfly affect?\", # Homonyms\n",
" \"This phrase is to cherck chatGPT for speling abilitty\" # spelling\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "1ef55b7b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 The girl with the black and white puppies has a ball.\n",
"1 未发现错误。\n",
"2 It's going to be a long day. Does the car need its oil changed?\n",
"3 Their goes my freedom. They're going to bring their suitcases.\n",
"4 输出:You're going to need your notebook.\n",
"5 That medicine affects my ability to sleep. Have you heard of the butterfly effect?\n",
"6 This phrase is to check chatGPT for spelling ability.\n"
]
}
],
"source": [
"for i in range(len(text)):\n",
" prompt = f\"\"\"请校对并更正以下文本,注意纠正文本保持原始语种,无需输出原始文本。\n",
" 如果您没有发现任何错误,请说“未发现错误”。\n",
" \n",
" 例如:\n",
" 输入:I are happy.\n",
" 输出:I am happy.\n",
" ```{text[i]}```\"\"\"\n",
" response = get_completion(prompt)\n",
" print(i, response)"
]
},
{
"cell_type": "markdown",
"id": "538181e0",
"metadata": {},
"source": [
"以下是一个简单的类Grammarly纠错示例,输入原始文本,输出纠正后的文本,并基于Redlines输出纠错过程。"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "6696b06a",
"metadata": {},
"outputs": [],
"source": [
"text = f\"\"\"\n",
"Got this for my daughter for her birthday cuz she keeps taking \\\n",
"mine from my room. Yes, adults also like pandas too. She takes \\\n",
"it everywhere with her, and it's super soft and cute. One of the \\\n",
"ears is a bit lower than the other, and I don't think that was \\\n",
"designed to be asymmetrical. It's a bit small for what I paid for it \\\n",
"though. I think there might be other options that are bigger for \\\n",
"the same price. It arrived a day earlier than expected, so I got \\\n",
"to play with it myself before I gave it to my daughter.\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "50cca36e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"I got this for my daughter's birthday because she keeps taking mine from my room. Yes, adults also like pandas too. She takes it everywhere with her, and it's super soft and cute. However, one of the ears is a bit lower than the other, and I don't think that was designed to be asymmetrical. It's also a bit smaller than I expected for the price. I think there might be other options that are bigger for the same price. On the bright side, it arrived a day earlier than expected, so I got to play with it myself before giving it to my daughter.\n"
]
gitextract_v09w7ej1/
├── .gitignore
├── README.md
├── notebooks-en/
│ ├── l2-guidelines.ipynb
│ ├── l3-iterative-prompt-development.ipynb
│ ├── l4-summarizing.ipynb
│ ├── l5-inferring.ipynb
│ ├── l6-transforming.ipynb
│ ├── l7-expanding.ipynb
│ └── l8-chatbot.ipynb
├── notebooks-zh/
│ ├── 1. 引言.md
│ ├── 2. 指南 Guidelines.ipynb
│ ├── 3. 迭代 Iterative.ipynb
│ ├── 4. 摘要 Summarizing.ipynb
│ ├── 5. 推断 Inferring.ipynb
│ ├── 6. 转换 Transforming.ipynb
│ ├── 7. 扩展 Expanding.ipynb
│ ├── 8. 聊天机器人 Chatbot.ipynb
│ └── 9. 总结.md
├── notes/
│ └── Prompt Engineering 提示工程 @Kevin的学堂.xmind
└── tutorial/
└── 20230723-ChatGPT最新注册教程.md
Condensed preview — 20 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (431K chars).
[
{
"path": ".gitignore",
"chars": 1886,
"preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\npdf/\ntnews-finetuning*\ncontribute.md\n*.drawio\n*.db\ncontent/metric.c"
},
{
"path": "README.md",
"chars": 2356,
"preview": "\n\n# 《ChatGPT"
},
{
"path": "notebooks-en/l2-guidelines.ipynb",
"chars": 16358,
"preview": "{\n \"cells\": [\n {\n \"attachments\": {},\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"[![Open In Co"
},
{
"path": "notebooks-en/l3-iterative-prompt-development.ipynb",
"chars": 9673,
"preview": "{\n \"cells\": [\n {\n \"attachments\": {},\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"[![Open In Co"
},
{
"path": "notebooks-en/l4-summarizing.ipynb",
"chars": 11862,
"preview": "{\n \"cells\": [\n {\n \"attachments\": {},\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"[![Open In Co"
},
{
"path": "notebooks-en/l5-inferring.ipynb",
"chars": 10956,
"preview": "{\n \"cells\": [\n {\n \"attachments\": {},\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"[![Open In Co"
},
{
"path": "notebooks-en/l6-transforming.ipynb",
"chars": 11252,
"preview": "{\n \"cells\": [\n {\n \"attachments\": {},\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"[![Open In Co"
},
{
"path": "notebooks-en/l7-expanding.ipynb",
"chars": 5615,
"preview": "{\n \"cells\": [\n {\n \"attachments\": {},\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"[![Open In Co"
},
{
"path": "notebooks-en/l8-chatbot.ipynb",
"chars": 8725,
"preview": "{\n \"cells\": [\n {\n \"attachments\": {},\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"[![Open In Co"
},
{
"path": "notebooks-zh/1. 引言.md",
"chars": 1903,
"preview": "# 简介\n\n**作者 吴恩达教授**\n\n欢迎来到本课程,我们将为开发人员介绍 ChatGPT 提示工程。本课程由 Isa Fulford 教授和我一起授课。Isa Fulford 是 OpenAI 的技术团队成员,曾开发过受欢迎的 Chat"
},
{
"path": "notebooks-zh/2. 指南 Guidelines.ipynb",
"chars": 35027,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"# 第二章 编写 Prompt 的原则\\n\",\n \"\\n\",\n "
},
{
"path": "notebooks-zh/3. 迭代 Iterative.ipynb",
"chars": 27717,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"# 迭代式提示开发\\n\",\n \"\\n\",\n \"当使用 LL"
},
{
"path": "notebooks-zh/4. 摘要 Summarizing.ipynb",
"chars": 17459,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"id\": \"b58204ea\",\n \"metadata\": {},\n \"source\": [\n \"# 文本概括 Summari"
},
{
"path": "notebooks-zh/5. 推断 Inferring.ipynb",
"chars": 23101,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"id\": \"3630c235-f891-4874-bd0a-5277d4d6aa82\",\n \"metadata\": {},\n \"so"
},
{
"path": "notebooks-zh/6. 转换 Transforming.ipynb",
"chars": 21023,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"id\": \"78624add\",\n \"metadata\": {},\n \"source\": [\n \"## 1 引言\"\n ]\n"
},
{
"path": "notebooks-zh/7. 扩展 Expanding.ipynb",
"chars": 15579,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {},\n \"source\": [\n \"# 第七章 扩展\\n\",\n \"\\n\",\n \"扩展是将短文本"
},
{
"path": "notebooks-zh/8. 聊天机器人 Chatbot.ipynb",
"chars": 104125,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"id\": \"a9183228-0ba6-4af9-8430-649e28868253\",\n \"metadata\": {\n \"id\""
},
{
"path": "notebooks-zh/9. 总结.md",
"chars": 636,
"preview": "恭喜你完成了这门短期课程。\n\n总的来说,在这门课程中,我们学习了关于prompt的两个关键原则:\n\n- 编写清晰具体的指令;\n- 如果适当的话,给模型一些思考时间。\n\n你还学习了迭代式prompt开发的方法,并了解了如何找到适合你应用程序的"
},
{
"path": "tutorial/20230723-ChatGPT最新注册教程.md",
"chars": 12259,
"preview": "# 20230723-ChatGPT 最新注册教程\n\n\n\n## 0 引言\n\n大家"
}
]
// ... and 1 more files (download for full content)
About this extraction
This page contains the full source code of the Kevin-free/chatgpt-prompt-engineering-for-developers GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 20 files (329.6 KB), approximately 113.3k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.