您现在的位置是:首页 >技术杂谈 >【深度学习 | Transformer】Transformers 教程:pipeline一键预测网站首页技术杂谈
【深度学习 | Transformer】Transformers 教程:pipeline一键预测
简介【深度学习 | Transformer】Transformers 教程:pipeline一键预测
文章目录
一、前言
Transformers 是用于自然语言处理 (NLP)、计算机视觉以及音频和语音处理任务的预训练最先进模型库。该库不仅包含 Transformer 模型,还包含非 Transformer 模型,例如用于计算机视觉任务的现代卷积网络。
pipeline()
可以加载多个模型让进行推理变得简单,即使没有使用特定模态的经验或不熟悉模型背后的底层代码,仍然可以使用它们通过pipeline()
进行推理。
二、Computer vision
2.1 Image classification
从一组预定义的类中标记图像。
from transformers import pipeline
classifier = pipeline(task="image-classification")
preds = classifier(
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
)
preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
输出结果为:
{'score': 0.4335, 'label': 'lynx, catamount'}
{'score': 0.0348, 'label': 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor'}
{'score': 0.0324, 'label': 'snow leopard, ounce, Panthera uncia'}
{'score': 0.0239, 'label': 'Egyptian cat'}
{'score': 0.0229, 'label': 'tiger cat'}
2.2 Object detection
目标检测识别图像对象以及对象在图像中的位置。
from transformers import pipeline
detector = pipeline(task="object-detection")
preds = detector(
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
)
preds = [{"score": round(pred["score"], 4), "label": pred["label"], "box": pred["box"]} for pred in preds]
输出结果为:
[{'score': 0.9865,
'label': 'cat',
'box': {'xmin': 178, 'ymin': 154, 'xmax': 882, 'ymax': 598}}]
2.3 Image segmentation
图像分割是一项像素级任务,它将图像中的每个像素分配给一个类别。
from transformers import pipeline
segmenter = pipeline(task="image-segmentation")
preds = segmenter(
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
)
preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
输出结果为:
{'score': 0.9879, 'label': 'LABEL_184'}
{'score': 0.9973, 'label': 'snow'}
{'score': 0.9972, 'label': 'cat'}
2.4 Depth estimation
预测图像中每个像素与相机的距离。
from transformers import pipeline
depth_estimator = pipeline(task="depth-estimation")
preds = depth_estimator(
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
)
三、NLP
3.1 Text classification
从一组预定义的类中标记一系列文本。
from transformers import pipeline
classifier = pipeline(task="sentiment-analysis")
preds = classifier("Hugging Face is the best thing since sliced bread!")
3.2 Token classification
为每个token分配定义类别中的标签。
from transformers import pipeline
classifier = pipeline(task="ner")
preds = classifier("Hugging Face is a French company based in New York City.")
3.3 Question answering
返回问题的答案,有时有上下文(开放域),有时没有上下文(封闭域)。
from transformers import pipeline
question_answerer = pipeline(task="question-answering")
preds = question_answerer(
question="What is the name of the repository?",
context="The name of the repository is huggingface/transformers",
)
3.4 Summarization
从较长的文本创建较短的版本,同时试图保留原始文档的大部分含义。
from transformers import pipeline
summarizer = pipeline(task="summarization")
summarizer(
"In this work, we presented the Transformer, the first sequence transduction model based entirely on attention, replacing the recurrent layers most commonly used in encoder-decoder architectures with multi-headed self-attention. For translation tasks, the Transformer can be trained significantly faster than architectures based on recurrent or convolutional layers. On both WMT 2014 English-to-German and WMT 2014 English-to-French translation tasks, we achieve a new state of the art. In the former task our best model outperforms even all previously reported ensembles."
)
3.5 Translation
将一种语言的转换为另一种语言。
from transformers import pipeline
text = "translate English to French: Hugging Face is a community-based open-source platform for machine learning."
translator = pipeline(task="translation", model="t5-small")
3.6 Language modeling
3.6.1 预测序列中的下一个单词
from transformers import pipeline
prompt = "Hugging Face is a community-based open-source platform for machine learning."
generator = pipeline(task="text-generation")
3.6.2 预测一个序列中的一个被屏蔽的token
text = "Hugging Face is a community-based open-source <mask> for machine learning."
fill_mask = pipeline(task="fill-mask")
风语者!平时喜欢研究各种技术,目前在从事后端开发工作,热爱生活、热爱工作。