# 高等用法

本篇文书档案涵盖了Requests的部分越来越高端的性格。


## 自定义身份验证

Requests允许你利用本人钦命的身份验证机制。

别的传递给请求方法的 auth
参数的可调用对象,在伸手发出在此之前都有时机械修理改请求。

自定义的身份验证机制是用作 requests.auth.AuthBase
的子类来贯彻的,也分外轻便定义。

Requests在 requests.auth 中提供了三种常见的的身份验证方案:
HTTPBasicAuthHTTPDigestAuth

要是大家有一个web服务,仅在 X-Pizza
头被设置为二个密码值的意况下才会有响应。即使这不太或者,
但就以它为例好了

from requests.auth import AuthBase

class PizzaAuth(AuthBase):
    """Attaches HTTP Pizza Authentication to the given Request object."""
    def __init__(self, username):
        # setup any auth-related data here
        self.username = username

    def __call__(self, r):
        # modify and return the request
        r.headers['X-Pizza'] = self.username
        return r

接下来就能够使用大家的PizzaAuth来张开网络请求:

>>> requests.get('http://pizzabin.org/admin', auth=PizzaAuth('kenneth'))
<Response [200]>

Example: Specific SSL Version

The Requests team has made a specific choice to use whatever SSL version
is default in the underlying library
(urllib3). Normally this is fine,
but from time to time, you might find yourself needing to connect to a
service-endpoint that uses a version that isn’t compatible with the
default.

You can use Transport Adapters for this by taking most of the existing
implementation of HTTPAdapter, and adding a parameter ssl_version
that gets passed-through to urllib3. We’ll make a TA that instructs
the library to use SSLv3:

import ssl

from requests.adapters import HTTPAdapter
from requests.packages.urllib3.poolmanager import PoolManager


class Ssl3HttpAdapter(HTTPAdapter):
    """"Transport adapter" that allows us to use SSLv3."""

    def init_poolmanager(self, connections, maxsize, block=False):
        self.poolmanager = PoolManager(num_pools=connections,
                                       maxsize=maxsize,
                                       block=block,
                                       ssl_version=ssl.PROTOCOL_SSLv3)

## 流式请求

使用
requests.Response.iter_lines()
你能够很有利地对流式API(举例
Twitter的流式API
)进行迭代。轻易地设置 stream 为 True 便得以运用
iter_lines()
对相应进行迭代:

import json
import requests

r = requests.get('http://httpbin.org/stream/20', stream=True)

for line in r.iter_lines():

    # filter out keep-alive new lines
    if line:
        print(json.loads(line))

## HTTP请求类型(附加例子)

Requests提供了差不离具备HTTP请求类型的效益:GET,OPTIONS,
HEAD,POST,PUT,PATCH和DELETE。
以下内容为运用Requests中的那些请求类型以及Github API提供了详见示例

自己将从最常使用的乞求类型GET伊始。HTTP
GET是一个幂等的法子,从给定的U凯雷德L再次来到四个能源。由此,
当你计划从3个web地点获取数据之时,你应该采用那么些请求类型。1个应用示例是尝试从Github上收获
关于一个一定commit的音信。借使我们想获取Requests的commit a050faf
的信息。大家得以 那样去做:

>>> import requests
>>> r = requests.get('https://api.github.com/repos/kennethreitz/requests/git/commits/a050faf084662f3a352dd1a941f2c7c9f886d4ad')

大家应当肯定Github是还是不是科学响应。如若没有错响应,我们想澄清响应内容是怎么着类型的。像这么去做:

>>> if (r.status_code == requests.codes.ok):
...     print r.headers['content-type']
...
application/json; charset=utf-8

足见,GitHub再次回到了JSON数据,相当好,那样就足以选用 r.json
方法把那么些重返的多少解析成Python对象。

>>> commit_data = r.json()
>>> print commit_data.keys()
[u'committer', u'author', u'url', u'tree', u'sha', u'parents', u'message']
>>> print commit_data[u'committer']
{u'date': u'2012-05-10T11:10:50-07:00', u'email': u'me@kennethreitz.com', u'name': u'Kenneth Reitz'}
>>> print commit_data[u'message']
makin' history

到近来甘休,一切都分外简单。嗯,大家来研商一下GitHub的API。我们得以去探视文书档案,
但假若选取Requests来钻探恐怕会越来越有意思一点。我们得以借助Requests的OPTIONS请求类型来看看大家刚使用过的url
辅助什么HTTP方法。

>>> verbs = requests.options(r.url)
>>> verbs.status_code
500

额,那是怎么回事?毫无扶助嘛!原来GitHub,与很多API提供方壹致,实际上并未有完成OPTIONS方法。
那是1个讨厌的忽视,但没什么,那大家得以选拔枯燥的文书档案。可是,假使GitHub正确贯彻了OPTIONS,
那么服务器应该在响应头中再次回到允许用户使用的HTTP方法,比如:

>>> verbs = requests.options('http://a-good-website.com/api/cats')
>>> print verbs.headers['allow']
GET,HEAD,POST,OPTIONS

转而去查看文书档案,大家看看对于提交消息,另一个允许的方式是POST,它会成立五个新的付出。
由于大家正在选择Requests代码库,大家应尽量制止对它发送愚拙的POST。作为代表,大家来
玩玩GitHub的Issue性子。

本篇文书档案是回答Issue
#482
而增进的。鉴于该难题早已存在,咱们就以它为例。先得到它。

>>> r = requests.get('https://api.github.com/repos/kennethreitz/requests/issues/482')
>>> r.status_code
200
>>> issue = json.loads(r.text)
>>> print issue[u'title']
Feature any http verb in docs
>>> print issue[u'comments']
3

Cool,有贰个评价。大家来看一下提起底贰个说三道四。

>>> r = requests.get(r.url + u'/comments')
>>> r.status_code
200
>>> comments = r.json()
>>> print comments[0].keys()
[u'body', u'url', u'created_at', u'updated_at', u'user', u'id']
>>> print comments[2][u'body']
Probably in the "advanced" section

嗯,那看起来就像是个迟钝之处。大家公布个评价来报告这些商量者他自个儿的工巧。那么,那一个争辨者是什么人吧?

>>> print comments[2][u'user'][u'login']
kennethreitz

好,大家来报告这几个叫Kenneth的玩意,那么些事例应该投身火速上手指南开中学。依照GitHub
API文书档案, 其艺术是POST到该话题。大家来试试看看。

>>> body = json.dumps({u"body": u"Sounds great! I'll get right on it!"})
>>> url = u"https://api.github.com/repos/kennethreitz/requests/issues/482/comments"
>>> r = requests.post(url=url, data=body)
>>> r.status_code
404

额,那有点奇异哈。恐怕我们需求表明身份。那就有点纠结了,对啊?不对。Requests简化了二种身份验证格局的应用,
包含丰裕普及的Basic Auth。

>>> from requests.auth import HTTPBasicAuth
>>> auth = HTTPBasicAuth('fake@example.com', 'not_a_real_password')
>>> r = requests.post(url=url, data=body, auth=auth)
>>> r.status_code
201
>>> content = r.json()
>>> print(content[u'body'])
Sounds great! I'll get right on it.

大好!噢,不!我本来是想说等笔者一会,因为自个儿得去喂一下自个儿的猫。假使作者力所能及编辑那条斟酌那就好了!
幸运的是,GitHub允许咱们选拔另二个HTTP动词,PATCH,来编排商量。大家来试试。

>>> print(content[u"id"])
5804413
>>> body = json.dumps({u"body": u"Sounds great! I'll get right on it once I feed my cat."})
>>> url = u"https://api.github.com/repos/kennethreitz/requests/issues/comments/5804413"
>>> r = requests.patch(url=url, data=body, auth=auth)
>>> r.status_code
200

格外好。以往,大家来折磨一下以此叫Kenneth的玩意儿,作者调节要让她急得团团转,也不告知她是自己在无中生有。
这象征本身想删除那条评论。GitHub允许我们应用完全名副其实的DELETE方法来删除讨论。我们来祛除该争执。

>>> r = requests.delete(url=url, auth=auth)
>>> r.status_code
204
>>> r.headers['status']
'204 No Content'

很好。不见了。最后壹件作者想掌握的业务是自己早已选用了多少限额(ratelimit)。查查看,GitHub在响应底部发送这一个音信,
由此不必下载整个网页,作者将动用1个HEAD请求来收获响应头。

>>> r = requests.head(url=url, auth=auth)
>>> print r.headers
...
'x-ratelimit-remaining': '4995'
'x-ratelimit-limit': '5000'
...

很好。是时候写个Python程序以各个激励的秘诀滥用GitHub的API,还足以选取49玖陆次啊。


## SSL证书验证

Requests可以为HTTPS请求验证SSL证书,就好像web浏览器同样。要想检查某些主机的SSL证书,你可以动用
verify 参数:

>>> requests.get('https://kennethreitz.com', verify=True)
requests.exceptions.SSLError: hostname 'kennethreitz.com' doesn't match either of '*.herokuapp.com', 'herokuapp.com'

在该域名上本身从不安装SSL,所以战败了。但Github设置了SSL:

>>> requests.get('https://github.com', verify=True)
<Response [200]>

对此个人证书,你也得以传递三个CA_BUNDLE文件的路子给 verify
。你也足以设置 REQUEST_CA_BUNDLE 情形变量。

设若您将verify设置为False,Requests也能忽略对SSL证书的证实。

>>> requests.get('https://kennethreitz.com', verify=False)
<Response [200]>

默认情况下, verify 是设置为True的。选项 verify
仅使用于主机证书。

你也得以内定3个本土证书用作客户端证书,能够是单个文件(包括密钥和证书)或三个暗含多少个文本路线的元组:

>>> requests.get('https://kennethreitz.com', cert=('/path/server.crt', '/path/key'))
<Response [200]>

假如你钦赐了二个张冠李戴路径或1个不算的证件:

>>> requests.get('https://kennethreitz.com', cert='/wrong_path/server.pem')
SSLError: [Errno 336265225] _ssl.c:347: error:140B0009:SSL routines:SSL_CTX_use_PrivateKey_file:PEM lib

## Blocking Or Non-Blocking?

With the default Transport Adapter in place, Requests does not provide
any kind of non-blocking IO. The
Response.content
property will block until the entire response has been downloaded. If
you require more granularity, the streaming features of the library (see
流式请求)
allow you to retrieve smaller quantities of the response at a time.
However, these calls will still block.

If you are concerned about the use of blocking IO, there are lots of
projects out there that combine Requests with one of Python’s
asynchronicity frameworks. Two excellent examples are
grequests and
requests-futures.


## 保持活动状态(持久连接)

好消息 –
归功于urllib3,同1会话内的持之以恒连接是一心自动处理的!同壹会话内你发出的其余请求都会活动复用妥当的连年!

注意:唯有具有的响应体数据被读取实现连接才会被释放为连接池;所以确认保证将
stream 设置为 False 或读取 Response 对象的 content
属性。


## 会话对象

对话对象让你可以跨请求保持某个参数。它也会在同3个Session实例发出的有所请求之间维持cookies。

对话对象具有重大的Requests API的全体办法。

大家来跨请求保持一些cookies:

s = requests.Session()

s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
r = s.get("http://httpbin.org/cookies")

print(r.text)
# '{"cookies": {"sessioncookie": "123456789"}}'

对话也可用来为呼吁方法提供缺省数据。这是通过为对话对象的习性提供数据来兑现的:

s = requests.Session()
s.auth = ('user', 'pass')
s.headers.update({'x-test': 'true'})

# both 'x-test' and 'x-test2' are sent
s.get('http://httpbin.org/headers', headers={'x-test2': 'true'})

任何你传递给请求方法的字典都会与已安装会话层数据统1。方法层的参数覆盖会话的参数。

从字典参数中移除3个值
神跡你会想省略字典参数中有的会话层的键。要产生那或多或少,你只需轻便地在点子层参数中校那些键的值设置为
None ,那多少个键就会被自动省略掉。

包含在3个会话中的全部数据你都足以一贯动用。学习更加多细节请阅读
会话API文档


## 响应体内容专业流

暗中认可境况下,当您举行互连网请求后,响应体会立即被下载。你能够透过
stream 参数覆盖这一个行为,推迟下载响应体直到访问
Response.content 属性:

tarball_url = 'https://github.com/kennethreitz/requests/tarball/master'
r = requests.get(tarball_url, stream=True)

那时候仅有响应头被下载下来了,总是保持开采状态,因而同意大家依照规范获得内容:

if int(r.headers['content-length']) < TOO_LONG:
    content = r.content
    ...

你能够更进一步行使
Response.iter_content

Response.iter_lines
方法来调整专门的学业流,大概以
Response.raw
从底层urllib3的 urllib3.HTTPResponse <urllib3.response.HTTPResponse
读取。

若果当您请求时设置streamTrue,Requests将不能够自由那些接二连三为连接池,除非您读取了任何数据依旧调用了Response.close,那样会使连接变得低作用。要是当您设置
stream = True
时您发觉你协和有个别地读取了响应体数据(也许完全没读取响应体数据),你应有思索使用contextlib.closing,比方上边包车型客车例子:

from contextlib import closing

with closing(requests.get('http://httpbin.org/get', stream=True)) as r:
    # Do things with the response here.

## 代理

若果急需利用代理,你能够经过为随机请求方法提供 proxies
参数来布署单个请求:

import requests

proxies = {
  "http": "http://10.10.1.10:3128",
  "https": "http://10.10.1.10:1080",
}

requests.get("http://example.org", proxies=proxies)

你也能够透过景况变量 HTTP_PROXYHTTPS_PROXY 来配置代理。

$ export HTTP_PROXY="http://10.10.1.10:3128"
$ export HTTPS_PROXY="http://10.10.1.10:1080"
$ python

>>> import requests
>>> requests.get("http://example.org")

若您的代办须要动用HTTP Basic Auth,能够选择
http://user:password@host/ 语法:

proxies = {
    "http": "http://user:pass@10.10.1.10:3128/",
}

## 响应头链接字段

众多HTTP
API都有响应头链接字段的风味,它们使得API能够更加好地自己描述和本身露出。

GitHub在API中为 分页
使用这个特点,比如:

>>> url = 'https://api.github.com/users/kennethreitz/repos?page=1&per_page=10'
>>> r = requests.head(url=url)
>>> r.headers['link']
'<https://api.github.com/users/kennethreitz/repos?page=2&per_page=10>; rel="next", <https://api.github.com/users/kennethreitz/repos?page=6&per_page=10>; rel="last"'

Requests会活动分析那一个响应头链接字段,并使得它们格外便于使用:

>>> r.links["next"]
{'url': 'https://api.github.com/users/kennethreitz/repos?page=2&per_page=10', 'rel': 'next'}

>>> r.links["last"]
{'url': 'https://api.github.com/users/kennethreitz/repos?page=7&per_page=10', 'rel': 'last'}

编码格局

当您接到2个响应时,Requests会困惑响应的编码格局,用于在您调用
Response.text 方法时
对响应进行解码。Requests首先在HTTP底部检验是或不是存在钦赐的编码方式,假使不设有,则会采取charade 来品尝推断编码格局。

只有当HTTP底部不设有显著钦定的字符集,并且 Content-Type 底部字段包涵text 值之时, Requests才不去预计编码方式。

在那种意况下, CRUISERFC 261陆 钦点私下认可字符集 必须是 ISO-8859-1。Requests服从那1行业内部。假如您供给壹种不一样的编码格局,你可以手动设置
Response.encoding 属性,或应用原有的 Response.content
(可构成上壹篇安装使用高效上手中的 壹呼百应内容 学习)


## POST 八个编码(Multipart-Encoded)文件

您能够在3个请求中发送四个文本,比如,假使你期望上传图像文件到贰个分包三个文件字段‘images’的HTML表单

<input type=”file” name=”images” multiple=”true” required=”true”/>

高达那一个目的,仅仅只必要设置文件到多少个含有(form_field_name,
file_info)的元组的列表

>>> url = 'http://httpbin.org/post'
>>> multiple_files = [('images', ('foo.png', open('foo.png', 'rb'), 'image/png')),
                      ('images', ('bar.png', open('bar.png', 'rb'), 'image/png'))]
>>> r = requests.post(url, files=multiple_files)
>>> r.text
{
  ...
  'files': {'images': ' ....'}
  'Content-Type': 'multipart/form-data; boundary=3131623adb2043caaeb5538cc7aa0b3a',
  ...
}

## Timeouts

Most requests to external servers should have a timeout attached, in
case the server is not responding in a timely manner. Without a timeout,
your code may hang for minutes or more.

The connect timeout is the number of seconds Requests will wait for your
client to establish a connection to a remote machine (corresponding to
the connect()) call on the socket.
It’s a good practice to set connect timeouts to slightly larger than a
multiple of 3, which is the default TCP packet retransmission
window
.

Once your client has connected to the server and sent the HTTP request,
the read timeout is the number of seconds the client will wait for the
server to send a response. (Specifically, it’s the number of seconds
that the client will wait between bytes sent from the server. In 99.9%
of cases, this is the time before the server sends the first byte).

If you specify a single value for the timeout, like this:

r = requests.get('https://github.com', timeout=5)

The timeout value will be applied to both the connect and the read
timeouts. Specify a tuple if you would like to set the values
separately:

r = requests.get('https://github.com', timeout=(3.05, 27))

If the remote server is very slow, you can tell Requests to wait forever
for a response, by passing None as a timeout value and then retrieving a
cup of coffee.

r = requests.get('https://github.com', timeout=None)

## CA Certificates

By default Requests bundles a set of root CAs that it trusts, sourced
from the Mozilla trust
store
.
However, these are only updated once for each Requests version. This
means that if you pin a Requests version your certificates can become
extremely out of date.

From Requests version 2.4.0 onwards, Requests will attempt to use
certificates from certifi if it is present on the
system. This allows for users to update their trusted certificates
without having to change the code that runs on their system.

For the sake of security we recommend upgrading certifi frequently!


证实:前边有个别官方文书档案没翻译到的,笔者自身翻译了,后一片段,时间太晚了,是在没精力了,未来有时光再翻译,只怕作者翻译的略微语句不通畅,不过还能够大约表达出意思的,要是你比较了官方文书档案,感觉您能够翻译得更加好,能够私信或留言小编哦

想喷作者的人也省省吧,的确,那篇小说和前边的一篇Requests安装使用都以本身从官方网站移植过来的,可是作者花时间翻译了一部分,排版也废了番武术,使用马克Down写成,须求源md文书档案也能够找小编索要,本文随便传播


自家是Akkuman,同道人能够和自身联合调换啊,私信或留言均可,我的博客hacktech.cn
| akkuman.cnblogs.com


## 块编码请求

对此出去和进入的央求,Requests也帮忙分块传输编码。要发送三个块编码的请求,仅需为你的请求体提供一个生成器(或随便未有实际尺寸(without
a length)的迭代器):

def gen():
    yield 'hi'
    yield 'there'

requests.post('http://some.url/chunked', data=gen())

## 请求与响应对象

其余时候调用requests.*()你都在做两件首要的事体。其一,你在创设三个Request 对象, 该对象将被发送到有个别服务器请求或询问部分财富。其2,壹旦
requests 获得贰个从 服务器重临的响应就会生出3个 Response
对象。该响应对象涵盖服务器再次来到的有着音信, 也带有你本来创造的 Request
对象。如下是一个简约的伏乞,从Wikipedia的服务器拿到 一些要命主要的音信:

>>> r = requests.get('http://en.wikipedia.org/wiki/Monty_Python')

固然想拜会服务器重返给咱们的响应底部音讯,能够这么做:

>>> r.headers
{'content-length': '56170', 'x-content-type-options': 'nosniff', 'x-cache':
'HIT from cp1006.eqiad.wmnet, MISS from cp1010.eqiad.wmnet', 'content-encoding':
'gzip', 'age': '3080', 'content-language': 'en', 'vary': 'Accept-Encoding,Cookie',
'server': 'Apache', 'last-modified': 'Wed, 13 Jun 2012 01:33:50 GMT',
'connection': 'close', 'cache-control': 'private, s-maxage=0, max-age=0,
must-revalidate', 'date': 'Thu, 14 Jun 2012 12:59:39 GMT', 'content-type':
'text/html; charset=UTF-8', 'x-cache-lookup': 'HIT from cp1006.eqiad.wmnet:3128,
MISS from cp1010.eqiad.wmnet:80'}

然而,如若想获取发送到服务器的请求的头顶,大家得以省略地走访该请求,然后是该请求的头顶:

>>> r.request.headers
{'Accept-Encoding': 'identity, deflate, compress, gzip',
'Accept': '*/*', 'User-Agent': 'python-requests/0.13.1'}

## 合规性

Requests符合全部有关的正规化和CR-VFC,那样不会为用户变成不须求的困难。但那种对职业的设想
导致一些行为对于不熟悉有关标准的人来讲看似有个别不敢相信 无法相信。

## 流式上传

Requests协助流式上传,那允许你发送大的数据流或文件而无需先把它们读入内存。要使用流式上传,仅需为你的请求体提供多少个类公事对象就能够:

with open('massive-body') as f:
    requests.post('http://some.url/streamed', data=f)

## Transport Adapters

As of v1.0.0, Requests has moved to a modular internal design. Part of
the reason this was done was to implement Transport Adapters, originally
described
here
.
Transport Adapters provide a mechanism to define interaction methods for
an HTTP service. In particular, they allow you to apply per-service
configuration.

Requests ships with a single Transport Adapter, the HTTPAdapter. This
adapter provides the default Requests interaction with HTTP and HTTPS
using the powerful urllib3 library.
Whenever a Requests
Session
is initialized, one of these is attached to the
Session
object for HTTP, and one for HTTPS.

Requests enables users to create and use their own Transport Adapters
that provide specific functionality. Once created, a Transport Adapter
can be mounted to a Session object, along with an indication of which
web services it should apply to.

>>> s = requests.Session()
>>> s.mount('http://www.github.com', MyAdapter())

The mount call registers a specific instance of a Transport Adapter to a
prefix. Once mounted, any HTTP request made using that session whose URL
starts with the given prefix will use the given Transport Adapter.

Many of the details of implementing a Transport Adapter are beyond the
scope of this documentation, but take a look at the next example for a
simple SSL use- case. For more than that, you might look at subclassing
requests.adapters.BaseAdapter.

## 事件挂钩

Requests有1个钩子系统,你能够用来操控部分请求进程,或复信号事件管理。

可用的钩:

response:

从二个伸手产生的响应

您能够经过传递2个 {hook_name: callback_function} 字典给 hooks
请求参数 为种种请求分配1个钩子函数:

hooks=dict(response=print_url)

callback_function 会接受二个数据块作为它的首先个参数。

def print_url(r):
    print(r.url)

若实践你的回调函数时期产生错误,系统会提交四个警戒。

若回调函数重返三个值,私下认可以该值替换传进来的数据。若函数未回来任何事物,
也从没什么样其余的震慑。

笔者们来在运作时期打字与印刷一些请求方法的参数:

>>> requests.get('http://httpbin.org', hooks=dict(response=print_url))
http://httpbin.org
<Response [200]>


## Prepared Request

当您从API调用或Session调用得到3个Response目的,对于这些的request属性实际上是被利用的PreparedRequest,在少数景况下您大概希望在出殡和埋葬请求从前对body和headers(或任丁芯西)做些额外的做事,一个粗略的例证如下:

from requests import Request, Session

s = Session()
req = Request('GET', url,
    data=data,
    headers=header
)
prepped = req.prepare()

# do something with prepped.body
# do something with prepped.headers

resp = s.send(prepped,
    stream=stream,
    verify=verify,
    proxies=proxies,
    cert=cert,
    timeout=timeout
)

print(resp.status_code)

因为您未曾用Request对象做其它特别的事体,你应该马上封装它和改造PreparedRequest 对象,然后指导着你想要发送到requests.* 或
Session.*的其他参数来发送它

不过,上边包车型大巴代码会丧失一些Requests
Session对象的优势,越发的,Session层的情况比方cookies不会被利用到你的别的请求中,要使它获得应用,你能够用Session.prepare_request()来替换
Request.prepare(),比方下边包车型地铁例子:

from requests import Request, Session

s = Session()
req = Request('GET',  url,
    data=data
    headers=headers
)

prepped = s.prepare_request(req)

# do something with prepped.body
# do something with prepped.headers

resp = s.send(prepped,
    stream=stream,
    verify=verify,
    proxies=proxies,
    cert=cert,
    timeout=timeout
)

print(resp.status_code)

相关文章

网站地图xml地图