# 高级用法

本篇文档涵盖了Requests的有的越来越高档的特点。


## 会话对象

对话对象被你能够跨请求保持某些参数。它吧会于同一个Session实例发出的装有请求中保持cookies。

对话对象有重要性的Requests API的有方。

咱来过请求保持有cookies:

s = requests.Session()

s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
r = s.get("http://httpbin.org/cookies")

print(r.text)
# '{"cookies": {"sessioncookie": "123456789"}}'

对话也可用来呢呼吁方法提供缺省数据。这是通过也对话对象的属性提供数据来兑现的:

s = requests.Session()
s.auth = ('user', 'pass')
s.headers.update({'x-test': 'true'})

# both 'x-test' and 'x-test2' are sent
s.get('http://httpbin.org/headers', headers={'x-test2': 'true'})

其余你传递让要方法的字典都见面与已设置会话层数据统一。方法层的参数覆盖会话的参数。

自从字典参数中移除一个价
偶然你晤面想看看略字典参数中有些会见话层的键。要就及时或多或少,你不过待简地当术层参数中将那个键的值设置也
None ,那个键就见面被自动省有点掉。

含在一个会话中的备数据而都得一直使用。学习再多细节要阅读
会话API文档


## 请求与应对象

其余时刻调用requests.*()你还当开片桩重大的事务。其一,你以构建一个
Request 对象, 该对象将于发送至某服务器请求或询问有资源。其二,一旦
requests 得到一个从 服务器返回的应就见面发生一个 Response
对象。该应对象涵盖服务器返回的备信息, 也饱含你本创建的 Request
对象。如下是一个概括的呼吁,从Wikipedia的服务器得到 一些老大重要的消息:

>>> r = requests.get('http://en.wikipedia.org/wiki/Monty_Python')

一经想拜会服务器返回给咱的响应头部信息,可以如此做:

>>> r.headers
{'content-length': '56170', 'x-content-type-options': 'nosniff', 'x-cache':
'HIT from cp1006.eqiad.wmnet, MISS from cp1010.eqiad.wmnet', 'content-encoding':
'gzip', 'age': '3080', 'content-language': 'en', 'vary': 'Accept-Encoding,Cookie',
'server': 'Apache', 'last-modified': 'Wed, 13 Jun 2012 01:33:50 GMT',
'connection': 'close', 'cache-control': 'private, s-maxage=0, max-age=0,
must-revalidate', 'date': 'Thu, 14 Jun 2012 12:59:39 GMT', 'content-type':
'text/html; charset=UTF-8', 'x-cache-lookup': 'HIT from cp1006.eqiad.wmnet:3128,
MISS from cp1010.eqiad.wmnet:80'}

而是,如果想获得发送至服务器的恳求的脑壳,我们得以大概地看该要,然后是该要的头部:

>>> r.request.headers
{'Accept-Encoding': 'identity, deflate, compress, gzip',
'Accept': '*/*', 'User-Agent': 'python-requests/0.13.1'}

## Prepared Request

当您自API调用或Session调用得到一个Response靶,对于这个的request属性实际上是为采用的PreparedRequest,在好几情况下您也许想以发送请求之前对body和headers(或其它东西)做来额外的做事,一个简约的例证如下:

from requests import Request, Session

s = Session()
req = Request('GET', url,
    data=data,
    headers=header
)
prepped = req.prepare()

# do something with prepped.body
# do something with prepped.headers

resp = s.send(prepped,
    stream=stream,
    verify=verify,
    proxies=proxies,
    cert=cert,
    timeout=timeout
)

print(resp.status_code)

以您莫用Request对象做另外特别之事体,你应当立即封装它与修改
PreparedRequest 对象,然后带着公想只要发送至requests.* 或
Session.*的别样参数来发送它

唯独,上面的代码会丧失一些Requests
Session对象的优势,特别的,Session层的状态仍cookies不会见叫采取至公的其他请求被,要使它们获应用,你可以用Session.prepare_request()来替换
Request.prepare(),比如下面的例证:

from requests import Request, Session

s = Session()
req = Request('GET',  url,
    data=data
    headers=headers
)

prepped = s.prepare_request(req)

# do something with prepped.body
# do something with prepped.headers

resp = s.send(prepped,
    stream=stream,
    verify=verify,
    proxies=proxies,
    cert=cert,
    timeout=timeout
)

print(resp.status_code)

## SSL证书验证

Requests可以吗HTTPS请求验证SSL证书,就如web浏览器同样。要惦记检查有主机的SSL证书,你可以使
verify 参数:

>>> requests.get('https://kennethreitz.com', verify=True)
requests.exceptions.SSLError: hostname 'kennethreitz.com' doesn't match either of '*.herokuapp.com', 'herokuapp.com'

以该域名及本身没装SSL,所以失败了。但Github设置了SSL:

>>> requests.get('https://github.com', verify=True)
<Response [200]>

对个人证书,你吧足以传递一个CA_BUNDLE文本之路线为 verify
。你为堪设置 REQUEST_CA_BUNDLE 环境变量。

如您用verify设置为False,Requests也能忽略本着SSL证书的证明。

>>> requests.get('https://kennethreitz.com', verify=False)
<Response [200]>

默认情况下, verify 是安也True的。选项 verify
仅以为主机证书。

若为堪指定一个本地证书用作客户端证书,可以是单科文件(包含密钥和证书)或一个含有两只文本路径的元组:

>>> requests.get('https://kennethreitz.com', cert=('/path/server.crt', '/path/key'))
<Response [200]>

只要您指定了一个错误路线或一个没用的证书:

>>> requests.get('https://kennethreitz.com', cert='/wrong_path/server.pem')
SSLError: [Errno 336265225] _ssl.c:347: error:140B0009:SSL routines:SSL_CTX_use_PrivateKey_file:PEM lib

## 响应体内容工作流

默认情况下,当你进行网络要后,响应体会立即于下载。你可通过
stream 参数覆盖是作为,推迟下载响应体直到访问
Response.content 属性:

tarball_url = 'https://github.com/kennethreitz/requests/tarball/master'
r = requests.get(tarball_url, stream=True)

这儿一味来响应头被下充斥下来了,连保持开拓状态,因此同意我们根据标准得到内容:

if int(r.headers['content-length']) < TOO_LONG:
    content = r.content
    ...

汝得更加行使
Response.iter_content

Response.iter_lines
方法来控制工作流,或者坐
Response.raw
从底层urllib3的 urllib3.HTTPResponse <urllib3.response.HTTPResponse
读取。

若当你要时设置streamTrue,Requests将不可知自由这个连续为连接池,除非您读取了百分之百数目还是调用了Response.close,这样会如连接变得不比效率。如果当您设置
stream = True
时若意识而自己有地读博了响应体数据(或者完全没有读取响应体数据),你应当考虑动用contextlib.closing,比如下面的事例:

from contextlib import closing

with closing(requests.get('http://httpbin.org/get', stream=True)) as r:
    # Do things with the response here.

## 保持活动状态(持久连接)

好消息 –
归功于urllib3,同一会话内之恒久连接是截然自动处理的!同一会话内而有之另要都见面自行复用恰当的连天!

注意:只有具备的响应体数据为读取了连接才会受放走也连接池;所以保证以
stream 设置为 False 或读取 Response 对象的 content
属性。


## 流式上传

Requests支持流式上污染,这允许而发送大之数据流或文件要无需先将她读入内存。要下流式上传,仅用呢公的请求体提供一个像样公事对象即可:

with open('massive-body') as f:
    requests.post('http://some.url/streamed', data=f)

## 块编码请求

于下和入的呼吁,Requests也支撑分块传输编码。要发送一个片编码的求,仅需呢公的请求体提供一个生成器(或随意没有切实可行尺寸(without
a length)的迭代器):

def gen():
    yield 'hi'
    yield 'there'

requests.post('http://some.url/chunked', data=gen())

## POST 多单编码(Multipart-Encoded)文件

您得当一个求中发送多个公文,例如,假要你望达到污染图像文件到一个含有多只文本字段‘images’的HTML表单

<input type=”file” name=”images” multiple=”true” required=”true”/>

落得这目的,仅仅只需要设置文件及一个富含(form_field_name,
file_info)的元组的列表

>>> url = 'http://httpbin.org/post'
>>> multiple_files = [('images', ('foo.png', open('foo.png', 'rb'), 'image/png')),
                      ('images', ('bar.png', open('bar.png', 'rb'), 'image/png'))]
>>> r = requests.post(url, files=multiple_files)
>>> r.text
{
  ...
  'files': {'images': ' ....'}
  'Content-Type': 'multipart/form-data; boundary=3131623adb2043caaeb5538cc7aa0b3a',
  ...
}

## 事件挂钩

Requests有一个钩子系统,你得用来操控部分要过程,或信号事件处理。

可用之钩子:

response:

起一个告产生的应

乃可透过传递一个 {hook_name: callback_function} 字典给 hooks
请求参数 为每个请求分配一个钩子函数:

hooks=dict(response=print_url)

callback_function 会接受一个数据块作为它们的首先单参数。

def print_url(r):
    print(r.url)

使执行你的回调函数期间生错误,系统会给闹一个警示。

若果回调函数返回一个价值,默认为该值替换传进的数目。若函数不返回外东西,
也从不什么其他的影响。

咱们来当运作期间打印一些告方法的参数:

>>> requests.get('http://httpbin.org', hooks=dict(response=print_url))
http://httpbin.org
<Response [200]>

## 自定义身份验证

Requests允许你下好指定的身份验证机制。

另传递让要方法的 auth
参数的可调用对象,在请发出前都发生时机修改要。

自定义的身份验证机制是当 requests.auth.AuthBase
的子类来落实的,也非常容易定义。

Requests在 requests.auth 中提供了个别栽普遍的底身份验证方案:
HTTPBasicAuthHTTPDigestAuth

而我们出一个web服务,仅以 X-Pizza
头深受设置为一个密码值的情下才会有应。虽然就不顶可能,
但就因她吧例好了

from requests.auth import AuthBase

class PizzaAuth(AuthBase):
    """Attaches HTTP Pizza Authentication to the given Request object."""
    def __init__(self, username):
        # setup any auth-related data here
        self.username = username

    def __call__(self, r):
        # modify and return the request
        r.headers['X-Pizza'] = self.username
        return r

接下来就好下我们的PizzaAuth来拓展网络要:

>>> requests.get('http://pizzabin.org/admin', auth=PizzaAuth('kenneth'))
<Response [200]>

## 流式请求

使用
requests.Response.iter_lines()
你得生有益地对准流式API(例如
Twitter的流式API
)进行迭代。简单地设置 stream 为 True 便可运用
iter_lines()
对相应进行迭代:

import json
import requests

r = requests.get('http://httpbin.org/stream/20', stream=True)

for line in r.iter_lines():

    # filter out keep-alive new lines
    if line:
        print(json.loads(line))

## 代理

假设要采用代理,你得经也擅自请求方法供 proxies
参数来部署单个请求:

import requests

proxies = {
  "http": "http://10.10.1.10:3128",
  "https": "http://10.10.1.10:1080",
}

requests.get("http://example.org", proxies=proxies)

卿啊得通过环境变量 HTTP_PROXYHTTPS_PROXY 来配置代理。

$ export HTTP_PROXY="http://10.10.1.10:3128"
$ export HTTPS_PROXY="http://10.10.1.10:1080"
$ python

>>> import requests
>>> requests.get("http://example.org")

倘您的代理要使用HTTP Basic Auth,可以使用
http://user:password@host/ 语法:

proxies = {
    "http": "http://user:pass@10.10.1.10:3128/",
}

## 合规性

Requests符合所有有关的规范以及RFC,这样非见面为用户造成不必要之不方便。但这种对标准之设想
导致一些表现对无熟悉有关标准之总人口吧看似有些意外。

编码方式

当你收到一个应时,Requests会蒙响应的编码方式,用于在你调用
Response.text 方法时
对响应进行解码。Requests首先以HTTP头部检测是否在指定的编码方式,如果未有,则会动
charade 来品尝猜测编码方式。

惟有当HTTP头部不存在明显指定的字符集,并且 Content-Type 头部字段包含
text 值之常, Requests才不失去猜测编码方式。

每当这种景象下, RFC 2616 指定默认字符集 必须是 ISO-8859-1
。Requests遵从马上同业内。如果您用一致种不同的编码方式,你可手动设置
Response.encoding 属性,或以原之 Response.content
(可做上等同篇安装使用速直达手中的 响应内容 学习)


## HTTP请求类型(附加例子)

Requests提供了几拥有HTTP请求类型的功效:GET,OPTIONS,
HEAD,POST,PUT,PATCH和DELETE。
以下内容为用Requests中的这些请求类型和Github API提供了详尽示例

自家用从今最常使用的恳求类型GET开始。HTTP
GET是一个幂等的计,从给定的URL返回一个资源。因而,
当你打算打一个web位置获取数据之常,你当下此要类型。一个采取示例是尝尝从Github上取
关于一个特定commit的音信。假设我们纪念获得Requests的commit a050faf
的音讯。我们得 这样夺做:

>>> import requests
>>> r = requests.get('https://api.github.com/repos/kennethreitz/requests/git/commits/a050faf084662f3a352dd1a941f2c7c9f886d4ad')

咱俩应肯定Github是否正确响应。如果是响应,我们纪念澄清响应内容是什么品种的。像这么去开:

>>> if (r.status_code == requests.codes.ok):
...     print r.headers['content-type']
...
application/json; charset=utf-8

看得出,GitHub返回了JSON数据,非常好,这样即便可下 r.json
方法将此返回的数码解析成Python对象。

>>> commit_data = r.json()
>>> print commit_data.keys()
[u'committer', u'author', u'url', u'tree', u'sha', u'parents', u'message']
>>> print commit_data[u'committer']
{u'date': u'2012-05-10T11:10:50-07:00', u'email': u'me@kennethreitz.com', u'name': u'Kenneth Reitz'}
>>> print commit_data[u'message']
makin' history

及目前为止,一切还非常简单。嗯,我们来研究一下GitHub的API。我们得以去看文档,
但如果利用Requests来研讨可能会又有意思一点。我们好借助Requests的OPTIONS请求类型来看望我们正下过的url
支持什么HTTP方法。

>>> verbs = requests.options(r.url)
>>> verbs.status_code
500

额,这是怎么回事?毫无帮助嘛!原来GitHub,与博API提供着一致,实际上没有实现OPTIONS方法。
这是一个讨厌的不经意,但没关系,那我们好运用枯燥的文档。然而,如果GitHub正确贯彻了OPTIONS,
那么服务器应该当应头着回到允许用户采取的HTTP方法,例如:

>>> verbs = requests.options('http://a-good-website.com/api/cats')
>>> print verbs.headers['allow']
GET,HEAD,POST,OPTIONS

转移而错过查文档,我们看出对于提交信息,另一个同意的方是POST,它见面创造一个新的付出。
由于我们正在下Requests代码库,我们承诺竭尽避免对它发送笨拙的POST。作为替代,我们来
玩玩GitHub的Issue特性。

本篇文档是回答Issue
#482
如果添加的。鉴于该问题已在,我们就因为她呢条例。先抱她。

>>> r = requests.get('https://api.github.com/repos/kennethreitz/requests/issues/482')
>>> r.status_code
200
>>> issue = json.loads(r.text)
>>> print issue[u'title']
Feature any http verb in docs
>>> print issue[u'comments']
3

Cool,有3个评价。我们来拘禁一下终极一个评价。

>>> r = requests.get(r.url + u'/comments')
>>> r.status_code
200
>>> comments = r.json()
>>> print comments[0].keys()
[u'body', u'url', u'created_at', u'updated_at', u'user', u'id']
>>> print comments[2][u'body']
Probably in the "advanced" section

哦,那圈起似乎是个愚蠢的处在。我们上个评价来报告这评论者他自己之痴呆。那么,这个评论者是谁吗?

>>> print comments[2][u'user'][u'login']
kennethreitz

好,我们来报告这为肯尼思的枪炮,这个事例应该置身快速达标手指南中。根据GitHub
API文档, 其方式是POST到该话题。我们来尝试看。

>>> body = json.dumps({u"body": u"Sounds great! I'll get right on it!"})
>>> url = u"https://api.github.com/repos/kennethreitz/requests/issues/482/comments"
>>> r = requests.post(url=url, data=body)
>>> r.status_code
404

额,这有接触古怪哈。可能我们需要征身份。那便起接触纠结了,对吧?不对。Requests简化了余身份验证形式的施用,
包括颇广阔的Basic Auth。

>>> from requests.auth import HTTPBasicAuth
>>> auth = HTTPBasicAuth('fake@example.com', 'not_a_real_password')
>>> r = requests.post(url=url, data=body, auth=auth)
>>> r.status_code
201
>>> content = r.json()
>>> print(content[u'body'])
Sounds great! I'll get right on it.

理想!噢,不!我本是想念说相当于自己一会,因为自己得去喂一下自我之猫。如果自身力所能及编辑就长长的评论那便哼了!
幸运的是,GitHub允许我们下其它一个HTTP动词,PATCH,来编排评论。我们来试。

>>> print(content[u"id"])
5804413
>>> body = json.dumps({u"body": u"Sounds great! I'll get right on it once I feed my cat."})
>>> url = u"https://api.github.com/repos/kennethreitz/requests/issues/comments/5804413"
>>> r = requests.patch(url=url, data=body, auth=auth)
>>> r.status_code
200

生好。现在,我们来折磨一下斯于肯尼思的枪杆子,我控制使为他急得团团转,也未告诉他是本人在作祟。
这意味我思念去这漫长评论。GitHub允许我们采取完全名副其实的DELETE方法来删除评论。我们来解除该评论。

>>> r = requests.delete(url=url, auth=auth)
>>> r.status_code
204
>>> r.headers['status']
'204 No Content'

异常好。不见了。最后一起我眷恋明白的事务是本人早就使了聊限额(ratelimit)。查查看,GitHub在应头部发送这个信息,
因此不必下充斥通网页,我用应用一个HEAD请求来取响应头。

>>> r = requests.head(url=url, auth=auth)
>>> print r.headers
...
'x-ratelimit-remaining': '4995'
'x-ratelimit-limit': '5000'
...

可怜好。是时写单Python程序因各种激励的措施滥用GitHub的API,还足以动用4995浅啊。


## 响应头链接字段

众多HTTP
API都发响应头链接字段的特点,它们叫API能够又好地自我描述和自身显露。

GitHub在API中为 分页
使用这些特色,例如:

>>> url = 'https://api.github.com/users/kennethreitz/repos?page=1&per_page=10'
>>> r = requests.head(url=url)
>>> r.headers['link']
'<https://api.github.com/users/kennethreitz/repos?page=2&per_page=10>; rel="next", <https://api.github.com/users/kennethreitz/repos?page=6&per_page=10>; rel="last"'

Requests会活动分析这些应头链接字段,并令其非常爱使:

>>> r.links["next"]
{'url': 'https://api.github.com/users/kennethreitz/repos?page=2&per_page=10', 'rel': 'next'}

>>> r.links["last"]
{'url': 'https://api.github.com/users/kennethreitz/repos?page=7&per_page=10', 'rel': 'last'}

## Transport Adapters

As of v1.0.0, Requests has moved to a modular internal design. Part of
the reason this was done was to implement Transport Adapters, originally
described
here.
Transport Adapters provide a mechanism to define interaction methods for
an HTTP service. In particular, they allow you to apply per-service
configuration.

Requests ships with a single Transport Adapter, the HTTPAdapter. This
adapter provides the default Requests interaction with HTTP and HTTPS
using the powerful urllib3 library.
Whenever a Requests
Session
is initialized, one of these is attached to the
Session
object for HTTP, and one for HTTPS.

Requests enables users to create and use their own Transport Adapters
that provide specific functionality. Once created, a Transport Adapter
can be mounted to a Session object, along with an indication of which
web services it should apply to.

>>> s = requests.Session()
>>> s.mount('http://www.github.com', MyAdapter())

The mount call registers a specific instance of a Transport Adapter to a
prefix. Once mounted, any HTTP request made using that session whose URL
starts with the given prefix will use the given Transport Adapter.

Many of the details of implementing a Transport Adapter are beyond the
scope of this documentation, but take a look at the next example for a
simple SSL use- case. For more than that, you might look at subclassing
requests.adapters.BaseAdapter.

Example: Specific SSL Version

The Requests team has made a specific choice to use whatever SSL version
is default in the underlying library
(urllib3). Normally this is fine,
but from time to time, you might find yourself needing to connect to a
service-endpoint that uses a version that isn’t compatible with the
default.

You can use Transport Adapters for this by taking most of the existing
implementation of HTTPAdapter, and adding a parameter ssl_version
that gets passed-through to urllib3. We’ll make a TA that instructs
the library to use SSLv3:

import ssl

from requests.adapters import HTTPAdapter
from requests.packages.urllib3.poolmanager import PoolManager


class Ssl3HttpAdapter(HTTPAdapter):
    """"Transport adapter" that allows us to use SSLv3."""

    def init_poolmanager(self, connections, maxsize, block=False):
        self.poolmanager = PoolManager(num_pools=connections,
                                       maxsize=maxsize,
                                       block=block,
                                       ssl_version=ssl.PROTOCOL_SSLv3)

## Blocking Or Non-Blocking?

With the default Transport Adapter in place, Requests does not provide
any kind of non-blocking IO. The
Response.content
property will block until the entire response has been downloaded. If
you require more granularity, the streaming features of the library (see
流式请求)
allow you to retrieve smaller quantities of the response at a time.
However, these calls will still block.

If you are concerned about the use of blocking IO, there are lots of
projects out there that combine Requests with one of Python’s
asynchronicity frameworks. Two excellent examples are
grequests and
requests-futures.


## Timeouts

Most requests to external servers should have a timeout attached, in
case the server is not responding in a timely manner. Without a timeout,
your code may hang for minutes or more.

The connect timeout is the number of seconds Requests will wait for your
client to establish a connection to a remote machine (corresponding to
the connect()) call on the socket.
It’s a good practice to set connect timeouts to slightly larger than a
multiple of 3, which is the default TCP packet retransmission
window.

Once your client has connected to the server and sent the HTTP request,
the read timeout is the number of seconds the client will wait for the
server to send a response. (Specifically, it’s the number of seconds
that the client will wait between bytes sent from the server. In 99.9%
of cases, this is the time before the server sends the first byte).

If you specify a single value for the timeout, like this:

r = requests.get('https://github.com', timeout=5)

The timeout value will be applied to both the connect and the read
timeouts. Specify a tuple if you would like to set the values
separately:

r = requests.get('https://github.com', timeout=(3.05, 27))

If the remote server is very slow, you can tell Requests to wait forever
for a response, by passing None as a timeout value and then retrieving a
cup of coffee.

r = requests.get('https://github.com', timeout=None)

## CA Certificates

By default Requests bundles a set of root CAs that it trusts, sourced
from the Mozilla trust
store.
However, these are only updated once for each Requests version. This
means that if you pin a Requests version your certificates can become
extremely out of date.

From Requests version 2.4.0 onwards, Requests will attempt to use
certificates from certifi if it is present on the
system. This allows for users to update their trusted certificates
without having to change the code that runs on their system.

For the sake of security we recommend upgrading certifi frequently!


征:前面有些官方文档没翻译到之,我好翻译了,后同局部,时间最好晚矣,是于从来不精力了,以后有日再翻,可能我翻的略报告句不通畅,但是还是能够大概表达出意思的,如果您比了官文档,觉得你可翻得重新好,可以私信或留言我哦

纪念射我的总人口吧省省吧,的确,这首文章和事先的一模一样篇Requests安装使用都是我于官网移植过来的,但是我花时间翻译了扳平有,排版也废弃了海功夫,使用MarkDown写成,需要源md文档也得以找寻我索要,本文随意传播


自己是Akkuman,同道人可以和自己一块交流啊,私信或留言都只是,我的博客hacktech.cn
| akkuman.cnblogs.com


相关文章

网站地图xml地图