requests是一个充分实用的Python HTTP客户端库,编写爬虫和测试服务器响应数据经常经常会就此到。可以说,Requests 完全满足如今网的求

正文全部起源官方文档
http://docs.python-requests.org/en/master/

装方式一般采用$
pip install requests。其它安装方式参考官方文档

 

HTTP – requests

 

import requests

 

GET请求

 

r  = requests.get(‘http://httpbin.org/get’)

 

传参

>>> payload = {‘key1’: ‘value1’, ‘key2’: ‘value2’, ‘key3’: None}
>>> r = requests.get(‘http://httpbin.org/get'**, params=payload)**

 

http://httpbin.org/get?key2=value2&key1=value1

 

Note that any dictionary
key whose value is None will not be added to the URL’s query string.

 

参数为得以传递列表

 

>>> payload = {‘key1’: ‘value1’, ‘key2’: [‘value2’, ‘value3’]}

>>> r = requests.get(‘http://httpbin.org/get'**, params=payload)
>>>
print(r.url)**
http://httpbin.org/get?key1=value1&key2=value2&key2=value3

r.text 返回headers中之编码解析的结果,可以通过r.encoding = ‘gbk’来改解码方式

r.content返回二进制结果

r.json()返回JSON格式,可能扔来十分

r.status_code

r.raw返回原始socket respons,需要加以参数stream=True

 

>>> r = requests.get(‘https://api.github.com/events'**, stream=True)**

>>> r.raw
<requests.packages.urllib3.response.HTTPResponse object
at 0x101194810>

>>> r.raw.read(10)
‘\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03’

用结果保存到文件,利用r.iter_content()

 

with open(filename, ‘wb’) as fd:
    for chunk in r.iter_content(chunk_size):
        fd.write(chunk)

 

传递headers

 

>>> headers = {‘user-agent’: ‘my-app/0.0.1’}
>>> r = requests.get(url, headers=headers)

 

传递cookies

 

>>> url = ‘http://httpbin.org/cookies’

>>> r = requests.get(url, cookies=dict(cookies_are=’working’))
>>> r.text
‘{“cookies”: {“cookies_are”: “working”}}’

 

 

POST请求

 

传递表单

r = requests.post(‘http://httpbin.org/post'**, data = {‘key’:‘value’})**

 

普普通通,你想要发送一些编码为表单形式的多寡—非常像一个HTML表单。 要兑现这,只需要简地传递一个字典给 data 参数。你的数字典
在发出请求时会自动编码为表单形式:

 

 

>>> payload = {‘key1’: ‘value1’, ‘key2’: ‘value2’}

>>> r = requests.post(“http://httpbin.org/post"**, data=payload)
>>>
print(r.text)**
{
  …
  “form”: {
    “key2”: “value2”,
    “key1”: “value1”
  },
  …
}

森时段你想要发送的多少毫无编码为表单形式的。如果你传递一个 string 而休是一个dict ,那么数量会被直接宣布出去。

 

>>> url = ‘https://api.github.com/some/endpoint’
>>>
payload = {‘some’: ‘data’}

 

>>> r = requests.post(url, data=json.dumps(payload))

或者

>>> r = requests.post(url, json=payload)

 

 

传送文件

 

url = ‘http://httpbin.org/post’
>>>
files = {‘file’: open(‘report.xls’, ‘rb’)}

>>> r = requests.post(url, files=files)

配置files,filename, content_type
and headers

files = {‘file’: (‘report.xls’, open(‘report.xls’, ‘rb’), ‘application/vnd.ms-excel’, {‘Expires’: ‘0’})}

 

files = {‘file’: (‘report.csv’, ‘some,data,to,send\nanother,row,to,send\n’)}

 

响应

 

r.status_code

r.heards

r.cookies

 

 

跳转

 

By default Requests will
perform location redirection for all verbs except HEAD.

 

>>> r = requests.get(‘http://httpbin.org/cookies/set?k2=v2&k1=v1'**)**

>>> r.url
‘http://httpbin.org/cookies’

>>> r.status_code
200

>>> r.history
[<Response [302]>]

 

If you’re using HEAD, you
can enable redirection as well:

 

r=requests.head(‘http://httpbin.org/cookies/set?k2=v2&k1=v1',allow\_redirects=**True**)

 

You can tell Requests to
stop waiting for a response after a given number of seconds with
the timeoutparameter:

 

requests.get(‘http://github.com'**, timeout=0.001)**

 

 

高档特性

 

来自
<http://docs.python-requests.org/en/master/user/advanced/#advanced>

 

session,自动保存cookies,可以设置请求参数,下次要自动带来上求参数

 

s = requests.Session()

s.get(‘http://httpbin.org/cookies/set/sessioncookie/123456789'**)**
r = s.get(‘http://httpbin.org/cookies'**)**

print(r.text)
# ‘{“cookies”: {“sessioncookie”: “123456789”}}’

session可以用来提供默认数据,函数参数级别之数据会和session级别之数统一,如果key重复,函数参数级别的数目以覆盖session级别之数码。如果想收回session的某个参数,可以在传递一个如出一辙key,value为None的dict

 

s = requests.Session()
s.auth = (‘user’, ‘pass’) #权限认证
s.headers.update({‘x-test’: ‘true’})

# both ‘x-test’ and ‘x-test2’ are sent
s.get(‘http://httpbin.org/headers'**, headers={‘x-test2’: ‘true’})**

函数参数中之数据就会使同样涂鸦,并无见面保留及session中

 

假如:cookies仅本次有效

r = s.get(‘http://httpbin.org/cookies'**, cookies={‘from-my’: ‘browser’})**

 

session也得以活动关闭

 

with requests.Session() as s:
    s.get(‘http://httpbin.org/cookies/set/sessioncookie/123456789'**)**

 

一呼百应结果不但包含响应的总体消息,也蕴藏呼吁信息

 

r = requests.get(‘http://en.wikipedia.org/wiki/Monty\_Python'**)**

r.headers

r.request.headers

 

 

SSL证书验证

 

 

Requests可以啊HTTPS请求验证SSL证书,就比如web浏览器同样。要想检查有主机的SSL证书,你可利用 verify 参数:

 

 

>>> requests.get(‘https://kennethreitz.com'**, verify=True)*
requests.exceptions.SSLError: hostname ‘kennethreitz.com’
doesn’t match either of ‘\
.herokuapp.com’, ‘herokuapp.com’

当该域名及本身从来不设置SSL,所以失败了。但Github设置了SSL:

>>> requests.get(‘https://github.com'**, verify=True)**
<Response [200]>

于私有证书,你呢得以传递一个CA_BUNDLE文件的门道为 verify 。你啊得安装REQUEST_CA_BUNDLE 环境变量。

 

>>> requests.get(‘https://github.com'**, verify=’/path/to/certfile’)**

 

一经您将 verify 设置也False,Requests也能够忽视对SSL证书的证实。

 

>>> requests.get(‘https://kennethreitz.com'**, verify=False)**
<Response [200]>

默认情况下, verify 是装也True的。选项 verify 仅使用叫主机证书。

卿吗可以指定一个地方证书用作客户端证书,可以是单科文件(包含密钥和证明)或一个饱含两独文本路径的元组:

 

>>> requests.get(‘https://kennethreitz.com'**, cert=(‘/path/server.crt’, ‘/path/key’))**
<Response [200]>

响应体内容工作流

 

默认情况下,当您进行网络要后,响应体会立即为下载。你得经 stream 参数覆盖这行为,推迟下载响应体直到访问 Response.content 属性:

 

tarball_url = ‘https://github.com/kennethreitz/requests/tarball/master’
r = requests.get(tarball_url, stream=True)

这时候只有来响应头被下充斥下来了,连接保持开拓状态,因此同意我们根据规则获得内容:

 

if int(r.headers[‘content-length’]) < TOO_LONG:
  content = r.content
  …

使安stream为True,请求连接不见面于关门,除非读取所有数据还是调用Response.close。

 

好利用contextlib.closing来机关关闭连接:

 

 

import requests

from contextlib

import closing

tarball_url = ‘https://github.com/kennethreitz/requests/tarball/master’

file = r’D:\Documents\WorkSpace\Python\Test\Python34Test\test.tar.gz’

 

with closing(requests.get(tarball_url, stream=True)) as r:

with open(file, ‘wb’) as f:

for data in r.iter_content(1024):

f.write(data)

 

Keep-Alive

 

来自
<http://docs.python-requests.org/en/master/user/advanced/>

 

同一会话内而发出之另外要都见面自行复用恰当的连续!

瞩目:只有具有的响应体数据让读取了连接才会给保释吧连接池;所以管将 stream设置也 False 或读取 Response 对象的 content 属性。

 

流式上污染

Requests支持流式上传,这允许而发送大之数据流或文件要无需先管它们读入内存。要以流式上传,仅需要呢汝的请求体提供一个近乎公事对象即可:

读取文件要动字节的道,这样Requests会扭转是的Content-Length

with open(‘massive-body’, ‘rb’) as f:
    requests.post(‘http://some.url/streamed'**, data=f)**

 

分块传输编码

 

对此下与入的请,Requests也支持分块传输编码。要发送一个块编码的乞求,仅用呢你的请求体提供一个生成器

留意生成器输出应该也bytes

def gen():
    yield b’hi’
    yield b’there’

requests.post(‘http://some.url/chunked'**, data=gen())**

For chunked encoded
responses, it’s best to iterate over the data using Response.iter_content(). In an ideal situation you’ll
have set stream=True on the request, in which case you can iterate
chunk-by-chunk by calling iter_content with a chunk size parameter of None.
If you want to set a maximum size of the chunk, you can set a chunk size
parameter to any integer.

POST Multiple Multipart-Encoded Files

 

来自
<http://docs.python-requests.org/en/master/user/advanced/>

 

<input type=”file” name=”images” multiple=”true”
required=”true”/>

 

To do that, just set files to a list of tuples
of (form_field_name, file_info):

 

>>> url = ‘http://httpbin.org/post’
>>>
multiple_files = [
       
(‘images’, (‘foo.png’, open(‘foo.png’, ‘rb’), ‘image/png’)),
       
(‘images’, (‘bar.png’, open(‘bar.png’, ‘rb’),
‘image/png’))]
>>>
r = requests.post(url, files=multiple_files)
>>>
r.text
{
 

  ‘files’:
{‘images’: ‘ ….’}
 
‘Content-Type’: ‘multipart/form-data;
boundary=3131623adb2043caaeb5538cc7aa0b3a’,
 

}

Custom Authentication

Requests allows you to use specify your own authentication
mechanism.

Any callable which is passed as the auth argument to a request method will have the opportunity to
modify the request before it is dispatched.

Authentication implementations are subclasses
of requests.auth.AuthBase, and are easy to define. Requests provides two common
authentication scheme implementations in requests.auth:HTTPBasicAuth and HTTPDigestAuth.

Let’s pretend that we have a web service that will only
respond if the X-Pizza header is set to a password value. Unlikely, but just go
with it.

from requests.auth import AuthBase

class PizzaAuth(AuthBase):
    “””Attaches HTTP Pizza Authentication to the given Request
object.”””

    def __init__(self, username):
        # setup any auth-related data here
        self.username = username

def __call__(self, r):
        # modify and return the request
        r.headers[‘X-Pizza’] = self.username
        return r

Then, we can make a request using our Pizza Auth:

>>> requests.get(‘http://pizzabin.org/admin'**, auth=PizzaAuth(‘kenneth’))**
<Response [200]>

 

来自
<http://docs.python-requests.org/en/master/user/advanced/>

 

流式请求

 

r = requests.get(‘http://httpbin.org/stream/20'**, stream=True)**

for line in r.iter_lines():

 

代理

 

If you need to use a proxy, you can configure individual
requests with the proxies argument to any request method:

import requests

proxies = {
  ‘http’: ‘http://10.10.1.10:3128’,
  ‘https’: ‘http://10.10.1.10:1080’,
}

requests.get(‘http://example.org'**, proxies=proxies)**

 

To use HTTP Basic Auth with your proxy, use
the http://user:password@host/ syntax:

proxies = {‘http’: ‘http://user:pass@10.10.1.10:3128/’}

 

超时

 

 

If you specify a single value for the timeout, like
this:

 

r = requests.get(‘https://github.com'**, timeout=5)**

 

The timeout value will be applied to both the connect and the read timeouts. Specify a tuple if you would like to set the
values separately:

 

r = requests.get(‘https://github.com'**, timeout=(3.05, 27))**

 

If the remote server is very slow, you can tell Requests to
wait forever for a response, by passing None as a timeout value and then
retrieving a cup of coffee.

 

r = requests.get(‘https://github.com'**, timeout=None)**

 

来自
<http://docs.python-requests.org/en/master/user/advanced/>

 

已使用 Microsoft OneNote 2016 创建。

相关文章

网站地图xml地图