如何用python调用http

在Python中调用HTTP请求的方式有多种，常用的方法包括使用requests库、使用http.client模块、以及使用urllib库。这三种方法各有优缺点，requests库以其简洁和功能丰富而备受推崇，适合大多数HTTP请求的场景。http.client模块是Python的标准库之一，适合轻量级的HTTP请求操作。urllib库也是Python标准库的一部分，适合处理URL和基本的网络请求。下面将详细介绍如何使用requests库进行HTTP请求。

一、使用`requests`库

requests库是一个非常流行的HTTP库，提供了简单易用的API来处理HTTP请求和响应。

1. 安装`requests`库

在使用requests库之前，需要先安装它。可以通过pip命令进行安装：

pip install requests

2. 发起GET请求

GET请求是最常见的HTTP请求类型，用于从服务器获取数据。使用requests库发起GET请求非常简单：

import requests
response = requests.get('https://api.example.com/data')
print(response.status_code)  # 打印响应状态码
print(response.json())       # 打印响应的JSON数据

在上述代码中，requests.get()方法用于发起GET请求，传入目标URL。response对象包含了请求的结果，通过response.status_code可以获取HTTP状态码，而response.json()方法则将响应内容解析为JSON格式。

3. 发起POST请求

POST请求通常用于提交数据到服务器。requests库也提供了发起POST请求的方法：

import requests
data = {'key1': 'value1', 'key2': 'value2'}
response = requests.post('https://api.example.com/submit', data=data)
print(response.status_code)
print(response.text)

在POST请求中，可以通过data参数传递要提交的数据，requests.post()方法会自动将其编码为表单数据格式。

二、使用`http.client`模块

http.client模块是Python标准库的一部分，提供了底层的HTTP客户端接口。

1. 发起GET请求

使用http.client模块发起GET请求需要创建连接对象，然后构造请求：

import http.client
conn = http.client.HTTPSConnection('api.example.com')
conn.request('GET', '/data')
response = conn.getresponse()
print(response.status, response.reason)
data = response.read()
print(data)
conn.close()

在上述代码中，http.client.HTTPSConnection()用于创建一个HTTPS连接，request()方法用于发送请求，getresponse()方法用于获取响应。

2. 发起POST请求

类似于GET请求，发起POST请求时需要指定请求方法和路径，并传递数据：

import http.client
conn = http.client.HTTPSConnection('api.example.com')
headers = {'Content-type': 'application/x-www-form-urlencoded'}
params = 'key1=value1&key2=value2'
conn.request('POST', '/submit', params, headers)
response = conn.getresponse()
print(response.status, response.reason)
data = response.read()
print(data)
conn.close()

在POST请求中，需要通过headers参数指定请求头，以表明数据格式。

三、使用`urllib`库

urllib库也是Python标准库的一部分，提供了用于处理URL和网络请求的模块。

1. 发起GET请求

使用urllib库发起GET请求可以通过urllib.request模块实现：

import urllib.request
with urllib.request.urlopen('https://api.example.com/data') as response:
    data = response.read()
    print(data)

在上述代码中，urllib.request.urlopen()方法用于打开目标URL并返回一个响应对象。

2. 发起POST请求

发起POST请求时，可以使用urllib.parse模块编码数据：

import urllib.request
import urllib.parse
url = 'https://api.example.com/submit'
data = {'key1': 'value1', 'key2': 'value2'}
data = urllib.parse.urlencode(data).encode('utf-8')
req = urllib.request.Request(url, data)
with urllib.request.urlopen(req) as response:
    result = response.read()
    print(result)

在POST请求中，urllib.parse.urlencode()方法用于将数据编码为表单格式，urllib.request.Request()方法用于创建请求对象。

四、处理HTTP请求的注意事项

1. 异常处理

在发起HTTP请求时，可能会遇到各种异常情况，例如网络错误、超时等。为此，应该添加异常处理逻辑，以确保程序的健壮性。

import requests
try:
    response = requests.get('https://api.example.com/data', timeout=5)
    response.raise_for_status()  # 如果响应状态码不是200，则引发HTTPError
except requests.exceptions.RequestException as e:
    print(f"HTTP请求失败: {e}")

2. SSL证书验证

在进行HTTPS请求时，默认会验证SSL证书。可以通过设置verify参数来控制此行为：

response = requests.get('https://api.example.com/data', verify=False)

关闭SSL验证可能会带来安全隐患，建议在开发环境中使用，而在生产环境中进行证书验证。

五、请求头和参数的使用

1. 自定义请求头

在某些情况下，需要自定义请求头，例如添加认证信息或设置用户代理。可以通过headers参数传递自定义请求头：

headers = {'Authorization': 'Bearer YOUR_ACCESS_TOKEN', 'User-Agent': 'my-app'}
response = requests.get('https://api.example.com/data', headers=headers)

2. URL参数

对于GET请求，可以通过params参数传递URL参数：

params = {'key1': 'value1', 'key2': 'value2'}
response = requests.get('https://api.example.com/data', params=params)

requests库会自动将参数编码并附加到URL中。

六、解析响应内容

HTTP响应通常包含内容体，可以是JSON、XML、HTML等格式。在处理响应时，通常需要解析响应内容。

1. 解析JSON响应

对于JSON格式的响应，可以使用json()方法直接解析：

response = requests.get('https://api.example.com/data')
data = response.json()
print(data['key'])

2. 解析HTML响应

对于HTML格式的响应，可以使用BeautifulSoup库进行解析：

from bs4 import BeautifulSoup
response = requests.get('https://example.com')
soup = BeautifulSoup(response.content, 'html.parser')
print(soup.title.string)

需要提前安装BeautifulSoup库：

pip install beautifulsoup4

七、高级使用技巧

1. 会话对象

使用会话对象可以在多个请求之间保持一些参数，例如cookies和连接池。

session = requests.Session()
session.headers.update({'Authorization': 'Bearer YOUR_ACCESS_TOKEN'})
response = session.get('https://api.example.com/data')
print(response.json())

2. 流式请求

对于大文件下载，可以使用流式请求，避免将文件内容一次性加载到内存中。

response = requests.get('https://example.com/largefile', stream=True)
with open('largefile', 'wb') as f:
    for chunk in response.iter_content(chunk_size=1024):
        f.write(chunk)

八、总结

在Python中调用HTTP请求是非常常见的操作，requests库以其简单易用的特性成为了处理HTTP请求的首选工具。无论是发起GET、POST请求，还是自定义请求头、解析响应内容，requests库都提供了丰富的API支持。此外，http.client和urllib库作为Python标准库的一部分，也提供了基本的HTTP请求功能，可以根据具体需求进行选择使用。无论使用哪种方法，都需要注意异常处理和安全性，确保程序的健壮性和安全性。