python如何取下载进度

Python可以通过多种方式来获取下载进度，包括使用HTTP请求库、使用进度条库等。常用的方法有：使用requests库进行流式下载、结合tqdm库显示进度条、以及使用urllib库。使用requests库和tqdm库可以方便地获取和显示下载进度，适合大多数下载任务。下面将详细介绍如何使用这两种方法来获取下载进度。

一、使用REQUESTS库获取下载进度

requests库是Python中非常流行的HTTP请求库，支持流式下载，这使得我们能够在下载大文件时逐步读取数据流并计算下载进度。

流式下载

通过在请求中设置stream=True参数，requests库可以实现流式下载。这意味着文件内容不会立即下载到内存，而是以流的方式逐步读取。这样可以在读取数据时计算下载进度。

import requests
def download_file(url, destination):
    response = requests.get(url, stream=True)
    total_size = int(response.headers.get('content-length', 0))
    downloaded_size = 0
    with open(destination, 'wb') as file:
        for data in response.iter_content(1024):
            downloaded_size += len(data)
            file.write(data)
            print(f"Download progress: {downloaded_size / total_size * 100:.2f}%")
url = 'http://example.com/largefile.zip'
destination = 'largefile.zip'
download_file(url, destination)

在这个例子中，通过response.iter_content(1024)逐步读取数据，每次读取1024字节的数据块，并计算已下载的数据量占总文件大小的百分比。

异常处理与性能优化

在实际应用中，网络连接中断、磁盘空间不足等问题可能导致下载失败。因此，加入异常处理是非常重要的。此外，适当地调整每次读取的数据块大小可以优化下载性能。

def download_file_with_error_handling(url, destination):
    try:
        response = requests.get(url, stream=True)
        response.raise_for_status()
        total_size = int(response.headers.get('content-length', 0))
        downloaded_size = 0
        with open(destination, 'wb') as file:
            for data in response.iter_content(4096):  # Increase chunk size for better performance
                downloaded_size += len(data)
                file.write(data)
                print(f"Download progress: {downloaded_size / total_size * 100:.2f}%")
    except requests.exceptions.RequestException as e:
        print(f"An error occurred: {e}")
url = 'http://example.com/largefile.zip'
destination = 'largefile.zip'
download_file_with_error_handling(url, destination)

在这个改进的例子中，我们增加了异常处理，并将数据块大小调整为4096字节，以提高下载性能。

二、结合TQDM库显示下载进度

tqdm是一个用于在Python中显示进度条的库，可以与requests库结合使用，使得下载进度的显示更加美观和直观。

安装TQDM库

首先，确保安装tqdm库，可以通过以下命令安装：

pip install tqdm

结合TQDM显示进度条

使用tqdm库的tqdm函数可以轻松地为迭代器添加进度条显示。

from tqdm import tqdm
import requests
def download_file_with_tqdm(url, destination):
    response = requests.get(url, stream=True)
    total_size = int(response.headers.get('content-length', 0))
    with open(destination, 'wb') as file:
        for data in tqdm(response.iter_content(1024), total=total_size//1024, unit='KB', desc=destination):
            file.write(data)
url = 'http://example.com/largefile.zip'
destination = 'largefile.zip'
download_file_with_tqdm(url, destination)

在这个例子中，tqdm函数包裹了response.iter_content(1024)，并指定了总大小total_size//1024、单位'KB'以及描述信息desc，使得进度条更加直观。

三、使用URLLIB库获取下载进度

除了requests库，urllib库也是Python内置的一个用于处理URL和HTTP请求的库，可以用来实现下载并获取进度。

使用URLLIB库实现下载

urllib库中的urlopen方法可以用于下载文件，并通过读取数据流计算下载进度。

import urllib.request
def download_with_urllib(url, destination):
    response = urllib.request.urlopen(url)
    total_size = int(response.headers.get('content-length', 0))
    downloaded_size = 0
    with open(destination, 'wb') as file:
        while True:
            data = response.read(1024)
            if not data:
                break
            downloaded_size += len(data)
            file.write(data)
            print(f"Download progress: {downloaded_size / total_size * 100:.2f}%")
url = 'http://example.com/largefile.zip'
destination = 'largefile.zip'
download_with_urllib(url, destination)

在这个例子中，response.read(1024)用于逐步读取数据流，并计算下载进度。

加入进度条显示

可以结合tqdm库为urllib的下载过程添加进度条显示。

from tqdm import tqdm
import urllib.request
def download_with_urllib_and_tqdm(url, destination):
    response = urllib.request.urlopen(url)
    total_size = int(response.headers.get('content-length', 0))
    with open(destination, 'wb') as file:
        for _ in tqdm(range(total_size//1024), unit='KB', desc=destination):
            data = response.read(1024)
            if not data:
                break
            file.write(data)
url = 'http://example.com/largefile.zip'
destination = 'largefile.zip'
download_with_urllib_and_tqdm(url, destination)