如何按行读取文件python

按行读取文件在Python中有多种方法，常用的包括使用open()函数、使用with语句、以及利用迭代器读取文件。 其中，使用with语句 是最推荐的方式，因为它不仅简洁，还能自动关闭文件，避免资源泄露。下面将详细介绍这种方式。

一、使用with语句按行读取文件

使用with语句可以确保文件在使用完毕后被正确关闭。具体做法如下：

with open('example.txt', 'r') as file:
    for line in file:
        print(line.strip())

在这个例子中，with open('example.txt', 'r') as file: 打开了一个名为 "example.txt" 的文件，并将其以只读模式 ('r') 打开。for line in file: 语句则遍历文件的每一行。line.strip() 用于去除行末的换行符。

二、使用open()函数按行读取文件

你也可以直接使用 open() 函数，然后手动关闭文件：

file = open('example.txt', 'r')
for line in file:
    print(line.strip())
file.close()

这种方法虽然有效，但需要记住手动关闭文件，这可能会导致资源泄露问题，尤其是在程序出现异常时。

三、使用readlines()方法按行读取文件

readlines() 方法将文件的所有行读入一个列表中，然后可以逐行处理：

with open('example.txt', 'r') as file:
    lines = file.readlines()
    for line in lines:
        print(line.strip())

这种方法适用于文件不大的情况，因为它将整个文件读入内存中。

四、使用迭代器按行读取文件

Python文件对象是可迭代的，这意味着你可以使用迭代器来按行读取文件：

with open('example.txt', 'r') as file:
    iterator = iter(file)
    for line in iterator:
        print(line.strip())

这种方法和直接使用 for line in file: 类似，但显得更灵活。

五、逐行读取大文件

对于非常大的文件，逐行读取可以避免内存不足的问题：

with open('example.txt', 'r') as file:
    while True:
        line = file.readline()
        if not line:
            break
        print(line.strip())

这种方法使用 readline() 方法逐行读取文件，直到文件结束。

六、使用生成器按行读取文件

生成器可以在需要时生成文件的每一行，从而节省内存：

def read_file_in_chunks(file_name):
    with open(file_name, 'r') as file:
        while True:
            line = file.readline()
            if not line:
                break
            yield line.strip()
for line in read_file_in_chunks('example.txt'):
    print(line)

这种方法使用了生成器 yield，可以在需要时生成每一行数据，适用于大文件读取。

七、使用contextlib的contextmanager

使用 contextlib 的 contextmanager 可以创建一个简洁的上下文管理器：

from contextlib import contextmanager
@contextmanager
def open_file(file_name, mode):
    file = open(file_name, mode)
    try:
        yield file
    finally:
        file.close()
with open_file('example.txt', 'r') as file:
    for line in file:
        print(line.strip())

这种方法可以创建自定义的上下文管理器来处理文件操作。

八、使用Pathlib模块按行读取文件

Pathlib 是Python 3.4引入的一个模块，用于处理文件和目录路径：

from pathlib import Path
file_path = Path('example.txt')
with file_path.open('r') as file:
    for line in file:
        print(line.strip())

Pathlib 提供了一种更简洁和直观的方式来处理文件路径。

九、使用fileinput模块读取多个文件

如果需要读取多个文件，fileinput 模块非常有用：

import fileinput
for line in fileinput.input(files=['example1.txt', 'example2.txt']):
    print(line.strip())

fileinput.input() 可以同时处理多个文件，并逐行读取。

十、使用pandas按行读取文件

如果需要处理带有特定格式的数据文件（如CSV），pandas 模块非常强大：

import pandas as pd
df = pd.read_csv('example.csv')
for index, row in df.iterrows():
    print(row)

pandas 提供了强大的数据处理能力，适用于复杂的数据分析任务。

十一、使用numpy按行读取文件

numpy 也可以用于读取数值数据文件：

import numpy as np
data = np.loadtxt('example.txt', delimiter=',')
for row in data:
    print(row)

numpy 适用于处理大量数值数据的文件。

十二、使用csv模块按行读取CSV文件

csv 模块专门用于处理CSV文件：

import csv
with open('example.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

csv.reader 可以方便地读取CSV文件，并将每一行作为列表返回。

十三、使用json模块按行读取JSON文件

如果文件是JSON格式，json 模块可以派上用场：

import json
with open('example.json', 'r') as file:
    data = json.load(file)
    for entry in data:
        print(entry)

json.load 可以将JSON文件解析为Python对象。

十四、按行读取压缩文件

如果文件是压缩格式，可以使用 gzip 模块：

import gzip
with gzip.open('example.txt.gz', 'rt') as file:
    for line in file:
        print(line.strip())

gzip.open 可以直接读取压缩文件。

十五、按行读取二进制文件

对于二进制文件，可以使用 rb 模式：

with open('example.bin', 'rb') as file:
    while True:
        line = file.readline()
        if not line:
            break
        print(line)

rb 模式打开文件，并按字节读取。

十六、使用io模块按行读取文件

io 模块提供了更底层的文件操作接口：

import io
with io.open('example.txt', 'r', encoding='utf-8') as file:
    for line in file:
        print(line.strip())

io.open 提供了更多的文件操作选项。

十七、使用configparser读取配置文件

configparser 模块专门用于读取配置文件：

import configparser
config = configparser.ConfigParser()
config.read('example.ini')
for section in config.sections():
    print(section)
    for key, value in config.items(section):
        print(key, value)

configparser 非常适合处理INI格式的配置文件。

十八、使用yaml模块读取YAML文件

如果文件是YAML格式，可以使用 yaml 模块：

import yaml
with open('example.yaml', 'r') as file:
    data = yaml.safe_load(file)
    for entry in data:
        print(entry)

yaml.safe_load 可以将YAML文件解析为Python对象。

十九、使用xml.etree.ElementTree读取XML文件

xml.etree.ElementTree 模块适用于处理XML文件：

import xml.etree.ElementTree as ET
tree = ET.parse('example.xml')
root = tree.getroot()
for child in root:
    print(child.tag, child.attrib)

ElementTree 提供了强大的XML处理功能。

二十、使用html.parser解析HTML文件

html.parser 模块适用于处理HTML文件：

from html.parser import HTMLParser
class MyHTMLParser(HTMLParser):
    def handle_starttag(self, tag, attrs):
        print("Start tag:", tag)
    def handle_endtag(self, tag):
        print("End tag  :", tag)
    def handle_data(self, data):
        print("Data     :", data)
parser = MyHTMLParser()
with open('example.html', 'r') as file:
    for line in file:
        parser.feed(line)