python如何统计多行注释行数

统计Python代码中的多行注释行数，可以通过使用一些编程技巧来实现，包括文件读取、字符串处理和正则表达式等。以下是一些关键步骤：使用文件读取、使用正则表达式、逐行检查文件内容。具体来说，可以通过以下步骤来实现这一功能。

我们可以详细描述一下如何使用正则表达式来识别多行注释，并统计这些注释行数。

一、读取文件内容

首先，需要从文件中读取内容。可以使用Python内置的文件操作函数来实现这一点。

def read_file(file_path):
    with open(file_path, 'r', encoding='utf-8') as file:
        return file.read()

二、使用正则表达式识别多行注释

Python中的多行注释通常使用三重引号（'''或"""）来表示。我们可以使用正则表达式来匹配这些注释。

import re
def find_multiline_comments(content):
    pattern = re.compile(r'\'\'\'(.*?)\'\'\'|\"\"\"(.*?)\"\"\"', re.DOTALL)
    return pattern.findall(content)

三、统计注释行数

找到所有的多行注释后，我们需要统计这些注释中的行数。可以通过分割字符串并计算行数来实现这一点。

def count_comment_lines(comments):
    total_lines = 0
    for comment in comments:
        # Comment is a tuple, one element will be an empty string
        comment_text = comment[0] if comment[0] else comment[1]
        total_lines += len(comment_text.split('\n'))
    return total_lines

四、综合以上步骤

我们可以将以上步骤整合到一个函数中，完成从读取文件到统计多行注释行数的整个过程。

import re
def read_file(file_path):
    with open(file_path, 'r', encoding='utf-8') as file:
        return file.read()
def find_multiline_comments(content):
    pattern = re.compile(r'\'\'\'(.*?)\'\'\'|\"\"\"(.*?)\"\"\"', re.DOTALL)
    return pattern.findall(content)
def count_comment_lines(comments):
    total_lines = 0
    for comment in comments:
        comment_text = comment[0] if comment[0] else comment[1]
        total_lines += len(comment_text.split('\n'))
    return total_lines
def count_multiline_comment_lines(file_path):
    content = read_file(file_path)
    comments = find_multiline_comments(content)
    return count_comment_lines(comments)
if __name__ == "__mAIn__":
    file_path = 'your_python_script.py'  # 请替换为实际文件路径
    line_count = count_multiline_comment_lines(file_path)
    print(f"多行注释总行数: {line_count}")

五、处理边界情况

在实际使用中，可能会遇到一些特殊情况，例如多行注释嵌套、注释内部包含三重引号等。为了处理这些情况，可以对正则表达式和字符串处理逻辑进行更详细的优化。

六、优化正则表达式

为了处理嵌套和内部包含三重引号的情况，可以对正则表达式进行优化。例如，使用非贪婪匹配和排除字符集来更准确地匹配注释。

def find_multiline_comments(content):
    pattern = re.compile(r'\'\'\'(.*?)\'\'\'|\"\"\"(.*?)\"\"\"', re.DOTALL)
    return pattern.findall(content)

七、测试和验证

在实现上述功能后，需要进行测试和验证，以确保代码能够正确统计多行注释行数。可以编写一些单元测试用例，覆盖各种边界情况。

def test_count_multiline_comment_lines():
    test_cases = [
        ('''\'\'\'This is a test comment\'\'\'\nprint("Hello World")''', 1),
        ('''\"\"\"Another test comment\nWith multiple lines\"\"\"\nprint("Hello Again")''', 2),
        ('''\'\'\'Nested comment\n\'\'\'Inner comment\'\'\'\nEnd of nested\'\'\'\nprint("Nested Test")''', 4),
    ]
    for content, expected in test_cases:
        with open('test_script.py', 'w', encoding='utf-8') as file:
            file.write(content)
        assert count_multiline_comment_lines('test_script.py') == expected
if __name__ == "__main__":
    test_count_multiline_comment_lines()
    print("所有测试用例均通过")