python中如何将连续出现的单词去掉

在Python中，去掉连续出现的单词可以使用字符串操作和集合来实现。首先，我们需要将文本拆分成单词列表、遍历该列表、检查每个单词是否与前一个单词相同，如果不同则添加到结果列表中。这种方法不仅可以去除连续出现的单词，还能保留原有的单词顺序。

一、字符串拆分和初始化

在处理文本时，第一步通常是将整个文本拆分成单词列表。这可以使用Python的split()方法来实现。为了后续处理方便，我们还需要初始化一些变量，例如结果列表和前一个单词的存储变量。

def remove_consecutive_words(text):
    words = text.split()
    result = []
    previous_word = None
    for word in words:
        if word != previous_word:
            result.append(word)
        previous_word = word
    return ' '.join(result)

二、遍历单词列表

在遍历单词列表时，我们需要比较当前单词和前一个单词。如果它们不相同，我们将当前单词添加到结果列表中，并更新前一个单词的值。这样的方法确保了只有不连续的单词被保留。

示例代码：

def remove_consecutive_words(text):
    words = text.split()
    result = []
    previous_word = None
    for word in words:
        if word != previous_word:
            result.append(word)
        previous_word = word
    return ' '.join(result)
text = "this is is a test test text"
print(remove_consecutive_words(text))
输出: "this is a test text"

三、处理边界条件

在实际应用中，我们可能会遇到一些特殊情况，如空字符串、只有一个单词的字符串或全是相同单词的字符串。我们需要确保我们的函数能够处理这些情况而不出错。

def remove_consecutive_words(text):
    if not text:
        return text
    words = text.split()
    if len(words) == 1:
        return text
    result = []
    previous_word = None
    for word in words:
        if word != previous_word:
            result.append(word)
        previous_word = word
    return ' '.join(result)

四、复杂应用：去除特定词汇和忽略大小写

在某些应用场景中，我们可能需要去除特定的连续词汇或者忽略大小写进行比较。这时，我们可以在原函数的基础上进行一些扩展。

忽略大小写：

def remove_consecutive_words(text, ignore_case=False):
    if not text:
        return text
    words = text.split()
    if len(words) == 1:
        return text
    result = []
    previous_word = None
    for word in words:
        comparison_word = word.lower() if ignore_case else word
        comparison_previous_word = previous_word.lower() if ignore_case and previous_word else previous_word
        if comparison_word != comparison_previous_word:
            result.append(word)
        previous_word = word
    return ' '.join(result)
text = "This is is a Test test text"
print(remove_consecutive_words(text, ignore_case=True))
输出: "This is a Test text"

去除特定词汇：

def remove_consecutive_words(text, ignore_case=False, remove_words=None):
    if not text:
        return text
    words = text.split()
    if len(words) == 1:
        return text
    if remove_words:
        remove_words = [word.lower() for word in remove_words] if ignore_case else remove_words
    result = []
    previous_word = None
    for word in words:
        comparison_word = word.lower() if ignore_case else word
        comparison_previous_word = previous_word.lower() if ignore_case and previous_word else previous_word
        if comparison_word != comparison_previous_word and (not remove_words or comparison_word not in remove_words):
            result.append(word)
        previous_word = word
    return ' '.join(result)
text = "this is is a test test text"
remove_words = ["is", "test"]
print(remove_consecutive_words(text, ignore_case=True, remove_words=remove_words))
输出: "this a text"