Sometimes you don't want to remove duplicates — you want to remove every line that contains a specific word, domain, or pattern. This is line filtering, and it's a different operation from deduplication. Here's how to do it.
Common Use Cases
- Remove all emails from a specific domain (e.g., all @competitor.com addresses)
- Filter out keywords containing a specific term
- Remove lines with numbers from a name list
- Keep only lines matching a pattern
Using remove-lines.com for Built-in Filters
remove-lines.com has several built-in content filters: Emails Only (keep only valid email addresses), Remove Numbers (remove lines containing digits), and Strip Domain (remove the @domain part from emails). These cover many common use cases without any scripting.
Filter your list online
Built-in filters for emails, numbers, blank lines, and more.
Open the tool →Grep: The Command-Line Filter
For custom word/pattern filtering on Linux/Mac, grep is the standard tool:
# Remove lines containing "spam"
grep -vi "spam" input.txt > output.txt
# Keep only lines containing "@gmail.com"
grep -i "@gmail.com" input.txt > output.txt
# Remove lines matching a pattern
grep -vE "^[0-9]" input.txt > output.txt # remove lines starting with a number
Python: Custom Filter
with open('input.txt') as f:
lines = f.read().splitlines()
# Remove lines containing "unsubscribed"
filtered = [l for l in lines if 'unsubscribed' not in l.lower()]
with open('output.txt', 'w') as f:
f.write('\n'.join(filtered))
Notepad++ with Regex
In Notepad++: Ctrl+H → check "Regular expression" → in Find field enter your pattern → leave Replace field empty → click Replace All. To remove entire lines matching a pattern, use ^.*WORD.*\n as the find pattern.
Multiple Filter Conditions
To apply multiple filters (e.g., remove lines containing "test" OR "spam"), chain grep commands:
grep -vi "test" input.txt | grep -vi "spam" > output.txt