Timing with random strings of ASCII printables:
from inspect import getsource
from random import sample
import re
from string import printable
from timeit import timeit
pattern_single = re.compile[r'[\W]']
pattern_repeat = re.compile[r'[\W]+']
translation_tb = str.maketrans['', '', ''.join[c for c in map[chr, range[256]] if not c.isalnum[]]]
def generate_test_string[length]:
return ''.join[sample[printable, length]]
def main[]:
for i in range[0, 60, 10]:
for test in [
lambda: ''.join[c for c in generate_test_string[i] if c.isalnum[]],
lambda: ''.join[filter[str.isalnum, generate_test_string[i]]],
lambda: re.sub[r'[\W]', '', generate_test_string[i]],
lambda: re.sub[r'[\W]+', '', generate_test_string[i]],
lambda: pattern_single.sub['', generate_test_string[i]],
lambda: pattern_repeat.sub['', generate_test_string[i]],
lambda: generate_test_string[i].translate[translation_tb],
]:
print[timeit[test], i, getsource[test].lstrip[' lambda: '].rstrip[',\n'], sep='\t']
if __name__ == '__main__':
main[]
Result [Python 3.7]:
Time Length Code
6.3716264850008880 00 ''.join[c for c in generate_test_string[i] if c.isalnum[]]
5.7285426190064750 00 ''.join[filter[str.isalnum, generate_test_string[i]]]
8.1875841680011940 00 re.sub[r'[\W]', '', generate_test_string[i]]
8.0002205439959650 00 re.sub[r'[\W]+', '', generate_test_string[i]]
5.5290945199958510 00 pattern_single.sub['', generate_test_string[i]]
5.4417179649972240 00 pattern_repeat.sub['', generate_test_string[i]]
4.6772285089973590 00 generate_test_string[i].translate[translation_tb]
23.574712151996210 10 ''.join[c for c in generate_test_string[i] if c.isalnum[]]
22.829975890002970 10 ''.join[filter[str.isalnum, generate_test_string[i]]]
27.210196289997840 10 re.sub[r'[\W]', '', generate_test_string[i]]
27.203713296003116 10 re.sub[r'[\W]+', '', generate_test_string[i]]
24.008979928999906 10 pattern_single.sub['', generate_test_string[i]]
23.945240008994006 10 pattern_repeat.sub['', generate_test_string[i]]
21.830899796994345 10 generate_test_string[i].translate[translation_tb]
38.731336012999236 20 ''.join[c for c in generate_test_string[i] if c.isalnum[]]
37.942474347000825 20 ''.join[filter[str.isalnum, generate_test_string[i]]]
42.169366310001350 20 re.sub[r'[\W]', '', generate_test_string[i]]
41.933375883003464 20 re.sub[r'[\W]+', '', generate_test_string[i]]
38.899814646996674 20 pattern_single.sub['', generate_test_string[i]]
38.636144253003295 20 pattern_repeat.sub['', generate_test_string[i]]
36.201238164998360 20 generate_test_string[i].translate[translation_tb]
49.377356811004574 30 ''.join[c for c in generate_test_string[i] if c.isalnum[]]
48.408927293996385 30 ''.join[filter[str.isalnum, generate_test_string[i]]]
53.901889764994850 30 re.sub[r'[\W]', '', generate_test_string[i]]
52.130339455994545 30 re.sub[r'[\W]+', '', generate_test_string[i]]
50.061149017004940 30 pattern_single.sub['', generate_test_string[i]]
49.366573111998150 30 pattern_repeat.sub['', generate_test_string[i]]
46.649754120997386 30 generate_test_string[i].translate[translation_tb]
63.107938601999194 40 ''.join[c for c in generate_test_string[i] if c.isalnum[]]
65.116287978999030 40 ''.join[filter[str.isalnum, generate_test_string[i]]]
71.477421126997800 40 re.sub[r'[\W]', '', generate_test_string[i]]
66.027950693998720 40 re.sub[r'[\W]+', '', generate_test_string[i]]
63.315361931003280 40 pattern_single.sub['', generate_test_string[i]]
62.342320287003530 40 pattern_repeat.sub['', generate_test_string[i]]
58.249303059004890 40 generate_test_string[i].translate[translation_tb]
73.810345625002810 50 ''.join[c for c in generate_test_string[i] if c.isalnum[]]
72.593953348005020 50 ''.join[filter[str.isalnum, generate_test_string[i]]]
76.048324580995540 50 re.sub[r'[\W]', '', generate_test_string[i]]
75.106637657001560 50 re.sub[r'[\W]+', '', generate_test_string[i]]
74.681338128997600 50 pattern_single.sub['', generate_test_string[i]]
72.430461594005460 50 pattern_repeat.sub['', generate_test_string[i]]
69.394243567003290 50 generate_test_string[i].translate[translation_tb]
str.maketrans
& str.translate
is fastest, but includes all non-ASCII characters. re.compile
& pattern.sub
is slower, but is somehow faster than ''.join
& filter
.
Created: May-28, 2021 Alphanumeric characters contain the blend of the 26 characters of the letter set and the numbers 0 to 9. Non-alphanumeric characters include characters that are not letters or digits, like isalnum[]
Method to Remove All Non-Alphanumeric Characters in Python Stringfilter[]
Function to Remove All Non-Alphanumeric Characters in Python String+
and @
.
In this tutorial, we will discuss how to remove non-alphanumeric characters from a string in Python.
Use the isalnum[]
Method to Remove All Non-Alphanumeric Characters in Python String
We can use the isalnum[]
method to check whether a given character or string is alphanumeric or not. We can compare each character individually from a string, and if it is alphanumeric, then we combine it using the join[]
function.
For example,
string_value = "alphanumeric@123__"
s = ''.join[ch for ch in string_value if ch.isalnum[]]
print[s]
Output:
alphanumeric123
Use the filter[]
Function to Remove All Non-Alphanumeric Characters in Python String
The filter[]
function is used to construct an iterator from components of the iterable object and filters the object’s elements using a function.
For our problem, the string is our object, and we will use the isalnum[]
function, which checks whether a given string contains alphanumeric characters or
not by checking each character. The join[]
function combines all the characters to return a string.
For example,
string_value = "alphanumeric@123__"
s = ''.join[filter[str.isalnum, string_value]]
print[s]
Output:
alphanumeric123
This method does not work with Python 3.
Use Regular Expressions to Remove All Non-Alphanumeric Characters in Python String
A regular expression is an exceptional grouping of characters that helps you match different strings or sets of strings, utilizing a specific syntax in a pattern. To use regular expressions, we import the re module.
We can use the sub[]
function from this module to replace all the string that matches a non-alphanumeric character by an empty character.
For example,
import re
string_value = "alphanumeric@123__"
s=re.sub[r'[\W_]+', '', string_value]
print[s]
Output:
alphanumeric123
Alternatively, we can also use the following pattern.
import re
string_value = "alphanumeric@123__"
s = re.sub[r'[^a-zA-Z0-9]', '', string_value]
print[s]
Output:
alphanumeric123
Write for us
DelftStack articles are written by software geeks like you. If you also would like to contribute to DelftStack by writing paid articles, you can check the write for us page.