How do you select only alpha characters in python?

Timing with random strings of ASCII printables:

from inspect import getsource
from random import sample
import re
from string import printable
from timeit import timeit

pattern_single = re.compile(r'[\W]')
pattern_repeat = re.compile(r'[\W]+')
translation_tb = str.maketrans('', '', ''.join(c for c in map(chr, range(256)) if not c.isalnum()))


def generate_test_string(length):
    return ''.join(sample(printable, length))


def main():
    for i in range(0, 60, 10):
        for test in [
            lambda: ''.join(c for c in generate_test_string(i) if c.isalnum()),
            lambda: ''.join(filter(str.isalnum, generate_test_string(i))),
            lambda: re.sub(r'[\W]', '', generate_test_string(i)),
            lambda: re.sub(r'[\W]+', '', generate_test_string(i)),
            lambda: pattern_single.sub('', generate_test_string(i)),
            lambda: pattern_repeat.sub('', generate_test_string(i)),
            lambda: generate_test_string(i).translate(translation_tb),

        ]:
            print(timeit(test), i, getsource(test).lstrip('            lambda: ').rstrip(',\n'), sep='\t')


if __name__ == '__main__':
    main()

Result (Python 3.7):

       Time       Length                           Code                           
6.3716264850008880  00  ''.join(c for c in generate_test_string(i) if c.isalnum())
5.7285426190064750  00  ''.join(filter(str.isalnum, generate_test_string(i)))
8.1875841680011940  00  re.sub(r'[\W]', '', generate_test_string(i))
8.0002205439959650  00  re.sub(r'[\W]+', '', generate_test_string(i))
5.5290945199958510  00  pattern_single.sub('', generate_test_string(i))
5.4417179649972240  00  pattern_repeat.sub('', generate_test_string(i))
4.6772285089973590  00  generate_test_string(i).translate(translation_tb)
23.574712151996210  10  ''.join(c for c in generate_test_string(i) if c.isalnum())
22.829975890002970  10  ''.join(filter(str.isalnum, generate_test_string(i)))
27.210196289997840  10  re.sub(r'[\W]', '', generate_test_string(i))
27.203713296003116  10  re.sub(r'[\W]+', '', generate_test_string(i))
24.008979928999906  10  pattern_single.sub('', generate_test_string(i))
23.945240008994006  10  pattern_repeat.sub('', generate_test_string(i))
21.830899796994345  10  generate_test_string(i).translate(translation_tb)
38.731336012999236  20  ''.join(c for c in generate_test_string(i) if c.isalnum())
37.942474347000825  20  ''.join(filter(str.isalnum, generate_test_string(i)))
42.169366310001350  20  re.sub(r'[\W]', '', generate_test_string(i))
41.933375883003464  20  re.sub(r'[\W]+', '', generate_test_string(i))
38.899814646996674  20  pattern_single.sub('', generate_test_string(i))
38.636144253003295  20  pattern_repeat.sub('', generate_test_string(i))
36.201238164998360  20  generate_test_string(i).translate(translation_tb)
49.377356811004574  30  ''.join(c for c in generate_test_string(i) if c.isalnum())
48.408927293996385  30  ''.join(filter(str.isalnum, generate_test_string(i)))
53.901889764994850  30  re.sub(r'[\W]', '', generate_test_string(i))
52.130339455994545  30  re.sub(r'[\W]+', '', generate_test_string(i))
50.061149017004940  30  pattern_single.sub('', generate_test_string(i))
49.366573111998150  30  pattern_repeat.sub('', generate_test_string(i))
46.649754120997386  30  generate_test_string(i).translate(translation_tb)
63.107938601999194  40  ''.join(c for c in generate_test_string(i) if c.isalnum())
65.116287978999030  40  ''.join(filter(str.isalnum, generate_test_string(i)))
71.477421126997800  40  re.sub(r'[\W]', '', generate_test_string(i))
66.027950693998720  40  re.sub(r'[\W]+', '', generate_test_string(i))
63.315361931003280  40  pattern_single.sub('', generate_test_string(i))
62.342320287003530  40  pattern_repeat.sub('', generate_test_string(i))
58.249303059004890  40  generate_test_string(i).translate(translation_tb)
73.810345625002810  50  ''.join(c for c in generate_test_string(i) if c.isalnum())
72.593953348005020  50  ''.join(filter(str.isalnum, generate_test_string(i)))
76.048324580995540  50  re.sub(r'[\W]', '', generate_test_string(i))
75.106637657001560  50  re.sub(r'[\W]+', '', generate_test_string(i))
74.681338128997600  50  pattern_single.sub('', generate_test_string(i))
72.430461594005460  50  pattern_repeat.sub('', generate_test_string(i))
69.394243567003290  50  generate_test_string(i).translate(translation_tb)

str.maketrans & str.translate is fastest, but includes all non-ASCII characters. re.compile & pattern.sub is slower, but is somehow faster than ''.join & filter.

  1. HowTo
  2. Python How-To's
  3. Remove Non-Alphanumeric Characters From Python String

Created: May-28, 2021

  1. Use the isalnum() Method to Remove All Non-Alphanumeric Characters in Python String
  2. Use the filter() Function to Remove All Non-Alphanumeric Characters in Python String
  3. Use Regular Expressions to Remove All Non-Alphanumeric Characters in Python String

Alphanumeric characters contain the blend of the 26 characters of the letter set and the numbers 0 to 9. Non-alphanumeric characters include characters that are not letters or digits, like + and @.

In this tutorial, we will discuss how to remove non-alphanumeric characters from a string in Python.

Use the isalnum() Method to Remove All Non-Alphanumeric Characters in Python String

We can use the isalnum() method to check whether a given character or string is alphanumeric or not. We can compare each character individually from a string, and if it is alphanumeric, then we combine it using the join() function.

For example,

string_value = "alphanumeric@123__"
s = ''.join(ch for ch in string_value if ch.isalnum())
print(s)

Output:

alphanumeric123

Use the filter() Function to Remove All Non-Alphanumeric Characters in Python String

The filter() function is used to construct an iterator from components of the iterable object and filters the object’s elements using a function.

For our problem, the string is our object, and we will use the isalnum() function, which checks whether a given string contains alphanumeric characters or not by checking each character. The join() function combines all the characters to return a string.

For example,

string_value = "alphanumeric@123__"
s = ''.join(filter(str.isalnum, string_value))
print(s)

Output:

alphanumeric123

This method does not work with Python 3.

Use Regular Expressions to Remove All Non-Alphanumeric Characters in Python String

A regular expression is an exceptional grouping of characters that helps you match different strings or sets of strings, utilizing a specific syntax in a pattern. To use regular expressions, we import the re module.

We can use the sub() function from this module to replace all the string that matches a non-alphanumeric character by an empty character.

For example,

import re
string_value = "alphanumeric@123__"
s=re.sub(r'[\W_]+', '', string_value)
print(s)

Output:

alphanumeric123

Alternatively, we can also use the following pattern.

import re
string_value = "alphanumeric@123__"
s = re.sub(r'[^a-zA-Z0-9]', '', string_value)
print(s)

Output:

alphanumeric123

Write for us

DelftStack articles are written by software geeks like you. If you also would like to contribute to DelftStack by writing paid articles, you can check the write for us page.

Related Article - Python String

  • Remove Commas From String in Python
  • Check a String Is Empty in a Pythonic Way
  • Convert a String to Variable Name in Python
  • Remove Whitespace From a String in Python
  • How do you select only alpha characters in python?

    How do you only find alphanumeric characters in Python?

    Python string isalnum() function returns True if it's made of alphanumeric characters only. A character is alphanumeric if it's either an alpha or a number. If the string is empty, then isalnum() returns False .

    How do you find alphanumeric values in Python?

    Python String isalnum() Method The isalnum() method returns True if all the characters are alphanumeric, meaning alphabet letter (a-z) and numbers (0-9). Example of characters that are not alphanumeric: (space)!

    How do you make a string only alphanumeric in Python?

    Using string isalnum() and string join() functions You can use the string isalnum() function along with the string join() function to create a string with only alphanumeric characters. You can see that the resulting string doesn't have any non alphanumeric characters.

    How do I ignore non alphabetic characters in Python?

    Use the isalnum() Method to Remove All Non-Alphanumeric Characters in Python String. We can use the isalnum() method to check whether a given character or string is alphanumeric or not. We can compare each character individually from a string, and if it is alphanumeric, then we combine it using the join() function.