How do you split text into chunks in python?

Contents

  • Introduction
  • Sample Code Snippet
  • Example 1: Split String into Chunks
  • Example 2: Split String by Length
  • Example 3: Split String with 0 Chunk Length
  • Example 4: Split String into Chunks using While Loop
  • Summary

To split a string into chunks of specific length, use List Comprehension with the string. All the chunks will be returned as an array.

We can also use a while loop to split a list into chunks of specific length.

In this tutorial, we shall learn how to split a string into specific length chunks, with the help of well detailed example Python programs.

Sample Code Snippet

Following is a quick code snippet to split a given string str into chunks of specific length n using list comprehension.

n = 3 # chunk length
chunks = [str[i:i+n] for i in range(0, len(str), n)]

Example 1: Split String into Chunks

In this, we will take a string str and split this string into chunks of length 3 using list comprehension.

Python Program

str = 'CarBadBoxNumKeyValRayCppSan'

n = 3
chunks = [str[i:i+n] for i in range(0, len(str), n)]
print(chunks)

Run

Output

['Car', 'Bad', 'Box', 'Num', 'Key', 'Val', 'Ray', 'Cpp', 'San']

The string is split into a list of strings with each of the string length as specified, i.e., 3. You can try with different length and different string values.

Example 2: Split String by Length

In this example we will split a string into chunks of length 4. Also, we have taken a string such that its length is not exactly divisible by chunk length. In that case, the last chunk contains characters whose count is less than the chunk size we provided.

Python Program

str = 'Welcome to Python Examples'

n = 4
chunks = [str[i:i+n] for i in range(0, len(str), n)]
print(chunks)

Run

Output

['Welc', 'ome ', 'to P', 'ytho', 'n Ex', 'ampl', 'es']

Example 3: Split String with 0 Chunk Length

In this example, we shall test a negative scenario with chink size of 0, and check the output. range() function raises ValueError if zero is given for its third argument.

Python Program

str = 'Welcome to Python Examples'

#chunk size
n = 0

chunks = [str[i:i+n] for i in range(0, len(str), n)]
print(chunks)

Run

Output

Traceback (most recent call last):
  File "example1.py", line 4, in 
    chunks = [str[i:i+n] for i in range(0, len(str), n)]
ValueError: range() arg 3 must not be zero

Chunk length must not be zero, and hence we got a ValueError for range().

Example 4: Split String into Chunks using While Loop

In this example, we will split string into chunks using Python While Loop.

Python Program

str = 'Welcome to Python Examples'
n = 5

chunks = []

i = 0
while i < len(str):
    if i+n < len(str):
        chunks.append(str[i:i+n])
    else:
        chunks.append(str[i:len(str)])
    i += n
print(chunks)

Run

Output

['Welco', 'me to', ' Pyth', 'on Ex', 'ample', 's']

Summary

In this tutorial of Python Examples, we learned how to split string by length in Python with the help of well detailed examples.

  • How to Split String by Underscore in Python?
  • Python Split String into List of Characters
  • Python Split String by New Line
  • Python Split String by Comma
  • Python Split String by Space

Is it possible to split a string every nth character?

For example, suppose I have a string containing the following:

'1234567890'

How can I get it to look like this:

['12','34','56','78','90']

Georgy

10.8k7 gold badges62 silver badges68 bronze badges

asked Feb 28, 2012 at 1:48

1

>>> line = '1234567890'
>>> n = 2
>>> [line[i:i+n] for i in range(0, len(line), n)]
['12', '34', '56', '78', '90']

answered Feb 28, 2012 at 2:02

4

Just to be complete, you can do this with a regex:

>>> import re
>>> re.findall('..','1234567890')
['12', '34', '56', '78', '90']

For odd number of chars you can do this:

>>> import re
>>> re.findall('..?', '123456789')
['12', '34', '56', '78', '9']

You can also do the following, to simplify the regex for longer chunks:

>>> import re
>>> re.findall('.{1,2}', '123456789')
['12', '34', '56', '78', '9']

And you can use re.finditer if the string is long to generate chunk by chunk.

Georgy

10.8k7 gold badges62 silver badges68 bronze badges

answered Feb 28, 2012 at 6:31

How do you split text into chunks in python?

the wolfthe wolf

32.8k12 gold badges53 silver badges71 bronze badges

5

There is already an inbuilt function in python for this.

>>> from textwrap import wrap
>>> s = '1234567890'
>>> wrap(s, 2)
['12', '34', '56', '78', '90']

This is what the docstring for wrap says:

>>> help(wrap)
'''
Help on function wrap in module textwrap:

wrap(text, width=70, **kwargs)
    Wrap a single paragraph of text, returning a list of wrapped lines.

    Reformat the single paragraph in 'text' so it fits in lines of no
    more than 'width' columns, and return a list of wrapped lines.  By
    default, tabs in 'text' are expanded with string.expandtabs(), and
    all other whitespace characters (including newline) are converted to
    space.  See TextWrapper class for available keyword args to customize
    wrapping behaviour.
'''

answered Feb 19, 2018 at 6:57

How do you split text into chunks in python?

10

Another common way of grouping elements into n-length groups:

>>> s = '1234567890'
>>> map(''.join, zip(*[iter(s)]*2))
['12', '34', '56', '78', '90']

This method comes straight from the docs for zip().

answered Feb 28, 2012 at 2:25

Andrew ClarkAndrew Clark

195k33 gold badges264 silver badges297 bronze badges

5

I think this is shorter and more readable than the itertools version:

def split_by_n(seq, n):
    '''A generator to divide a sequence into chunks of n units.'''
    while seq:
        yield seq[:n]
        seq = seq[n:]

print(list(split_by_n('1234567890', 2)))

How do you split text into chunks in python?

answered Feb 28, 2012 at 1:53

Russell BorogoveRussell Borogove

17.9k3 gold badges39 silver badges48 bronze badges

2

Using more-itertools from PyPI:

>>> from more_itertools import sliced
>>> list(sliced('1234567890', 2))
['12', '34', '56', '78', '90']

answered Jun 22, 2017 at 10:19

Tim DielsTim Diels

3,0282 gold badges18 silver badges22 bronze badges

I like this solution:

s = '1234567890'
o = []
while s:
    o.append(s[:2])
    s = s[2:]

answered Sep 12, 2015 at 23:14

vlkvlk

2,3913 gold badges31 silver badges33 bronze badges

You could use the grouper() recipe from itertools:

Python 2.x:

from itertools import izip_longest    

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

Python 3.x:

from itertools import zip_longest

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

These functions are memory-efficient and work with any iterables.

answered Oct 3, 2015 at 20:16

Eugene YarmashEugene Yarmash

134k37 gold badges309 silver badges366 bronze badges

1

This can be achieved by a simple for loop.

a = '1234567890a'
result = []

for i in range(0, len(a), 2):
    result.append(a[i : i + 2])
print(result)

The output looks like ['12', '34', '56', '78', '90', 'a']

How do you split text into chunks in python?

answered May 22, 2020 at 18:02

How do you split text into chunks in python?

Kasem777Kasem777

5575 silver badges10 bronze badges

3

I was stucked in the same scenrio.

This worked for me

x="1234567890"
n=2
list=[]
for i in range(0,len(x),n):
    list.append(x[i:i+n])
print(list)

Output

['12', '34', '56', '78', '90']

answered Nov 28, 2019 at 14:54

StrickStrick

1,3748 silver badges15 bronze badges

1

Try the following code:

from itertools import islice

def split_every(n, iterable):
    i = iter(iterable)
    piece = list(islice(i, n))
    while piece:
        yield piece
        piece = list(islice(i, n))

s = '1234567890'
print list(split_every(2, list(s)))

answered Feb 28, 2012 at 1:52

enderskillenderskill

6,9483 gold badges23 silver badges23 bronze badges

1

Try this:

s='1234567890'
print([s[idx:idx+2] for idx,val in enumerate(s) if idx%2 == 0])

Output:

['12', '34', '56', '78', '90']

answered Jul 10, 2018 at 3:46

How do you split text into chunks in python?

U12-ForwardU12-Forward

66.2k13 gold badges76 silver badges96 bronze badges

0

>>> from functools import reduce
>>> from operator import add
>>> from itertools import izip
>>> x = iter('1234567890')
>>> [reduce(add, tup) for tup in izip(x, x)]
['12', '34', '56', '78', '90']
>>> x = iter('1234567890')
>>> [reduce(add, tup) for tup in izip(x, x, x)]
['123', '456', '789']

answered Feb 28, 2012 at 1:56

ben wben w

2,45212 silver badges17 bronze badges

0

As always, for those who love one liners

n = 2  
line = "this is a line split into n characters"  
line = [line[i * n:i * n+n] for i,blah in enumerate(line[::n])]

answered May 20, 2016 at 20:00

SqripterSqripter

992 silver badges7 bronze badges

4

more_itertools.sliced has been mentioned before. Here are four more options from the more_itertools library:

s = "1234567890"

["".join(c) for c in mit.grouper(2, s)]

["".join(c) for c in mit.chunked(s, 2)]

["".join(c) for c in mit.windowed(s, 2, step=2)]

["".join(c) for c in  mit.split_after(s, lambda x: int(x) % 2 == 0)]

Each of the latter options produce the following output:

['12', '34', '56', '78', '90']

Documentation for discussed options: grouper, chunked, windowed, split_after

answered Feb 9, 2018 at 1:16

How do you split text into chunks in python?

pylangpylang

36.3k11 gold badges120 silver badges110 bronze badges

0

A simple recursive solution for short string:

def split(s, n):
    if len(s) < n:
        return []
    else:
        return [s[:n]] + split(s[n:], n)

print(split('1234567890', 2))

Or in such a form:

def split(s, n):
    if len(s) < n:
        return []
    elif len(s) == n:
        return [s]
    else:
        return split(s[:n], n) + split(s[n:], n)

, which illustrates the typical divide and conquer pattern in recursive approach more explicitly (though practically it is not necessary to do it this way)

answered Oct 22, 2018 at 10:25

englealuzeenglealuze

1,40511 silver badges17 bronze badges

A solution with groupby:

from itertools import groupby, chain, repeat, cycle

text = "wwworldggggreattecchemggpwwwzaz"
n = 3
c = cycle(chain(repeat(0, n), repeat(1, n)))
res = ["".join(g) for _, g in groupby(text, lambda x: next(c))]
print(res)

Output:

['www', 'orl', 'dgg', 'ggr', 'eat', 'tec', 'che', 'mgg', 'pww', 'wza', 'z']

answered Jul 23, 2021 at 23:08

How do you split text into chunks in python?

TigerTV.ruTigerTV.ru

1,0382 gold badges14 silver badges33 bronze badges

These answers are all nice and working and all, but the syntax is so cryptic... Why not write a simple function?

def SplitEvery(string, length):
    if len(string) <= length: return [string]        
    sections = len(string) / length
    lines = []
    start = 0;
    for i in range(sections):
        line = string[start:start+length]
        lines.append(line)
        start += length
    return lines

And call it simply:

text = '1234567890'
lines = SplitEvery(text, 2)
print(lines)

# output: ['12', '34', '56', '78', '90']

answered Jul 22 at 9:12

How do you split text into chunks in python?

How do you split string into chunks?

Split a string into chunks of a certain size in C#.
Using LINQ. We can use LINQ's Select() method to split a string into substrings of equal size. ... .
Using String.Substring() method. ... .
Using Regex..

How do you split a string into parts in Python?

The split() method splits a string into a list. You can specify the separator, default separator is any whitespace. Note: When maxsplit is specified, the list will contain the specified number of elements plus one.

Can you split words in Python?

Splitting a string in Python is pretty simple. You can achieve this using Python's built-in "split()" function. The split() method in Python separates each word in a string using a comma, turning it into a list of words.

How do you split a string into substrings of length in Python?

To split a string into chunks of specific length, use List Comprehension with the string. All the chunks will be returned as an array. We can also use a while loop to split a list into chunks of specific length.