How do i get the length of a column in python?

This article describes how to get the number of rows, columns, and total number of elements (size) of pandas.DataFrame and pandas.Series.

  • pandas.DataFrame
    • Display number of rows, columns, etc.: df.info()
    • Get the number of rows: len(df)
    • Get the number of columns: len(df.columns)
    • Get the number of rows and columns: df.shape
    • Get the number of elements: df.size
    • Notes when specifying index
  • pandas.Series
    • Get the number of elements: len(s), s.size

As an example, use Titanic survivor data. It can be downloaded from Kaggle.

import pandas as pd

df = pd.read_csv('data/src/titanic_train.csv')

print(df.head())
#    PassengerId  Survived  Pclass  \
# 0            1         0       3   
# 1            2         1       1   
# 2            3         1       3   
# 3            4         1       1   
# 4            5         0       3   
# 
#                                                 Name     Sex   Age  SibSp  \
# 0                            Braund, Mr. Owen Harris    male  22.0      1   
# 1  Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   
# 2                             Heikkinen, Miss. Laina  female  26.0      0   
# 3       Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   
# 4                           Allen, Mr. William Henry    male  35.0      0   
# 
#    Parch            Ticket     Fare Cabin Embarked  
# 0      0         A/5 21171   7.2500   NaN        S  
# 1      0          PC 17599  71.2833   C85        C  
# 2      0  STON/O2. 3101282   7.9250   NaN        S  
# 3      0            113803  53.1000  C123        S  
# 4      0            373450   8.0500   NaN        S  

Get the number of rows, columns, elements of pandas.DataFrame

Display number of rows, columns, etc.: df.info()

The info() method of pandas.DataFrame can display information such as the number of rows and columns, the total memory usage, the data type of each column, and the number of non-NaN elements.

df.info()
# 
# RangeIndex: 891 entries, 0 to 890
# Data columns (total 12 columns):
# PassengerId    891 non-null int64
# Survived       891 non-null int64
# Pclass         891 non-null int64
# Name           891 non-null object
# Sex            891 non-null object
# Age            714 non-null float64
# SibSp          891 non-null int64
# Parch          891 non-null int64
# Ticket         891 non-null object
# Fare           891 non-null float64
# Cabin          204 non-null object
# Embarked       889 non-null object
# dtypes: float64(2), int64(5), object(5)
# memory usage: 83.6+ KB

The result is standard output and cannot be obtained as a value.

Get the number of rows: len(df)

The number of rows of pandas.DataFrame can be obtained with the Python built-in function len().

In the example, it is displayed using print(), but len() returns an integer value, so it can be assigned to another variable or used for calculation.

Get the number of columns: len(df.columns)

The number of columns of pandas.DataFrame can be obtained by applying len() to the columns attribute.

print(len(df.columns))
# 12

Get the number of rows and columns: df.shape

The shape attribute of pandas.DataFrame stores the number of rows and columns as a tuple (number of rows, number of columns).

print(df.shape)
# (891, 12)

print(df.shape[0])
# 891

print(df.shape[1])
# 12

It is also possible to unpack and store them in separate variables.

  • Unpack a tuple and list in Python

row, col = df.shape
print(row)
# 891

print(col)
# 12

Get the number of elements: df.size

The total number of elements of pandas.DataFrame is stored in the size attribute. This is equal to the row_count * column_count.

print(df.size)
# 10692

print(df.shape[0] * df.shape[1])
# 10692

Notes when specifying index

When a column of data is specified as an index by the set_index() method, these columns are removed from the data body (values attribute), so it is not counted as the number of columns.

df_multiindex = df.set_index(['Sex', 'Pclass', 'Embarked', 'PassengerId'])

print(len(df_multiindex))
# 891

print(len(df_multiindex.columns))
# 8

print(df_multiindex.shape)
# (891, 8)

print(df_multiindex.size)
# 7128

See the following article for set_index().

  • pandas: Assign existing column to the DataFrame index with set_index()

Get the number of elements of pandas.Series

As an example of pandas.Series, select one row from pandas.DataFrame.

s = df['PassengerId']
print(s.head())
# 0    1
# 1    2
# 2    3
# 3    4
# 4    5
# Name: PassengerId, dtype: int64

Get the number of elements : len(s), s.size

Since pandas.Series is one-dimensional, you can get the total number of elements (size) with either len() or size attribute.

Note that the shape attribute is a tuple with one element.

print(len(s))
# 891

print(s.size)
# 891

print(s.shape)
# (891,)

There is no info() method in pandas.Series.

How do you calculate columns in python?

Use DataFrame. sum() to get sum/total of a DataFrame for both rows and columns, to get the total sum of columns use axis=1 param. By default, this method takes axis=0 which means summing of rows.

How do you find the length of a series in python?

By using the python length function we can get the length of the Series object, as well as size and shape attributes will return the count of elements and dimension of the series.

How do I find the length of a string in a data frame?

To find the length of strings in a data frame you have the len method on the dataframes str property. But to do this you need to call this method on the column that contains the string data.

How do I find the size of a DataFrame in python?

ndim are used to return size, shape and dimensions of data frames and series..
Syntax: dataframe.size..
Return : Returns size of dataframe/series which is equivalent to total number of elements. ... .
Syntax: dataframe.shape..
Return : Returns tuple of shape (Rows, columns) of dataframe/series..
Syntax: dataframe.ndim..