plus: change the precision of the numbers in a Dataframe.
[all the code of this post you can find in my github]
Hello All! Following my Pandas’ tips series [the last post was about Groupby Tips], I will explain how to display all columns and rows of a Pandas Dataframe. Besides that, I will explain how to show all values in a list inside a Dataframe and choose the precision of the numbers in a Dataframe. Everything with the same tool.
In this tutorial, I am using the top 250 IMDB movies dataset, downloaded from Data World. The database has 250 rows and 37 columns
Problem: Pandas truncates information
Sometimes I read a Dataframe with many rows or columns and when I display it in Jupyter the rows and columns are hidden [highlighted in the red boxes]:
movies = pd.read_csv["data/IMDB_Top250movies2_OMDB_Detailed.csv"]
movies
I understand that they are hidden to avoid displaying so much information. But sometimes I want to see all the columns and rows! So, how to print them all?
We can play with the Options parameters in Pandas. Let’s see.
Options
Pandas has the Options configuration, which you can change the display settings of your Dataframe [and more].
All you need to do is select your option [with a string name] and get/set/reset the values of it. And those functions accept regex pattern, so if you pass a substring it will work [unless more than one option is matched].
Columns
The display.max_columns option controls the number of columns to be printed. It receives an int or None [to print all the columns]:
pd.set_option['display.max_columns', None]
movies.head[]
You can also use the string max_columns instead of display.max_columns [remember that it accepts a regex]:
pd.set_option['max_columns', None]
Passing a number instead of None:
pd.set_option['max_columns', 2]
movies.head[]
To go back to the default value, you need to reset the option:
pd.reset_option[“max_columns”]
movies.head[]
Column width
You can change the width of the column with the option max_colwidth. For example, the plot column has many characters and originally is displayed truncated:
You can increase the width passing an int [or put at the max passing None]:
pd.set_option[“max_colwidth”, None]
movies[[“Title”, “Plot”]].head[]
Rows
To change the number of rows you need to change the max_rows option.
pd.set_option["max_columns", 2] #Showing only two columnspd.set_option["max_rows", None]
movies
Related to rows, there are two settings: max_rows and min_rows. When the number of rows is greater than max_rows, the Dataframe is truncated and it is shown min_rows rows.
For example. Let’s print the movies Dataframe again along with the default values of max_rows and min_rows:
print["Default max_rows: {} and min_rows: {}".format[
pd.get_option["max_rows"], pd.get_option["min_rows"]]]movies
As the number of rows in the Dataframe is 250 [more than max_rows value 60], it is shown 10 rows [min_rows value], the first and last 5 rows.
If we change min_rows to 2 it will only display the first and the last rows:
pd.set_option[“min_rows”, 2]
movies
If we use the head command with a value below the max_rows value [60], all the rows are shown. For example, using head with value 20:
movies.head[20]
Sequence of items
The sequence of items [lists] are also truncated if they have many characters:
#Create "my_list" column and put a list of 100 values in each row
movies[‘my_list’] = [[1]*100] * 250
movies.head[]
The option to change this behavior is max_seq_items. But we have also to change the max_colwidth. First, changing the max_colwidth [the lists will be truncated]:
pd.set_option[“max_colwidth”, None]
movies.head[]
Then you change the max_seq_item.
pd.set_option[“max_seq_item”, None]
movies.head[]
Bonus: Precision of numbers
Another useful option is to set the float precision - the number of places after the decimal, using the precision option.
#adding more decimal places to imdbRating column
movies[‘imdbRating’] = movies[‘imdbRating’] + 0.11111
movies[[‘imdbRating’]].head[]
pd.set_option[‘precision’, 2]movies[[‘imdbRating’]].head[]
Sources:
- Options and settings — pandas 1.0.1 documentation
- //stackoverflow.com/questions/19124601/pretty-print-an-entire-pandas-series-dataframe
- //stackoverflow.com/questions/52580111/how-do-i-set-the-column-width-when-using-pandas-dataframe-to-html/52580495
Thank you and feel free to add your comments
PS: There is a Brazilian movie in the IMDB top 250: City of God. It’s a very good movie =]