How do you find the top 5 in python?

Question

I am new to pandas and I would like to filter a dataframe in pandas which includes top5 values in the list. What is the best way to get the 5 values from the list with that code?

Nội dung chính Show

Step 1: Create Sample DataFrame
Step 2: Get Top 10 biggest/lowest values for single column
Step 3: Get Top 10 biggest/lowest values - duplicates
Step 4: Get Top N values in multiple columns
Step 5: How do nsmallest and nlargest work
Step 6: Display Top values for all numeric columns in DataFrame
Step 7: nsmallest and nlargest and datetime
How do you find the top 5 highest value in pandas?
How do you print top 5 rows in Python?
How do I print the first 5 rows in Python?
How do you sort Top 5 in Python?

My Code:

cheese_top5 = cheese[cheese.year >= 2016]

AMC

2,5747 gold badges12 silver badges34 bronze badges

asked Nov 23, 2017 at 20:07

6

I think what you are looking for is:

cheese.sort_values(by=['Name of column']).head(5)

to say anything more we need to see a sample of your data.

answered May 1, 2019 at 20:36

MarkMark

84610 silver badges25 bronze badges

dataframe_name['field_name'].value_counts.head(5)

Tomerikoo

16.6k15 gold badges37 silver badges54 bronze badges

answered Sep 18, 2020 at 17:28

1

import pandas as pd
df = pd.read_csv('911.csv')
df['zip'].value_counts().head(5)

StupidWolf

42.8k17 gold badges32 silver badges65 bronze badges

answered Nov 29, 2020 at 9:50

To get the top 5 most occuring values use

df['column'].value_counts().head(n)

The solution provided by @lux7

df['column'].nlargest(n=5)

would result in the top 5 values from a column(their values not how many times they have appeared).

horace_vr

2,8275 gold badges27 silver badges45 bronze badges

answered Mar 26 at 10:48

In this short guide, I'll show you how to get top 5, 10 or N values in Pandas DataFrame. You can find also how to print top/bottom values for all columns in a DataFrame.

If you need the most frequent values in a column or DataFrame you can check: How to Get Most Frequent Values in Pandas Dataframe

To start, here is the syntax that you may apply in order to get top 10 biggest numeric values in your DataFrame:

df['Magnitude'].nlargest(n=10)

To list the top 10 lowest values in DataFrame you can use:

df.nlargest(n=5, columns=['Magnitude', 'Depth'])

In the next section, I’ll show you more examples and other techniques in order to get top/bottom values in DataFrame.

Step 1: Create Sample DataFrame

In this example we are going to use real data available from Kaggle:Significant Earthquakes, 1965-2016.

To learn more about reading Kaggle data with Python and Pandas: How to Search and Download Kaggle Dataset to Pandas DataFrame

import pandas as pd

cols = ['Date', 'Time', 'Latitude', 'Longitude', 'Depth', 'Magnitude Type', 'Type', 'ID', 'Magnitude']
df = pd.read_csv(f'../data/earthquakes_1965_2016_database.csv.zip')[cols]

data:

Date	Latitude	Longitude	Depth	Magnitude Type	Magnitude
01/02/1965	19.246	145.616	131.6	MW	6.0
01/04/1965	1.863	127.352	80.0	MW	5.8
01/05/1965	-20.579	-173.972	20.0	MW	6.2
01/08/1965	-59.076	-23.557	15.0	MW	5.8
01/09/1965	11.938	126.427	15.0	MW	5.8

Step 2: Get Top 10 biggest/lowest values for single column

You can use functions:

nsmallest - return the first n rows ordered by columns in ascending order
nlargest - return the first n rows ordered by columns in descending order

to get the top N highest or lowest values in Pandas DataFrame.

So let's say that we would like to find the strongest earthquakes by magnitude. From the data above we can use the following syntax:

df['Magnitude'].nlargest(n=10)

the result is:

17083    9.1
20501    9.1
19928    8.8
16       8.7
17329    8.6
21219    8.6
...
Name: Magnitude, dtype: float64

How does it return the top 10 values? It orders all values in ascending or descending order. Then will return 5, 10 or N values from this order.

Step 3: Get Top 10 biggest/lowest values - duplicates

As you can see in Step 2 there are duplicates in the results.

nsmallest / nlargest have a parameter:

keep{‘first’, ‘last’, ‘all’}, default ‘first’

which helps to deal with duplicates.

Let's say that you would like to get 5 smallest values by Depth of the earthquake:

df.nsmallest(n=5, columns=['Depth'])

result:

Date	Time	Depth	Magnitude Type	Type	ID	Magnitude
06/28/1992	12:00:45	-1.100	ML	Earthquake	CI3043549	5.77
06/28/1992	11:57:34	-0.097	MW	Earthquake	CI3031111	7.30
07/21/1986	22:07:16	-0.076	ML	Earthquake	NC95459	5.60
02/16/1973	05:02:58	0.000	MB	Explosion	USP00000JC	5.60
07/23/1973	01:22:58	0.000	MB	Nuclear Explosion	USP00002TP	6.30

It seems that we have many records with Depth 0. In order to get all records which have Depth 0 we can use:

df.nsmallest(n=5, columns=['Depth'], keep='all')

the result will return the previous ones plus the duplicated:

173 rows × 7 columns

While if we use keep='last' we will get again 5 rows as a result.

Step 4: Get Top N values in multiple columns

You can get top 5, 10 or N records from multiple columns with:

nsmallest
nlargest

To do so you can use the following syntax:

df.nlargest(n=5, columns=['Magnitude', 'Depth'])

result:

Date	Time	Depth	Magnitude Type	Type	Magnitude
12/26/2004	00:58:53	30.0	MW	Earthquake	9.1
03/11/2011	05:46:24	29.0	MWW	Earthquake	9.1
02/27/2010	06:34:12	22.9	MWW	Earthquake	8.8
02/04/1965	05:01:22	30.3	MW	Earthquake	8.7
03/28/2005	16:09:37	30.0	MWW	Earthquake	8.6

Step 5: How do nsmallest and nlargest work

So to summarise the algorithm:

order all values - ascending or descending
find top N values
filter out duplicates ( depends on parameter keep )
return N top values

Both methods will work only with numeric types. If you try to use them on columns with dtype object you will get an error:

TypeError: Column 'Magnitude Type' has dtype object, cannot use method 'nlargest' with this dtype

Step 6: Display Top values for all numeric columns in DataFrame

If you like to get top values for all columns you can use the next syntax:

from pandas.api.types import is_numeric_dtype

dfs = []

for col in df.columns:
    top_values = []
    if is_numeric_dtype(df[col]):
        top_values = df[col].nlargest(n=7)
        dfs.append(pd.DataFrame({col: top_values}).reset_index(drop=True))
pd.concat(dfs, axis=1)

This will return the top values per column as a new DataFrame:

Depth	Magnitude
700.0	9.1
691.6	9.1
690.0	8.8
688.0	8.7
687.6	8.6

Step 7: nsmallest and nlargest and datetime

It's possible to use methods nsmallest and nlargest with datetime. You need to be sure that the target column is from type: datetime64[ns, UTC]

If you need to convert string to datetime column you can use:

df['Date'] = pd.to_datetime(df['Date'], utc=True)

and then use the nlargest:

df['Date'].nlargest()

Resources

Notebook
pandas.DataFrame.nlargest
pandas.DataFrame.nsmallest

How do you find the top 5 highest value in pandas?

How to Get Top 10 Highest or Lowest Values in Pandas.

Step 1: Create Sample DataFrame. ... .

Step 2: Get Top 10 biggest/lowest values for single column. ... .

Step 3: Get Top 10 biggest/lowest values - duplicates. ... .

Step 4: Get Top N values in multiple columns. ... .

Step 5: How do nsmallest and nlargest work..

How do you print top 5 rows in Python?

head() prints the indexes. dataframe. to_string(index=False,max_rows=10) prints the first 5 and last 5 rows..

df. head(). to_string(index=False) returns a string with "\n" as row delimiters,.

split() method then returns a list of single rows,.

[print(row) for row in ...] then prints every row..

How do I print the first 5 rows in Python?

DataFrame. head(n) to get the first n rows of the DataFrame. It takes one optional argument n (number of rows you want to get from the start). By default n = 5, it return first 5 rows if value of n is not passed to the method.

How do you sort Top 5 in Python?

You can use sort_values() function to sort the dataframe by column 'Age'. To sort in the descending order, you need to add parameter 'ascending=False' to this function. Then you can apply iloc to the sorted dataframe to select the top 5 rows.

programming python Sort_values Pandas select top Nlargest Pandas head

How do you find the top 5 in python?

Step 1: Create Sample DataFrame

Step 2: Get Top 10 biggest/lowest values for single column

Step 3: Get Top 10 biggest/lowest values - duplicates

Step 4: Get Top N values in multiple columns

Step 5: How do nsmallest and nlargest work

Step 6: Display Top values for all numeric columns in DataFrame

Step 7: nsmallest and nlargest and datetime

Resources

How do you find the top 5 highest value in pandas?

How do you print top 5 rows in Python?

How do I print the first 5 rows in Python?

How do you sort Top 5 in Python?

Bài Viết Liên Quan

Quảng Cáo

Có thể bạn quan tâm

Toplist được quan tâm

Quảng cáo

Xem Nhiều

Quảng cáo

Chúng tôi

Điều khoản

Trợ giúp

Mạng xã hội