Hướng dẫn find mode in python
IntroductionWhen we're trying to describe and summarize a sample of data, we probably start by finding the mean (or average), the median, and the mode of the data. These are central tendency measures and are often our first look at a dataset. Show In this tutorial, we'll learn how to find or compute the mean, the median, and the mode in Python. We'll first code a Python function for each measure followed by using Python's
With this knowledge, we'll be able to take a quick look at our datasets and get an idea of the general tendency of data. Calculating the Mean of a SampleIf we have a sample of numeric values, then its mean or the average is the total sum of the values (or observations) divided by the number of values. Say we have the sample
The mean (arithmetic mean) is a general description of our data. Suppose you buy 10 pounds of tomatoes. When you count the tomatoes at home, you get 25 tomatoes. In this case, you can say that the average weight of a tomato is 0.4 pounds. That would be a good description of your tomatoes. The mean can also be a poor description of a sample of data. Say you're analyzing a group of dogs. If you take the cumulated weight of all dogs and divide it by the number of dogs, then that would probably be a poor description of the weight of an individual dog as different breeds of dogs can have vastly different sizes and weights. How good or how bad the mean describes a sample depends on how spread the data is. In the case of tomatoes, they're almost the same weight each and the mean is a good description of them. In the case of dogs, there is no topical dog. They can range from a tiny Chihuahua to a giant German Mastiff. So, the mean by itself isn't a good description in this case. Now it's time to get into action and learn how we can calculate the mean using Python. Calculating the Mean With PythonTo calculate the mean of a sample of numeric data, we'll use two of Python's built-in functions. One to calculate the total sum of the values and another to calculate the length of the sample. The first function is The second function is Here's how we can calculate the mean:
We first sum the values in Using Python's mean()Since calculating the mean is a common operation, Python includes
this functionality in the Here's how Python's
We just need to
import the Finding the Median of a SampleThe median of a sample of numeric data is the value that lies in the middle when we sort the data. The data may be sorted in ascending or descending order, the median remains the same. To find the median, we need to:
When locating the number in the middle of a sorted sample, we can face two kinds of situations:
If we have the sample On the other hand, if we have the sample Let's take a look at how we can use Python to calculate the median. Finding the Median With PythonTo find the median, we first need to sort the values in our sample. We can achieve that using the built-in The second step is to locate the value that lies in the middle of the sorted sample. To locate that value in a sample with an odd number of observations, we can divide the number of observations by 2. The result will be the index of the value in the middle of the sorted sample. Since a division operator ( If the sample has an even number of observations, then we need to locate the two middle values. Say we have the sample Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it! Let's put all these together in function that calculates the median of a sample. Here's a possible implementation:
This function takes a sample of numeric values and returns its median. We first find the length of the sample, The The final Note that the slicing operation Using Python's median()Python's
Note that Finding the Mode of a SampleThe mode is the most frequent observation (or observations) in a sample. If we have the sample The mode doesn't have to be unique. Some samples have more than one mode. Say we have the sample The mode is commonly used for categorical data. Common categorical data types are:
When we're analyzing a dataset of categorical data, we can use the mode to know which category is the most common in our data. We can find samples that don't have a mode. If all the observations are unique (there aren't repeated observations), then your sample won't have a mode. Now that we know the basics about mode, let's take a look at how we can find it using Python. Finding the Mode with PythonTo find the mode with Python, we'll start by counting the number of occurrences of each value in the sample at hand. Then, we'll get the value(s) with a higher number of occurrences. Since counting objects is a common operation, Python provides the
The Let's use Here's a possible implementation:
We first count the observations in the Since Note that the comprehension's condition compares the count of each observation ( Using Python's mode()Python's
With a single-mode sample, Python's Since
Python 3.8 we can also use Here's an example of how to use
Note: The function always returns a ConclusionThe mean (or average), the median, and the mode are commonly our first looks at a sample of data when we're trying to understand the central tendency of the data. In this tutorial, we've learned how to find or compute the mean, the median, and the mode using Python. We first covered, step-by-step, how to create our own functions to compute them, and then how to use Python's |