What relies on looking at past records or data sets to look for interesting patterns or relationships?
What is data mining?Data mining is the process of sorting through large data sets to identify patterns and relationships that can help solve business problems through data analysis. Data mining techniques and tools enable enterprises to predict future trends and make more-informed business decisions. Show
Data mining is a key part of data analytics overall and one of the core disciplines in data science, which uses advanced analytics techniques to find useful information in data sets. At a more granular level, data mining is a step in the knowledge discovery in databases (KDD) process, a data science methodology for gathering, processing and analyzing data. Data mining and KDD are sometimes referred to interchangeably, but they're more commonly seen as distinct things. Why is data mining important?Data mining is a crucial component of successful analytics initiatives in organizations. The information it generates can be used in business intelligence (BI) and advanced analytics applications that involve analysis of historical data, as well as real-time analytics applications that examine streaming data as it's created or collected. Effective data mining aids in various aspects of planning business strategies and managing operations. That includes customer-facing functions such as marketing, advertising, sales and customer support, plus manufacturing, supply chain management, finance and HR. Data mining supports fraud detection, risk management, cybersecurity planning and many other critical business use cases. It also plays an important role in healthcare, government, scientific research, mathematics, sports and more.
Data mining process: How does it work?Data mining is typically done by data scientists and other skilled BI and analytics professionals. But it can also be performed by data-savvy business analysts, executives and workers who function as citizen data scientists in an organization. Its core elements include machine learning and statistical analysis, along with data management tasks done to prepare data for analysis. The use of machine learning algorithms and artificial intelligence (AI) tools has automated more of the process and made it easier to mine massive data sets, such as customer databases, transaction records and log files from web servers, mobile apps and sensors. The data mining process can be broken down into these four primary stages:
Types of data mining techniquesVarious techniques can be used to mine data for different data science applications. Pattern recognition is a common data mining use case that's enabled by multiple techniques, as is anomaly detection, which aims to identify outlier values in data sets. Popular data mining techniques include the following types:
Data mining software and toolsData mining tools are available from a large number of vendors, typically as part of software platforms that also include other types of data science and advanced analytics tools. Key features provided by data mining software include data preparation capabilities, built-in algorithms, predictive modeling support, a GUI-based development environment, and tools for deploying models and scoring how they perform. Vendors that offer tools for data mining include Alteryx, AWS, Databricks, Dataiku, DataRobot, Google, H2O.ai, IBM, Knime, Microsoft, Oracle, RapidMiner, SAP, SAS Institute and Tibco Software, among others. A variety of free open source technologies can also be used to mine data, including DataMelt, Elki, Orange, Rattle, scikit-learn and Weka. Some software vendors provide open source options, too. For example, Knime combines an open source analytics platform with commercial software for managing data science applications, while companies such as Dataiku and H2O.ai offer free versions of their tools. Benefits of data miningIn general, the business benefits of data mining come from the increased ability to uncover hidden patterns, trends, correlations and anomalies in data sets. That information can be used to improve business decision-making and strategic planning through a combination of conventional data analysis and predictive analytics. Specific data mining benefits include the following:
Ultimately, data mining initiatives can lead to higher revenue and profits, as well as competitive advantages that set companies apart from their business rivals. Industry examples of data miningHere's how organizations in some industries use data mining as part of analytics applications:
Data mining vs. data analytics and data warehousingData mining is sometimes viewed as being synonymous with data analytics. But it's predominantly seen as a specific aspect of data analytics that automates the analysis of large data sets to discover information that otherwise couldn't be detected. That information can then be used in the data science process and in other BI and analytics applications. Data warehousing supports data mining efforts by providing repositories for the data sets. Traditionally, historical data has been stored in enterprise data warehouses or smaller data marts built for individual business units or to hold specific subsets of data. Now, though, data mining applications are often served by data lakes that store both historical and streaming data and are based on big data platforms like Hadoop and Spark, NoSQL databases or cloud object storage services. Data mining history and originsData warehousing, BI and analytics technologies began to emerge in the late 1980s and early 1990s, providing an increased ability to analyze the growing amounts of data that organizations were creating and collecting. The term data mining was in use by 1995, when the First International Conference on Knowledge Discovery and Data Mining was held in Montreal. The event was sponsored by the Association for the Advancement of Artificial Intelligence, or AARI, which also held the conference annually for the next three years. Since 1999, the conference -- popularly known as KDD 2021 and so on -- has been organized primarily by SIGKDD, the special interest group on knowledge discovery and data mining within the Association for Computing Machinery. A technical journal, Data Mining and Knowledge Discovery, published its first issue in 1997. Initially a quarterly, it's now published bimonthly and contains peer-reviewed articles on data mining and knowledge discovery theories, techniques and practices. Another publication, the American Journal of Data Mining and Knowledge Discovery, was launched in 2016. This was last updated in September 2021 Continue Reading About data mining
Dig Deeper on Data science and analytics
In what research a researcher compares multiple segments of the population at the same time?Cross-sectional research compares multiple segments of a population at a single time.
Which of the following looks at data administered over an extended period of a time quizlet?Longitudinal research is a research design in which data-gathering is administered repeatedly over an extended period of time.
What is the only type of research method that allows for causation to be determined?A controlled experiment is the only research method that can establish a cause and effect relationship.
What is archival research in psychology?the use of books, journals, historical documents, and other existing records or data available in storage in scientific research. Archival research allows for unobtrusive observation of human activity in natural settings and permits the study of phenomena that otherwise cannot easily be investigated.
|