Introduction to CRISP-DM Framework

CRISP-DM is one of the popular methodologies in the data mining community(Cross-Industry Standard Process for Data Mining).CRISP-DM was created in 1996 and is still used today. It is very important for anyone working on following a streamlined approach of creating a Machine Learning Model. This is also done to ensure, we follow and do not miss any of the required steps for creating our Machine Learning Model.



CRISP-DM is a cross-industry process for data mining. The CRISP-DM methodology provides a structured approach to planning a data mining project.

The CRISP-DM consists of the following phases, which aren't mutually exclusive and can occur in parallel:
  1. Business Understanding
  2. Data Understanding
  3. Data Preparation
  4. Modeling
  5. Evaluation
  6. Deployment
I will try to bring in the key steps and significance of all the above-mentioned phases.
  
Business understanding: This phase is often taken care of by specialized domain experts. Usually, we have a business person formulate a business problem, such as selling more units of a certain product.

Data understanding: This is also a phase that may require input from domain experts, however, often a technical specialist needs to get involved more than in the business understanding phase. The domain expert may be proficient with spreadsheet programs, but have trouble with complicated data.  It's usually termed as phase exploration. 

Data preparation: This is also a phase where a domain expert with only Microsoft Excel knowledge may not be able to help you. This is the phase where we create our training and test datasets. It's usually termed as phase preprocessing. 

Modeling: This is the phase most people associate with machine learning. In this phase, we formulate a model and fit our data. 

Evaluation: In this phase, we evaluate how well the model fits the data to check whether we were able to solve our business problem. 

Deployment: This phase usually involves setting up the system in a production environment. Typically, this is done by a specialized team.

2 comments:

  1. Thank you so much for this nice information. Hope so many people will get aware of this and useful as well. And please keep update like this.

    Text Analytics Software

    Text Summarization Solutions

    ReplyDelete

Powered by Blogger.