Artificially providing the ability to possess the intelligence to our computers is known as artificial intelligence. Machine intelligence or Artificial intelligence, in a broader sense, tries to mimic the intelligence of human beings and sometimes performs even better than humans after completing the learning stage.
One fantastic example of machines surpassing human intelligence is analyzing a more significant chunk of data. Suppose some human is studying a larger piece of data, and while doing so, they observe 500 similar samples. They might probably become biased and skip going through every sample after that, which can be wrong sometimes. But on the other hand, machines will analyze every sample thoroughly and provide insight, which will be less erroneous.
After going through this blog, we will be able to understand the following things:
So let's start without any further delay.
In the current era of software engineers, we mainly have traditional Information Technology (IT) infrastructure where business teams (or experts) make decisions and convey them to our developers. Developers write logic as programs and integrate it with the IT system, producing output in the form of decisions. At the current stage of development, ML techniques can not replace this entire process yet can complement them for betterment. And for this betterment, there can be two scenarios,
In this case, IT experts define rules based on their experience. Our Machine Learning models analyze the data and provide their insights, which can support the rules made by experts or even discard them.
For example, in our traditional banking systems, bank experts make rules that if a farmer has not repaid the previous loan on time, do not grant them another loan as there is a higher chance that he will not repay it. But our machine learning model analyzed that farmer's previous loan repayment history and analyzed that if that farmer takes the loan in the spring season, he repays it very soon because he makes a more significant profit from spring crops. In such a scenario, the ML model will help our bank experts to rectify their decisions.
IT experts can not define explicit rules for some tasks in this case. Our ML algorithms try to mimic the cognitive intelligence possessed by humans.
For example, a bank manager is reading a customer's mail and is trying to identify whether his customer is happy with the banking services or not. He made the rule that if the content of mail includes "fantastic" word, it means the customer is satisfied. Suppose in the mail, the customer writes,
The banking services are so fantastic that every issue gets solved within 3 to 4 years.
Although the mail includes words like "fantastic" and "issues are getting solved", the bank manager can identify sarcasm, and the customer is unhappy. Sowe can not explain this intelligence in hard-coded rules. Here machines try to learn humans' cognitive intelligence.
If we have noticed clearly, we see that the decision-supportive use cases are generally structured data with rules defined in the numerical format. But on the other hand, in cognitive use cases, we have data, not in numbers but in images, videos, or text format. Computers understand the numbers, and we need to convert these data samples into a computer-readable format.
In computer science, specifically in the IT sector, business and technology need a high level of coordination. The difference between "business demand" and "feasibility of demand via technology" is known as the IT error. The more the coordination between business experts and coders, the lesser will be the IT error.
In Machine Learning solutions, we need to have the most coordination between technology and business verticals. In the case of decision-supportive use case, this tech requires rules formulated by business experts and the corresponding data to support or derive new rules. And in the case of cognitive use-case, a close watch on the learning process is required as humans have to evaluate the model's accuracy.
For any Machine Learning project from business experts, there are mainly seven different verticals or phases it has to pass. All of these seven verticals are mentioned in the image above. Let's discuss them one by one.
We might have heard that half the problem is solved if the problem is identified clearly. Technology is developed to solve a particular problem, but this process requires a "clear definition" of the problem we want to solve. It is more like goal finalization.
For example, suppose we are the head of any banking firm and want our customers to be happy. Till now, we roughly stated the central objective, but this can not be a problem statement that technology would target. Machine learning techniques would provide several solutions to enhance the customer experience in multiple ways. One could be alerting the customers in case of fraudulent transactions.
Now we have a clear problem statement in our mind, we can start thinking from the requirement perspective. We want to solve the defined problem statement using the machine learning technique, and the core requirement for this technique is historical data. So let's see our next stage of Data Gathering.
There can be three scenarios,
In this case, we can start utilizing the available data and find whether it is present in the supervised, unsupervised, or semi-supervised format. Based on this, we will begin analyzing the dataset further.
In the second case, we need to analyze the data and check whether available attributes are sufficient to make the model or not. This stage requires a trial and error condition. We can start categorizing the data and check whether ML techniques can fulfill our objective. If not, we will have to capture a sufficient amount of data from scratch.
In such a scenario, we will have to gather or record the data from the start. We might need some domain knowledge about the problem statement or can consult some experts. For example, suppose we are trying to sense whether any transaction is fraudulent. What attributes( previous month's transaction history, type of shopping, etc.) would be essential to learn the mapping function successfully?
Note: There are multiple platforms like Kaggle or government data portals where we can easily find datasets for our academic purpose.
Once we collect the dataset, we need to process it. We can not expect our data to be accurate as multiple dependencies exist. So let's process them to make them usable to us.
After collecting the data, it presents in the raw format and can not be used directly in the model-building process. We need to pass this raw data through a "set of defined processes", known as data pre-processing steps. These steps vary based on the project requirements and raw data quality. Some famous data pre-processing steps for structured data are,
After this step, we should have rich and machine-readable format data. For unstructured data, like text, images, and videos, our computers need different processing steps to make it understandable to our systems. We also have covered the text data pre-processing in our blog.
After this step, we have the machine-readable data, and we can start using machine learning algorithms to learn the dependencies.
There are many algorithms in the market; which algorithm should we target? Selection of model mainly depends upon these factors,
Once the model is finalized, we need various sets to check whether our selection is good or not. Let's discuss how to do this data splitting.
We have decided on the algorithm we need to train; we need to make three sets from the processed data: training, validation, and test datasets. Let's first learn about all these three datasets.
In some of the online solutions, we might find no validation or testing data. People generally get confused about these two datasets' use cases, making it one of the most asked basic questions in ML interviews. While training a Machine Learning model, we use a validation dataset to tune the parameters. So we can say that not directly, but our model knows which set of hyperparameters will achieve better results on the validation dataset. On the other hand, test data is unseen data used when the model development phase is completed.
There can be mainly three ways of doing so,
Once we have made the final model, we need to evaluate it on several metrics. Let's understand them.
Once our model is ready, we must evaluate it rigorously on different metrics and check whether it beats previous methodologies. The evaluation metrics can vary based on project requirements.
We sometimes need to design our evaluation metric, but that is beyond the scope of this blog to discuss here. If our model performs as per the expectations, there comes a phase to roll out to the world.
Once our model clears our evaluation criteria, we integrate it with our mobile apps and website APIs. It will start giving predictions on the user data in real time. But there can be several scenarios based on which we might need to retrain our ML algorithms. Some of these scenarios are:
There can be more reasons, and as we discussed, this technology demands close coordination with business so that retraining can solve the problem whenever new things are needed.
In this article, we have learned about the different use cases of Machine Learning technology. We also discussed the complete pathway, including problem formulation, data gathering, pre-processing, model selection, data splitting, model building, performance evaluation, and finally, the model rollout. These steps need to be followed while building any Machine Learning project. We hope you enjoyed it.