Machine learning is a field of artificial intelligence that uses statistical techniques to give computer systems the ability to “learn” from data, without being explicitly programmed.
The name machine learning was coined in 1959 by Arthur Samuel.Machine learning explores the study and construction of algorithms that can learn from and make predictions on data such algorithms overcome following strictly static program instructionsby making data-driven predictions or decisions, through building a model from sample inputs. Machine learning is employed in a range of computing tasks where designing and programming explicit algorithms with good performance is difficult or infeasible; example applications include email filtering, detection of network intruders, and computer vision.
It has strong ties to mathematical optimization, which delivers methods, theory and application domains to the field. Machine learning is sometimes conflated with data mining.
Broadly it can be categorised into data mining and Data Analytics, where hadoop/Big data hadoop is used for data organisation and Data analytics will work on that data with the help of differnet methods like machine learning or deep learning.
Machine learning tasks
Machine learning tasks are typically classified into several broad categories:
- Supervised learning: The computer is presented with example inputs and their desired outputs, given by a “teacher”, and the goal is to learn a general rule that maps inputs to outputs. As special cases, the input signal can be only partially available, or restricted to special feedback.
- Semi-supervised learning: The computer is given only an incomplete training signal: a training set with some (often many) of the target outputs missing.
- Active learning: The computer can only obtain training labels for a limited set of instances (based on a budget), and also has to optimize its choice of objects to acquire labels for. When used interactively, these can be presented to the user for labeling.
- Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own to find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in data) or a means towards an end (feature learning).
- Reinforcement learning: Data (in form of rewards and punishments) are given only as feedback to the program’s actions in a dynamic environment, such as driving a vehicle or playing a game against an opponent.
List of Common Machine Learning Algorithms
Here is the list of commonly used machine learning algorithms. These algorithms can be applied to almost any data problem:
- Linear Regression
- Logistic Regression
- Decision Tree
- Naive Bayes
- Random Forest
- Dimensionality Reduction Algorithms
- Gradient Boosting algorithms
1. Image Recognition
One of the most common uses of machine learning is image recognition. There are many situations where you can classify the object as a digital image. For digital images, the measurements describe the outputs of each pixel in the image.
In the case of a black and white image, the intensity of each pixel serves as one measurement. So if a black and white image has N*N pixels, the total number of pixels and hence measurement is N2.
In the colored image, each pixel considered as providing 3 measurements to the intensities of 3 main color components ie RGB. So N*N colored image there are 3 N2 measurements.
- For face detection – The categories might be face versus no face present. There might be a separate category for each person in a database of several individuals.
- For character recognition – We can segment a piece of writing into smaller images, each containing a single character. The categories might consist of the 26 letters of the English alphabet, the 10 digits, and some special characters.
2. Speech Recognition
Speech recognition (SR) is the translation of spoken words into text. It is also known as “automatic speech recognition” (ASR), “computer speech recognition”, or “speech to text” (STT).
In speech recognition, a software application recognizes spoken words. The measurements in this application might be a set of numbers that represent the speech signal. We can segment the signal into portions that contain distinct words or phonemes. In each segment, we can represent the speech signal by the intensities or energy in different time-frequency bands.
Although the details of signal representation are outside the scope of this program, we can represent the signal by a set of real values.
Speech recognition applications include voice user interfaces. Voice user interfaces are such as voice dialing, call routing, domotic appliance control. It can also use as simple data entry, preparation of structured documents, speech-to-text processing, and plane.
3. Medical Diagnosis
ML provides methods, techniques, and tools that can help solving diagnostic and prognostic problems in a variety of medical domains. It is being used for the analysis of the importance of clinical parameters and of their combinations for prognosis, e.g. prediction of disease progression, for the extraction of medical knowledge for outcomes research, for therapy planning and support, and for overall patient management. ML is also being used for data analysis, such as detection of regularities in the data by appropriately dealing with imperfect data, interpretation of continuous data used in the Intensive Care Unit, and for intelligent alarming resulting in effective and efficient monitoring.
It is argued that the successful implementation of ML methods can help the integration of computer-based systems in the healthcare environment providing opportunities to facilitate and enhance the work of medical experts and ultimately to improve the efficiency and quality of medical care.
In medical diagnosis, the main interest is in establishing the existence of a disease followed by its accurate identification. There is a separate category for each disease under consideration and one category for cases where no disease is present. Here, machine learning improves the accuracy of medical diagnosis by analyzing data of patients.
The measurements in this application are typically the results of certain medical tests (example blood pressure, temperature and various blood tests) or medical diagnostics (such as medical images), presence/absence/intensity of various symptoms and basic physical information about the patient(age, sex, weight etc). On the basis of the results of these measurements, the doctors narrow down on the disease inflicting the patient.
4. Statistical Arbitrage
In finance, statistical arbitrage refers to automated trading strategies that are typical of a short term and involve a large number of securities. In such strategies, the user tries to implement a trading algorithm for a set of securities on the basis of quantities such as historical correlations and general economic variables. These measurements can be cast as a classification or estimation problem. The basic assumption is that prices will move towards a historical average.
We apply machine learning methods to obtain an index arbitrage strategy. In particular, we employ linear regression and support vector regression (SVR)onto the prices of an exchange-traded fund and a stream of stocks. By using principal component analysis (PCA) in reducing the dimension of feature space, we observe the benefit and note the issues in the application of SVR. To generate trading signals, we model the residuals from the previous regression as a mean reverting process.
In the case of classification, the categories might be sold, buy or do nothing for each security. I the case of estimation one might try to predict the expected return of each security over a future time horizon. In this case, one typically needs to use the estimates of the expected return to make a trading decision(buy, sell, etc.)
5. Learning Associations
Learning association is the process of developing insights into various associations between products. A good example is how seemingly unrelated products may reveal an association to one another. When analyzed in relation to buying behaviors of customers.
One application of machine learning- Often studying the association between the products people buy, which is also known as basket analysis. If a buyer buys ‘X’, would he or she force to buy ‘Y’ because of a relationship that can identify between them. This leads to relationship that exists between fish and chips etc. when new products launches in the market a Knowing these relationships it develops new relationship. Knowing these relationships could help in suggesting the associated product to the customer. For a higher likelihood of the customer buying it, It can also help in bundling products for a better package.
This learning of associations between products by a machine is learning associations . Once we found an association by examining a large amount of sales data, Big Data analysts. It can develop a rule to derive a probability test in learning a conditional probability.
A Classification is a process of placing each individual from the population under study in many classes. This is identify as independent variables.
Classification helps analysts to use measurements of an object to identify the category to which that object belongs. To establish an efficient rule, analysts use data. Data consists of many examples of objects with their correct classification.
For example, before a bank decides to disburse a loan, it assesses customers on their ability to repay the loan. By considering factors such as customer’s earning, age, savings and financial history we can do it. This information taken from the past data of the loan. Hence, Seeker uses to create relationship between customer attributes and related risks.
Consider the example of a bank computing the probability of any of loan applicants faulting the loan repayment. To compute the probability of the fault, the system will first need to classify the available data in certain groups. It is described by a set of rules prescribed by the analysts.
Once we do the classification, as per need we can compute the probability. These probability computations can compute across all sectors for varied purposes
Current prediction is one of the hottest machine learning algorithms. Let’s take an example of retail, earlier we were able to get insights like sales report last month / year / 5-years / Diwali / Christmas. These type of reporting is called as historical reporting. But currently business is more interested in finding out what will be my sales next month / year / Diwali, etc.
So that business can take required decision (related to procurement, stocks, etc.) on time.
Information Extraction (IE) is another application of machine learning. It is the process of extracting structured information from unstructured data. For example web pages, articles, blogs, business reports, and e-mails. The relational database maintains the output produced by the information extraction.
The process of extraction takes input as a set of documents and produces a structured data. This output is in summarized form such as excel sheet and table in a relational database.
Now-a-days extraction is becoming a key in big data industry.
As we know that huge volume of data is getting generated out of which most of the data is unstructured. The first key challenge is handling of unstructured data. Now conversion of unstructured data to structured form based on some pattern so that the same can stored in RDBMS.
Apart from this in current days data collection mechanism is also getting change. Earlier we collected data in batches like End-of-Day (EOD), but now business want the data as soon as it is getting generated, i.e. in real time.
We can apply Machine learning to regression as well.
Assume that x= x1, x2, x3, … xn are the input variables and y is the outcome variable. In this case, we can use machine learning technology to produce the output (y) on the basis of the input variables (x). You can use a model to express the relationship between various parameters as below:
Y=g(x) where g is a function that depends on specific characteristics of the model.
In regression, we can use the principle of machine learning to optimize the parameters. To cut the approximation error and calculate the closest possible outcome.
We can also use Machine learning for function optimization. We can choose to alter the inputs to get a better model. This gives a new and improved model to work with. This is known as response surface design.
In conclusion, Machine learning is an incredible breakthrough in the field of artificial intelligence. While it does have some frightening implications when you think about it, these Machine Learning Applications are several of the many ways this technology can improve our lives.