2) Un-Supervised Machine
Learning.
In data, there are only
inputs and no outputs, which are called unlabeled data. When
we identify patterns or groups in the data without any output, it
is called Unsupervised Machine Learning.
When data has only
input and no output, it is called unsupervised machine learning.
Example: Student data
|
Input |
|
|
IQ |
CGPA |
|
110 |
8.5 |
|
120 |
9.1 |
|
100 |
7.8 |
|
115 |
8.8 |
|
105 |
8.0 |
In Un-Supervised
Learning, there is no output—only input data. So, we perform tasks
such as Clustering, Dimensionality Reduction, Anomaly
Detection, and Association Rule Learning.
2.1
Types of
Un-Supervised
Machine Learning:
********************
>< ********************
I.
Clustering :
Clustering is an unsupervised machine learning technique that groups similar data points into clusters based on their features, without using any labeled data.
For example,
we have a dataset of IQ and CGPA. We plot this data on a 2D coordinate system, where the
X-axis
represents IQ
and the Y-axis
represents CGPA.
A clustering algorithm
detects groups of students
such as high IQ–high CGPA,
high IQ–low CGPA,
low IQ–high CGPA,
and low IQ–low CGPA.
In this way, students are grouped
into categories. When a new
student comes, the algorithm places the student into a group, and we can assign labels like 1, 2, 3, or 4 to the groups.
********************
>< ********************
II.
Dimensionality Reduction:
Dimensionality Reduction
in machine learning is the process of reducing
the number of input Dimensions/Column /Features in a dataset while preserving as much important information as
possible.
Original Dataset
(High Dimensions)
Suppose
we collect the following data for each student:
1.
IQ
2.
Study Hours per Day
3.
Attendance (%)
4.
Internal Exam Marks
5.
Assignment Score
6.
Project Marks
7.
Mid-Sem Marks
8.
End-Sem Marks
o 8 features (8
dimensions)
Problem
- Many features are correlated (exam marks,
assignments, projects)
- More features → complex model, higher
computation, risk of overfitting
Apply PCA (Principal Component
Analysis)
PCA combines related features into principal
components:
- PC1 (Academic Performance)
- Internal Marks, Assignment Score, Project Marks,
Mid-Sem, End-Sem
- PC2 (Effort & Consistency)
- Study Hours, Attendance
- PC3 (Cognitive Ability)
- IQ
o
Reduced from 8
dimensions to 3 dimensions
Use
of Dimensionality Reduction
- Reduces computational cost
- Removes irrelevant or redundant features
- Helps avoid overfitting
- Improves model performance
- Enables data visualization (e.g., reducing
to 2D or 3D)
********************
>< ********************
III.
Anomaly Detection:
Anomaly Detection
is a machine learning technique used to identify rare, unusual, or abnormal
data points that are different from normal data patterns.
Simple - Anomaly
detection is the process of finding data
points that do not follow the expected pattern in a dataset.
Use of Anomaly Detection:
· Fraud
· System failures
· Security attacks
· Medical abnormalities
· Data errors
********************
>< ********************
IV.
Association Rule Learning:
·
Association Rule Learning is a machine learning technique
used to discover relationships, patterns, or associations between items
in large datasets.
· Association Rule Learning finds hidden relationships between items that frequently occur together in a dataset.
Example
If customers buy: Bread
Then they are also likely to buy: Milk
********************
>< ********************





0 Comments