Next Lesson

Sample Solution

MMTE-007 Solved Assignment | SOFT COMPUTING AND ITS APPLICATIONS | M.Sc. MACS | IGNOU Sample Solution

MMTE-007 Solved Assignment 2024 CS

SOFT COMPUTING AND ITS APPLICATIONS

(Valid from 1st January, 2024 to 31st December, 2024)

a) Two sensors based upon their detection levels and gain settings are compared. The following of gain setting and sensor detection levels with a standard item being monitored provides typical membership values to represent the detection levels for each of the sensors.

Gain Setting

Sensor 1

detection levels

Sensor 2

detection levels

0.5

0.35

0.65

0.5

0.85

0.75

0.90

100

The universe of discourse is

x = {0, 20, 40, 60, 80, 100}

. Find the membership function for the two sensors. Also, verify De-morgon’s laws for these membership functions.
b) Consider a subset of natural numbers from 1 to 30 , as the universe of discourse,

U

. Define the fuzzy sets "small" and "medium" by enumeration.
2. a) Construct the

α

-cut at

α = 0.4

for the fuzzy sets defined in Q. 1(b).
b) Apply the "very" hedge on the fuzzy sets defined in Q. 1(b) to get the new modified fuzzy sets. Show the modified fuzzy sets through numeration.
3. Let A and B are two fuzzy sets and

x \in U

, if

μ_{A} (x) = 0.4

and

μ_{B} (x) = 0.8

then find out the following membership values:
i)

μ_{A} (x)

,
ii)

μ_{A \cap B} (x)

,
iii)

μ_{\overset{―}{A} \cup \overset{―}{B}} (x)

,
iv)

μ_{\overset{―}{A} \cap \overset{―}{B}} (x)

,
v)

μ_{\overset{―}{A} \overset{―}{\overset{―}{B}}} (x)

,
vi)

μ_{\overset{―}{A} \overset{―}{\overset{―}{B}}} (x)

Consider a dataset of six points given in the following table, each of which has two features $f_{1}$ and $f_{2}$ . Assuming the values of the parameters $c$ and $m$ as 2 and the initial cluster centers $v_{1} = (5, 5)$ and $v_{2} = (10, 10)$ , apply FCm algorithm to find the new cluster center after one iteration.

	$f_{1}$	$f_{2}$
$x_{1}$	3	11
$x_{2}$	3	10

$x_{3}$	8	12
$x_{4}$	10	6
$x_{5}$	13	6
$x_{6}$	13	5

a) Define Error Correction Learning with examples.
b) Write the types of Neural Memory Models. Also, give one example of each.
Consider the set of pattern vectors P. Obtain the connectivity matrix (CM) for the patterns in $P$ (four patterns).

p = [\begin{array}{llllllllll} 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \\ 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \end{array}]

a) Define Kohonen networks with examples.
b) Describe the Function Approximation in MLP. Also, explain Generalization of MLP.
a) Find the length and order of the following schema:
i) $S_{1} = (1^{* *} 00 * 1^{* *})$
ii) $S_{2} = (^{*} 00^{*} 1^{* *})$
iii) $S_{3} = (* * * 1^{* * *})$
b) Let an activation function be defined as

ϕ (v) = \frac{1}{1 + e^{- zv}}, a > 0

Show that

\frac{d ϕ}{d v} = a ϕ (v) [1 - ϕ (v)]

. What is the value of

ϕ (v)

at the origin? Also, find the value of

ϕ (v)

v

approaches

+ \infty

and

- \infty

.
9. a) Consider the following travelling salesman problem involving 9 cities.

Parent 1	G	J	H	F	E	D	B	I	C
Parent 2	D	C	H	J	I	G	E	F	B

Determine the children solution using.
i) Order crossover #1, assuming

4^{th}

and

7^{th}

sites as the crossover sites.
ii) Order crossover #2, assuming

3^{rd}, 5^{th}

and

7^{th}

as the key positions.

b) Consider the following single layer perception as shown in the following figure.

and the activation function of each unit is defined as

Φ (v) = {\begin{cases} 1, if v \geq 0 \\ 0, otherwise. \end{cases}

Calculate the output

y

of the unit for each of the following input patterns:

Patterns	$P_{1}$	$P_{2}$	$P_{3}$	$P_{4}$
$x_{1}$	1	0	1	1
$x_{2}$	0	1	0	1
$x_{3}$	0	1	1	1

Also, find the modified weights after one iteration.

Which of the following statements are true or false? Give a short proof or a counter example in support of your answers.
a) There is chance of occurrence of the premature convergence in Roulett-wheel selection shceme used in GA.
b) Gradient based optimization methods are used when the objective function is not smooth and one needs efficient local optimization.
c) The $α$ -cut of a fuzzy set A in $\cup$ is defined as $A α_{0} = {x \in U ∣ μ_{A} (x) \leq α_{0}}$
d) A single perception with preprocessing is neither an auto associative network nor a multiple layer neural network.
e) If $W (k_{0}) = W (k_{0} + 1) = W (k_{0} + 2)$ , then perception is non-linear separable.

Expert Answer

Question:-1(a)

Two sensors based upon their detection levels and gain settings are compared. The following of gain setting and sensor detection levels with a standard item being monitored provides typical membership values to represent the detection levels for each of the sensors.

Gain Setting

Sensor 1

detection levels

Sensor 2

detection levels

0.5

0.35

0.65

0.5

0.85

0.75

0.90

100

The universe of discourse is

x = {0, 20, 40, 60, 80, 100}

. Find the membership function for the two sensors. Also, verify De-morgon’s laws for these membership functions.

Answer:

Step 1: Define the Membership Functions

We are given the detection levels for two sensors at different gain settings. The gain settings are represented as the universe of discourse

x = {0, 20, 40, 60, 80, 100}

, and for each gain setting, we have membership values representing the detection levels for each sensor.

Let’s define the membership functions for Sensor 1 and Sensor 2.

Membership Function for Sensor 1, $μ_{S1} (x)$

The detection levels for Sensor 1 at different gain settings are:

\begin{aligned} μ_{S1} (0) & = 0, \\ μ_{S1} (20) & = 0.5, \\ μ_{S1} (40) & = 0.65, \\ μ_{S1} (60) & = 0.85, \\ μ_{S1} (80) & = 1, \\ μ_{S1} (100) & = 1. \end{aligned}

Thus, the membership function for Sensor 1 is:

μ_{S1} (x) = {(0, 0), (20, 0.5), (40, 0.65), (60, 0.85), (80, 1), (100, 1)}

Membership Function for Sensor 2, $μ_{S2} (x)$

The detection levels for Sensor 2 at different gain settings are:

\begin{aligned} μ_{S2} (0) & = 0, \\ μ_{S2} (20) & = 0.35, \\ μ_{S2} (40) & = 0.5, \\ μ_{S2} (60) & = 0.75, \\ μ_{S2} (80) & = 0.90, \\ μ_{S2} (100) & = 1. \end{aligned}

Thus, the membership function for Sensor 2 is:

μ_{S2} (x) = {(0, 0), (20, 0.35), (40, 0.5), (60, 0.75), (80, 0.90), (100, 1)}

Step 2: Verify De Morgan’s Laws

De Morgan’s laws for fuzzy sets are similar to De Morgan’s laws in classical set theory. For two fuzzy sets

A

and

B

with membership functions

μ_{A} (x)

and

μ_{B} (x)

, De Morgan’s laws are stated as:

First Law:

$μ_{\neg (A \cup B)} (x) = μ_{\neg A} (x) \cap μ_{\neg B} (x)$

This means:

$μ_{\neg (A \cup B)} (x) = min (1 - μ_{A} (x), 1 - μ_{B} (x))$
Second Law:

$μ_{\neg (A \cap B)} (x) = μ_{\neg A} (x) \cup μ_{\neg B} (x)$

This means:

$μ_{\neg (A \cap B)} (x) = max (1 - μ_{A} (x), 1 - μ_{B} (x))$

Let’s verify these laws using the membership functions for Sensor 1 and Sensor 2.

First Law: $μ_{\neg (S 1 \cup S 2)} (x) = μ_{\neg S 1} (x) \cap μ_{\neg S 2} (x)$

Compute $μ_{S 1 \cup S 2} (x)$ :
The union of two fuzzy sets is the maximum of the membership values:

$μ_{S 1 \cup S 2} (x) = max (μ_{S1} (x), μ_{S2} (x))$

For each value of $x$ :

$\begin{aligned} μ_{S 1 \cup S 2} (0) & = max (0, 0) = 0, \\ μ_{S 1 \cup S 2} (20) & = max (0.5, 0.35) = 0.5, \\ μ_{S 1 \cup S 2} (40) & = max (0.65, 0.5) = 0.65, \\ μ_{S 1 \cup S 2} (60) & = max (0.85, 0.75) = 0.85, \\ μ_{S 1 \cup S 2} (80) & = max (1, 0.90) = 1, \\ μ_{S 1 \cup S 2} (100) & = max (1, 1) = 1. \end{aligned}$
Compute $μ_{\neg (S 1 \cup S 2)} (x)$ :
The complement of a fuzzy set is $1 - μ (x)$ :

$μ_{\neg (S 1 \cup S 2)} (x) = 1 - μ_{S 1 \cup S 2} (x)$

For each value of $x$ :

$\begin{aligned} μ_{\neg (S 1 \cup S 2)} (0) & = 1 - 0 = 1, \\ μ_{\neg (S 1 \cup S 2)} (20) & = 1 - 0.5 = 0.5, \\ μ_{\neg (S 1 \cup S 2)} (40) & = 1 - 0.65 = 0.35, \\ μ_{\neg (S 1 \cup S 2)} (60) & = 1 - 0.85 = 0.15, \\ μ_{\neg (S 1 \cup S 2)} (80) & = 1 - 1 = 0, \\ μ_{\neg (S 1 \cup S 2)} (100) & = 1 - 1 = 0. \end{aligned}$
Compute $μ_{\neg S 1} (x) \cap μ_{\neg S 2} (x)$ :
The complement of Sensor 1 and Sensor 2 is:

$μ_{\neg S 1} (x) = 1 - μ_{S1} (x), μ_{\neg S 2} (x) = 1 - μ_{S2} (x)$

For each value of $x$ :

$\begin{aligned} μ_{\neg S 1} (0) & = 1 - 0 = 1, & μ_{\neg S 2} (0) & = 1 - 0 = 1, \\ μ_{\neg S 1} (20) & = 1 - 0.5 = 0.5, & μ_{\neg S 2} (20) & = 1 - 0.35 = 0.65, \\ μ_{\neg S 1} (40) & = 1 - 0.65 = 0.35, & μ_{\neg S 2} (40) & = 1 - 0.5 = 0.5, \\ μ_{\neg S 1} (60) & = 1 - 0.85 = 0.15, & μ_{\neg S 2} (60) & = 1 - 0.75 = 0.25, \\ μ_{\neg S 1} (80) & = 1 - 1 = 0, & μ_{\neg S 2} (80) & = 1 - 0.90 = 0.1, \\ μ_{\neg S 1} (100) & = 1 - 1 = 0, & μ_{\neg S 2} (100) & = 1 - 1 = 0. \end{aligned}$

Now, compute the intersection (minimum):

\begin{aligned} μ_{\neg S 1} (0) \cap μ_{\neg S 2} (0) & = min (1, 1) = 1, \\ μ_{\neg S 1} (20) \cap μ_{\neg S 2} (20) & = min (0.5, 0.65) = 0.5, \\ μ_{\neg S 1} (40) \cap μ_{\neg S 2} (40) & = min (0.35, 0.5) = 0.35, \\ μ_{\neg S 1} (60) \cap μ_{\neg S 2} (60) & = min (0.15, 0.25) = 0.15, \\ μ_{\neg S 1} (80) \cap μ_{\neg S 2} (80) & = min (0, 0.1) = 0, \\ μ_{\neg S 1} (100) \cap μ_{\neg S 2} (100) & = min (0, 0) = 0. \end{aligned}

Compare Results for the First Law:
We can see that the results match for all values of $x$ , so the first De Morgan law holds:

$μ_{\neg (S 1 \cup S 2)} (x) = μ_{\neg S 1} (x) \cap μ_{\neg S 2} (x)$

Second Law: $μ_{\neg (S 1 \cap S 2)} (x) = μ_{\neg S 1} (x) \cup μ_{\neg S 2} (x)$

Compute $μ_{S 1 \cap S 2} (x)$ :
The intersection of two fuzzy sets is the minimum of the membership values:

$μ_{S 1 \cap S 2} (x) = min (μ_{S1} (x), μ_{S2} (x))$

For each value of $x$ :

$\begin{aligned} μ_{S 1 \cap S 2} (0) & = min (0, 0) = 0, \\ μ_{S 1 \cap S 2} (20) & = min (0.5, 0.35) = 0.35, \\ μ_{S 1 \cap S 2} (40) & = min (0.65, 0.5) = 0.5, \\ μ_{S 1 \cap S 2} (60) & = min (0.85, 0.75) = 0.75, \\ μ_{S 1 \cap S 2} (80) & = min (1, 0.90) = 0.90, \\ μ_{S 1 \cap S 2} (100) & = min (1, 1) = 1. \end{aligned}$
Compute $μ_{\neg (S 1 \cap S 2)} (x)$ :
The complement of $S 1 \cap S 2$ is:

$μ_{\neg (S 1 \cap S 2)} (x) = 1 - μ_{S 1 \cap S 2} (x)$

For each value of $x$ :

$\begin{aligned} μ_{\neg (S 1 \cap S 2)} (0) & = 1 - 0 = 1, \\ μ_{\neg (S 1 \cap S 2)} (20) & = 1 - 0.35 = 0.65, \\ μ_{\neg (S 1 \cap S 2)} (40) & = 1 - 0.5 = 0.5, \\ μ_{\neg (S 1 \cap S 2)} (60) & = 1 - 0.75 = 0.25, \\ μ_{\neg (S 1 \cap S 2)} (80) & = 1 - 0.90 = 0.10, \\ μ_{\neg (S 1 \cap S 2)} (100) & = 1 - 1 = 0. \end{aligned}$
Compute $μ_{\neg S 1} (x) \cup μ_{\neg S 2} (x)$ :
The union (maximum) is:

$μ_{\neg S 1} (x) \cup μ_{\neg S 2} (x) = max (1 - μ_{S1} (x), 1 - μ_{S2} (x))$

For each value of $x$ :

$\begin{aligned} μ_{\neg S 1} (0) \cup μ_{\neg S 2} (0) & = max (1, 1) = 1, \\ μ_{\neg S 1} (20) \cup μ_{\neg S 2} (20) & = max (0.5, 0.65) = 0.65, \\ μ_{\neg S 1} (40) \cup μ_{\neg S 2} (40) & = max (0.35, 0.5) = 0.5, \\ μ_{\neg S 1} (60) \cup μ_{\neg S 2} (60) & = max (0.15, 0.25) = 0.25, \\ μ_{\neg S 1} (80) \cup μ_{\neg S 2} (80) & = max (0, 0.1) = 0.1, \\ μ_{\neg S 1} (100) \cup μ_{\neg S 2} (100) & = max (0, 0) = 0. \end{aligned}$
Compare Results for the Second Law:
We can see that the results match for all values of $x$ , so the second De Morgan law holds:

$μ_{\neg (S 1 \cap S 2)} (x) = μ_{\neg S 1} (x) \cup μ_{\neg S 2} (x)$

Final Answer:

The membership functions for the two sensors are:

For Sensor 1:

$μ_{S1} (x) = {(0, 0), (20, 0.5), (40, 0.65), (60, 0.85), (80, 1), (100, 1)}$
For Sensor 2:

$μ_{S2} (x) = {(0, 0), (20, 0.35), (40, 0.5), (60, 0.75), (80, 0.90), (100, 1)}$

We have also verified that both De Morgan’s laws hold for these membership functions.

Question:-1(b)

Consider a subset of natural numbers from 1 to 30, as the universe of discourse, $U$ . Define the fuzzy sets "small" and "medium" by enumeration.

Answer:

We are given a universe of discourse

U

, which is the set of natural numbers from 1 to 30:

U = {1, 2, 3, \dots, 30}

We are tasked with defining two fuzzy sets: "small" and "medium". A fuzzy set is characterized by a membership function that assigns each element in the universe a membership value between 0 and 1, indicating the degree to which that element belongs to the set.

Define the Fuzzy Set "Small"

The fuzzy set "small" should intuitively have higher membership values for smaller numbers and lower values as the numbers increase. A possible definition could be:

For numbers near 1, the membership value is close to 1.
For numbers near 30, the membership value is close to 0.

One possible membership function for "small" could be defined as:

μ_{small} (x) = {\begin{cases} 1 & if 1 \leq x \leq 10 \\ \frac{20 - x}{10} & if 11 \leq x \leq 20 \\ 0 & if 21 \leq x \leq 30 \end{cases}

Thus, by enumeration, the fuzzy set "small" is:

μ_{small} (x) = {(1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1), (11, 0.9), (12, 0.8), (13, 0.7), (14, 0.6), (15, 0.5), (16, 0.4), (17, 0.3), (18, 0.2), (19, 0.1), (20, 0), (21, 0), \dots, (30, 0)}

Define the Fuzzy Set "Medium"

The fuzzy set "medium" should intuitively have higher membership values for numbers in the middle range (e.g., around 15) and lower membership values for smaller and larger numbers.

One possible membership function for "medium" could be defined as:

The membership value is 0 for numbers below 5 and above 25.
The membership value increases from 0 to 1 as numbers go from 5 to 15.
The membership value decreases from 1 to 0 as numbers go from 15 to 25.

Mathematically, we can define the membership function for "medium" as:

μ_{medium} (x) = {\begin{cases} 0 & if x \leq 5 or x \geq 25 \\ \frac{x - 5}{10} & if 6 \leq x \leq 15 \\ \frac{25 - x}{10} & if 16 \leq x \leq 24 \end{cases}

Thus, by enumeration, the fuzzy set "medium" is:

μ_{medium} (x) = {(1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0.1), (7, 0.2), (8, 0.3), (9, 0.4), (10, 0.5), (11, 0.6), (12, 0.7), (13, 0.8), (14, 0.9), (15, 1), (16, 0.9), (17, 0.8), (18, 0.7), (19, 0.6), (20, 0.5), (21, 0.4), (22, 0.3), (23, 0.2), (24, 0.1), (25, 0), (26, 0), \dots, (30, 0)}

Final Answer

Fuzzy set "small": Membership values decrease as $x$ increases.

μ_{small} (x) = {(1, 1), (2, 1), (3, 1), \dots, (10, 1), (11, 0.9), \dots, (19, 0.1), (20, 0), (21, 0), \dots, (30, 0)}

Fuzzy set "medium": Membership values peak around 15 and decrease towards the edges.

μ_{medium} (x) = {(1, 0), (2, 0), (3, 0), \dots, (6, 0.1), (10, 0.5), (15, 1), (20, 0.5), (25, 0), \dots, (30, 0)}

Question:-2(a)

Construct the $α$ -cut at $α = 0.4$ for the fuzzy sets defined in Q. 1(b).

Answer:

To construct the

α

-cut for the fuzzy sets "small" and "medium" at

α = 0.4

, we need to identify all the elements in the universe of discourse

U = {1, 2, \dots, 30}

where the membership values are greater than or equal to 0.4.

Recap of the Membership Functions

Fuzzy Set "Small" Membership Function:

μ_{small} (x) = {(1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1), (11, 0.9), (12, 0.8), (13, 0.7), (14, 0.6), (15, 0.5), (16, 0.4), (17, 0.3), (18, 0.2), (19, 0.1), (20, 0), (21, 0), \dots, (30, 0)}

Fuzzy Set "Medium" Membership Function:

μ_{medium} (x) = {(1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0.1), (7, 0.2), (8, 0.3), (9, 0.4), (10, 0.5), (11, 0.6), (12, 0.7), (13, 0.8), (14, 0.9), (15, 1), (16, 0.9), (17, 0.8), (18, 0.7), (19, 0.6), (20, 0.5), (21, 0.4), (22, 0.3), (23, 0.2), (24, 0.1), (25, 0), (26, 0), \dots, (30, 0)}

Definition of $α$ -cut

The $α$ -cut of a fuzzy set is the crisp set of all elements in the universe of discourse whose membership values are greater than or equal to

α

A_{α} = {x \in U ∣ μ_{A} (x) \geq α}

For

α = 0.4

, we need to find all the elements

x

such that their membership values in the fuzzy sets "small" and "medium" are greater than or equal to 0.4.

1. $α$ -cut for the Fuzzy Set "Small" at $α = 0.4$

Looking at the fuzzy set "small", we select all the values of

x

where

μ_{small} (x) \geq 0.4

μ_{small} (x) = {(1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1), (11, 0.9), (12, 0.8), (13, 0.7), (14, 0.6), (15, 0.5), (16, 0.4)}

So, the

α = 0.4

cut for the fuzzy set "small" is:

A_{0.4}^{small} = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}

2. $α$ -cut for the Fuzzy Set "Medium" at $α = 0.4$

Looking at the fuzzy set "medium", we select all the values of

x

where

μ_{medium} (x) \geq 0.4

μ_{medium} (x) = {(9, 0.4), (10, 0.5), (11, 0.6), (12, 0.7), (13, 0.8), (14, 0.9), (15, 1), (16, 0.9), (17, 0.8), (18, 0.7), (19, 0.6), (20, 0.5), (21, 0.4)}

So, the

α = 0.4

cut for the fuzzy set "medium" is:

A_{0.4}^{medium} = {9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21}

Final Answer

The $α = 0.4$ -cut for the fuzzy set "small" is:

$A_{0.4}^{small} = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}$
The $α = 0.4$ -cut for the fuzzy set "medium" is:

$A_{0.4}^{medium} = {9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21}$

Question:-2(b)

Apply the "very" hedge on the fuzzy sets defined in Q. 1(b) to get the new modified fuzzy sets. Show the modified fuzzy sets through numeration.

Answer:

In fuzzy logic, a hedge is a linguistic modifier that affects the membership values of a fuzzy set. The hedge "very" typically squares the membership values of a fuzzy set, making the set more restrictive (i.e., it reduces the membership values of elements that are not fully in the set).

For a fuzzy set

A

with membership function

μ_{A} (x)

, the modified fuzzy set under the "very" hedge has a membership function defined as:

μ_{very A} (x) = (μ_{A} (x))^{2}

We will apply this hedge to the fuzzy sets "small" and "medium" from the previous question.

Recap of the Membership Functions

Fuzzy Set "Small" Membership Function:

μ_{small} (x) = {(1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1), (11, 0.9), (12, 0.8), (13, 0.7), (14, 0.6), (15, 0.5), (16, 0.4), (17, 0.3), (18, 0.2), (19, 0.1), (20, 0), (21, 0), \dots, (30, 0)}

Fuzzy Set "Medium" Membership Function:

μ_{medium} (x) = {(1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0.1), (7, 0.2), (8, 0.3), (9, 0.4), (10, 0.5), (11, 0.6), (12, 0.7), (13, 0.8), (14, 0.9), (15, 1), (16, 0.9), (17, 0.8), (18, 0.7), (19, 0.6), (20, 0.5), (21, 0.4), (22, 0.3), (23, 0.2), (24, 0.1), (25, 0), (26, 0), \dots, (30, 0)}

Applying the "Very" Hedge

We will now apply the "very" hedge, squaring each membership value.

1. Modified Fuzzy Set "Small" (Applying the "Very" Hedge)

For the fuzzy set "small", we square each membership value:

μ_{very small} (x) = {(μ_{small} (x))}^{2}

Thus, the new membership values for the fuzzy set "very small" are:

μ_{very small} (x) = {(1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1), (11, 0.81), (12, 0.64), (13, 0.49), (14, 0.36), (15, 0.25), (16, 0.16), (17, 0.09), (18, 0.04), (19, 0.01), (20, 0), (21, 0), \dots, (30, 0)}

2. Modified Fuzzy Set "Medium" (Applying the "Very" Hedge)

For the fuzzy set "medium", we square each membership value:

μ_{very medium} (x) = {(μ_{medium} (x))}^{2}

Thus, the new membership values for the fuzzy set "very medium" are:

μ_{very medium} (x) = {(1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0.01), (7, 0.04), (8, 0.09), (9, 0.16), (10, 0.25), (11, 0.36), (12, 0.49), (13, 0.64), (14, 0.81), (15, 1), (16, 0.81), (17, 0.64), (18, 0.49), (19, 0.36), (20, 0.25), (21, 0.16), (22, 0.09), (23, 0.04), (24, 0.01), (25, 0), (26, 0), \dots, (30, 0)}

Final Answer: Modified Fuzzy Sets with the "Very" Hedge

Modified fuzzy set "very small":

$μ_{very small} (x) = {(1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1), (11, 0.81), (12, 0.64), (13, 0.49), (14, 0.36), (15, 0.25), (16, 0.16), (17, 0.09), (18, 0.04), (19, 0.01), (20, 0), (21, 0), \dots, (30, 0)}$
Modified fuzzy set "very medium":

$μ_{very medium} (x) = {(1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0.01), (7, 0.04), (8, 0.09), (9, 0.16), (10, 0.25), (11, 0.36), (12, 0.49), (13, 0.64), (14, 0.81), (15, 1), (16, 0.81), (17, 0.64), (18, 0.49), (19, 0.36), (20, 0.25), (21, 0.16), (22, 0.09), (23, 0.04), (24, 0.01), (25, 0), (26, 0), \dots, (30, 0)}$

Question:-3

Let A and B be two fuzzy sets and $x \in U$ , if $μ_{A} (x) = 0.4$ and $μ_{B} (x) = 0.8$ , then find out the following membership values:

μ_{A} (x)

,
ii)

μ_{A \cap B} (x)

,
iii)

μ_{\overset{―}{A} \cup \overset{―}{B}} (x)

,
iv)

μ_{\overset{―}{A} \cap \overset{―}{B}} (x)

,
v)

μ_{\overset{―}{A} \overset{―}{\overset{―}{B}}} (x)

,
vi)

μ_{\overset{―}{A} \overset{―}{\overset{―}{B}}} (x)

Answer:

Given two fuzzy sets

A

and

B

, with membership functions:

μ_{A} (x) = 0.4, μ_{B} (x) = 0.8

We are tasked with calculating several membership values based on these values.

i) $μ_{A} (x)$

This appears to be asking for the membership value of the complement of set

A

. The complement of a fuzzy set

A

is given by:

μ_{\overset{―}{A}} (x) = 1 - μ_{A} (x)

Substitute

μ_{A} (x) = 0.4

μ_{\overset{―}{A}} (x) = 1 - 0.4 = 0.6

ii) $μ_{A \cap B} (x)$

The intersection (

\cap

) of two fuzzy sets

A

and

B

is defined as the minimum of the membership values:

μ_{A \cap B} (x) = min (μ_{A} (x), μ_{B} (x))

Substitute the given values:

μ_{A \cap B} (x) = min (0.4, 0.8) = 0.4

iii) $μ_{\overset{―}{A} \cup \overset{―}{B}} (x)$

The union (

\cup

) of two fuzzy sets is the maximum of the membership values. We need to first compute the complement of

A

and

B

, and then take their union:

μ_{\overset{―}{A}} (x) = 1 - μ_{A} (x) = 0.6, μ_{\overset{―}{B}} (x) = 1 - μ_{B} (x) = 0.2

Now, compute the union:

μ_{\overset{―}{A} \cup \overset{―}{B}} (x) = max (μ_{\overset{―}{A}} (x), μ_{\overset{―}{B}} (x)) = max (0.6, 0.2) = 0.6

iv) $μ_{\overset{―}{A} \cap \overset{―}{B}} (x)$

The intersection (

\cap

) of the complements of

A

and

B

is the minimum of the membership values:

μ_{\overset{―}{A} \cap \overset{―}{B}} (x) = min (μ_{\overset{―}{A}} (x), μ_{\overset{―}{B}} (x)) = min (0.6, 0.2) = 0.2

v) $μ_{\overset{―}{A} \overset{―}{\overset{―}{B}}} (x)$

This expression seems to involve the complement of the complement of

B

, which simplifies as follows:

\overset{―}{\overset{―}{B}} = B

So, the expression becomes

μ_{\overset{―}{A} B} (x)

, which implies the intersection of

\overset{―}{A}

and

B

μ_{\overset{―}{A} \cap B} (x) = min (μ_{\overset{―}{A}} (x), μ_{B} (x)) = min (0.6, 0.8) = 0.6

vi) $μ_{\overset{―}{A} \overset{―}{\overset{―}{B}}} (x)$

This is identical to part (v) because

\overset{―}{\overset{―}{B}} = B

, so the result is the same:

μ_{\overset{―}{A} \cap B} (x) = 0.6

Final Answer:

$μ_{\overset{―}{A}} (x) = 0.6$
$μ_{A \cap B} (x) = 0.4$
$μ_{\overset{―}{A} \cup \overset{―}{B}} (x) = 0.6$
$μ_{\overset{―}{A} \cap \overset{―}{B}} (x) = 0.2$
$μ_{\overset{―}{A} \cap B} (x) = 0.6$
$μ_{\overset{―}{A} \cap B} (x) = 0.6$ (same as v).

Question:-4

Consider a dataset of six points given in the following table, each of which has two features $f_{1}$ and $f_{2}$ . Assuming the values of the parameters $c$ and $m$ as 2 and the initial cluster centers $v_{1} = (5, 5)$ and $v_{2} = (10, 10)$ , apply FCm algorithm to find the new cluster center after one iteration.

\begin{aligned} \begin{array}{lll} f_{1} & f_{2} \\ x_{1} & 3 & 11 \\ x_{2} & 3 & 10 \end{array} \\ \begin{array}{lll} x_{3} & 8 & 12 \\ x_{4} & 10 & 6 \\ x_{5} & 13 & 6 \\ x_{6} & 13 & 5 \end{array} \end{aligned}

Answer:

The Fuzzy C-Means (FCM) algorithm is a clustering method where each data point can belong to multiple clusters with varying degrees of membership. The objective is to update cluster centers iteratively based on the membership values of the points to the clusters.

Step-by-Step Process to Apply FCM Algorithm:

Given Information:

Two initial cluster centers:

$v_{1} = (5, 5), v_{2} = (10, 10)$
Parameter $c = 2$ (number of clusters),
Fuzziness parameter $m = 2$ ,
Dataset of six points with two features:

$\begin{aligned} x_{1} = (3, 11), x_{2} = (3, 10), x_{3} = (8, 12), \\ x_{4} = (10, 6), x_{5} = (13, 6), x_{6} = (13, 5) \end{aligned}$

Step 1: Calculate the Distance between Each Data Point and the Cluster Centers

The Euclidean distance formula between two points

(x_{1}, y_{1})

and

(x_{2}, y_{2})

is given by:

d (x, v) = \sqrt{(x_{1} - x_{2})^{2} + (y_{1} - y_{2})^{2}}

Distance from points to $v_{1} = (5, 5)$ :

\begin{aligned} d (x_{1}, v_{1}) & = \sqrt{(3 - 5)^{2} + (11 - 5)^{2}} = \sqrt{4 + 36} = \sqrt{40} = 6.32, \\ d (x_{2}, v_{1}) & = \sqrt{(3 - 5)^{2} + (10 - 5)^{2}} = \sqrt{4 + 25} = \sqrt{29} = 5.39, \\ d (x_{3}, v_{1}) & = \sqrt{(8 - 5)^{2} + (12 - 5)^{2}} = \sqrt{9 + 49} = \sqrt{58} = 7.62, \\ d (x_{4}, v_{1}) & = \sqrt{(10 - 5)^{2} + (6 - 5)^{2}} = \sqrt{25 + 1} = \sqrt{26} = 5.10, \\ d (x_{5}, v_{1}) & = \sqrt{(13 - 5)^{2} + (6 - 5)^{2}} = \sqrt{64 + 1} = \sqrt{65} = 8.06, \\ d (x_{6}, v_{1}) & = \sqrt{(13 - 5)^{2} + (5 - 5)^{2}} = \sqrt{64} = 8.00 . \end{aligned}

Distance from points to $v_{2} = (10, 10)$ :

\begin{aligned} d (x_{1}, v_{2}) & = \sqrt{(3 - 10)^{2} + (11 - 10)^{2}} = \sqrt{49 + 1} = \sqrt{50} = 7.07, \\ d (x_{2}, v_{2}) & = \sqrt{(3 - 10)^{2} + (10 - 10)^{2}} = \sqrt{49} = 7.00, \\ d (x_{3}, v_{2}) & = \sqrt{(8 - 10)^{2} + (12 - 10)^{2}} = \sqrt{4 + 4} = \sqrt{8} = 2.83, \\ d (x_{4}, v_{2}) & = \sqrt{(10 - 10)^{2} + (6 - 10)^{2}} = \sqrt{0 + 16} = 4.00, \\ d (x_{5}, v_{2}) & = \sqrt{(13 - 10)^{2} + (6 - 10)^{2}} = \sqrt{9 + 16} = \sqrt{25} = 5.00, \\ d (x_{6}, v_{2}) & = \sqrt{(13 - 10)^{2} + (5 - 10)^{2}} = \sqrt{9 + 25} = \sqrt{34} = 5.83 . \end{aligned}

Step 2: Calculate Membership Values

For each point

x_{i}

and each cluster

v_{1}

and

v_{2}

, we calculate the membership values using the formula:

μ_{i j} = \frac{1}{\sum_{k = 1}^{c} {(\frac{d (x_{i}, v_{j})}{d (x_{i}, v_{k})})}^{\frac{2}{m - 1}}}

where:

$μ_{i j}$ is the membership of point $x_{i}$ in cluster $j$ ,
$m = 2$ (the fuzziness parameter),
$d (x_{i}, v_{j})$ is the distance between point $x_{i}$ and cluster center $v_{j}$ ,
$c = 2$ (the number of clusters).

We can compute this for each point for both clusters:

Membership Values for $x_{1} = (3, 11)$ :

μ_{1, 1} = \frac{1}{{(\frac{6.32}{6.32})}^{2} + {(\frac{6.32}{7.07})}^{2}} = \frac{1}{1 + 0.798} = 0.556

μ_{1, 2} = 1 - μ_{1, 1} = 0.444

Membership Values for $x_{2} = (3, 10)$ :

μ_{2, 1} = \frac{1}{{(\frac{5.39}{5.39})}^{2} + {(\frac{5.39}{7.00})}^{2}} = \frac{1}{1 + 0.594} = 0.627

μ_{2, 2} = 1 - μ_{2, 1} = 0.373

Membership Values for $x_{3} = (8, 12)$ :

μ_{3, 1} = \frac{1}{{(\frac{7.62}{7.62})}^{2} + {(\frac{7.62}{2.83})}^{2}} = \frac{1}{1 + 7.24} = 0.121

μ_{3, 2} = 1 - μ_{3, 1} = 0.879

Membership Values for $x_{4} = (10, 6)$ :

μ_{4, 1} = \frac{1}{{(\frac{5.10}{5.10})}^{2} + {(\frac{5.10}{4.00})}^{2}} = \frac{1}{1 + 1.625} = 0.381

μ_{4, 2} = 1 - μ_{4, 1} = 0.619

Membership Values for $x_{5} = (13, 6)$ :

μ_{5, 1} = \frac{1}{{(\frac{8.06}{8.06})}^{2} + {(\frac{8.06}{5.00})}^{2}} = \frac{1}{1 + 2.60} = 0.278

μ_{5, 2} = 1 - μ_{5, 1} = 0.722

Membership Values for $x_{6} = (13, 5)$ :

μ_{6, 1} = \frac{1}{{(\frac{8.00}{8.00})}^{2} + {(\frac{8.00}{5.83})}^{2}} = \frac{1}{1 + 1.88} = 0.347

μ_{6, 2} = 1 - μ_{6, 1} = 0.653

Step 3: Update the Cluster Centers

The new cluster centers are computed using the following formula for each cluster center

v_{j}

v_{j} = \frac{\sum_{i = 1}^{n} μ_{i j}^{m} x_{i}}{\sum_{i = 1}^{n} μ_{i j}^{m}}

For cluster center

v_{1}

Numerator:

$\sum_{i = 1}^{n} μ_{i, 1}^{m} x_{i} = {0.556}^{2} x_{1} + {0.627}^{2} x_{2} + {0.121}^{2} x_{3} + {0.381}^{2} x_{4} + {0.278}^{2} x_{5} + {0.347}^{2} x_{6}$

$= 0.309 \cdot (3, 11) + 0.393 \cdot (3, 10) + 0.015 \cdot (8, 12) + 0.145 \cdot (10, 6) + 0.077 \cdot (13, 6) + 0.120 \cdot (13, 5)$

$= (0.927, 3.399) + (1.179, 3.930) + (0.118, 0.180) + (1.450, 0.870) + (1.001, 0.462) + (1.560, 0.600)$

$= (6.235, 9.441)$
Denominator:

$\sum_{i = 1}^{n} μ_{i, 1}^{m} = {0.556}^{2} + {0.627}^{2} + {0.121}^{2} + {0.381}^{2} + {0.278}^{2} + {0.347}^{2}$

$= 0.309 + 0.393 + 0.015 + 0.145 + 0.077 + 0.120 = 1.059$

So, the new cluster center for

v_{1}

is:

v_{1} = \frac{(6.235, 9.441)}{1.059} = (5.89, 8.91)

For cluster center

v_{2}

Numerator:

$\sum_{i = 1}^{n} μ_{i, 2}^{m} x_{i} = {0.444}^{2} x_{1} + {0.373}^{2} x_{2} + {0.879}^{2} x_{3} + {0.619}^{2} x_{4} + {0.722}^{2} x_{5} + {0.653}^{2} x_{6}$

$= 0.197 \cdot (3, 11) + 0.139 \cdot (3, 10) + 0.773 \cdot (8, 12) + 0.383 \cdot (10, 6) + 0.521 \cdot (13, 6) + 0.426 \cdot (13, 5)$

$= (0.591, 2.167) + (0.417, 1.386) + (6.184, 9.276) + (3.830, 2.298) + (6.773, 3.126) + (5.538, 2.130)$

$= (23.333, 20.383)$
Denominator:

$\sum_{i = 1}^{n} μ_{i, 2}^{m} = {0.444}^{2} + {0.373}^{2} + {0.879}^{2} + {0.619}^{2} + {0.722}^{2} + {0.653}^{2}$

$= 0.197 + 0.139 + 0.773 + 0.383 + 0.521 + 0.426 = 2.439$

So, the new cluster center for

v_{2}

is:

v_{2} = \frac{(23.333, 20.383)}{2.439} = (9.57, 8.36)

Final Answer: New Cluster Centers

After one iteration, the new cluster centers are:

v_{1} = (5.89, 8.91), v_{2} = (9.57, 8.36)

Question:-5(a)

Define Error Correction Learning with examples.

Answer:

Error Correction Learning is a learning process in machine learning and neural networks where the model adjusts its parameters (such as weights) based on the difference between the actual output and the desired (or target) output. The main goal is to minimize the error, typically defined as the difference between the predicted output and the actual output, over a series of learning iterations.

Key Concept:

In Error Correction Learning, the model tries to correct its errors by adjusting its internal parameters using the error signal. The error signal is calculated as the difference between the desired output and the actual output. This process is iterative, meaning that the model repeatedly adjusts its parameters to reduce the error and eventually learn the correct mapping from inputs to outputs.

General Steps:

Initial Prediction: The model makes an initial prediction using its current parameters (e.g., weights in a neural network).
Compute Error: The error is computed by comparing the actual prediction to the desired (target) output: $Error = Target Output - Actual Output$
Parameter Adjustment: The model adjusts its parameters to reduce this error. This is typically done using a learning rule, such as gradient descent, where the parameters are updated proportionally to the error: $New Parameter = Old Parameter + Learning Rate \times Error \times Input$
Repeat: The process is repeated until the error is minimized (ideally close to zero) or until a stopping criterion (e.g., number of iterations) is reached.

Types of Error Correction Learning Algorithms:

Perceptron Learning Rule: One of the simplest forms of error correction learning, used in a single-layer neural network called a perceptron. The weights are adjusted based on the error between the predicted class and the true class.
Delta Rule (Widrow-Hoff Rule): Used in gradient descent for adjusting weights in a linear neuron or perceptron to minimize the squared error between the actual output and target output.
Backpropagation: Used in multi-layer neural networks, where error correction is applied at each layer of the network using the gradient of the error with respect to the parameters (weights). This is the basis for modern deep learning.

Example 1: Perceptron Learning Algorithm (Error Correction in Binary Classification)

Consider a simple perceptron model used for binary classification where:

Inputs: $x = [x_{1}, x_{2}]$
Weights: $w = [w_{1}, w_{2}]$
Bias: $b$

The output

y

of the perceptron is given by the sign of the weighted sum of inputs:

y = sign (w_{1} x_{1} + w_{2} x_{2} + b)

Let’s assume the task is to classify points as either belonging to class +1 or -1. If the perceptron makes a mistake, the weights and bias are updated as:

w_{i} \leftarrow w_{i} + η (t - y) x_{i} (for each weight w_{i})

b \leftarrow b + η (t - y)

Where:

$η$ is the learning rate,
$t$ is the target output (either +1 or -1),
$y$ is the actual output.

Example with data:

Suppose we are classifying the following points:

$x_{1}$	$x_{2}$	Target $t$
1	1	+1
-1	-1	-1

Initial weights and bias: $w_{1} = 0, w_{2} = 0, b = 0$ .
First input: $x = [1, 1]$ $x = [1, 1]$ x=[1,1]\mathbf{x} = [1, 1] $x = [1, 1]$ , target $t = + 1$ $t = + 1$ t=+1t = +1 $t = + 1$ .
- The weighted sum is $0$ (since weights are zero), so the predicted output $y = 0$ .
- Update rule: $w_{1} \leftarrow 0 + η (1 - 0) 1 = η$ , and similarly for $w_{2}$ .
- So, $w_{1} = η$ , $w_{2} = η$ , $b = η$ .
Second input: $x = [- 1, - 1]$ $x = [- 1, - 1]$ x=[-1,-1]\mathbf{x} = [-1, -1] $x = [- 1, - 1]$ , target $t = - 1$ $t = - 1$ t=-1t = -1 $t = - 1$ .
- The weighted sum is $η (- 1) + η (- 1) + η = - η$ .
- Predicted output $y = - 1$ , which is correct. No weight update is necessary.

The weights are adjusted after each incorrect prediction, correcting the error step-by-step.

Example 2: Backpropagation in a Neural Network

In modern deep learning, backpropagation is an error correction learning method used to train multi-layer neural networks. Consider a simple feedforward neural network with:

One input layer,
One hidden layer,
One output layer.

Forward Pass: The input data is passed through the network to produce an output.
Compute Error: The error is computed using a loss function, such as mean squared error (MSE): $E = \frac{1}{2} (t - y)^{2}$ where $t$ is the target output, and $y$ is the predicted output.
Backpropagation: The error is propagated backward through the network. The weights in each layer are adjusted based on the gradient of the error with respect to each weight: $w \leftarrow w - η \frac{\partial E}{\partial w}$
Update Weights: The weights are updated, and the process is repeated for each training example.

Example in Context:

Let’s assume you have a neural network that classifies handwritten digits. During training, if the network predicts a digit incorrectly (e.g., predicts "3" when the correct label is "8"), the backpropagation algorithm adjusts the network’s weights to reduce the error in future predictions. Over time, the model corrects its errors and becomes more accurate in classifying digits.

Conclusion:

Error Correction Learning is a fundamental concept used to update parameters in machine learning models, enabling them to reduce prediction errors iteratively. It forms the basis for algorithms such as perceptron learning, delta rule, and backpropagation in neural networks.

Question:-5(b)

Write the types of Neural Memory Models. Also, give one example of each.

Answer:

Neural memory models are neural networks designed to mimic memory functions in computational systems. These models store, retrieve, and process information in ways similar to how memory works in biological systems. Below are the main types of Neural Memory Models, along with an example of each:

1. Associative Memory Models

These models are designed to store patterns and recall them when a similar or noisy pattern is presented. They "associate" an input with a stored pattern, even if the input is incomplete or noisy.

Example: Hopfield Network

Hopfield Network is a type of recurrent neural network that stores patterns as stable states. When an input pattern is presented, the network iteratively updates until it converges to one of the stored patterns, even if the input is noisy or incomplete.

Application: Pattern recognition, error correction, or auto-association tasks.

2. Content-Addressable Memory (CAM)

These models allow for data retrieval based on the content of the data itself rather than an explicit memory address, making them a type of associative memory that searches for stored patterns matching the input.

Example: Bidirectional Associative Memory (BAM)

BAM is a recurrent associative memory model capable of storing a set of pattern pairs, where one pattern can be used to recall its associated counterpart. It works bidirectionally: both forward and backward (i.e., pattern $A$ retrieves pattern $B$ and vice versa).

Application: Paired pattern recognition (e.g., recalling a person’s face given their name or vice versa).

3. Recurrent Neural Networks (RNNs)

RNNs are neural networks with connections that form directed cycles, giving them the ability to maintain a "memory" of previous inputs through internal state loops. This makes RNNs particularly effective for tasks involving sequences or time-series data.

Example: Long Short-Term Memory (LSTM)

LSTM is a special kind of RNN capable of learning long-term dependencies. It uses memory cells to store information over time and has mechanisms (gates) to control the flow of information, solving the vanishing gradient problem common in traditional RNNs.

Application: Time-series prediction, natural language processing (NLP), and speech recognition (e.g., predicting the next word in a sentence).

4. Self-Organizing Maps (SOMs)

SOMs are unsupervised learning models that project high-dimensional data into lower-dimensional spaces while preserving topological relationships. They store information by organizing neurons based on the similarity between input data.

Example: Kohonen Self-Organizing Map (SOM)

Kohonen SOM is a neural network that organizes and clusters data based on similarity. The map consists of a grid of neurons that adapt to input data, and nearby neurons in the grid represent similar input data.

Application: Data clustering, visualization of high-dimensional data, such as customer segmentation or feature extraction.

5. Memory Networks

Memory networks are models that include an external memory component, allowing the network to store information in an explicit memory bank and retrieve it as needed. These models are commonly used in tasks that require both reasoning and the ability to handle large amounts of information.

Example: Neural Turing Machine (NTM)

NTM is a neural network model that includes an external memory matrix, enabling it to read from and write to the memory bank. The model learns to use this memory through backpropagation and is capable of complex tasks like copying or sorting sequences.

Application: Reasoning tasks, algorithmic tasks like sorting or copying sequences, and solving problems that require both memory and computation.

6. Transformer-based Models

These models use attention mechanisms to handle long-range dependencies in data sequences by focusing on relevant parts of the input data, enabling parallel processing. They are particularly efficient for sequence-to-sequence tasks without needing a recurrent structure.

Example: BERT (Bidirectional Encoder Representations from Transformers)

BERT is a transformer-based model that uses attention mechanisms to learn contextual relationships between words in a text. It captures information from both directions (left and right context) in a sentence, enabling more accurate language understanding.

Application: Natural language understanding, question-answering systems, and text classification.

Summary of Types and Examples:

Associative Memory Models: Hopfield Network (Pattern recognition).
Content-Addressable Memory (CAM): Bidirectional Associative Memory (BAM) (Pattern pair recognition).
Recurrent Neural Networks (RNNs): LSTM (Time-series prediction, NLP).
Self-Organizing Maps (SOMs): Kohonen SOM (Data clustering, visualization).
Memory Networks: Neural Turing Machine (NTM) (Complex tasks with memory).
Transformer-based Models: BERT (NLP tasks, text classification).

These neural memory models are fundamental in various applications, including pattern recognition, language processing, and reasoning tasks.

Question:-6

Consider the set of pattern vectors P. Obtain the connectivity matrix (CM) for the patterns in $P$ (four patterns).

p = [\begin{array}{llllllllll} 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \\ 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \end{array}]

Answer:

The Connectivity Matrix (CM), also known as the weight matrix in the context of associative memory (such as Hopfield Networks), represents the strength of connections between neurons (pattern elements). For a given set of pattern vectors, the connectivity matrix is symmetric, with each element of the matrix representing the degree of interaction between two neurons.

Steps to Calculate the Connectivity Matrix (CM):

Given Pattern Set $P$ :
We are given four pattern vectors $p_{1}, p_{2}, p_{3}, p_{4}$ , each consisting of 10 elements:

$p_{1} = [1, 1, 1, 0, 0, 0, 0, 0, 0, 0]$

$p_{2} = [0, 0, 0, 0, 0, 0, 0, 1, 1, 1]$

$p_{3} = [1, 1, 1, 0, 0, 0, 0, 0, 0, 1]$

$p_{4} = [1, 0, 1, 0, 1, 0, 1, 0, 1, 0]$
Connectivity Matrix Formula:
The connectivity matrix $W$ for a set of $N$ patterns, where each pattern has $n$ elements, is calculated as:

$W = \sum_{k = 1}^{N} p_{k}^{T} p_{k}$

where $p_{k}^{T} p_{k}$ is the outer product of pattern $p_{k}$ with itself.

The matrix will have dimensions $10 \times 10$ , since each pattern vector has 10 elements.
Outer Product Calculation:
For each pattern vector $p_{k}$ , compute the outer product $p_{k}^{T} p_{k}$ , which results in a $10 \times 10$ matrix. The final weight matrix $W$ is the sum of these outer products for all four patterns.

Step-by-Step Outer Product Calculations

For $p_{1} = [1, 1, 1, 0, 0, 0, 0, 0, 0, 0]$ :

p_{1}^{T} p_{1} = [\begin{matrix} 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{matrix}]

For $p_{2} = [0, 0, 0, 0, 0, 0, 0, 1, 1, 1]$ :

p_{2}^{T} p_{2} = [\begin{matrix} 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \end{matrix}]

For $p_{3} = [1, 1, 1, 0, 0, 0, 0, 0, 0, 1]$ :

p_{3}^{T} p_{3} = [\begin{matrix} 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{matrix}]

For $p_{4} = [1, 0, 1, 0, 1, 0, 1, 0, 1, 0]$ :

p_{4}^{T} p_{4} = [\begin{matrix} 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{matrix}]

Step 4: Sum of the Outer Products

Now, sum the outer products to get the final connectivity matrix

W

W = p_{1}^{T} p_{1} + p_{2}^{T} p_{2} + p_{3}^{T} p_{3} + p_{4}^{T} p_{4}

Summing all the matrices element-wise, we get:

W = [\begin{matrix} 4 & 2 & 4 & 0 & 2 & 0 & 2 & 0 & 2 & 1 \\ 2 & 2 & 2 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 4 & 2 & 4 & 0 & 2 & 0 & 2 & 0 & 2 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 2 & 0 & 2 & 0 & 2 & 0 & 2 & 0 & 2 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 2 & 0 & 2 & 0 & 2 & 0 & 2 & 0 & 2 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 2 & 2 & 2 \\ 2 & 0 & 2 & 0 & 2 & 0 & 2 & 2 & 2 & 2 \\ 1 & 1 & 1 & 0 & 0 & 0 & 0 & 2 & 2 & 3 \end{matrix}]

Final Answer: Connectivity Matrix $W$

The Connectivity Matrix (CM) is:

W = [\begin{matrix} 4 & 2 & 4 & 0 & 2 & 0 & 2 & 0 & 2 & 1 \\ 2 & 2 & 2 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 4 & 2 & 4 & 0 & 2 & 0 & 2 & 0 & 2 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 2 & 0 & 2 & 0 & 2 & 0 & 2 & 0 & 2 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 2 & 0 & 2 & 0 & 2 & 0 & 2 & 0 & 2 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 2 & 2 & 2 \\ 2 & 0 & 2 & 0 & 2 & 0 & 2 & 2 & 2 & 2 \\ 1 & 1 & 1 & 0 & 0 & 0 & 0 & 2 & 2 & 3 \end{matrix}]

Question:-7(a)

Define Kohonen networks with examples.

Answer:

Kohonen Networks, also known as Self-Organizing Maps (SOMs), are a type of unsupervised neural network used for clustering and visualizing high-dimensional data by projecting it into a lower-dimensional space, typically a two-dimensional grid. These networks are named after Teuvo Kohonen, who introduced this technique.

Key Characteristics of Kohonen Networks:

Unsupervised Learning: SOMs do not require labeled data. They automatically group similar data points together based on patterns in the input data.
Topological Preservation: The network attempts to maintain the relative distance of data points in the input space, meaning similar data points in the input will map to nearby nodes in the output map.
Dimensionality Reduction: High-dimensional input data is mapped onto a low-dimensional grid, often 2D, making it easier to visualize complex data patterns.

Structure of Kohonen Networks:

A Kohonen Network consists of:

Input Layer: The input vector representing the features of the data.
Output Layer (Map Units): A grid of neurons (or nodes), usually arranged in a 2D lattice. Each neuron in this grid has a weight vector of the same dimension as the input vector.
Weight Vectors: Each node in the output layer is associated with a weight vector that is iteratively adjusted during the learning process to represent the input data.

Working of Kohonen Networks (SOM):

Initialization: The weights of the neurons are initialized, typically with small random values.
Input Vector Presentation: An input vector from the dataset is presented to the network.
Best Matching Unit (BMU): The Euclidean distance between the input vector and each neuron’s weight vector is calculated. The neuron whose weight vector is closest to the input vector is called the Best Matching Unit (BMU).

$BMU = \arg min_{j} ‖ x - w_{j} ‖$

where $x$ is the input vector, and $w_{j}$ is the weight vector of neuron $j$ .
Weight Update: The BMU and its neighboring neurons have their weights updated to become more similar to the input vector. The amount of update depends on the learning rate and the neighborhood function, which shrinks over time.

$w_{j} (t + 1) = w_{j} (t) + η (t) \cdot h (j, t) \cdot (x (t) - w_{j} (t))$

where:
- $w_{j} (t + 1)$ is the updated weight vector,
- $η (t)$ is the learning rate,
- $h (j, t)$ is the neighborhood function,
- $x (t)$ is the input vector.
Iteration: The process is repeated for multiple iterations and for all input vectors until the weights stabilize, and the network becomes "self-organized."

Example of Kohonen Network:

Dataset:

Consider a simple dataset of two-dimensional points:

Data Point	Feature 1 (X)	Feature 2 (Y)
A	0.1	0.6
B	0.8	0.7
C	0.4	0.3
D	0.9	0.2

We want to cluster this data using a Kohonen network with a

2 \times 2

grid of neurons.

Steps:

Initialization: Assume the weights of the neurons in the $2 \times 2$ grid are initialized as random 2D vectors:

$Weights of neurons: w_{1} = (0.2, 0.4), w_{2} = (0.6, 0.1), w_{3} = (0.7, 0.9), w_{4} = (0.5, 0.5)$
Input Vector Presentation: Present the first data point $x_{A} = (0.1, 0.6)$ .
BMU Selection: Calculate the Euclidean distance between the input point $x_{A}$ and each neuron’s weight:

$d (x_{A}, w_{1}) = \sqrt{(0.1 - 0.2)^{2} + (0.6 - 0.4)^{2}} = 0.224$

$d (x_{A}, w_{2}) = \sqrt{(0.1 - 0.6)^{2} + (0.6 - 0.1)^{2}} = 0.707$

$d (x_{A}, w_{3}) = \sqrt{(0.1 - 0.7)^{2} + (0.6 - 0.9)^{2}} = 0.678$

$d (x_{A}, w_{4}) = \sqrt{(0.1 - 0.5)^{2} + (0.6 - 0.5)^{2}} = 0.412$

The neuron with the smallest distance is neuron 1, so it is the BMU.
Update Weights: Update the weight of neuron 1 (the BMU) and its neighboring neurons. Assuming a learning rate $η = 0.5$ and that neuron 4 is within the neighborhood, the updated weights are:

$w_{1} = (0.2, 0.4) + 0.5 \cdot ((0.1, 0.6) - (0.2, 0.4)) = (0.15, 0.5)$

$w_{4} = (0.5, 0.5) + 0.5 \cdot ((0.1, 0.6) - (0.5, 0.5)) = (0.3, 0.55)$
Repeat: The process is repeated for all input vectors until the weights converge.

Output:

After several iterations, the neurons will represent clusters of the input data. The data points that are closer in feature space will be mapped to nearby neurons in the 2D grid.

Applications of Kohonen Networks:

Data Clustering: SOMs are widely used to cluster high-dimensional data into groups where similar data points are grouped together in the map. For example, clustering customer profiles based on purchasing behavior.
Dimensionality Reduction: Kohonen networks can be used to reduce the dimensionality of data while preserving topological relationships. For example, reducing 100-dimensional data into a 2D map for visualization.
Pattern Recognition: Kohonen networks can be applied to recognize patterns in input data, such as in speech recognition and image classification.
Visualization: They are also used for visualizing complex relationships in high-dimensional datasets. For instance, visualizing a large number of genes in a 2D map based on gene expression data.

Conclusion:

Kohonen Networks (Self-Organizing Maps) are powerful tools for unsupervised learning that help cluster and visualize high-dimensional data by mapping it to a lower-dimensional grid. They are used in various fields like data mining, pattern recognition, and feature extraction due to their ability to preserve topological relationships and uncover hidden patterns in the data.

Question:-7(b)

Describe the Function Approximation in MLP. Also, explain Generalization of MLP.

Answer:

Function Approximation in Multi-Layer Perceptron (MLP)

A Multi-Layer Perceptron (MLP) is a type of feedforward neural network that can approximate complex, non-linear functions. In function approximation, the goal is to learn a mapping from input data (features) to output data (target values), based on a set of training examples. MLPs are capable of approximating both continuous and discrete functions by adjusting their weights and biases through training.

Key Components of Function Approximation in MLP:

Layers:
- Input Layer: Receives input data (features) $x = [x_{1}, x_{2}, \dots, x_{n}]$ .
- Hidden Layers: One or more layers that process the inputs using weights, biases, and activation functions. These layers extract and learn complex patterns in the data.
- Output Layer: Produces the final approximation of the function. The number of neurons in this layer corresponds to the number of outputs (e.g., a single neuron for scalar output or multiple neurons for vector outputs).
Activation Functions:
- Activation functions introduce non-linearity into the model, which allows MLPs to approximate complex functions. Common activation functions include sigmoid, ReLU (Rectified Linear Unit), and tanh.
Training Process:
- Feedforward: During the forward pass, the input data is propagated through the network layer by layer, using the weights and biases of the neurons to compute outputs.
- Loss Function: The difference between the actual output (from the MLP) and the desired output (target) is calculated using a loss function, such as mean squared error (MSE) for regression tasks or cross-entropy for classification.
- Backpropagation and Weight Update: The error is propagated back through the network, and the weights are updated using an optimization algorithm like gradient descent to minimize the error.
Universal Approximation Theorem:
The Universal Approximation Theorem states that an MLP with at least one hidden layer containing a sufficient number of neurons can approximate any continuous function to any desired accuracy, provided that an appropriate activation function (e.g., sigmoid or tanh) is used. This theorem emphasizes that MLPs are powerful tools for function approximation, capable of learning complex, non-linear mappings from inputs to outputs.

Example of Function Approximation in MLP:

Suppose we want to approximate the function

f (x) = x^{2} + 2 x + 1

. Given a dataset of input-output pairs

(x, f (x))

, an MLP can be trained to learn this quadratic function. The MLP would adjust its weights and biases to minimize the error between the predicted and actual outputs. After training, the MLP can generalize and predict the output for unseen inputs, effectively approximating the function

f (x)

Generalization of Multi-Layer Perceptron (MLP)

Generalization refers to the MLP’s ability to correctly predict or classify unseen data after being trained on a given dataset. In other words, a well-generalized MLP can apply what it has learned during training to new, previously unseen inputs and produce accurate predictions.

Key Concepts of Generalization:

Training vs. Testing:
- During training, the MLP adjusts its weights based on the given training data to minimize the error between predicted and actual outputs.
- Generalization is evaluated on test data, which the MLP has never seen during training. If the model performs well on this unseen test data, it is said to generalize well.
Overfitting:
- Overfitting occurs when the MLP performs very well on the training data but poorly on the test data. This happens because the network has learned the noise and specific details in the training data rather than capturing the underlying pattern.
- Overfitting can be recognized when the training error continues to decrease, but the test error starts to increase. It indicates that the model has become too complex, memorizing the training data rather than generalizing from it.
Underfitting:
- Underfitting occurs when the MLP is too simple (e.g., having too few neurons or layers) to capture the underlying patterns in the data. The model performs poorly on both the training and test datasets because it fails to learn the function properly.

Techniques to Improve Generalization:

Regularization:
- Regularization techniques like L2 regularization (weight decay) or L1 regularization (lasso) add a penalty term to the loss function to constrain the weights and prevent the model from becoming too complex.
- The loss function is modified to include the regularization term: $Loss = MSE + λ \sum w_{i}^{2}$ where $λ$ is the regularization coefficient, and $w_{i}$ are the weights.
Dropout:
- Dropout is a regularization technique where, during each training iteration, random neurons are "dropped" (i.e., temporarily ignored) from the network. This prevents the network from becoming too reliant on specific neurons and encourages the model to learn more robust features.
Early Stopping:
- Early stopping involves monitoring the performance of the MLP on a validation dataset during training. If the validation error starts to increase while the training error continues to decrease, training is stopped to avoid overfitting.
Cross-Validation:
- K-fold cross-validation is used to ensure that the MLP generalizes well. The dataset is split into $k$ subsets, and the model is trained on $k - 1$ subsets and tested on the remaining subset. This process is repeated $k$ times, with a different subset used as the test set each time. The final result is averaged over all folds, providing a more robust measure of generalization.

Example of Generalization in MLP:

Consider an MLP trained to classify handwritten digits (e.g., the MNIST dataset). After training the network on thousands of labeled digit images, the MLP’s generalization ability is tested on a separate test set containing different handwritten digits that the network has never seen before. If the MLP can correctly classify these unseen digits with high accuracy, it has generalized well.

Conclusion:

Function Approximation in an MLP refers to the network’s ability to learn a mapping from inputs to outputs by adjusting its weights based on training data. MLPs, through their non-linear activation functions and multiple layers, can approximate complex functions.
Generalization is the ability of the MLP to perform well on unseen data after training. Good generalization means that the MLP has learned the underlying patterns in the data and not just memorized the training examples. Techniques like regularization, dropout, and early stopping help to improve generalization and prevent overfitting.

Question:-8(a)

Find the length and order of the following schema:

S_{1} = (1^{* *} 00 * 1^{* *})

ii)

S_{2} = (^{*} 00^{*} 1^{* *})

iii)

S_{3} = (* * * 1^{* * *})

Answer:

In the context of schema theory, we define the length and order of a schema as follows:

Length: The length of a schema is the distance between the first and the last fixed positions (positions that are either $0$ or $1$ ) in the string.
Order: The order of a schema is the number of fixed positions (positions that are either $0$ or $1$ ) in the string.

Now, let’s calculate the length and order for the given schemas.

i) Schema $S_{1} = (1^{* } 00 1^{* *})$

Order:

The fixed positions are:
- Position 1: $1$ ,
- Positions 5 and 6: $00$ ,
- Position 8: $1$ .
Therefore, the fixed positions are 1, 5, 6, and 8.
The number of fixed positions is 4.

Thus, the order of

S_{1}

is:

Order of S_{1} = 4

Length:

The first fixed position is at position 1 (the first $1$ ).
The last fixed position is at position 8 (the last $1$ ).
The distance between these positions is $8 - 1 = 7$ .

Thus, the length of

S_{1}

is:

Length of S_{1} = 7

ii) Schema $S_{2} = (^{} 00^{} 1^{* *})$

Order:

The fixed positions are:
- Positions 2 and 3: $00$ ,
- Position 6: $1$ .
Therefore, the fixed positions are 2, 3, and 6.
The number of fixed positions is 3.

Thus, the order of

S_{2}

is:

Order of S_{2} = 3

Length:

The first fixed position is at position 2 (the first $0$ ).
The last fixed position is at position 6 (the $1$ ).
The distance between these positions is $6 - 2 = 4$ .

Thus, the length of

S_{2}

is:

Length of S_{2} = 4

iii) Schema $S_{3} = (* * * 1^{* * *})$

Order:

The fixed position is:
- Position 4: $1$ .
There is only one fixed position, which is position 4.

Thus, the order of

S_{3}

is:

Order of S_{3} = 1

Length:

Since there is only one fixed position (at position 4), the length of the schema is 0, because the distance between the first and last fixed positions is 0.

Thus, the length of

S_{3}

is:

Length of S_{3} = 0

Final Answer:

$S_{1} = (1^{* *} 00 * 1^{* *})$ :
- Order: 4
- Length: 7
$S_{2} = (^{*} 00^{*} 1^{* *})$ :
- Order: 3
- Length: 4
$S_{3} = (* * * 1^{* * *})$ :
- Order: 1
- Length: 0

Question:-8(b)

Let an activation function be defined as

ϕ (v) = \frac{1}{1 + e^{- zv}}, a > 0

Show that

\frac{d ϕ}{d v} = a ϕ (v) [1 - ϕ (v)]

. What is the value of

ϕ (v)

at the origin? Also, find the value of

ϕ (v)

v

approaches

+ \infty

and

- \infty

Answer:

We are given the activation function:

ϕ (v) = \frac{1}{1 + e^{- a v}}, a > 0

where

a

is a positive constant, and we need to:

Show that $\frac{d ϕ}{d v} = a ϕ (v) [1 - ϕ (v)]$ ,
Find the value of $ϕ (v)$ at the origin ( $v = 0$ ),
Find the value of $ϕ (v)$ as $v$ approaches $+ \infty$ and $- \infty$ .

1. Differentiating $ϕ (v)$ with respect to $v$

The given activation function is:

ϕ (v) = \frac{1}{1 + e^{- a v}} .

Let’s differentiate

ϕ (v)

with respect to

v

\frac{d ϕ}{d v} = \frac{d}{d v} (\frac{1}{1 + e^{- a v}}) .

Using the chain rule, let’s first differentiate the denominator. The denominator is

1 + e^{- a v}

, so the derivative of this term with respect to

v

is:

\frac{d}{d v} (1 + e^{- a v}) = - a e^{- a v} .

Now, applying the quotient rule for differentiation:

\frac{d ϕ}{d v} = \frac{0 \cdot (1 + e^{- a v}) - (1) \cdot (- a e^{- a v})}{(1 + e^{- a v})^{2}} = \frac{a e^{- a v}}{(1 + e^{- a v})^{2}} .

Since

ϕ (v) = \frac{1}{1 + e^{- a v}}

, we can express

e^{- a v}

as:

e^{- a v} = \frac{1 - ϕ (v)}{ϕ (v)} .

Substitute this expression for

e^{- a v}

into the derivative:

\frac{d ϕ}{d v} = \frac{a \frac{1 - ϕ (v)}{ϕ (v)}}{(1 + \frac{1 - ϕ (v)}{ϕ (v)})^{2}} = a ϕ (v) [1 - ϕ (v)] .

Thus, we have shown that:

\frac{d ϕ}{d v} = a ϕ (v) [1 - ϕ (v)] .

2. Value of $ϕ (v)$ at the origin ( $v = 0$ )

Substitute

v = 0

into the expression for

ϕ (v)

ϕ (0) = \frac{1}{1 + e^{- a \cdot 0}} = \frac{1}{1 + 1} = \frac{1}{2} .

Thus, the value of

ϕ (v)

at the origin is:

ϕ (0) = \frac{1}{2} .

3. Value of $ϕ (v)$ as $v$ approaches $+ \infty$ and $- \infty$

As $v \to + \infty$ :

When

v

approaches

+ \infty

e^{- a v}

approaches 0 because

e^{- a v}

decays exponentially. So, we have:

ϕ (v) = \frac{1}{1 + e^{- a v}} as v \to + \infty .

e^{- a v} \to 0

, we get:

ϕ (v) \to \frac{1}{1 + 0} = 1.

Thus, as

v \to + \infty

, the value of

ϕ (v)

approaches:

lim_{v \to + \infty} ϕ (v) = 1.

As $v \to - \infty$ :

When

v

approaches

- \infty

e^{- a v}

grows exponentially large because

e^{- a v} \to \infty

. So, we have:

ϕ (v) = \frac{1}{1 + e^{- a v}} as v \to - \infty .

e^{- a v} \to \infty

, we get:

ϕ (v) \to \frac{1}{1 + \infty} = 0.

Thus, as

v \to - \infty

, the value of

ϕ (v)

approaches:

lim_{v \to - \infty} ϕ (v) = 0.

Final Results:

We have shown that:

$\frac{d ϕ}{d v} = a ϕ (v) [1 - ϕ (v)] .$
The value of $ϕ (v)$ at the origin is:

$ϕ (0) = \frac{1}{2} .$
The value of $ϕ (v)$ as $v \to + \infty$ is:

$lim_{v \to + \infty} ϕ (v) = 1.$
The value of $ϕ (v)$ as $v \to - \infty$ is:

$lim_{v \to - \infty} ϕ (v) = 0.$

Question:-9(a)

Consider the following travelling salesman problem involving 9 cities.

\begin{array}{llllllllll} P a r e n t 1 & G & J & H & F & E & D & B & I & C \\ P a r e n t 2 & D & C & H & J & I & G & E & F & B \end{array}

Determine the children solution using:
i) Order crossover #1, assuming

4^{th}

and

7^{th}

sites as the crossover sites.
ii) Order crossover #2, assuming

3^{rd}, 5^{th}

and

7^{th}

as the key positions.

Answer:

i) Order Crossover #1

In Order Crossover #1 (OX1), we generate two child solutions from two parent solutions by selecting a subset of the tour (sequence of cities) from each parent and preserving the relative order of the remaining cities.

Given:

Parent 1: $G, J, H, F, E, D, B, I, C$
Parent 2: $D, C, H, J, I, G, E, F, B$
Crossover points are at the 4th and 7th sites.

Steps for OX1:

Copy the segment between the crossover points from Parent 1 to Child 1 and from Parent 2 to Child 2:
- Child 1: Copy cities from positions 4 to 7 in Parent 1, i.e., $[F, E, D, B]$ .
- Child 2: Copy cities from positions 4 to 7 in Parent 2, i.e., $[J, I, G, E]$ .
Preserve the relative order of the remaining cities from the other parent:
- Start from the end of the crossover segment and continue circularly to fill the remaining positions.
- For Child 1, we need to place the cities $D, C, H, J, I, G$ (in that order from Parent 2) in the empty spots, skipping those already in Child 1.
- For Child 2, we need to place the cities $G, J, H, F, E, D, B, I, C$ (in that order from Parent 1), skipping those already in Child 2.

Construction of Child 1:

Initial segment from Parent 1: $[F, E, D, B]$ .
Now, add the remaining cities from Parent 2 $[D, C, H, J, I, G]$ $[D, C, H, J, I, G]$ [D,C,H,J,I,G][D, C, H, J, I, G] $[D, C, H, J, I, G]$ , skipping the ones already in Child 1 $[D, E, F, B]$ $[D, E, F, B]$ [D,E,F,B][D, E, F, B] $[D, E, F, B]$ :
- First available city: $C$
- Then: $H, J, I, G$

So, Child 1 becomes:

C, H, J, F, E, D, B, I, G

Construction of Child 2:

Initial segment from Parent 2: $[J, I, G, E]$ .
Now, add the remaining cities from Parent 1 $[G, J, H, F, E, D, B, I, C]$ $[G, J, H, F, E, D, B, I, C]$ [G,J,H,F,E,D,B,I,C][G, J, H, F, E, D, B, I, C] $[G, J, H, F, E, D, B, I, C]$ , skipping the ones already in Child 2 $[J, I, G, E]$ $[J, I, G, E]$ [J,I,G,E][J, I, G, E] $[J, I, G, E]$ :
- First available city: $H$
- Then: $F, D, B, C$

So, Child 2 becomes:

H, F, D, J, I, G, E, B, C

ii) Order Crossover #2

In Order Crossover #2 (OX2), instead of choosing a single segment, we select specific key positions from one parent to be passed to the child. The remaining cities are filled in the order they appear in the other parent, skipping any that are already included.

Given:

Parent 1: $G, J, H, F, E, D, B, I, C$
Parent 2: $D, C, H, J, I, G, E, F, B$
The key positions are the 3rd, 5th, and 7th sites.

Steps for OX2:

Copy the cities at the key positions from Parent 1 to Child 1 and from Parent 2 to Child 2:
- Key positions for Child 1: $H$ (3rd), $E$ (5th), $B$ (7th) from Parent 1.
- Key positions for Child 2: $H$ (3rd), $I$ (5th), $E$ (7th) from Parent 2.
Fill the remaining cities from the other parent:
- For Child 1, take cities from Parent 2 $[D, C, H, J, I, G, E, F, B]$ , skipping those already in Child 1 $[H, E, B]$ .
- For Child 2, take cities from Parent 1 $[G, J, H, F, E, D, B, I, C]$ , skipping those already in Child 2 $[H, I, E]$ .

Construction of Child 1:

Initial positions (from Parent 1): $[H, E, B]$ at positions 3, 5, and 7.
Now, take cities from Parent 2 $[D, C, H, J, I, G, E, F, B]$ $[D, C, H, J, I, G, E, F, B]$ [D,C,H,J,I,G,E,F,B][D, C, H, J, I, G, E, F, B] $[D, C, H, J, I, G, E, F, B]$ , skipping the ones already in Child 1:
- First available city: $D$
- Then: $C, J, I, G, F$

So, Child 1 becomes:

D, C, H, J, E, I, B, G, F

Construction of Child 2:

Initial positions (from Parent 2): $[H, I, E]$ at positions 3, 5, and 7.
Now, take cities from Parent 1 $[G, J, H, F, E, D, B, I, C]$ $[G, J, H, F, E, D, B, I, C]$ [G,J,H,F,E,D,B,I,C][G, J, H, F, E, D, B, I, C] $[G, J, H, F, E, D, B, I, C]$ , skipping the ones already in Child 2:
- First available city: $G$
- Then: $J, F, D, B, C$

So, Child 2 becomes:

G, J, H, F, I, D, E, B, C

Final Answer:

Child 1 (OX1):

C, H, J, F, E, D, B, I, G

Child 2 (OX1):

H, F, D, J, I, G, E, B, C

Child 1 (OX2):

D, C, H, J, E, I, B, G, F

Child 2 (OX2):

G, J, H, F, I, D, E, B, C

="Q9(b)">Question:-9(b)

Consider the following single layer perception as shown in the following figure.

and the activation function of each unit is defined as:

Φ (v) = {\begin{cases} 1, if v \geq 0 \\ 0, otherwise. \end{cases}

Calculate the output

y

of the unit for each of the following input patterns:

\begin{array}{lllll} Patterns & P_{1} & P_{2} & P_{3} & P_{4} \\ x_{1} & 1 & 0 & 1 & 1 \\ x_{2} & 0 & 1 & 0 & 1 \\ x_{3} & 0 & 1 & 1 & 1 \end{array}

Also, find the modified weights after one iteration.

Answer:

Step 1: Understanding the Structure of the Perceptron

The given diagram represents a single-layer perceptron. The inputs

x_{1}

x_{2}

, and

x_{3}

are fed into a neuron with weights

w_{1} = 2

w_{2} = - 4

, and

w_{3} = 1

, respectively. The output of the perceptron is computed by summing the weighted inputs and applying a threshold activation function, defined as:

Φ (v) = {\begin{cases} 1 & if v \geq 0 \\ 0 & if v < 0 \end{cases}

where

v

is the weighted sum of inputs.

The weighted sum

v

is given by:

v = w_{1} x_{1} + w_{2} x_{2} + w_{3} x_{3}

Then, we apply the activation function

Φ (v)

to get the output

y

Step 2: Calculate the Output $y$ for Each Pattern

We are provided with 4 input patterns:

Pattern	$x_{1}$	$x_{2}$	$x_{3}$
$P_{1}$	1	0	0
$P_{2}$	0	1	1
$P_{3}$	1	0	1
$P_{4}$	1	1	1

For each pattern, we will compute the weighted sum

v

and the output

y = Φ (v)

Pattern $P_{1}$ ( $x_{1} = 1, x_{2} = 0, x_{3} = 0$ ):

v = (2) (1) + (- 4) (0) + (1) (0) = 2

Since

v = 2 \geq 0

, the activation function gives:

y = Φ (2) = 1

Pattern $P_{2}$ ( $x_{1} = 0, x_{2} = 1, x_{3} = 1$ ):

v = (2) (0) + (- 4) (1) + (1) (1) = 0 - 4 + 1 = - 3

Since

v = - 3 < 0

, the activation function gives:

y = Φ (- 3) = 0

Pattern $P_{3}$ ( $x_{1} = 1, x_{2} = 0, x_{3} = 1$ ):

v = (2) (1) + (- 4) (0) + (1) (1) = 2 + 0 + 1 = 3

Since

v = 3 \geq 0

, the activation function gives:

y = Φ (3) = 1

Pattern $P_{4}$ ( $x_{1} = 1, x_{2} = 1, x_{3} = 1$ ):

v = (2) (1) + (- 4) (1) + (1) (1) = 2 - 4 + 1 = - 1

Since

v = - 1 < 0

, the activation function gives:

y = Φ (- 1) = 0

Step 3: Summarize the Output $y$ for Each Pattern

Pattern	$x_{1}$	$x_{2}$	$x_{3}$	$v$	Output $y$
$P_{1}$	1	0	0	2	1
$P_{2}$	0	1	1	-3	0
$P_{3}$	1	0	1	3	1
$P_{4}$	1	1	1	-1	0

Step 4: Modified Weights After One Iteration

To modify the weights after one iteration, we typically use the Perceptron Learning Rule, which updates weights when the perceptron makes an error. The update rule is:

w_{i} \leftarrow w_{i} + Δ w_{i}

where:

Δ w_{i} = η \cdot (t - y) \cdot x_{i}

$t$ is the target output (assuming the correct output),
$y$ is the predicted output,
$x_{i}$ is the input,
$η$ is the learning rate (commonly set to 1 for simplicity).

Question:-10

Which of the following statements are true or false? Give a short proof or a counter example in support of your answers.

a) There is chance of occurrence of the premature convergence in Roulette-wheel selection scheme used in GA.
b) Gradient based optimization methods are used when the objective function is not smooth and one needs efficient local optimization.
c) The

α

-cut of a fuzzy set A in

\cup

is defined as

A α_{0} = {x \in U ∣ μ_{A} (x) \leq α_{0}}

.
d) A single perceptron with preprocessing is neither an auto associative network nor a multiple layer neural network.
e) If

W (k_{0}) = W (k_{0} + 1) = W (k_{0} + 2)

, then perception is non-linear separable.

Answer:

a) There is chance of occurrence of premature convergence in Roulette-wheel selection scheme used in GA.

True.

Reason:

In Roulette-wheel selection (used in Genetic Algorithms), individuals are selected with a probability proportional to their fitness. However, if one or a few individuals have significantly higher fitness values than others, they will dominate the selection process, leading to premature convergence.
Premature convergence happens when the population loses genetic diversity too early, causing the algorithm to get stuck in local optima rather than exploring the global search space effectively.

b) Gradient based optimization methods are used when the objective function is not smooth and one needs efficient local optimization.

False.

Reason:

Gradient-based optimization methods, such as gradient descent, rely on computing the gradient (or derivative) of the objective function to find the direction of steepest descent. These methods require the objective function to be smooth and differentiable, at least locally, to compute the gradient.
If the objective function is not smooth (discontinuous, non-differentiable), gradient-based methods cannot be applied effectively because the gradient does not exist or is unreliable.

c) The $α$ -cut of a fuzzy set A in $\cup$ is defined as $A_{α_{0}} = {x \in U ∣ μ_{A} (x) \leq α_{0}}$ .

False.

Reason:

The correct definition of an $α$ -cut of a fuzzy set $A$ is: $A_{α_{0}} = {x \in U ∣ μ_{A} (x) \geq α_{0}}$ It includes all elements of the universe $U$ whose membership value in $A$ is greater than or equal to $α_{0}$ , not less than or equal. The given definition is incorrect because it uses $\leq$ instead of $\geq$ .

d) A single perceptron with preprocessing is neither an auto associative network nor a multiple layer neural network.

True.

Reason:

A single perceptron is a simple feedforward neural network with one layer of neurons. It can only solve linearly separable problems.
An autoassociative network is a network where the input and output dimensions are the same, and the network is trained to output the same values as the input (used for tasks like identity mapping or denoising).
A multiple layer neural network consists of multiple layers of neurons (i.e., a deep neural network), which a single perceptron is not.
A single perceptron with preprocessing may improve input representation, but it still does not fit the definition of an autoassociative network or a multilayer neural network.

e) If $W (k_{0}) = W (k_{0} + 1) = W (k_{0} + 2)$ , then the perceptron is non-linearly separable.

False.

Reason:

The condition that $W (k_{0}) = W (k_{0} + 1) = W (k_{0} + 2)$ implies that the weight vector does not change over iterations $k_{0}$ , $k_{0} + 1$ , and $k_{0} + 2$ .
This does not necessarily mean that the problem is non-linearly separable. It could indicate that the perceptron has converged and found a solution to the problem (if the problem is linearly separable), or that the learning process has stalled (e.g., due to reaching a steady state without achieving separability).
Non-linear separability means that no linear boundary can separate the classes, but the given weight condition alone does not prove non-linear separability.

Final Answers:

a) True
b) False
c) False
d) True
e) False

Back to Course

Next Lesson

MMTE-007 Solved Assignment | SOFT COMPUTING AND ITS APPLICATIONS | M.Sc. MACS | IGNOU

Sample Solution

MMTE-007 Solved Assignment 2024 CS

SOFT COMPUTING AND ITS APPLICATIONS

(Valid from 1st January, 2024 to 31st December, 2024)

Expert Answer

Question:-1(a)

Two sensors based upon their detection levels and gain settings are compared. The following of gain setting and sensor detection levels with a standard item being monitored provides typical membership values to represent the detection levels for each of the sensors.

Answer:

Step 1: Define the Membership Functions

Membership Function for Sensor 1, μ S1 ( x ) μ S1 ( x ) mu_(“S1”)(x)\mu_{\text{S1}}(x)μS1(x)

Membership Function for Sensor 2, μ S2 ( x ) μ S2 ( x ) mu_(“S2”)(x)\mu_{\text{S2}}(x)μS2(x)

Step 2: Verify De Morgan’s Laws

First Law: μ ¬ ( S 1 ∪ S 2 ) ( x ) = μ ¬ S 1 ( x ) ∩ μ ¬ S 2 ( x ) μ ¬ ( S 1 ∪ S 2 ) ( x ) = μ ¬ S 1 ( x ) ∩ μ ¬ S 2 ( x ) mu_(not(S1uu S2))(x)=mu_(not S1)(x)nnmu_(not S2)(x)\mu_{\neg(S1 \cup S2)}(x) = \mu_{\neg S1}(x) \cap \mu_{\neg S2}(x)μ¬(S1∪S2)(x)=μ¬S1(x)∩μ¬S2(x)

Second Law: μ ¬ ( S 1 ∩ S 2 ) ( x ) = μ ¬ S 1 ( x ) ∪ μ ¬ S 2 ( x ) μ ¬ ( S 1 ∩ S 2 ) ( x ) = μ ¬ S 1 ( x ) ∪ μ ¬ S 2 ( x ) mu_(not(S1nn S2))(x)=mu_(not S1)(x)uumu_(not S2)(x)\mu_{\neg(S1 \cap S2)}(x) = \mu_{\neg S1}(x) \cup \mu_{\neg S2}(x)μ¬(S1∩S2)(x)=μ¬S1(x)∪μ¬S2(x)

Final Answer:

Question:-1(b)

Consider a subset of natural numbers from 1 to 30, as the universe of discourse, U U UUU. Define the fuzzy sets "small" and "medium" by enumeration.

Answer:

Define the Fuzzy Set "Small"

Define the Fuzzy Set "Medium"

Final Answer

Question:-2(a)

Construct the α α alpha\alphaα-cut at α = 0.4 α = 0.4 alpha=0.4\alpha=0.4α=0.4 for the fuzzy sets defined in Q. 1(b).

Answer:

Recap of the Membership Functions

Fuzzy Set "Small" Membership Function:

Fuzzy Set "Medium" Membership Function:

Definition of α α alpha\alphaα-cut

1. α α alpha\alphaα-cut for the Fuzzy Set "Small" at α = 0.4 α = 0.4 alpha=0.4\alpha = 0.4α=0.4

2. α α alpha\alphaα-cut for the Fuzzy Set "Medium" at α = 0.4 α = 0.4 alpha=0.4\alpha = 0.4α=0.4

Final Answer

Question:-2(b)

Apply the "very" hedge on the fuzzy sets defined in Q. 1(b) to get the new modified fuzzy sets. Show the modified fuzzy sets through numeration.

Answer:

Recap of the Membership Functions

Fuzzy Set "Small" Membership Function:

Fuzzy Set "Medium" Membership Function:

Applying the "Very" Hedge

1. Modified Fuzzy Set "Small" (Applying the "Very" Hedge)

2. Modified Fuzzy Set "Medium" (Applying the "Very" Hedge)

Final Answer: Modified Fuzzy Sets with the "Very" Hedge

Question:-3

Answer:

i) μ A ( x ) μ A ( x ) mu_(A)(x)\mu_{\mathrm{A} \mathcal{}}(x)μA(x)

ii) μ A ∩ B ( x ) μ A ∩ B ( x ) mu_(A nn B)(x)\mu_{A \cap B}(x)μA∩B(x)

iii) μ A ― ∪ B ― ( x ) μ A ¯ ∪ B ¯ ( x ) mu_( bar(A)uu bar(B))(x)\mu_{\overline{\mathrm{A}} \cup \overline{\mathrm{B}}}(x)μA―∪B―(x)

iv) μ A ― ∩ B ― ( x ) μ A ¯ ∩ B ¯ ( x ) mu_( bar(A)nn bar(B))(x)\mu_{\overline{\mathrm{A}} \cap \overline{\mathrm{B}}}(x)μA―∩B―(x)

v) μ A ― B ― ― ( x ) μ A ¯ B ¯ ¯ ( x ) mu_( bar(A) bar(bar(B)))(x)\mu_{\overline{A} \overline{\overline{B}}}(x)μA―B――(x)

vi) μ A ― B ― ― ( x ) μ A ¯ B ¯ ¯ ( x ) mu_( bar(A) bar(bar(B)))(x)\mu_{\overline{A} \overline{\overline{B}}}(x)μA―B――(x)

Final Answer:

Question:-4

Answer:

Step-by-Step Process to Apply FCM Algorithm:

Given Information:

Step 1: Calculate the Distance between Each Data Point and the Cluster Centers

Distance from points to v 1 = ( 5 , 5 ) v 1 = ( 5 , 5 ) v_(1)=(5,5)v_1 = (5, 5)v1=(5,5):

Distance from points to v 2 = ( 10 , 10 ) v 2 = ( 10 , 10 ) v_(2)=(10,10)v_2 = (10, 10)v2=(10,10):

Step 2: Calculate Membership Values

Membership Values for x 1 = ( 3 , 11 ) x 1 = ( 3 , 11 ) x_(1)=(3,11)\mathbf{x}_1 = (3, 11)x1=(3,11):

Membership Values for x 2 = ( 3 , 10 ) x 2 = ( 3 , 10 ) x_(2)=(3,10)\mathbf{x}_2 = (3, 10)x2=(3,10):

Membership Values for x 3 = ( 8 , 12 ) x 3 = ( 8 , 12 ) x_(3)=(8,12)\mathbf{x}_3 = (8, 12)x3=(8,12):

Membership Values for x 4 = ( 10 , 6 ) x 4 = ( 10 , 6 ) x_(4)=(10,6)\mathbf{x}_4 = (10, 6)x4=(10,6):

Membership Values for x 5 = ( 13 , 6 ) x 5 = ( 13 , 6 ) x_(5)=(13,6)\mathbf{x}_5 = (13, 6)x5=(13,6):

Membership Values for x 6 = ( 13 , 5 ) x 6 = ( 13 , 5 ) x_(6)=(13,5)\mathbf{x}_6 = (13, 5)x6=(13,5):

Step 3: Update the Cluster Centers

Final Answer: New Cluster Centers

Question:-5(a)

Define Error Correction Learning with examples.

Answer:

Key Concept:

General Steps:

Types of Error Correction Learning Algorithms:

Example 1: Perceptron Learning Algorithm (Error Correction in Binary Classification)

Example with data:

Example 2: Backpropagation in a Neural Network

Example in Context:

Conclusion:

Question:-5(b)

Write the types of Neural Memory Models. Also, give one example of each.

Membership Function for Sensor 1, $μ_{S1} (x)$

Membership Function for Sensor 2, $μ_{S2} (x)$

First Law: $μ_{\neg (S 1 \cup S 2)} (x) = μ_{\neg S 1} (x) \cap μ_{\neg S 2} (x)$

Second Law: $μ_{\neg (S 1 \cap S 2)} (x) = μ_{\neg S 1} (x) \cup μ_{\neg S 2} (x)$

Consider a subset of natural numbers from 1 to 30, as the universe of discourse, $U$ . Define the fuzzy sets "small" and "medium" by enumeration.

Construct the $α$ -cut at $α = 0.4$ for the fuzzy sets defined in Q. 1(b).

Definition of $α$ -cut

1. $α$ -cut for the Fuzzy Set "Small" at $α = 0.4$

2. $α$ -cut for the Fuzzy Set "Medium" at $α = 0.4$

i) $μ_{A} (x)$

ii) $μ_{A \cap B} (x)$

iii) $μ_{\overset{―}{A} \cup \overset{―}{B}} (x)$

iv) $μ_{\overset{―}{A} \cap \overset{―}{B}} (x)$

v) $μ_{\overset{―}{A} \overset{―}{\overset{―}{B}}} (x)$

vi) $μ_{\overset{―}{A} \overset{―}{\overset{―}{B}}} (x)$

Distance from points to $v_{1} = (5, 5)$ :

Distance from points to $v_{2} = (10, 10)$ :

Membership Values for $x_{1} = (3, 11)$ :

Membership Values for $x_{2} = (3, 10)$ :

Membership Values for $x_{3} = (8, 12)$ :

Membership Values for $x_{4} = (10, 6)$ :

Membership Values for $x_{5} = (13, 6)$ :

Membership Values for $x_{6} = (13, 5)$ :

Consider the set of pattern vectors P. Obtain the connectivity matrix (CM) for the patterns in $P$ (four patterns).

For $p_{1} = [1, 1, 1, 0, 0, 0, 0, 0, 0, 0]$ :

For $p_{2} = [0, 0, 0, 0, 0, 0, 0, 1, 1, 1]$ :

For $p_{3} = [1, 1, 1, 0, 0, 0, 0, 0, 0, 1]$ :

For $p_{4} = [1, 0, 1, 0, 1, 0, 1, 0, 1, 0]$ :

Final Answer: Connectivity Matrix $W$

i) Schema $S_{1} = (1^{* } 00 1^{* *})$

ii) Schema $S_{2} = (^{} 00^{} 1^{* *})$

iii) Schema $S_{3} = (* * * 1^{* * *})$

1. Differentiating $ϕ (v)$ with respect to $v$

2. Value of $ϕ (v)$ at the origin ( $v = 0$ )

3. Value of $ϕ (v)$ as $v$ approaches $+ \infty$ and $- \infty$

As $v \to + \infty$ :