Free BPCC-104 Solved Assignment | July 2023-January 2024 | STATISTICAL METHODS FOR PSYCHOLOGICAL RESEARCH- I | IGNOU

BPCC-104 Solved Assignment

  1. Define statistics. Explain the basic concepts in statistics.
  2. Compute Pearson’s Product Moment Correlation for the following data:
Individuals A B C D E F G H I J
Data A 34 22 10 11 23 20 16 18 21 15
Data B 12 23 21 24 10 30 28 29 32 25
Individuals A B C D E F G H I J Data A 34 22 10 11 23 20 16 18 21 15 Data B 12 23 21 24 10 30 28 29 32 25| Individuals | A | B | C | D | E | F | G | H | I | J | | :— | :— | :— | :— | :— | :— | :— | :— | :— | :— | :— | | Data A | 34 | 22 | 10 | 11 | 23 | 20 | 16 | 18 | 21 | 15 | | Data B | 12 | 23 | 21 | 24 | 10 | 30 | 28 | 29 | 32 | 25 |
Assignment Two
Answer the following questions in about 100 words each (wherever applicable). Each question carries 5 marks.
  1. Explain the type-I and type-II errors in the process of hypothesis testing.
  2. Compute mean, median and mode for the following data
50 68 90 45 75 48 72 65 58 50
50 68 90 45 75 48 72 65 58 50| 50 | 68 | 90 | 45 | 75 | 48 | 72 | 65 | 58 | 50 | | :— | :— | :— | :— | :— | :— | :— | :— | :— | :— |
  1. Explain the merits, limitations and uses of standard deviation.
  2. Compute average deviation for the following data.
18 15 16 14 11 8 7 12 10 13
18 15 16 14 11 8 7 12 10 13| 18 | 15 | 16 | 14 | 11 | 8 | 7 | 12 | 10 | 13 | | —: | —: | —: | —: | —: | —: | —: | —: | —: | —: |
  1. Describe the properties of Normal Distribution Curve (NPC)
  2. Discuss the concept of Kurtosis and Skewness.

Expert Answer:

Question:-1

Define statistics. Explain the basic concepts in statistics.

Answer:

Definition of Statistics:

Statistics is the branch of mathematics that deals with the collection, organization, analysis, interpretation, and presentation of numerical data. It enables us to make informed decisions and draw conclusions based on data. Statistics is used across various fields, including economics, healthcare, marketing, education, and more, to understand trends, test hypotheses, and make predictions.

Basic Concepts in Statistics:

  1. Population and Sample:
    • Population: The entire group of individuals or items that we want to study. It includes every possible outcome or observation.
      • Example: All the students in a university.
    • Sample: A subset of the population selected for analysis. A sample is used to draw conclusions about the population without needing to study the entire population.
      • Example: A group of 200 students randomly selected from a university.
  2. Descriptive and Inferential Statistics:
    • Descriptive Statistics: Involves summarizing and organizing data to describe its basic features. It includes measures such as mean, median, mode, variance, and standard deviation.
      • Example: Finding the average height of a group of people.
    • Inferential Statistics: Involves making predictions or inferences about a population based on sample data. It includes hypothesis testing, confidence intervals, and regression analysis.
      • Example: Predicting the average height of all students at a university based on a sample.
  3. Variables:
    • Qualitative (Categorical) Variables: These describe characteristics that cannot be measured numerically but can be classified into categories. Examples include gender, nationality, or color.
    • Quantitative (Numerical) Variables: These describe measurable quantities and can be divided into two types:
      • Discrete Variables: Countable values, like the number of students in a class.
      • Continuous Variables: Any value within a given range, like height or temperature.
  4. Measures of Central Tendency:
    These are metrics that summarize the central point of a data set:
    • Mean: The arithmetic average of a data set, found by summing all the values and dividing by the total number of values.
    • Median: The middle value in a sorted data set. If the data set has an even number of values, the median is the average of the two middle numbers.
    • Mode: The value(s) that occur most frequently in a data set.
  5. Measures of Dispersion:
    These metrics describe how spread out the data is:
    • Range: The difference between the largest and smallest values in a data set.
    • Variance: The average of the squared differences between each value and the mean.
    • Standard Deviation: The square root of the variance, which provides a measure of the average distance from the mean.
  6. Probability:
    Probability is a measure of the likelihood that a given event will occur. It ranges from 0 (impossible event) to 1 (certain event). It forms the foundation of inferential statistics, allowing statisticians to make predictions about populations based on sample data.
  7. Normal Distribution:
    A normal distribution is a bell-shaped curve where most of the data points cluster around the mean, with fewer observations appearing as we move away from the mean. It is important in statistics because many natural phenomena follow this distribution, and it forms the basis for various statistical tests.
  8. Hypothesis Testing:
    A statistical method used to decide whether there is enough evidence to support a particular belief (hypothesis) about a population. It involves:
    • Null Hypothesis (H₀): A default assumption that there is no effect or difference.
    • Alternative Hypothesis (H₁): The hypothesis that suggests there is an effect or difference.
    • Hypothesis testing aims to reject or fail to reject the null hypothesis based on sample data.
  9. Correlation and Regression:
    • Correlation: Measures the strength and direction of a linear relationship between two variables. The correlation coefficient (r) ranges from -1 (perfect negative correlation) to +1 (perfect positive correlation).
    • Regression: A statistical technique for modeling the relationship between a dependent variable and one or more independent variables. It is used for predicting values and understanding relationships.

Summary:

Statistics provides the tools for analyzing and interpreting data in a meaningful way. It encompasses a wide range of techniques, from descriptive methods that summarize data to inferential methods that allow for predictions and hypothesis testing. Understanding these basic concepts is essential for applying statistical methods in real-world scenarios.



Question:-2

Compute Pearson’s Product Moment Correlation for the following data:

Individuals A B C D E F G H I J
Data A 34 22 10 11 23 20 16 18 21 15
Data B 12 23 21 24 10 30 28 29 32 25
Individuals A B C D E F G H I J Data A 34 22 10 11 23 20 16 18 21 15 Data B 12 23 21 24 10 30 28 29 32 25| Individuals | A | B | C | D | E | F | G | H | I | J | | :— | :— | :— | :— | :— | :— | :— | :— | :— | :— | :— | | Data A | 34 | 22 | 10 | 11 | 23 | 20 | 16 | 18 | 21 | 15 | | Data B | 12 | 23 | 21 | 24 | 10 | 30 | 28 | 29 | 32 | 25 |

Answer:

To calculate Pearson’s Product Moment Correlation coefficient (r), we follow these steps:

Step-by-Step Solution:

Given two sets of data:
Data A = [ 34 , 22 , 10 , 11 , 23 , 20 , 16 , 18 , 21 , 15 ] Data A = [ 34 , 22 , 10 , 11 , 23 , 20 , 16 , 18 , 21 , 15 ] “Data A”=[34,22,10,11,23,20,16,18,21,15]\text{Data A} = [34, 22, 10, 11, 23, 20, 16, 18, 21, 15]Data A=[34,22,10,11,23,20,16,18,21,15]
Data B = [ 12 , 23 , 21 , 24 , 10 , 30 , 28 , 29 , 32 , 25 ] Data B = [ 12 , 23 , 21 , 24 , 10 , 30 , 28 , 29 , 32 , 25 ] “Data B”=[12,23,21,24,10,30,28,29,32,25]\text{Data B} = [12, 23, 21, 24, 10, 30, 28, 29, 32, 25]Data B=[12,23,21,24,10,30,28,29,32,25]

Formula for Pearson’s Correlation Coefficient:

The formula to compute Pearson’s correlation is:
r = n ( ( A B ) ) ( A ) ( B ) [ n A 2 ( A ) 2 ] [ n B 2 ( B ) 2 ] r = n ( ( A B ) ) ( A ) ( B ) n A 2 ( A ) 2 n B 2 ( B ) 2 r=(n(sum(A*B))-(sum A)(sum B))/(sqrt([n sumA^(2)-(sum A)^(2)]*[n sumB^(2)-(sum B)^(2)]))r = \frac{n(\sum (A \cdot B)) – (\sum A)(\sum B)}{\sqrt{\left[n \sum A^2 – (\sum A)^2 \right] \cdot \left[n \sum B^2 – (\sum B)^2 \right]}}r=n((AB))(A)(B)[nA2(A)2][nB2(B)2]
Where:
  • A A AAA and B B BBB are the data sets.
  • n n nnn is the number of data points.

Step 1: Compute the required summations:

  • A A sum A\sum AA: Sum of all values in Data A.
  • B B sum B\sum BB: Sum of all values in Data B.
  • A 2 A 2 sumA^(2)\sum A^2A2: Sum of the squares of values in Data A.
  • B 2 B 2 sumB^(2)\sum B^2B2: Sum of the squares of values in Data B.
  • A B A B sum A*B\sum A \cdot BAB: Sum of the product of corresponding values of A and B.
Let’s compute these step by step:

Data Table:

Individuals Data A Data B A 2 A 2 A^(2)A^2A2 B 2 B 2 B^(2)B^2B2 A B A B A*BA \cdot BAB
A 34 12 1156 144 408
B 22 23 484 529 506
C 10 21 100 441 210
D 11 24 121 576 264
E 23 10 529 100 230
F 20 30 400 900 600
G 16 28 256 784 448
H 18 29 324 841 522
I 21 32 441 1024 672
J 15 25 225 625 375

Summations:

A = 34 + 22 + 10 + 11 + 23 + 20 + 16 + 18 + 21 + 15 = 190 A = 34 + 22 + 10 + 11 + 23 + 20 + 16 + 18 + 21 + 15 = 190 sum A=34+22+10+11+23+20+16+18+21+15=190\sum A = 34 + 22 + 10 + 11 + 23 + 20 + 16 + 18 + 21 + 15 = 190A=34+22+10+11+23+20+16+18+21+15=190
B = 12 + 23 + 21 + 24 + 10 + 30 + 28 + 29 + 32 + 25 = 234 B = 12 + 23 + 21 + 24 + 10 + 30 + 28 + 29 + 32 + 25 = 234 sum B=12+23+21+24+10+30+28+29+32+25=234\sum B = 12 + 23 + 21 + 24 + 10 + 30 + 28 + 29 + 32 + 25 = 234B=12+23+21+24+10+30+28+29+32+25=234
A 2 = 1156 + 484 + 100 + 121 + 529 + 400 + 256 + 324 + 441 + 225 = 4436 A 2 = 1156 + 484 + 100 + 121 + 529 + 400 + 256 + 324 + 441 + 225 = 4436 sumA^(2)=1156+484+100+121+529+400+256+324+441+225=4436\sum A^2 = 1156 + 484 + 100 + 121 + 529 + 400 + 256 + 324 + 441 + 225 = 4436A2=1156+484+100+121+529+400+256+324+441+225=4436
B 2 = 144 + 529 + 441 + 576 + 100 + 900 + 784 + 841 + 1024 + 625 = 5964 B 2 = 144 + 529 + 441 + 576 + 100 + 900 + 784 + 841 + 1024 + 625 = 5964 sumB^(2)=144+529+441+576+100+900+784+841+1024+625=5964\sum B^2 = 144 + 529 + 441 + 576 + 100 + 900 + 784 + 841 + 1024 + 625 = 5964B2=144+529+441+576+100+900+784+841+1024+625=5964
A B = 408 + 506 + 210 + 264 + 230 + 600 + 448 + 522 + 672 + 375 = 4235 A B = 408 + 506 + 210 + 264 + 230 + 600 + 448 + 522 + 672 + 375 = 4235 sum A*B=408+506+210+264+230+600+448+522+672+375=4235\sum A \cdot B = 408 + 506 + 210 + 264 + 230 + 600 + 448 + 522 + 672 + 375 = 4235AB=408+506+210+264+230+600+448+522+672+375=4235

Step 2: Apply the formula:

Now that we have all the summations, we can plug the values into the Pearson correlation formula:
r = n ( ( A B ) ) ( A ) ( B ) [ n A 2 ( A ) 2 ] [ n B 2 ( B ) 2 ] r = n ( ( A B ) ) ( A ) ( B ) n A 2 ( A ) 2 n B 2 ( B ) 2 r=(n(sum(A*B))-(sum A)(sum B))/(sqrt([n sumA^(2)-(sum A)^(2)]*[n sumB^(2)-(sum B)^(2)]))r = \frac{n(\sum (A \cdot B)) – (\sum A)(\sum B)}{\sqrt{\left[n \sum A^2 – (\sum A)^2 \right] \cdot \left[n \sum B^2 – (\sum B)^2 \right]}}r=n((AB))(A)(B)[nA2(A)2][nB2(B)2]
Where:
  • n = 10 n = 10 n=10n = 10n=10
  • A = 190 A = 190 sum A=190\sum A = 190A=190
  • B = 234 B = 234 sum B=234\sum B = 234B=234
  • A 2 = 4436 A 2 = 4436 sumA^(2)=4436\sum A^2 = 4436A2=4436
  • B 2 = 5964 B 2 = 5964 sumB^(2)=5964\sum B^2 = 5964B2=5964
  • A B = 4235 A B = 4235 sum A*B=4235\sum A \cdot B = 4235AB=4235
Substitute the values into the formula:
r = 10 ( 4235 ) ( 190 ) ( 234 ) [ 10 ( 4436 ) ( 190 ) 2 ] [ 10 ( 5964 ) ( 234 ) 2 ] r = 10 ( 4235 ) ( 190 ) ( 234 ) [ 10 ( 4436 ) ( 190 ) 2 ] [ 10 ( 5964 ) ( 234 ) 2 ] r=(10(4235)-(190)(234))/(sqrt([10(4436)-(190)^(2)]*[10(5964)-(234)^(2)]))r = \frac{10(4235) – (190)(234)}{\sqrt{[10(4436) – (190)^2] \cdot [10(5964) – (234)^2]}}r=10(4235)(190)(234)[10(4436)(190)2][10(5964)(234)2]

Step 3: Simplify the numerator and denominator:

Numerator:

10 ( 4235 ) ( 190 ) ( 234 ) = 42350 44460 = 1110 10 ( 4235 ) ( 190 ) ( 234 ) = 42350 44460 = 1110 10(4235)-(190)(234)=42350-44460=-111010(4235) – (190)(234) = 42350 – 44460 = -111010(4235)(190)(234)=4235044460=1110

Denominator:

[ 10 ( 4436 ) 190 2 ] [ 10 ( 5964 ) 234 2 ] [ 10 ( 4436 ) 190 2 ] [ 10 ( 5964 ) 234 2 ] sqrt([10(4436)-190^(2)]*[10(5964)-234^(2)])\sqrt{[10(4436) – 190^2] \cdot [10(5964) – 234^2]}[10(4436)1902][10(5964)2342]
= [ 44360 36100 ] [ 59640 54756 ] = [ 44360 36100 ] [ 59640 54756 ] =sqrt([44360-36100]*[59640-54756])= \sqrt{[44360 – 36100] \cdot [59640 – 54756]}=[4436036100][5964054756]
= 8260 4884 = 40341840 = 6349.52 = 8260 4884 = 40341840 = 6349.52 =sqrt(8260*4884)=sqrt40341840=6349.52= \sqrt{8260 \cdot 4884} = \sqrt{40341840} = 6349.52=82604884=40341840=6349.52

Step 4: Compute Pearson’s correlation:

r = 1110 6349.52 0.463 r = 1110 6349.52 0.463 r=(-1110)/(6349.52)~~-0.463r = \frac{-1110}{6349.52} \approx -0.463r=11106349.520.463

Conclusion:

The Pearson’s Product Moment Correlation coefficient is approximately -0.463, indicating a moderate negative correlation between Data A and Data B. This means that as the values in Data A increase, the values in Data B tend to decrease, and vice versa.



Question:-3

Explain the type-I and type-II errors in the process of hypothesis testing.

Answer:

Type I and Type II Errors in Hypothesis Testing

In hypothesis testing, we make decisions based on sample data to either reject or fail to reject a null hypothesis (H₀). However, since decisions are made on sample data and not the entire population, there is always a risk of making errors. These errors are known as Type I and Type II errors.

1. Type I Error (False Positive):

  • A Type I error occurs when we reject the null hypothesis (H₀) when it is actually true.
  • It is essentially a "false positive" result because it indicates that an effect or difference exists when, in reality, it does not.

Example:

Suppose a medical test is used to detect a disease, and the null hypothesis (H₀) states that the patient does not have the disease.
  • If the test concludes that the patient has the disease (rejecting H₀) when the patient is actually healthy (H₀ is true), a Type I error has occurred.

Probability of a Type I Error:

  • The probability of committing a Type I error is denoted by α α alpha\alphaα (alpha), which is called the significance level of the test.
  • Commonly, α α alpha\alphaα is set at 0.05 or 5%, meaning there’s a 5% risk of rejecting the null hypothesis when it is true.

Consequences:

  • In practice, Type I errors can lead to false conclusions, such as claiming that a new drug is effective when it is not, or approving a faulty product for release.

2. Type II Error (False Negative):

  • A Type II error occurs when we fail to reject the null hypothesis (H₀) when it is actually false.
  • It is essentially a "false negative" result because it suggests that no effect or difference exists when, in reality, it does.

Example:

Using the same medical test example, the null hypothesis (H₀) states that the patient does not have the disease.
  • If the test concludes that the patient is healthy (fails to reject H₀) when the patient actually has the disease (H₀ is false), a Type II error has occurred.

Probability of a Type II Error:

  • The probability of committing a Type II error is denoted by β β beta\betaβ (beta).
  • The power of the test (1 – β β beta\betaβ) represents the ability of the test to detect an effect if one exists. A higher power reduces the likelihood of a Type II error.

Consequences:

  • Type II errors can result in missed opportunities, such as failing to detect a beneficial effect of a new drug, or not identifying a problem in a manufacturing process.

Summary of Type I and Type II Errors:

Hypothesis Decision Reality: Null Hypothesis (H₀) is True Reality: Null Hypothesis (H₀) is False
Reject H₀ Type I Error (False Positive) Correct Decision (True Positive)
Fail to Reject H₀ Correct Decision (True Negative) Type II Error (False Negative)

Balancing Type I and Type II Errors:

  • Reducing Type I Errors: Lowering α α alpha\alphaα (significance level) reduces the likelihood of Type I errors, but it increases the probability of Type II errors.
  • Reducing Type II Errors: Increasing the sample size or conducting more sensitive tests can reduce β β beta\betaβ, thereby lowering the chance of Type II errors, but it may increase the risk of a Type I error.
In practice, researchers must carefully choose the acceptable level of risk for Type I and Type II errors based on the context and consequences of the decision-making process.



Question:-4

Compute mean, median and mode for the following data:

50 68 90 45 75 48 72 65 58 50
50 68 90 45 75 48 72 65 58 50| 50 | 68 | 90 | 45 | 75 | 48 | 72 | 65 | 58 | 50 | | :— | :— | :— | :— | :— | :— | :— | :— | :— | :— |

Answer:

Data Set:
50, 68, 90, 45, 75, 48, 72, 65, 58, 50

1. Mean:

The mean is the average of the numbers. To calculate the mean:
  • Add all the numbers together.
  • Divide the sum by the total count of numbers.
Steps:
  1. Sum of the data:
    50 + 68 + 90 + 45 + 75 + 48 + 72 + 65 + 58 + 50 = 621 50 + 68 + 90 + 45 + 75 + 48 + 72 + 65 + 58 + 50 = 621 50+68+90+45+75+48+72+65+58+50=62150 + 68 + 90 + 45 + 75 + 48 + 72 + 65 + 58 + 50 = 62150+68+90+45+75+48+72+65+58+50=621
  2. Count of numbers: There are 10 numbers.
  3. Mean:
    Mean = 621 10 = 62.1 Mean = 621 10 = 62.1 “Mean”=(621)/(10)=62.1\text{Mean} = \frac{621}{10} = 62.1Mean=62110=62.1
So, the mean is 62.1.

2. Median:

The median is the middle number when the numbers are arranged in order. If there are an even number of data points, the median is the average of the two middle numbers.
Steps:
  1. Arrange the numbers in ascending order:
    45, 48, 50, 50, 58, 65, 68, 72, 75, 90
  2. Since there are 10 numbers (even count), the median is the average of the 5th and 6th numbers:
    • The 5th number is 58.
    • The 6th number is 65.
  3. Median:
    Median = 58 + 65 2 = 123 2 = 61.5 Median = 58 + 65 2 = 123 2 = 61.5 “Median”=(58+65)/(2)=(123)/(2)=61.5\text{Median} = \frac{58 + 65}{2} = \frac{123}{2} = 61.5Median=58+652=1232=61.5
So, the median is 61.5.

3. Mode:

The mode is the number that appears most frequently in the data set.
Steps:
  1. In the given data set:
    50, 68, 90, 45, 75, 48, 72, 65, 58, 50
  2. The number 50 appears twice, which is more frequent than any other number.
So, the mode is 50.

Summary:

  • Mean: 62.1
  • Median: 61.5
  • Mode: 50
This gives a comprehensive look at the measures of central tendency for your data.



Question:-5

Explain the merits, limitations and uses of standard deviation.

Answer:

Merits of Standard Deviation:

  1. Precision in Measuring Spread:
    Standard deviation gives a precise measure of how much individual data points deviate from the mean. It considers all data points, unlike simpler measures such as range, which only uses the extremes.
  2. Comparison Across Data Sets:
    It enables comparison between different data sets to determine which set has greater variability or dispersion. A lower standard deviation indicates that the data points are closer to the mean, while a higher value shows more spread.
  3. Useful in Many Fields:
    Standard deviation is widely used in various fields such as finance, economics, psychology, and the natural sciences to measure the risk, variability, or uncertainty of data sets.
  4. Foundation for Other Statistical Tools:
    Many other statistical measures, such as variance, z-scores, and hypothesis testing, are based on the standard deviation. It also plays a key role in inferential statistics.
  5. Applicable to Normal Distribution:
    For normally distributed data, standard deviation helps in defining the probability of data points falling within certain intervals, making it essential for predictions and probability calculations.

Limitations of Standard Deviation:

  1. Sensitivity to Outliers:
    Standard deviation can be heavily affected by outliers, as extreme values can significantly increase the calculated spread of the data.
  2. Assumes Normal Distribution:
    Standard deviation is most meaningful when data follows a normal (bell-curve) distribution. In cases of skewed or irregularly distributed data, its interpretation becomes less effective.
  3. Not Easy to Interpret for All Data Sets:
    For those not familiar with statistical analysis, the concept of standard deviation may be difficult to grasp. Moreover, in some real-world contexts, it may not provide intuitive or actionable insights.
  4. Relies on Mean:
    Since standard deviation is based on the mean, it inherits the limitations of the mean, such as not being a good measure of central tendency in cases of highly skewed distributions.
  5. Difficult to Calculate Manually:
    For large datasets, calculating the standard deviation manually can be tedious and error-prone. While tools exist to compute it, the formula itself can be cumbersome without computational assistance.

Uses of Standard Deviation:

  1. Risk Assessment in Finance:
    In financial markets, standard deviation is commonly used to measure the volatility of stocks or investment portfolios. A higher standard deviation signifies greater risk due to higher variability in returns.
  2. Quality Control in Manufacturing:
    In production processes, standard deviation helps monitor product consistency. If the standard deviation of a product’s features is small, it implies uniformity, whereas a large deviation may indicate defects.
  3. Education and Psychology:
    Standard deviation is used to understand the variability in test scores, intelligence quotients (IQ), or other psychological assessments. It shows how much individual performance deviates from the average.
  4. Weather Forecasting:
    Meteorologists use standard deviation to analyze temperature fluctuations, helping them predict extreme weather conditions and establish climate patterns.
  5. Scientific Research:
    In experiments, standard deviation is used to quantify variability in data and to understand the precision and reliability of measurements or observations.
In summary, while standard deviation is a powerful and widely-used tool for measuring data variability, its accuracy is diminished by outliers and non-normal distributions. It is, however, indispensable in fields like finance, manufacturing, and scientific research for analyzing and interpreting data variability.



Question:-6

Compute average deviation for the following data:

18 15 16 14 11 8 7 12 10 13
18 15 16 14 11 8 7 12 10 13| 18 | 15 | 16 | 14 | 11 | 8 | 7 | 12 | 10 | 13 | | —: | —: | —: | —: | —: | —: | —: | —: | —: | —: |

Answer:

Step 1: List the given data

The data provided is:
18 , 15 , 16 , 14 , 11 , 8 , 7 , 12 , 10 , 13 18 , 15 , 16 , 14 , 11 , 8 , 7 , 12 , 10 , 13 18,15,16,14,11,8,7,12,10,1318, 15, 16, 14, 11, 8, 7, 12, 10, 1318,15,16,14,11,8,7,12,10,13

Step 2: Calculate the mean (average)

The mean is the sum of all data values divided by the number of data points.
Mean = 18 + 15 + 16 + 14 + 11 + 8 + 7 + 12 + 10 + 13 10 = 124 10 = 12.4 Mean = 18 + 15 + 16 + 14 + 11 + 8 + 7 + 12 + 10 + 13 10 = 124 10 = 12.4 “Mean”=(18+15+16+14+11+8+7+12+10+13)/(10)=(124)/(10)=12.4\text{Mean} = \frac{18 + 15 + 16 + 14 + 11 + 8 + 7 + 12 + 10 + 13}{10} = \frac{124}{10} = 12.4Mean=18+15+16+14+11+8+7+12+10+1310=12410=12.4

Step 3: Calculate the deviation of each value from the mean

Now, subtract the mean from each data point and take the absolute value of the result (ignore negative signs). This gives the absolute deviation for each data point.
| 18 12.4 | = 5.6 | 18 12.4 | = 5.6 |18-12.4|=5.6|18 – 12.4| = 5.6|1812.4|=5.6
| 15 12.4 | = 2.6 | 15 12.4 | = 2.6 |15-12.4|=2.6|15 – 12.4| = 2.6|1512.4|=2.6
| 16 12.4 | = 3.6 | 16 12.4 | = 3.6 |16-12.4|=3.6|16 – 12.4| = 3.6|1612.4|=3.6
| 14 12.4 | = 1.6 | 14 12.4 | = 1.6 |14-12.4|=1.6|14 – 12.4| = 1.6|1412.4|=1.6
| 11 12.4 | = 1.4 | 11 12.4 | = 1.4 |11-12.4|=1.4|11 – 12.4| = 1.4|1112.4|=1.4
| 8 12.4 | = 4.4 | 8 12.4 | = 4.4 |8-12.4|=4.4|8 – 12.4| = 4.4|812.4|=4.4
| 7 12.4 | = 5.4 | 7 12.4 | = 5.4 |7-12.4|=5.4|7 – 12.4| = 5.4|712.4|=5.4
| 12 12.4 | = 0.4 | 12 12.4 | = 0.4 |12-12.4|=0.4|12 – 12.4| = 0.4|1212.4|=0.4
| 10 12.4 | = 2.4 | 10 12.4 | = 2.4 |10-12.4|=2.4|10 – 12.4| = 2.4|1012.4|=2.4
| 13 12.4 | = 0.6 | 13 12.4 | = 0.6 |13-12.4|=0.6|13 – 12.4| = 0.6|1312.4|=0.6

Step 4: Find the sum of all absolute deviations

Sum up all the absolute deviations:
5.6 + 2.6 + 3.6 + 1.6 + 1.4 + 4.4 + 5.4 + 0.4 + 2.4 + 0.6 = 28.0 5.6 + 2.6 + 3.6 + 1.6 + 1.4 + 4.4 + 5.4 + 0.4 + 2.4 + 0.6 = 28.0 5.6+2.6+3.6+1.6+1.4+4.4+5.4+0.4+2.4+0.6=28.05.6 + 2.6 + 3.6 + 1.6 + 1.4 + 4.4 + 5.4 + 0.4 + 2.4 + 0.6 = 28.05.6+2.6+3.6+1.6+1.4+4.4+5.4+0.4+2.4+0.6=28.0

Step 5: Calculate the average deviation

Finally, divide the sum of the absolute deviations by the number of data points (which is 10):
Average Deviation = 28.0 10 = 2.8 Average Deviation = 28.0 10 = 2.8 “Average Deviation”=(28.0)/(10)=2.8\text{Average Deviation} = \frac{28.0}{10} = 2.8Average Deviation=28.010=2.8

Conclusion:

The average deviation for the given data is 2.8.



Question:-7

Describe the properties of Normal Distribution Curve (NPC).

Answer:

The Normal Distribution Curve (NPC), often referred to as the bell curve or Gaussian distribution, is a fundamental concept in statistics. Its key properties include:
  1. Symmetry: The NPC is symmetric around its mean. This means the left and right sides of the curve are mirror images of each other. The mean, median, and mode of the distribution are all located at the same point, right at the center of the curve.
  2. Bell-shaped: The curve has a characteristic bell shape. It starts low, rises to a peak at the mean, and then gradually tapers off symmetrically as it moves away from the center.
  3. Mean, Median, Mode: In a perfectly normal distribution, the mean, median, and mode are equal and located at the center of the distribution. These measures of central tendency coincide at the peak.
  4. Asymptotic: The tails of the normal distribution curve approach but never touch the horizontal axis. This property implies that, theoretically, the curve extends infinitely in both directions.
  5. Empirical Rule (68-95-99.7 rule):
    • 68% of the data falls within one standard deviation of the mean (i.e., between μ – σ and μ + σ).
    • 95% of the data falls within two standard deviations of the mean (i.e., between μ – 2σ and μ + 2σ).
    • 99.7% of the data falls within three standard deviations of the mean (i.e., between μ – 3σ and μ + 3σ).
  6. Unimodal: The normal distribution curve has only one peak (mode), making it unimodal.
  7. Standard Normal Distribution: A special case of the normal distribution where the mean (μ) is 0 and the standard deviation (σ) is 1. This allows for comparison of different normal distributions by converting data points into z-scores (standardized scores).
  8. Probability Density: The area under the NPC represents probability, and the total area under the curve sums to 1 (or 100%). The height of the curve at any given point represents the likelihood of the random variable taking on that particular value.
These properties make the normal distribution a vital tool in probability theory, statistics, and various real-world applications like quality control, finance, and natural sciences.



Question:-8

Discuss the concept of Kurtosis and Skewness.

Answer:

Kurtosis and Skewness are two important statistical measures that describe the shape of a probability distribution.

1. Skewness

Skewness measures the asymmetry or lack of symmetry in a data distribution.
  • Positive Skewness (Right-Skewed Distribution): When the right tail (the larger values) of the distribution is longer or fatter than the left tail. In this case, most of the data is concentrated on the left, and the mean is typically greater than the median.
  • Negative Skewness (Left-Skewed Distribution): When the left tail (the smaller values) is longer or fatter than the right tail. Here, most of the data is concentrated on the right, and the mean is usually less than the median.
  • Zero Skewness (Symmetrical Distribution): In a perfectly symmetrical distribution, the mean, median, and mode are equal, and there is no skewness. For example, the normal distribution has zero skewness.

Formula for Skewness:

Skewness can be calculated using the third standardized moment:
Skewness = E [ ( X μ ) 3 ] σ 3 Skewness = E [ ( X μ ) 3 ] σ 3 “Skewness”=(E[(X-mu)^(3)])/(sigma^(3))\text{Skewness} = \frac{E[(X – \mu)^3]}{\sigma^3}Skewness=E[(Xμ)3]σ3
Where:
  • X X XXX is a random variable
  • μ μ mu\muμ is the mean of the distribution
  • σ σ sigma\sigmaσ is the standard deviation
  • E E EEE represents the expected value

2. Kurtosis

Kurtosis measures the "tailedness" of a probability distribution, or how heavily the tails of the distribution differ from the tails of a normal distribution.
  • High Kurtosis (Leptokurtic): Distributions with high kurtosis have heavy tails and a sharp peak, indicating more frequent extreme outliers. The data is highly concentrated around the mean, with more extreme values in the tails.
  • Low Kurtosis (Platykurtic): Distributions with low kurtosis have lighter tails and a flatter peak. This indicates that the data points are more spread out and there are fewer extreme values.
  • Normal Kurtosis (Mesokurtic): A distribution with kurtosis similar to a normal distribution has a kurtosis value of 3 (this is often subtracted by 3 to give a "zero kurtosis" interpretation). In this case, the distribution has moderate tails and no significant outliers.

Formula for Kurtosis:

Kurtosis is calculated using the fourth standardized moment:
Kurtosis = E [ ( X μ ) 4 ] σ 4 Kurtosis = E [ ( X μ ) 4 ] σ 4 “Kurtosis”=(E[(X-mu)^(4)])/(sigma^(4))\text{Kurtosis} = \frac{E[(X – \mu)^4]}{\sigma^4}Kurtosis=E[(Xμ)4]σ4
Where:
  • X X XXX, μ μ mu\muμ, and σ σ sigma\sigmaσ have the same meanings as in the skewness formula.

Interpretation of Kurtosis:

  • Excess Kurtosis: To make kurtosis interpretation easier, we often subtract 3 from the kurtosis value (since a normal distribution has a kurtosis of 3). This is called excess kurtosis. Therefore:
    • Positive excess kurtosis: Indicates a leptokurtic distribution (heavy tails).
    • Negative excess kurtosis: Indicates a platykurtic distribution (light tails).

Summary:

  • Skewness tells us about the asymmetry of the distribution (left or right-skewed).
  • Kurtosis tells us about the distribution’s tails (whether they are heavy or light compared to a normal distribution).
Both metrics help in understanding the nature of the data’s distribution beyond the basic measures like mean and standard deviation.



Search Free Solved Assignment

Just Type atleast 3 letters of your Paper Code

Scroll to Top
Scroll to Top