IGNOU MST-017 Solved Assignment 2024 MSCAST

IGNOU MST-017 Solved Assignment 2024 | MSCAST | IGNOU

Solved By – Narendra Kr. Sharma – M.Sc (Mathematics Honors) – Delhi University

365.00

Share with your Friends

Details For MST-017 Solved Assignment

IGNOU MST-017 Assignment Question Paper 2024

mst-017-solved-assignment-2024-qp-8e24e610-06c9-4b43-84f6-a5bf6ef5ab5c

mst-017-solved-assignment-2024-qp-8e24e610-06c9-4b43-84f6-a5bf6ef5ab5c

1(a) State whether the following statements are true or false and also give the reason in support of your answer.
(i) We define three indicator variables for an explanatory variable with three categories.
(ii) If the coefficient of determination is 0.833 , the number of observations and explanatory variables are 12 and 3 , respectively, then the Adjusted R 2 R 2 R^(2)R^2R2 will be 0.84 .
(iii) For a simple regression model fitted on 15 observations, if we have h i i = 0.37 h i i = 0.37 h_(ii)=0.37h_{i i}=0.37hii=0.37, then it is an indication to trace the leverage point in the regression model.
(iv) In a regression model Y = β 0 + β 1 X 1 + β 2 X 2 + ε Y = β 0 + β 1 X 1 + β 2 X 2 + ε Y=beta_(0)+beta_(1)X_(1)+beta_(2)X_(2)+epsiY=\beta_0+\beta_1 X_1+\beta_2 X_2+\varepsilonY=β0+β1X1+β2X2+ε, if H 0 : β 1 = 0 H 0 : β 1 = 0 H_(0):beta_(1)=0H_0: \beta_1=0H0:β1=0 is not rejected, then the variable X 1 X 1 X_(1)X_1X1 will remain in the model.
(v) The logit link function is log [ log ( 1 π ) ] log [ log ( 1 π ) ] log[-log(1-pi)]\log [-\log (1-\pi)]log[log(1π)].
(b) Write a short note on the problem of multicollinearity and autocorrelation.
2(a) Explain the assumptions underlying multiple linear regression model.
(b) Suppose a researcher wants to evaluate the effect of cholesterol on the blood pressure. The following data on serum cholesterol (in m g / d L m g / d L mg//dL\mathrm{mg} / \mathrm{dL}mg/dL ) and systolic blood pressure (in m m / H g m m / H g mm//Hg\mathrm{mm} / \mathrm{Hg}mm/Hg ) were obtained for 15 patients to explore the relationship between cholesterol and blood pressure:
S. No. Cholesterol (mg/dL) SBP (mm/Hg) 1 300 150 2 410 270 3 380 210 4 530 310 5 570 350 6 490 310 7 340 210 8 320 150 9 280 110 10 550 320 11 340 220 12 350 170 13 410 260 14 390 230 15 450 270 (i) Fit a linear regression model using the method of least squares. S. No. Cholesterol (mg/dL) SBP (mm/Hg) 1 300 150 2 410 270 3 380 210 4 530 310 5 570 350 6 490 310 7 340 210 8 320 150 9 280 110 10 550 320 11 340 220 12 350 170 13 410 260 14 390 230 15 450 270 (i) Fit a linear regression model using the method of least squares. {:[{:[” S. No. “,” Cholesterol (mg/dL) “,” SBP (mm/Hg) “],[1,300,150],[2,410,270],[3,380,210],[4,530,310],[5,570,350],[6,490,310],[7,340,210],[8,320,150],[9,280,110],[10,550,320],[11,340,220],[12,350,170],[13,410,260],[14,390,230],[15,450,270]:}],[” (i) Fit a linear regression model using the method of least squares. “]:}\begin{aligned} &\begin{array}{|c|c|c|} \hline \text { S. No. } & \text { Cholesterol (mg/dL) } & \text { SBP (mm/Hg) } \\ \hline 1 & 300 & 150 \\ \hline 2 & 410 & 270 \\ \hline 3 & 380 & 210 \\ \hline 4 & 530 & 310 \\ \hline 5 & 570 & 350 \\ \hline 6 & 490 & 310 \\ \hline 7 & 340 & 210 \\ \hline 8 & 320 & 150 \\ \hline 9 & 280 & 110 \\ \hline 10 & 550 & 320 \\ \hline 11 & 340 & 220 \\ \hline 12 & 350 & 170 \\ \hline 13 & 410 & 260 \\ \hline 14 & 390 & 230 \\ \hline 15 & 450 & 270 \\ \hline \end{array}\\ &\text { (i) Fit a linear regression model using the method of least squares. } \end{aligned} S. No. Cholesterol (mg/dL) SBP (mm/Hg) 130015024102703380210453031055703506490310734021083201509280110105503201134022012350170134102601439023015450270 (i) Fit a linear regression model using the method of least squares.
(ii) Construct the normal probability plot for the regression model fitted on serum cholesterol and systolic blood pressure.
(iii) Test the significance of the fitted regression model.
  1. For the data given in Question 2(b), obtain the followings:
    (i) Diagonal of the hat matrix and also check the leverage points if any.
    (ii) Cook’s Distances, DFFITS and DFBETAS. Also verify the influence points if any.
4 A company conducted a study on its employees to see the relationship of several variables with an employ’s IQ. For this purpose, fifteen employees were selected and an IQ as well as five different personality tests were given to them. Each employ’s IQ was recorded along with scores on five tests. The data are shown in the following table:
Employee Test 1 Test 2 Test 3 Test 4 Test 5 IQ
1 83 80 78 77 67 99
2 73 85 67 80 63 92
3 81 80 71 81 68 94
4 96 86 82 83 56 99
5 84 73 75 75 68 94
6 72 74 71 67 59 79
7 84 79 84 84 69 97
8 54 86 61 69 53 92
9 86 85 79 78 76 94
10 42 71 60 80 56 86
11 83 72 72 78 74 98
12 63 86 65 85 56 83
13 69 76 64 85 61 98
14 81 84 65 79 64 96
15 50 85 71 65 75 76
Employee Test 1 Test 2 Test 3 Test 4 Test 5 IQ 1 83 80 78 77 67 99 2 73 85 67 80 63 92 3 81 80 71 81 68 94 4 96 86 82 83 56 99 5 84 73 75 75 68 94 6 72 74 71 67 59 79 7 84 79 84 84 69 97 8 54 86 61 69 53 92 9 86 85 79 78 76 94 10 42 71 60 80 56 86 11 83 72 72 78 74 98 12 63 86 65 85 56 83 13 69 76 64 85 61 98 14 81 84 65 79 64 96 15 50 85 71 65 75 76| Employee | Test 1 | Test 2 | Test 3 | Test 4 | Test 5 | IQ | | :—: | :—: | :—: | :—: | :—: | :—: | :—: | | 1 | 83 | 80 | 78 | 77 | 67 | 99 | | 2 | 73 | 85 | 67 | 80 | 63 | 92 | | 3 | 81 | 80 | 71 | 81 | 68 | 94 | | 4 | 96 | 86 | 82 | 83 | 56 | 99 | | 5 | 84 | 73 | 75 | 75 | 68 | 94 | | 6 | 72 | 74 | 71 | 67 | 59 | 79 | | 7 | 84 | 79 | 84 | 84 | 69 | 97 | | 8 | 54 | 86 | 61 | 69 | 53 | 92 | | 9 | 86 | 85 | 79 | 78 | 76 | 94 | | 10 | 42 | 71 | 60 | 80 | 56 | 86 | | 11 | 83 | 72 | 72 | 78 | 74 | 98 | | 12 | 63 | 86 | 65 | 85 | 56 | 83 | | 13 | 69 | 76 | 64 | 85 | 61 | 98 | | 14 | 81 | 84 | 65 | 79 | 64 | 96 | | 15 | 50 | 85 | 71 | 65 | 75 | 76 |
Determine the most appropriate regression model for the employee’s IQ using stepwise approach at 5 % 5 % 5%5 \%5% level of significance and interpret the results. Does the final regression model satisfy the linearity and normality assumptions?
  1. The following data on diagnosis of coronary heart disease (where 0 indicating absence and 1 indicating presence), serum cholesterol (in m g / d l m g / d l mg//dl\mathrm{mg} / \mathrm{dl}mg/dl ), resting blood pressure (in m m H g m m H g mmHg\mathrm{mmHg}mmHg ) and weight (in k g k g kg\mathrm{kg}kg ) were obtained for 80 patients to explore the relationship of coronary heart disease with cholesterol and weight:
S.
No.
S. No.| S. | | :—: | | No. |
Serum
Cholesterol
(mg/dl)
Serum Cholesterol (mg/dl)| Serum | | :—: | | Cholesterol | | (mg/dl) |
Weight
(kg)
Weight (kg)| Weight | | :—: | | (kg) |
Number of
Patients
having CHD
Number of Patients having CHD| Number of | | :—: | | Patients | | having CHD |
Total
Number of
Patients
Total Number of Patients| Total | | :—: | | Number of | | Patients |
1 420 60 10 20
2 450 68 15 30
3 400 54 4 15
4 510 74 2 10
5 480 62 1 5
“S. No.” “Serum Cholesterol (mg/dl)” “Weight (kg)” “Number of Patients having CHD” “Total Number of Patients” 1 420 60 10 20 2 450 68 15 30 3 400 54 4 15 4 510 74 2 10 5 480 62 1 5| S. <br> No. | Serum <br> Cholesterol <br> (mg/dl) | Weight <br> (kg) | Number of <br> Patients <br> having CHD | Total <br> Number of <br> Patients | | :—: | :—: | :—: | :—: | :—: | | 1 | 420 | 60 | 10 | 20 | | 2 | 450 | 68 | 15 | 30 | | 3 | 400 | 54 | 4 | 15 | | 4 | 510 | 74 | 2 | 10 | | 5 | 480 | 62 | 1 | 5 |
(i) Fit a multiple logistic model for the dependence of coronary heart disease on the average serum cholesterol and weight considering β ^ 0 0 = 4.279 , β ^ 1 0 = 0.035 β ^ 0 0 = 4.279 , β ^ 1 0 = 0.035 hat(beta)_(0)^(0)=4.279, hat(beta)_(1)^(0)=-0.035\hat{\beta}_0^0=4.279, \hat{\beta}_1^0=-0.035β^00=4.279,β^10=0.035 and β ^ 2 0 = 0.172 β ^ 2 0 = 0.172 hat(beta)_(2)^(0)=0.172\hat{\beta}_2^0=0.172β^20=0.172 as the initial values of the parameters (solve only for one Iteration).
(ii) Test the significance of the fitted model using Hosmer-Lemeshow test at 5 % 5 % 5%5 \%5% level of significance.
\(2\:cos\:\theta \:sin\:\phi =sin\:\left(\theta +\phi \right)-sin\:\left(\theta -\phi \right)\)

MST-017 Sample Solution 2024

mst-017-solved-assignment-2024-ss-8e24e610-06c9-4b43-84f6-a5bf6ef5ab5c

mst-017-solved-assignment-2024-ss-8e24e610-06c9-4b43-84f6-a5bf6ef5ab5c

1(a) State whether the following statements are true or false and also give the reason in support of your answer.
(i) We define three indicator variables for an explanatory variable with three categories.
Answer:
When dealing with categorical explanatory variables in statistical modeling, particularly in regression models, it’s common to use indicator variables (also known as dummy variables) to encode these categorical variables into a format that can be provided to the model. The statement involves defining three indicator variables for an explanatory variable with three categories. Let’s analyze the correctness of this approach.

Statement Analysis

For an explanatory variable with three categories, we typically need only two indicator variables to encode the information about these categories in a regression model. This is because with n n nnn categories, you need n 1 n 1 n-1n-1n1 indicator variables to represent all the information without introducing multicollinearity, which occurs when one variable can be linearly predicted from the others with a high degree of accuracy. This can cause issues in the estimation of model parameters.

Why Only n 1 n 1 n-1n-1n1 Indicator Variables?

The reason behind using n 1 n 1 n-1n-1n1 indicator variables, instead of n n nnn, is to avoid the "dummy variable trap," a scenario where the indicator variables are highly correlated (perfect multicollinearity). This happens because the inclusion of an indicator variable for every category means one category can always be predicted if the values of the other categories are known, leading to redundant information.
For example, if we have three categories (A, B, C) and we use two indicator variables, say D 1 D 1 D1D1D1 and D 2 D 2 D2D2D2, we could define them as follows:
  • D 1 = 1 D 1 = 1 D1=1D1 = 1D1=1 if the category is A, and D 1 = 0 D 1 = 0 D1=0D1 = 0D1=0 otherwise.
  • D 2 = 1 D 2 = 1 D2=1D2 = 1D2=1 if the category is B, and D 2 = 0 D 2 = 0 D2=0D2 = 0D2=0 otherwise.
Then, if both D 1 D 1 D1D1D1 and D 2 D 2 D2D2D2 are 0, it implies that the category must be C. This way, all three categories can be represented using only two indicator variables, and there’s no risk of multicollinearity due to the indicator variables.

Conclusion

The statement "We define three indicator variables for an explanatory variable with three categories" suggests an approach that is not typically recommended in statistical modeling due to the risk of multicollinearity. The correct approach would be to define two indicator variables for an explanatory variable with three categories. This allows the model to encode the categorical information without redundancy and avoids the dummy variable trap, ensuring that the model’s parameters can be estimated accurately.
Therefore, the statement as presented is false in the context of best practices in statistical modeling.
(ii) If the coefficient of determination is 0.833 , the number of observations and explanatory variables are 12 and 3 , respectively, then the Adjusted R 2 R 2 R^(2)R^2R2 will be 0.84 .
Answer:
To evaluate the statement, let’s first understand what the coefficient of determination ( R 2 R 2 R^(2)R^2R2) and the adjusted R 2 R 2 R^(2)R^2R2 are, and how they are calculated.

Coefficient of Determination ( R 2 R 2 R^(2)R^2R2)

The coefficient of determination, R 2 R 2 R^(2)R^2R2, is a statistical measure that represents the proportion of the variance for a dependent variable that’s explained by an independent variable or variables in a regression model. It is a value between 0 and 1.

Adjusted R 2 R 2 R^(2)R^2R2

The adjusted R 2 R 2 R^(2)R^2R2 is a modified version of R 2 R 2 R^(2)R^2R2 that has been adjusted for the number of predictors in the model. It is used to account for the phenomenon where the R 2 R 2 R^(2)R^2R2 value can increase just by adding more predictors, regardless of whether those variables are significant or not. The formula for adjusted R 2 R 2 R^(2)R^2R2 is:
Adjusted R 2 = 1 ( ( 1 R 2 ) ( n 1 ) n k 1 ) Adjusted R 2 = 1 ( 1 R 2 ) ( n 1 ) n k 1 “Adjusted “R^(2)=1-(((1-R^(2))(n-1))/(n-k-1))\text{Adjusted } R^2 = 1 – \left(\frac{(1 – R^2)(n – 1)}{n – k – 1}\right)Adjusted R2=1((1R2)(n1)nk1)
where:
  • n n nnn is the number of observations,
  • k k kkk is the number of explanatory variables,
  • R 2 R 2 R^(2)R^2R2 is the coefficient of determination.
Given:
  • R 2 = 0.833 R 2 = 0.833 R^(2)=0.833R^2 = 0.833R2=0.833,
  • n = 12 n = 12 n=12n = 12n=12 (number of observations),
  • k = 3 k = 3 k=3k = 3k=3 (number of explanatory variables).

Calculation

Let’s plug these values into the formula for adjusted R 2 R 2 R^(2)R^2R2:
Adjusted R 2 = 1 ( ( 1 0.833 ) ( 12 1 ) 12 3 1 ) Adjusted R 2 = 1 ( 1 0.833 ) ( 12 1 ) 12 3 1 “Adjusted “R^(2)=1-(((1-0.833)(12-1))/(12-3-1))\text{Adjusted } R^2 = 1 – \left(\frac{(1 – 0.833)(12 – 1)}{12 – 3 – 1}\right)Adjusted R2=1((10.833)(121)1231)
Adjusted R 2 = 1 ( ( 0.167 ) ( 11 ) 8 ) Adjusted R 2 = 1 ( 0.167 ) ( 11 ) 8 “Adjusted “R^(2)=1-(((0.167)(11))/(8))\text{Adjusted } R^2 = 1 – \left(\frac{(0.167)(11)}{8}\right)Adjusted R2=1((0.167)(11)8)
Adjusted R 2 = 1 ( 1.837 8 ) Adjusted R 2 = 1 1.837 8 “Adjusted “R^(2)=1-((1.837)/(8))\text{Adjusted } R^2 = 1 – \left(\frac{1.837}{8}\right)Adjusted R2=1(1.8378)
Adjusted R 2 = 1 0.229625 Adjusted R 2 = 1 0.229625 “Adjusted “R^(2)=1-0.229625\text{Adjusted } R^2 = 1 – 0.229625Adjusted R2=10.229625
Adjusted R 2 = 0.770375 Adjusted R 2 = 0.770375 “Adjusted “R^(2)=0.770375\text{Adjusted } R^2 = 0.770375Adjusted R2=0.770375

Conclusion

Based on the calculation, the adjusted R 2 R 2 R^(2)R^2R2 would be approximately 0.770, not 0.84 as stated. Therefore, the statement is false. The adjusted R 2 R 2 R^(2)R^2R2 is calculated to account for the number of predictors in the model and typically will be less than the R 2 R 2 R^(2)R^2R2 unless the added variables significantly improve the model beyond what is expected by chance.
(iii) For a simple regression model fitted on 15 observations, if we have h i i = 0.37 h i i = 0.37 h_(ii)=0.37h_{i i}=0.37hii=0.37, then it is an indication to trace the leverage point in the regression model.
Answer:
The statement pertains to the concept of leverage in the context of regression analysis. Leverage points are observations that have a greater potential than others to influence the regression line’s slope. The leverage of an observation can be quantified by its leverage value, h i i h i i h_(ii)h_{ii}hii, which is an element of the hat matrix H H HHH. The hat matrix is used to project the vector of observed dependent variables y y yyy onto the vector of fitted values y ^ y ^ hat(y)\hat{y}y^.

Leverage Values

The leverage value h i i h i i h_(ii)h_{ii}hii for the i i iiith observation is a measure of the distance of the i i iiith independent variable value from the mean of the independent variable values. It indicates how much influence the i i iiith observation has on its own predicted value from the regression. Leverage values range from 1 / n 1 / n 1//n1/n1/n to 1, where n n nnn is the number of observations. A common rule of thumb is that an observation is considered to have high leverage if its leverage value is more than 2 ( p + 1 ) / n 2 ( p + 1 ) / n 2(p+1)//n2(p+1)/n2(p+1)/n, where p p ppp is the number of predictors (excluding the intercept) in the model.

Analysis of the Statement

Given:
  • A simple regression model (which means p = 1 p = 1 p=1p = 1p=1 predictor plus the intercept).
  • Fitted on 15 observations ( n = 15 n = 15 n=15n = 15n=15).
  • A specific observation has h i i = 0.37 h i i = 0.37 h_(ii)=0.37h_{ii} = 0.37hii=0.37.
To evaluate whether h i i = 0.37 h i i = 0.37 h_(ii)=0.37h_{ii} = 0.37hii=0.37 indicates a leverage point, let’s apply the rule of thumb for high leverage:
High leverage threshold = 2 ( p + 1 ) n = 2 ( 1 + 1 ) 15 = 4 15 0.267 High leverage threshold = 2 ( p + 1 ) n = 2 ( 1 + 1 ) 15 = 4 15 0.267 “High leverage threshold”=(2(p+1))/(n)=(2(1+1))/(15)=(4)/(15)~~0.267\text{High leverage threshold} = \frac{2(p+1)}{n} = \frac{2(1+1)}{15} = \frac{4}{15} \approx 0.267High leverage threshold=2(p+1)n=2(1+1)15=4150.267
Given h i i = 0.37 h i i = 0.37 h_(ii)=0.37h_{ii} = 0.37hii=0.37, which is greater than the threshold of approximately 0.267, the observation is indeed considered to have high leverage according to the rule of thumb.

Conclusion

The statement "For a simple regression model fitted on 15 observations, if we have h i i = 0.37 h i i = 0.37 h_(ii)=0.37h_{ii}=0.37hii=0.37, then it is an indication to trace the leverage point in the regression model." is true. The given leverage value h i i = 0.37 h i i = 0.37 h_(ii)=0.37h_{ii} = 0.37hii=0.37 exceeds the common threshold for identifying high leverage points in the context provided, indicating that this observation has a significant potential to influence the regression model’s fit.
\(\operatorname{cosec}^2 \theta=1+\cot ^2 \theta\)

Frequently Asked Questions (FAQs)

You can access the Complete Solution through our app, which can be downloaded using this link:

App Link 

Simply click “Install” to download and install the app, and then follow the instructions to purchase the required assignment solution. Currently, the app is only available for Android devices. We are working on making the app available for iOS in the future, but it is not currently available for iOS devices.

Yes, It is Complete Solution, a comprehensive solution to the assignments for IGNOU. 

Yes, the Complete Solution is aligned with the requirements and has been solved accordingly.

Yes, the Complete Solution is guaranteed to be error-free.The solutions are thoroughly researched and verified by subject matter experts to ensure their accuracy.

As of now, you have access to the Complete Solution for a period of 1 Year after the date of purchase, which is sufficient to complete the assignment. However, we can extend the access period upon request. You can access the solution anytime through our app.

The app provides complete solutions for all assignment questions. If you still need help, you can contact the support team for assistance at Whatsapp +91-9958288900

No, access to the educational materials is limited to one device only, where you have first logged in. Logging in on multiple devices is not allowed and may result in the revocation of access to the educational materials.

Payments can be made through various secure online payment methods available in the app.Your payment information is protected with industry-standard security measures to ensure its confidentiality and safety. You will receive a receipt for your payment through email or within the app, depending on your preference.

The instructions for formatting your assignments are detailed in the Assignment Booklet, which includes details on paper size, margins, precision, and submission requirements. It is important to strictly follow these instructions to facilitate evaluation and avoid delays.

\(cos\left(2\theta \right)=cos^2\theta -sin^2\theta \)

Terms and Conditions

  • The educational materials provided in the app are the sole property of the app owner and are protected by copyright laws.
  • Reproduction, distribution, or sale of the educational materials without prior written consent from the app owner is strictly prohibited and may result in legal consequences.
  • Any attempt to modify, alter, or use the educational materials for commercial purposes is strictly prohibited.
  • The app owner reserves the right to revoke access to the educational materials at any time without notice for any violation of these terms and conditions.
  • The app owner is not responsible for any damages or losses resulting from the use of the educational materials.
  • The app owner reserves the right to modify these terms and conditions at any time without notice.
  • By accessing and using the app, you agree to abide by these terms and conditions.
  • Access to the educational materials is limited to one device only. Logging in to the app on multiple devices is not allowed and may result in the revocation of access to the educational materials.

Our educational materials are solely available on our website and application only. Users and students can report the dealing or selling of the copied version of our educational materials by any third party at our email address (abstract4math@gmail.com) or mobile no. (+91-9958288900).

In return, such users/students can expect free our educational materials/assignments and other benefits as a bonafide gesture which will be completely dependent upon our discretion.

Scroll to Top
Scroll to Top