IGNOU PGDAST Assignment Question Papers 2023 | Applied Statistics

Estimated reading: 11 minutes 299 views

untitled-document-15-c4a14609-db5c-41e1-99f0-81aab3fdac8f

Course Code: MSTL-002

Assignment Code: MSTL-002/TMA/2023

Maximum Marks: 100

Note:

All questions are compulsory.
Solve the following questions in MS Excel.
Take the screenshots of the final output/spreadsheet.
Paste all screenshots in the assignment booklets with all necessary interpretation and steps.

Q1 A manager of an amusement park wanted to study the waiting times of visitors for issuing entry tickets during a peak hour. A subgroup of 15 visitors was selected (one at each ten minutes interval during an hour) and the time (in minutes) was measured from the point each visitor entered in the line to when he or she began to be attended. The results of 40 days period are recorded in the following table:

The manager of this amusement park needs to construct suitable control charts for variability as well as average to infer whether the waiting times of visitors for getting entry tickets is under statistical control or not. If it is out-of-control, she also computes the revised control limits, if necessary.

Q 2 A company designs decorative glass wall panels. Each panel is supposed to meet company standards for such things as glass thickness, ability to reflect, size of panel, quality of glass, colour, and so on. To control these features, the company quality people randomly sampled the panels from every shift and determined how many of the panels are out of compliance on at least one feature. The data collected from 25 such samples are shown below:

Sample No.	Sampled Panels	Out of Compliance Panels
1	69	2
2	71	3
3	66	3
4	65	9

5	69	3
6	67	2
7	70	4
8	73	5
9	71	3
10	69	2
11	74	5
12	79	2
13	74	4
14	74	3
15	71	2
16	67	3
17	69	2
18	75	4
19	71	2
20	72	4
21	69	3
22	74	3
23	69	2
24	69	4

Construct a suitable control chart for fraction of out of compliance panels to check whether the process is said to be in a state of control or not using both approaches. Also construct the revised control charts, if necessary.

Q3 A researcher is interested in studying the impact of the weekly working hours and type of machine used ( 0 for Machine

A

and 1 for Machine

B

) on the number of produced items of a particular type. The data were collected for 40 weeks and shown in the following table:

Week	Produced Item	Working Hours	Machine Type
1	9	48	1
2	15	67	1
3	12	61	1
4	17	86	0
5	19	93	1
6	17	80	1
7	12	55	0
8	9	51	1

9	7	44	0
10	18	89	0
11	13	55	1
12	10	56	0
13	15	67	1
14	13	63	1
15	15	73	0
16	15	73	0
17	14	70	0
18	15	67	1
19	12	57	1
20	14	68	0
21	13	57	1
22	11	57	0
23	11	64	0
24	13	67	0
25	10	56	0
26	7	47	0
27	8	47	0
28	12	64	0
29	7	42	0
30	11	60	0
31	15	67	1
32	13	60	1
33	16	69	1
34	10	44	1
35	18	83	1
36	20	94	1
37	17	82	0
38	19	93	1
39	10	57	0
40	7	35	0

i) Prepare a scatter plot to get an idea about the relationship among the variables.

ii) Fit a linear regression model and its related analysis at

1 %

level of significance.

iii) Does the fitted regression model satisfy the linearity and normality assumptions?

iv) Also, draw both fitted regression lines on the scatter plot. Q 4 A popular café chain wishes to improve customer service and its employee scheduling based on the daily customers’ footfall during past 10 weeks. The numbers of customers served in the restaurants during that period are given as follow:

Week	Monday	Tuesday	Wednesday	Thursday	Friday	Saturday	Sunday
$1$	443	608	371	341	544	460	332
$2$	279	358	312	377	438	277	402
$3$	219	288	349	223	375	208	199
$4$	264	343	190	362	423	202	387
$5$	204	273	334	208	373	216	392
$6$	379	292	417	234	303	364	238
$7$	332	241	348	377	252	432	441
$8$	321	478	499	478	327	604	429
$9$	588	649	523	699	499	569	772
$10$	658	848	843	793	751	975	941

i) Determine the seasonal indices for these data using a 7-day moving averages.

ii) Obtain the deseasonalised values.

iii) Fit the appropriate trend for the deseasonalised data using the least-squares method by matrix approach that best describes these data.

iv) Project the number of customers on Wednesday of the

22^{th}

week.

v) Plot the original data, the deseasonalised data, and the trend.

1. TUTOR MARKED ASSIGNMENT
MSTL-003: Biostatistics Lab

Course Code: MSTL-003

Assignment Code: MSTL-003/TMA/2023

Maximum Marks: 100

Note:

All questions are compulsory.
Solve the following questions in MS Excel.
Take the screenshots of the final output/spreadsheet.
Paste all screenshots in the assignment’s booklets with all necessary interpretation and steps.

Q1(a) A random sample of 440 patients of cardiology department of a hospital was taken and their workout timing and severity of heart disease status were recorded. The following table shows the workout timing and severity of heart disease:

$\begin{matrix} Workout \\ (in minutes) \end{matrix}$	Severity of Heart Disease
$\begin{matrix} Workout \\ (in minutes) \end{matrix}$	Low	Mild	Moderate	High	Very High
No workout	5	13	26	21	23
$0 - 15$	6	15	19	19	21
$15$ to $30$	16	17	14	16	12
$30$ to $45$	18	17	13	11	9
$45$ to $60$	20	19	15	13	7
$\geq 60$	16	22	6	5	6

Test at

5 %

level of significance whether workout habit and heart disease are associated with to each other or not.

(b) To study the association between the diabetic patients and their family history of diabetes, the following data were obtained on 70 subjects.

$\begin{matrix} Diabetes in \\ Family \end{matrix}$	Diabetes in Subject		Total
$\begin{matrix} Diabetes in \\ Family \end{matrix}$	Yes	No
Yes	14	3	17
No	3	50	53
Total	17	53	70

Which test is appropriate in this situation? Check whether the diabetes runs with generations in families or not at

5 %

level of significance using appropriate test.

(10 + 15)

Q2 A researcher is interested to check the relationship between the serum creatinine (in

m g / d L

) with the weight (in

k g

) and gender (0 if female and 1 if male). The data were collected from the hospital records to examine the contribution of these variables to serum creatinine. A total of 40 patients were sampled and the data are shown in the following table:

S. No.	$\begin{matrix} Serum \\ Creatinine \end{matrix}$	Weight	Gender	S. No.	$\begin{matrix} Serum \\ Creatinine \end{matrix}$	Weight	Gender
1	0.7	46	1	21	1.1	55	1
2	1.3	65	1	22	0.9	55	0
3	1	59	1	23	0.9	62	0
4	1.5	84	0	24	1.1	65	0
5	1.7	91	1	25	0.8	54	0
6	1.5	78	1	26	0.5	45	0
7	1	53	0	27	0.6	45	0
8	0.7	49	1	28	1	62	0
9	0.5	42	0	29	0.5	40	0
10	1.6	87	0	30	0.9	58	0
11	1.1	53	1	31	1.3	65	1
12	0.8	54	0	32	1.1	58	1
13	1.3	65	1	33	1.4	67	1
14	1.1	61	1	34	0.8	42	1
15	1.3	71	0	35	1.6	81	1
16	1.3	71	0	36	1.8	92	1
17	1.2	68	0	37	1.5	80	0
18	1.3	65	1	38	1.7	91	1
19	1	55	1	39	0.8	55	0
20	1.2	66	0	40	0.5	33	0

(i) Prepare a scatter plot to get an idea about the relationship among the variables.

(ii) Fit a linear regression model and its related analysis at

1 %

level of significance.

(iii) Does the fitted regression model satisfy the linearity and normality assumptions?

(iv) Also, draw both fitted regression lines on the scatter plot.

Q3 A hypothetical data of 40 patients on age (in years), weight (in kgs) and systolic blood pressure (in

m m / h g

) denoting 1 for high

S B P

and 0 for normal

S B P

are given in the following table:

S. No.	Age	Weight	SBP	S. No.	Age	Weight	SBP
$1$	52	60	0	$21$	47	48	0
$2$	56	68	1	$22$	42	45	0
$3$	51	54	0	$23$	45	57	0
$4$	63	74	1	$24$	56	83	1
$5$	54	62	0	$25$	49	63	0
$6$	51	67	0	$26$	56	94	1
$7$	51	66	0	$27$	55	87	1
$8$	54	65	0	$28$	53	67	0
$9$	59	71	1	$29$	65	70	1
$10$	51	89	1	$30$	44	70	0

$11$	56	72	1	$31$	48	54	0
$12$	55	72	1	$32$	61	79	1
$13$	46	57	0	$33$	45	85	0
$14$	42	54	0	$34$	63	98	1
$15$	52	63	0	$35$	49	78	0
$16$	65	67	1	$36$	65	80	1
$17$	50	67	0	$37$	60	70	1
$18$	42	53	0	$38$	53	98	1
$19$	50	68	1	$39$	41	53	0
$20$	39	55	0	$40$	50	70	1

For this data:

(i) Fit a multiple logistic regression model.

(ii) Test the significance of the individual model coefficients

β_{1}

and

β_{2}

5 %

level of significance.

(iii) Obtain the

95 %

confidence intervals for

β_{1}

and

β_{2}

(iv) Determine the Nagelkerke pseudo R-squared.

Q4 A clinical study was conducted on individuals with advanced stage of Hepatocellular Carcinoma to test three lines of Treatments: T1, T2 and T3. Thirthy-six patients with stage III Hepatocellular Carcinoma who agreed to take part in the experiment were randomly allocated one of three line of Treatments

T 1, T 2

and T3. The primary outcome was mortality, and patients were monitored for up to 60 months (5 years) after recruitment. The data (in months) so obtained are given as follows:

$\begin{matrix} Patient \\ ID \end{matrix}$	$\begin{matrix} Survival \\ time \end{matrix}$	Outcome	Treatment	$\begin{matrix} Patient \\ ID \end{matrix}$	$\begin{matrix} Survival \\ time \end{matrix}$	Outcome	Treatment
ID001	14	Died	T3	ID019	50	Died	$T 1$
ID002	27	Unknown	T1	ID020	54	Unknown	T3
ID003	37	Unknown	T3	ID021	57	Died	T2
ID004	44	Died	T1	ID022	60	Survived	T3
ID005	27	Died	T2	ID023	20	Died	$T 1$
ID006	29	Died	T3	ID024	22	Unknown	T2
ID007	50	Died	T2	ID025	11	Unknown	$T 2$
ID008	31	Died	T1	ID026	12	Unknown	T1
ID009	54	Died	T2	ID027	57	Unknown	$T 3$
ID010	32	Died	T2	ID028	60	Survived	$T 3$
ID011	32	Unknown	T2	ID029	44	Died	$T 1$

ID012	60	Unknown	$T 1$	ID030	47	Unknown	T2
ID013	2	Unknown	$T 2$	ID031	32	Died	$T 2$
ID014	42	Died	$T 3$	ID032	34	Died	$T 1$
ID015	42	Unknown	$T 2$	ID033	17	Died	$T 2$
ID016	60	Died	$T 3$	ID034	6	Died	$T 1$
ID017	60	Survived	$T 3$	ID035	50	Unknown	$T 3$
ID018	47	Died	$T 3$	ID036	14	Unknown	$T 2$

For this data,

(i) Construct Kaplan and Meier survival curves.

(ii) Test whether there is a significant difference between the survival distributions of the patients under all treatments at

5 %

level of significance.

PGDAST Assignment Question Papers

PGDAST Previous Year Question Papers

IGNOU PGDAST Assignment Question Papers 2023 | Applied Statistics

1. TUTOR MARKED ASSIGNMENT MSTL-003: Biostatistics Lab

CONTENTS

1. TUTOR MARKED ASSIGNMENT
MSTL-003: Biostatistics Lab