Take the screenshots of the final output/spreadsheet.
Paste all screenshots in the assignment booklets with all necessary interpretation and steps.
Q1 A manager of an amusement park wanted to study the waiting times of visitors for issuing entry tickets during a peak hour. A subgroup of 15 visitors was selected (one at each ten minutes interval during an hour) and the time (in minutes) was measured from the point each visitor entered in the line to when he or she began to be attended. The results of 40 days period are recorded in the following table:
The manager of this amusement park needs to construct suitable control charts for variability as well as average to infer whether the waiting times of visitors for getting entry tickets is under statistical control or not. If it is out-of-control, she also computes the revised control limits, if necessary.
Q 2 A company designs decorative glass wall panels. Each panel is supposed to meet company standards for such things as glass thickness, ability to reflect, size of panel, quality of glass, colour, and so on. To control these features, the company quality people randomly sampled the panels from every shift and determined how many of the panels are out of compliance on at least one feature. The data collected from 25 such samples are shown below:
Construct a suitable control chart for fraction of out of compliance panels to check whether the process is said to be in a state of control or not using both approaches. Also construct the revised control charts, if necessary.
Q3 A researcher is interested in studying the impact of the weekly working hours and type of machine used ( 0 for Machine A\mathrm{A} and 1 for Machine B\mathrm{B} ) on the number of produced items of a particular type. The data were collected for 40 weeks and shown in the following table:
i) Prepare a scatter plot to get an idea about the relationship among the variables.
ii) Fit a linear regression model and its related analysis at 1%1 \% level of significance.
iii) Does the fitted regression model satisfy the linearity and normality assumptions?
iv) Also, draw both fitted regression lines on the scatter plot. Q 4 A popular café chain wishes to improve customer service and its employee scheduling based on the daily customers’ footfall during past 10 weeks. The numbers of customers served in the restaurants during that period are given as follow:
Take the screenshots of the final output/spreadsheet.
Paste all screenshots in the assignment’s booklets with all necessary interpretation and steps.
Q1(a) A random sample of 440 patients of cardiology department of a hospital was taken and their workout timing and severity of heart disease status were recorded. The following table shows the workout timing and severity of heart disease:
{:[” Workout “],[” (in minutes) “]:}\begin{array}{c}\text { Workout } \\
\text { (in minutes) }\end{array}
Test at 5%5 \% level of significance whether workout habit and heart disease are associated with to each other or not.
(b) To study the association between the diabetic patients and their family history of diabetes, the following data were obtained on 70 subjects.
{:[” Diabetes in “],[” Family “]:}\begin{array}{c}\text { Diabetes in } \\
\text { Family }\end{array}
Diabetes in Subject
Total
Yes
No
Yes
14
3
17
No
3
50
53
Total
17
53
70
” Diabetes in
Family ” Diabetes in Subject Total
Yes No
Yes 14 3 17
No 3 50 53
Total 17 53 70| $\begin{array}{c}\text { Diabetes in } \\ \text { Family }\end{array}$ | Diabetes in Subject | | Total |
| :—: | :—: | :—: | :—: |
| | Yes | No | |
| Yes | 14 | 3 | 17 |
| No | 3 | 50 | 53 |
| Total | 17 | 53 | 70 |
Which test is appropriate in this situation? Check whether the diabetes runs with generations in families or not at 5%5 \% level of significance using appropriate test.
(10+15)(10+15)
Q2 A researcher is interested to check the relationship between the serum creatinine (in mg//dL\mathrm{mg} / \mathrm{dL} ) with the weight (in kg\mathrm{kg} ) and gender (0 if female and 1 if male). The data were collected from the hospital records to examine the contribution of these variables to serum creatinine. A total of 40 patients were sampled and the data are shown in the following table:
(i) Prepare a scatter plot to get an idea about the relationship among the variables.
(ii) Fit a linear regression model and its related analysis at 1%1 \% level of significance.
(iii) Does the fitted regression model satisfy the linearity and normality assumptions?
(iv) Also, draw both fitted regression lines on the scatter plot.
Q3 A hypothetical data of 40 patients on age (in years), weight (in kgs) and systolic blood pressure (in mm//hg\mathrm{mm} / \mathrm{hg} ) denoting 1 for high SBP\mathrm{SBP} and 0 for normal SBP\mathrm{SBP} are given in the following table:
(ii) Test the significance of the individual model coefficients beta_(1)\beta_{1} and beta_(2)\beta_{2} at 5%5 \% level of significance.
(iii) Obtain the 95%95 \% confidence intervals for beta_(1)\beta_{1} and beta_(2)\beta_{2}.
(iv) Determine the Nagelkerke pseudo R-squared.
Q4 A clinical study was conducted on individuals with advanced stage of Hepatocellular Carcinoma to test three lines of Treatments: T1, T2 and T3. Thirthy-six patients with stage III Hepatocellular Carcinoma who agreed to take part in the experiment were randomly allocated one of three line of Treatments T1,T2\mathrm{T} 1, \mathrm{~T} 2 and T3. The primary outcome was mortality, and patients were monitored for up to 60 months (5 years) after recruitment. The data (in months) so obtained are given as follows:
{:[” Patient “],[” ID “]:}\begin{array}{c}\text { Patient } \\ \text { ID }\end{array}
{:[” Survival “],[” time “]:}\begin{array}{c}\text { Survival } \\ \text { time }\end{array}
Outcome
Treatment
{:[” Patient “],[” ID “]:}\begin{array}{c}\text { Patient } \\ \text { ID }\end{array}
{:[” Survival “],[” time “]:}\begin{array}{c}\text { Survival } \\ \text { time }\end{array}
Outcome
Treatment
ID001
14
Died
T3
ID019
50
Died
T1\mathrm{T} 1
ID002
27
Unknown
T1
ID020
54
Unknown
T3
ID003
37
Unknown
T3
ID021
57
Died
T2
ID004
44
Died
T1
ID022
60
Survived
T3
ID005
27
Died
T2
ID023
20
Died
T1\mathrm{T} 1
ID006
29
Died
T3
ID024
22
Unknown
T2
ID007
50
Died
T2
ID025
11
Unknown
T2\mathrm{T} 2
ID008
31
Died
T1
ID026
12
Unknown
T1
ID009
54
Died
T2
ID027
57
Unknown
T3\mathrm{T} 3
ID010
32
Died
T2
ID028
60
Survived
T3\mathrm{T} 3
ID011
32
Unknown
T2
ID029
44
Died
T1\mathrm{T} 1
” Patient
ID ” ” Survival
time ” Outcome Treatment ” Patient
ID ” ” Survival
time ” Outcome Treatment
ID001 14 Died T3 ID019 50 Died T1
ID002 27 Unknown T1 ID020 54 Unknown T3
ID003 37 Unknown T3 ID021 57 Died T2
ID004 44 Died T1 ID022 60 Survived T3
ID005 27 Died T2 ID023 20 Died T1
ID006 29 Died T3 ID024 22 Unknown T2
ID007 50 Died T2 ID025 11 Unknown T2
ID008 31 Died T1 ID026 12 Unknown T1
ID009 54 Died T2 ID027 57 Unknown T3
ID010 32 Died T2 ID028 60 Survived T3
ID011 32 Unknown T2 ID029 44 Died T1| $\begin{array}{c}\text { Patient } \\ \text { ID }\end{array}$ | $\begin{array}{c}\text { Survival } \\ \text { time }\end{array}$ | Outcome | Treatment | $\begin{array}{c}\text { Patient } \\ \text { ID }\end{array}$ | $\begin{array}{c}\text { Survival } \\ \text { time }\end{array}$ | Outcome | Treatment |
| :—: | :—: | :—: | :—: | :—: | :—: | :—: | :—: |
| ID001 | 14 | Died | T3 | ID019 | 50 | Died | $\mathrm{T} 1$ |
| ID002 | 27 | Unknown | T1 | ID020 | 54 | Unknown | T3 |
| ID003 | 37 | Unknown | T3 | ID021 | 57 | Died | T2 |
| ID004 | 44 | Died | T1 | ID022 | 60 | Survived | T3 |
| ID005 | 27 | Died | T2 | ID023 | 20 | Died | $\mathrm{T} 1$ |
| ID006 | 29 | Died | T3 | ID024 | 22 | Unknown | T2 |
| ID007 | 50 | Died | T2 | ID025 | 11 | Unknown | $\mathrm{T} 2$ |
| ID008 | 31 | Died | T1 | ID026 | 12 | Unknown | T1 |
| ID009 | 54 | Died | T2 | ID027 | 57 | Unknown | $\mathrm{T} 3$ |
| ID010 | 32 | Died | T2 | ID028 | 60 | Survived | $\mathrm{T} 3$ |
| ID011 | 32 | Unknown | T2 | ID029 | 44 | Died | $\mathrm{T} 1$ |
(ii) Test whether there is a significant difference between the survival distributions of the patients under all treatments at 5%5 \% level of significance.