hw1 110590049
tags data
2023 Educational Data Mining and Applications HW1.pdf
2.2(e)
Five number summary: min, Q1, median, Q3, max
| min | Q1 | median | Q3 | max |
|---|---|---|---|---|
| 13 | 20 | 25 | 35 | 70 |
2.2(f)
2.8(a)
following 2D data set
x={1.4,1.6}
| A1 | A2 | |
|---|---|---|
| x1 | 1.5 | 1.7 |
| x2 | 2.0 | 1.9 |
| x3 | 1.6 | 1.8 |
| x4 | 1.2 | 1.5 |
| x5 | 1.5 | 1.0 |
Manhattan distance
| A1 | A2 | distance | |
|---|---|---|---|
| x | 1.4 | 1.6 | 0 |
| x1 | 1.5 | 1.7 | 0.2 |
| x4 | 1.2 | 1.5 | 0.3 |
| x3 | 1.6 | 1.8 | 0.4 |
| x5 | 1.5 | 1.0 | 0.5 |
| x2 | 2.0 | 1.9 | 0.9 |
Euclidean distance
| A1 | A2 | distance | rank | |
|---|---|---|---|---|
| x | 1.4 | 1.6 | 0 | |
| x1 | 1.5 | 1.7 | 0.141 | 1 |
| x2 | 2.0 | 1.9 | 0.670 | 5 |
| x3 | 1.6 | 1.8 | 0.282 | 3 |
| x4 | 1.2 | 1.5 | 0.223 | 2 |
| x5 | 1.5 | 1.0 | 0.508 | 4 |
supremum distance
| A1 | A2 | distance | rank | |
|---|---|---|---|---|
| x | 1.4 | 1.6 | 0 | |
| x1 | 1.5 | 1.7 | 0.1 | 1 |
| x2 | 2.0 | 1.9 | 0.6 | 5 |
| x3 | 1.6 | 1.8 | 0.2 | 3 |
| x4 | 1.2 | 1.5 | 0.2 | 2 |
| x5 | 1.5 | 1.0 | 0.4 | 4 |
cosine similarity
| A1 | A2 | similarity | rank | |
|---|---|---|---|---|
| x | 1.4 | 1.6 | 0 | |
| x1 | 1.5 | 1.7 | 0.9999 | 1 |
| x2 | 2.0 | 1.9 | 0.9957 | 3 |
| x3 | 1.6 | 1.8 | 0.9999 | 2 |
| x4 | 1.2 | 1.5 | 0.9990 | 5 |
| x5 | 1.5 | 1.0 | 0.9653 | 4 |
2.8(b)
| A1 | A2 | distance | rank | |
|---|---|---|---|---|
| x | 0.658 | 0.752 | 0 | |
| x1 | 0.661 | 0.749 | 0.0042 | 1 |
| x2 | 0.642 | 0.789 | 0.0403 | 3 |
| x3 | 0.724 | 0.688 | 0.0919 | 4 |
| x4 | 0.664 | 0.747 | 0.0078 | 2 |
| x5 | 0.832 | 0.554 | 0.2635 | 5 |
3.3(a)
| Bin | Data |
|---|---|
| Bin 1 | 13, 15, 16 |
| Bin 2 | 16, 19, 20 |
| Bin 3 | 20, 21, 22 |
| Bin 4 | 22, 25, 25 |
| Bin 5 | 25, 25, 30 |
| Bin 6 | 33, 33 ,35 |
| Bin 7 | 35, 35, 35 |
| Bin 8 | 35, 35, 36 |
| Bin 9 | 36, 40, 45 |
| Bin 10 | 46, 52, 70 |
| Bin | Smoothed Data |
|---|---|
| Bin 1 | 14.67, 14.67, 14.67 |
| Bin 2 | 18.33, 18.33, 18.33 |
| Bin 3 | 21.00, 21.00, 21.00 |
| Bin 4 | 24.00, 24.00, 24.00 |
| Bin 5 | 26.67, 26.67, 26.67 |
| Bin 6 | 33.67, 33.67, 33.67 |
| Bin 7 | 35.00, 35.00, 35.00 |
| Bin 8 | 35.33, 35.33, 35.33 |
| Bin 9 | 40.33, 40.33, 40.33 |
| Bin 10 | 56.00, 56.00, 56.00 |
3.3(b)
find the outlier value using the IQR method:
3.7(a)
3.7(b)
3.8(b)
| age | fat |
|---|---|
| 23 | 9.5 |
| 23 | 26.5 |
| 27 | 7.8 |
| 27 | 17.8 |
| 39 | 31.4 |
| 41 | 25.9 |
| 47 | 27.4 |
| 49 | 27.2 |
| 50 | 31.2 |
| 52 | 34.6 |
| 54 | 28.8 |
| 56 | 33.4 |
| 57 | 30.2 |
| 58 | 34.1 |
| 58 | 32.9 |
| 60 | 41.2 |
| 61 | 35.7 |