hw1 110590049

tags data

2023 Educational Data Mining and Applications HW1.pdf

2.2(e)

Five number summary: min, Q1, median, Q3, max

minQ1medianQ3max
1320253570

2.2(f)

2.8(a)

following 2D data set
x={1.4,1.6}

A1A2
x11.51.7
x22.01.9
x31.61.8
x41.21.5
x51.51.0

Manhattan distance

A1A2distance
x1.41.60
x11.51.70.2
x41.21.50.3
x31.61.80.4
x51.51.00.5
x22.01.90.9

Euclidean distance

A1A2distancerank
x1.41.60
x11.51.70.1411
x22.01.90.6705
x31.61.80.2823
x41.21.50.2232
x51.51.00.5084

supremum distance

A1A2distancerank
x1.41.60
x11.51.70.11
x22.01.90.65
x31.61.80.23
x41.21.50.22
x51.51.00.44

cosine similarity

A1A2similarityrank
x1.41.60
x11.51.70.99991
x22.01.90.99573
x31.61.80.99992
x41.21.50.99905
x51.51.00.96534

2.8(b)

A1A2distancerank
x0.6580.7520
x10.6610.7490.00421
x20.6420.7890.04033
x30.7240.6880.09194
x40.6640.7470.00782
x50.8320.5540.26355

3.3(a)

BinData
Bin 113, 15, 16
Bin 216, 19, 20
Bin 320, 21, 22
Bin 422, 25, 25
Bin 525, 25, 30
Bin 633, 33 ,35
Bin 735, 35, 35
Bin 835, 35, 36
Bin 936, 40, 45
Bin 1046, 52, 70
BinSmoothed Data
Bin 114.67, 14.67, 14.67
Bin 218.33, 18.33, 18.33
Bin 321.00, 21.00, 21.00
Bin 424.00, 24.00, 24.00
Bin 526.67, 26.67, 26.67
Bin 633.67, 33.67, 33.67
Bin 735.00, 35.00, 35.00
Bin 835.33, 35.33, 35.33
Bin 940.33, 40.33, 40.33
Bin 1056.00, 56.00, 56.00

3.3(b)

find the outlier value using the IQR method:

3.7(a)

3.7(b)

3.8(b)

agefat
239.5
2326.5
277.8
2717.8
3931.4
4125.9
4727.4
4927.2
5031.2
5234.6
5428.8
5633.4
5730.2
5834.1
5832.9
6041.2
6135.7

3.11(a)