xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40">
3 amaliy ishi mashinali oZBEKISTON RESPUBLIKASI AXBOROT TEXNOLOGIYALARI VA KOMUNIKATSIYALARINI RIVOJLANTIRISH VAZIRLIGI
Muhammad Al-Xorazmiy nomidagi Toshkent axborot texnologiyalari universiteti
3 - AMALIY ISHI
Mashinali o guruh talabasi
Bajardi : Dehqonov Dilshod
Tekshirdi : Ochilov Mannon
Toshkent -2021
Mavzu: Mashinali oqituvchisiz organish va ularni dasturlash
Men ushbu amaliy topshiriqni bajarish uchun 1-navbatda orgatuvchi tanlanma yarinishga keldi:
import numpy as np
dataset_2 = np.array([
[10, 4, 6, 6],
[15, 7, 8, 10],
[12, 5, 7, 9],
[9, 4, 5, 5],
[11, 5, 6, 7],
[20, 8, 12, 15],
[6, 1, 5, 4],
[19, 9, 10, 14],
[9, 4, 5, 6],
[15, 5, 10, 12],
[15, 7, 8, 15],
[14, 6, 8, 10],
[21, 10, 11, 12],
[24, 14, 10, 20],
[9, 4, 5, 6],
[11, 5, 6, 6],
[20, 9, 11, 16],
[12, 5, 7, 9],
[22, 10, 12, 10],
[25, 12, 13, 15],
#
[15, 10, 5, 10],
[22, 15, 7, 15],
[20, 12, 8, 10],
[8, 4, 4, 4],
[10, 5, 5, 7],
[8, 4, 4, 5],
[16, 10, 6, 8],
[10, 5, 5, 6],
[18, 10, 8, 12],
[14, 8, 6, 15],
[14, 7, 7, 10],
[6, 4, 2, 4],
[17, 9, 8, 14],
[13, 7, 6, 13],
[17, 9, 8, 15],
[20, 10, 10, 8],
[11, 7, 4, 6],
[22, 12, 10, 18],
[24, 16, 8, 16],
[14, 8, 6, 12]
])
keyin ushbu datasetni X ozlashtirib, Y olumotlarni olishi uchun ushbu kutubxonadan foydalanib DataFrame ga joylashtiramiz va uning har bir ustuni nom beramiz:
import pandas as pd
df = pd.DataFrame({
'Daromad': X[:, 0],
'Xarajat': X[:, 1],
'Sof foyda': X[:, 2],
'Shartnoma': X[:, 3],
'Class': Y[:]
})
Buning natijasi esa quyidagicha boladi:
Endi esa masalamizning asosiy qismini boshlasak ham bozgaruvchisiga ozgaruvchining qiymatlari yazimiz bergan va KMean yorliqlarni farqini bilish uchun ularning ikkalasini ham grafigini birgalikda chizamiz. Buning uchun esa Matplotlib kutubxonasidagi pyplot metodidan foydalanamiz:
import matplotlib.pyplot as plt
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.title('Orginal')
plt.xlabel('Xarajat')
plt.ylabel('Shartnoma')
plt.scatter(*X_train[Y == 0].T, s=50, alpha=0.8, label='sinf-0')
plt.scatter(*X_train[Y == 1].T, s=50, alpha=0.8, label='sinf-1')
plt.legend()
plt.subplot(1, 2, 2)
plt.title('KMeans')
plt.xlabel('Xarajat')
plt.ylabel('Shartnoma')
plt.scatter(*X_train[Y_kmean == 0].T, s=50, alpha=0.8, label='Cluster-0')
plt.scatter(*X_train[Y_kmean == 1].T, s=50, alpha=0.8, label='Cluster-1')
plt.legend()
for i in center:
plt.scatter(i[0], i[1], s=50, c='k', marker='o')
plt.show()
Grafik:
Bunda chap tomondagi biz ong tomondagisi esa KMean grafigi. Farqni kogng ushbu 2 grafikdan kelib chiqib konfutsiyon matriks va xatoliklarini topib olamiz:
from sklearn.metrics import confusion_matrix, accuracy_score
result = confusion_matrix(Y_train, Y_kmean)
print("Confusion Matrix:")
print(result)
result2 = accuracy_score(Y_train, Y_kmean)
print("Accuracy:", result2)
Natija:
2 ta xususiyatini olganimizda 57.5 % aniqlikka erishdik. Endi 4ta xususiyatdan ham foydalanib topamiz.
Bunda biz DataFrame dan mamagan nuqtalar kamligi ham shundan)
Matriks hamda aniqlik foizi:
Bunda aniqlik 47,5 % chiqdi. Lekn odatda bizda xususiyat kolumotlarni olumotlarni topolmadim. Lekn algoritmi tori degan umiddaman.
http://fayllar.org
|