• 7 Bayes teoremasidan foydalanib, ijobiy yoki salbiy sinflarga mansub so‘zlarning ehtimolliklarini hisoblang.
  • 8 Test to‘pamiga asoslanib, modelning tasniflash aniqligini baholang.
  • Ma’lumotlar uchun frequency va likehood tablellarni quring




    Download 111,07 Kb.
    bet4/4
    Sana07.12.2023
    Hajmi111,07 Kb.
    #113353
    1   2   3   4
    Bog'liq
    iqtisodiyot

    6 Ma’lumotlar uchun frequency va likehood tablellarni quring.
    import pandas as pd
    from sklearn.feature_extraction.text import CountVectorizer
    from sklearn.naive_bayes import MultinomialNB
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    vectorizer = CountVectorizer()
    X_train_bow = vectorizer.fit_transform(X_train)
    X_test_bow = vectorizer.transform(X_test)
    classifier = MultinomialNB()
    classifier.fit(X_train_bow, y_train)
    features = vectorizer.get_feature_names_out()
    likelihood_table = pd.DataFrame(data={'So'zlar': features, 'Ijobiy (Positive) Likelihood': classifier.feature_log_prob_[1], 'Salbiy (Negative) Likelihood': classifier.feature_log_prob_[0]})
    likelihood_table['Ijobiy (Positive) Likelihood'] = likelihood_table['Ijobiy (Positive) Likelihood'].apply(lambda x: 2 ** x) # likelihood qiymatlarini tartibga solish
    likelihood_table['Salbiy (Negative) Likelihood'] = likelihood_table['Salbiy (Negative) Likelihood'].apply(lambda x: 2 ** x) # likelihood qiymatlarini tartibga solish

    frequency_table = pd.DataFrame(data={'So'zlar': features, 'Ijobiy (Positive) Frequency': X_train_bow[y_train == 1].sum(axis=0).tolist()[0], 'Salbiy (Negative) Frequency': X_train_bow[y_train == 0].sum(axis=0).tolist()[0]})


    frequency_table['Ijobiy (Positive) Frequency'] = frequency_table['Ijobiy (Positive) Frequency'] + 1 # Qolgan bo'lmagan so'zlarni hisobga olish
    frequency_table['Salbiy (Negative) Frequency'] = frequency_table['Salbiy (Negative) Frequency'] + 1 # Qolgan bo'lmagan so'zlarni hisobga olish
    likelihood_table.to_csv('likelihood_table.csv', index=False)
    frequency_table.to_csv('frequency_table.csv', index=False)
    7 Bayes teoremasidan foydalanib, ijobiy yoki salbiy sinflarga mansub
    so‘zlarning ehtimolliklarini hisoblang.
    jobiy_likelihoods = (frequency_table['Ijobiy (Positive) Frequency'] + 1) / (frequency_table['Ijobiy (Positive) Frequency'].sum() + len(features))
    salbiy_likelihoods = (frequency_table['Salbiy (Negative) Frequency'] + 1) / (frequency_table['Salbiy (Negative) Frequency'].sum() + len(features))
    jobiy_prior = y_train.mean()
    salbiy_prior = 1 - jobiy_prior
    total_word_likelihood = jobiy_likelihoods * jobiy_prior + salbiy_likelihoods * salbiy_prior
    jobiy_posterior = (jobiy_likelihoods * jobiy_prior) / total_word_likelihood
    salbiy_posterior = (salbiy_likelihoods * salbiy_prior) / total_word_likelihood
    result_table = pd.DataFrame(data={'So'zlar': features, 'Ijobiy (Positive) Posterior': jobiy_posterior, 'Salbiy (Negative) Posterior': salbiy_posterior})
    result_table.to_csv('result_table.csv', index=False)
    8 Test to‘pamiga asoslanib, modelning tasniflash aniqligini baholang.
    from sklearn.metrics import accuracy_score
    X_test_bow = vectorizer.transform(X_test)
    y_pred = classifier.predict(X_test_bow)
    accuracy = accuracy_score(y_test, y_pred)
    print("Accuracy:", accuracy)
    kod test ma'lumotlarini BoW formatiga o'tkazadi va modelni ishlatib, bayolangan sinflar bilan taqqoslaydi. Natijalardan "accuracy_score" metrikasi orqali aniqlikni hisoblayadi va ekranga chiqaradi. Yani, modelning qanday darajada to'g'ri tasniflashni ko'rsatdiğini aytib chiqaradi.
    Download 111,07 Kb.
    1   2   3   4




    Download 111,07 Kb.

    Bosh sahifa
    Aloqalar

        Bosh sahifa



    Ma’lumotlar uchun frequency va likehood tablellarni quring

    Download 111,07 Kb.