|
Image Recognition
|
bet | 131/182 | Sana | 19.05.2024 | Hajmi | 5,69 Mb. | | #244351 |
Bog'liq Python sun\'iy intellekt texnologiyasi Dasrlik 20243. PySpark MLlib:
PySpark MLlib orqali mashina o‘qitish algoritmlarini ishlatish:
from pyspark.sql import SparkSession
from pyspark.ml.feature import VectorAssembler
from pyspark.ml.regression import LinearRegression # Spark session yaratish
spark = SparkSession.builder.appName("example").getOrCreate()
# Ma’lumotlarni yuklab olish
data = spark.read.csv("path/to/data.csv", header=True, inferSchema=True)
# X va y larni tayyorlash
feature_cols = ["feature1", "feature2"]
assembler = VectorAssembler(inputCols=feature_cols, outputCol="features")
data = assembler.transform(data).select("features", "label")
# Modelni o‘rnating va o‘qitish
lr = LinearRegression()
model = lr.fit(data) # Modelni baholash
print("Coefficient: ", model.coefficients)
print("Intercept: ", model.intercept)
Bu misollar sizga Python bilan Big Data loyihalarida ishlash uchun ko‘proq yordam berishi mumkin. Lekin, ular quyidagi vaziyatlarda mos kelmagan bo‘lishi mumkin: ma’lumotlar hajmi juda katta bo‘lishi, distributiv qulaylash texnologiyalari yoki ML algoritmlarini foydalanishni talab etish. Iltimos, o‘zingizning maqsadlaringiz va tizmingiz uchun mos keladigan misollarni o‘rganib chiqishingizga salmoqchi.
Mashinali o‘qitishni Big Data loyihalariga integratsiya qilish uchun PySpark kutubxonasidan foydalanish quyidagi bosqichlar orqali amalga oshiriladi:
1. PySparkni o‘rnatish:
PySpark-dan foydalanish uchun avval, PySpark kutubxonasi o‘rnatilishi kerak.
pip install pyspark
2. PySpark yaratish:
from pyspark.sql import SparkSession # Spark session yaratish
spark = SparkSession.builder.appName("example").getOrCreate()
3. Ma’lumotni yuklash:
Ma’lumotni yuklash uchun PySpark DataFrame ishlatiladi. Ma’lumotni CSV, Parquet, Avro, JSON yoki boshqa formatlarda yuklash mumkin.
# CSV dan ma’lumot yuklash
data = spark.read.csv("path/to/data.csv", header=True, inferSchema=True)
|
| |