4 задание сделано
This commit is contained in:
		
							parent
							
								
									15cb09a30c
								
							
						
					
					
						commit
						6d2169fdf4
					
				
							
								
								
									
										4846
									
								
								.ipynb_checkpoints/1000_ml_jobs_us-checkpoint.csv
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										4846
									
								
								.ipynb_checkpoints/1000_ml_jobs_us-checkpoint.csv
									
									
									
									
									
										Normal file
									
								
							
										
											
												File diff suppressed because one or more lines are too long
											
										
									
								
							
							
								
								
									
										380
									
								
								.ipynb_checkpoints/week4_scikit_learn-checkpoint.ipynb
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										380
									
								
								.ipynb_checkpoints/week4_scikit_learn-checkpoint.ipynb
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,380 @@
 | 
				
			|||||||
 | 
					{
 | 
				
			||||||
 | 
					 "cells": [
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "d44b76a1-962a-4a09-91d5-74f1fbefb88f",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "Цель задачи:\n",
 | 
				
			||||||
 | 
					    "Сравнить эффективность различных алгоритмов классификации из библиотеки scikit-learn на двух типах данных:\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "Синтетический датасет: fetch_rcv1() — набор новостных текстов с множественными категориями.\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "Реальный датасет: Fashion-MNIST с сайта OpenML — изображения одежды в виде числовых признаков.\n"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "code",
 | 
				
			||||||
 | 
					   "execution_count": 7,
 | 
				
			||||||
 | 
					   "id": "dec9f4c6-3bc5-4dd1-b11e-85fe059751ce",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "outputs": [
 | 
				
			||||||
 | 
					    {
 | 
				
			||||||
 | 
					     "name": "stdout",
 | 
				
			||||||
 | 
					     "output_type": "stream",
 | 
				
			||||||
 | 
					     "text": [
 | 
				
			||||||
 | 
					      "              precision    recall  f1-score   support\n",
 | 
				
			||||||
 | 
					      "\n",
 | 
				
			||||||
 | 
					      "           0       1.00      1.00      1.00         9\n",
 | 
				
			||||||
 | 
					      "           1       1.00      1.00      1.00        10\n",
 | 
				
			||||||
 | 
					      "           2       1.00      1.00      1.00        11\n",
 | 
				
			||||||
 | 
					      "\n",
 | 
				
			||||||
 | 
					      "    accuracy                           1.00        30\n",
 | 
				
			||||||
 | 
					      "   macro avg       1.00      1.00      1.00        30\n",
 | 
				
			||||||
 | 
					      "weighted avg       1.00      1.00      1.00        30\n",
 | 
				
			||||||
 | 
					      "\n"
 | 
				
			||||||
 | 
					     ]
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					   ],
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "from sklearn.datasets import load_iris\n",
 | 
				
			||||||
 | 
					    "from sklearn.model_selection import train_test_split\n",
 | 
				
			||||||
 | 
					    "from sklearn.neural_network import MLPClassifier\n",
 | 
				
			||||||
 | 
					    "from sklearn.metrics import classification_report\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "# Загрузка и разбиение данных\n",
 | 
				
			||||||
 | 
					    "X, y = load_iris(return_X_y=True)\n",
 | 
				
			||||||
 | 
					    "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "# Модель MLP — многослойный перцептрон\n",
 | 
				
			||||||
 | 
					    "clf = MLPClassifier(hidden_layer_sizes=(10,), activation='relu', max_iter=2500)\n",
 | 
				
			||||||
 | 
					    "clf.fit(X_train, y_train)\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "# Отчёт о точности\n",
 | 
				
			||||||
 | 
					    "print(classification_report(y_test, clf.predict(X_test)))"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "b8718927-08f3-4d35-b163-d31bc7a8ce7d",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "Импорт необходимых библиотек"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "code",
 | 
				
			||||||
 | 
					   "execution_count": 2,
 | 
				
			||||||
 | 
					   "id": "619b3507-5dc9-4581-b5b7-a2dc21e2656a",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "outputs": [],
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "import numpy as np\n",
 | 
				
			||||||
 | 
					    "import matplotlib.pyplot as plt\n",
 | 
				
			||||||
 | 
					    "from sklearn.datasets import fetch_rcv1\n",
 | 
				
			||||||
 | 
					    "from sklearn.decomposition import TruncatedSVD\n",
 | 
				
			||||||
 | 
					    "from sklearn.model_selection import train_test_split\n",
 | 
				
			||||||
 | 
					    "from sklearn.preprocessing import StandardScaler\n",
 | 
				
			||||||
 | 
					    "from sklearn.pipeline import make_pipeline\n",
 | 
				
			||||||
 | 
					    "from sklearn.linear_model import LogisticRegression\n",
 | 
				
			||||||
 | 
					    "from sklearn.svm import SVC\n",
 | 
				
			||||||
 | 
					    "from sklearn.neighbors import KNeighborsClassifier\n",
 | 
				
			||||||
 | 
					    "from sklearn.tree import DecisionTreeClassifier\n",
 | 
				
			||||||
 | 
					    "from sklearn.ensemble import RandomForestClassifier\n",
 | 
				
			||||||
 | 
					    "from sklearn.metrics import accuracy_score, classification_report\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "\n"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "d112584d-4642-4eea-9d47-c1310a7be009",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "2. СИНТЕТИЧЕСКИЙ ДАТАСЕТ — fetch_rcv1()"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "aecc83be-209a-4f5c-aaf1-32ea571f0820",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "2.1 Загрузка данных"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "code",
 | 
				
			||||||
 | 
					   "execution_count": 3,
 | 
				
			||||||
 | 
					   "id": "39d5b664-ee7d-444d-b964-fd3b90d8a396",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "outputs": [],
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "rcv1 = fetch_rcv1()\n",
 | 
				
			||||||
 | 
					    "X, y = rcv1.data, rcv1.target[:, 33].toarray().ravel()  # пример: метка с индексом 33\n"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "ea44ff0d-c2ef-47af-b589-55beb44d5787",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "2.2 Препроцессинг"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "code",
 | 
				
			||||||
 | 
					   "execution_count": 4,
 | 
				
			||||||
 | 
					   "id": "fc2711a3-dbb6-48ab-8e4a-06b280675f23",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "outputs": [
 | 
				
			||||||
 | 
					    {
 | 
				
			||||||
 | 
					     "ename": "NameError",
 | 
				
			||||||
 | 
					     "evalue": "name 'TruncatedSVD' is not defined",
 | 
				
			||||||
 | 
					     "output_type": "error",
 | 
				
			||||||
 | 
					     "traceback": [
 | 
				
			||||||
 | 
					      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
 | 
				
			||||||
 | 
					      "\u001b[31mNameError\u001b[39m                                 Traceback (most recent call last)",
 | 
				
			||||||
 | 
					      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[4]\u001b[39m\u001b[32m, line 2\u001b[39m\n\u001b[32m      1\u001b[39m \u001b[38;5;66;03m# Уменьшим размерность для ускорения вычислений\u001b[39;00m\n\u001b[32m----> \u001b[39m\u001b[32m2\u001b[39m svd = \u001b[43mTruncatedSVD\u001b[49m(n_components=\u001b[32m100\u001b[39m, random_state=\u001b[32m42\u001b[39m)\n\u001b[32m      3\u001b[39m X_reduced = svd.fit_transform(X)\n\u001b[32m      5\u001b[39m \u001b[38;5;66;03m# Деление на обучающую и тестовую выборки\u001b[39;00m\n",
 | 
				
			||||||
 | 
					      "\u001b[31mNameError\u001b[39m: name 'TruncatedSVD' is not defined"
 | 
				
			||||||
 | 
					     ]
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					   ],
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "# Уменьшим размерность для ускорения вычислений\n",
 | 
				
			||||||
 | 
					    "svd = TruncatedSVD(n_components=100, random_state=42)\n",
 | 
				
			||||||
 | 
					    "X_reduced = svd.fit_transform(X)\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "# Деление на обучающую и тестовую выборки\n",
 | 
				
			||||||
 | 
					    "X_train, X_test, y_train, y_test = train_test_split(X_reduced, y, test_size=0.3, random_state=42)\n"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "0d23f0b7-37e1-4644-ae01-7ee8e1674658",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "3. Обучение и сравнение классификаторов"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "code",
 | 
				
			||||||
 | 
					   "execution_count": null,
 | 
				
			||||||
 | 
					   "id": "f1baafd9-b282-4c2f-ad6f-8c2473bd29e7",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "outputs": [],
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "classifiers = {\n",
 | 
				
			||||||
 | 
					    "    \"Logistic Regression\": LogisticRegression(max_iter=1000),\n",
 | 
				
			||||||
 | 
					    "    \"Random Forest\": RandomForestClassifier(),\n",
 | 
				
			||||||
 | 
					    "    \"Linear SVM\": LinearSVC(),\n",
 | 
				
			||||||
 | 
					    "    \"KNN\": KNeighborsClassifier(),\n",
 | 
				
			||||||
 | 
					    "    \"Naive Bayes\": MultinomialNB(),\n",
 | 
				
			||||||
 | 
					    "    \"AdaBoost\": AdaBoostClassifier()\n",
 | 
				
			||||||
 | 
					    "}\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "results = {}\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "for name, clf in classifiers.items():\n",
 | 
				
			||||||
 | 
					    "    start = time.time()\n",
 | 
				
			||||||
 | 
					    "    try:\n",
 | 
				
			||||||
 | 
					    "        clf.fit(X_train, y_train)\n",
 | 
				
			||||||
 | 
					    "        y_pred = clf.predict(X_test)\n",
 | 
				
			||||||
 | 
					    "        acc = accuracy_score(y_test, y_pred)\n",
 | 
				
			||||||
 | 
					    "        duration = time.time() - start\n",
 | 
				
			||||||
 | 
					    "        results[name] = (acc, duration)\n",
 | 
				
			||||||
 | 
					    "    except Exception as e:\n",
 | 
				
			||||||
 | 
					    "        results[name] = (str(e), 0)\n"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "b283e3ee-df57-4018-873b-bdea8a75d71a",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "3.1 Визуализация результатов"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "code",
 | 
				
			||||||
 | 
					   "execution_count": null,
 | 
				
			||||||
 | 
					   "id": "6100a228-b6a8-40c9-bded-32dff19c433a",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "outputs": [],
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "import pandas as pd\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "df = pd.DataFrame(results).T\n",
 | 
				
			||||||
 | 
					    "df.columns = ['Accuracy', 'Time (s)']\n",
 | 
				
			||||||
 | 
					    "df.sort_values('Accuracy', ascending=False, inplace=True)\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "df.plot(kind='bar', figsize=(10, 6), legend=True, title=\"Сравнение классификаторов (RCV1)\")\n",
 | 
				
			||||||
 | 
					    "plt.ylabel(\"Accuracy / Time\")\n",
 | 
				
			||||||
 | 
					    "plt.grid()\n",
 | 
				
			||||||
 | 
					    "plt.xticks(rotation=45)\n",
 | 
				
			||||||
 | 
					    "plt.show()\n"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "3614904f-d3d1-40e5-9ad6-f308117c2a7f",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "4. Интерпретация результатов\n",
 | 
				
			||||||
 | 
					    "Лучшими алгоритмами стали: Logistic Regression и LinearSVC, показывающие хорошую точность и быструю работу на текстовых данных.\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "Naive Bayes работает особенно быстро, но точность ограничена.\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "Использование SVD дало возможность обрабатывать разреженную матрицу RCV1."
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "b4c25059-3ee5-49a2-8b1a-3d6c5af8ff98",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "5. РЕАЛЬНЫЙ ДАТАСЕТ — Fashion MNIST"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "1a1abebd-43f3-46e1-8bf2-a996d986f8b7",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "5.1 Загрузка с OpenML"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "code",
 | 
				
			||||||
 | 
					   "execution_count": null,
 | 
				
			||||||
 | 
					   "id": "895c0577-7e18-42fb-9dae-c89977b6f7c8",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "outputs": [],
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "import openml\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "dataset = openml.datasets.get_dataset(40996)  # Fashion-MNIST\n",
 | 
				
			||||||
 | 
					    "X, y, _, _ = dataset.get_data(target=dataset.default_target_attribute)\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "X = X.astype('float32')\n",
 | 
				
			||||||
 | 
					    "y = y.astype('int')\n"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "17edbf86-7b62-4161-90c9-7e2e0272b99c",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "5.2 Препроцессинг"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "code",
 | 
				
			||||||
 | 
					   "execution_count": null,
 | 
				
			||||||
 | 
					   "id": "27e1ef0a-f61b-4bab-b5ac-b86e7c9c86d8",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "outputs": [],
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "scaler = StandardScaler()\n",
 | 
				
			||||||
 | 
					    "X_scaled = scaler.fit_transform(X)\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.3, random_state=42)\n"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "fd791fde-6950-444f-912f-c2633a6d4e9a",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "5.3 Обучение моделей и сравнение"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "code",
 | 
				
			||||||
 | 
					   "execution_count": null,
 | 
				
			||||||
 | 
					   "id": "446d1cf4-bf93-4262-8986-9083c449dacf",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "outputs": [],
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "results_real = {}\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "for name, clf in classifiers.items():\n",
 | 
				
			||||||
 | 
					    "    start = time.time()\n",
 | 
				
			||||||
 | 
					    "    try:\n",
 | 
				
			||||||
 | 
					    "        clf.fit(X_train, y_train)\n",
 | 
				
			||||||
 | 
					    "        y_pred = clf.predict(X_test)\n",
 | 
				
			||||||
 | 
					    "        acc = accuracy_score(y_test, y_pred)\n",
 | 
				
			||||||
 | 
					    "        duration = time.time() - start\n",
 | 
				
			||||||
 | 
					    "        results_real[name] = (acc, duration)\n",
 | 
				
			||||||
 | 
					    "    except Exception as e:\n",
 | 
				
			||||||
 | 
					    "        results_real[name] = (str(e), 0)\n"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "d4c125cd-1118-46a2-b0d1-1f1dc1e7161b",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "5.4 Визуализация результатов"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "code",
 | 
				
			||||||
 | 
					   "execution_count": null,
 | 
				
			||||||
 | 
					   "id": "11aa22f3-c21f-4dee-a090-fadbc2bdec71",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "outputs": [],
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "df_real = pd.DataFrame(results_real).T\n",
 | 
				
			||||||
 | 
					    "df_real.columns = ['Accuracy', 'Time (s)']\n",
 | 
				
			||||||
 | 
					    "df_real.sort_values('Accuracy', ascending=False, inplace=True)\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "df_real.plot(kind='bar', figsize=(10, 6), legend=True, title=\"Сравнение классификаторов (Fashion-MNIST)\")\n",
 | 
				
			||||||
 | 
					    "plt.ylabel(\"Accuracy / Time\")\n",
 | 
				
			||||||
 | 
					    "plt.grid()\n",
 | 
				
			||||||
 | 
					    "plt.xticks(rotation=45)\n",
 | 
				
			||||||
 | 
					    "plt.show()\n"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "5b0544af-af7b-40e7-a4e2-6d1fc31bf7d6",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "6. Интерпретация результатов"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "c8be62a9-149c-4f27-902d-b3dc87e99089",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "Random Forest и Logistic Regression показали лучшие результаты на изображениях.\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "KNN оказался медленным на таком объеме данных, но может быть эффективен после отбора признаков.\n",
 | 
				
			||||||
 | 
					    "\n",
 | 
				
			||||||
 | 
					    "LinearSVC работает быстро, но может потребовать настройки регуляризации для повышения точности."
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  }
 | 
				
			||||||
 | 
					 ],
 | 
				
			||||||
 | 
					 "metadata": {
 | 
				
			||||||
 | 
					  "kernelspec": {
 | 
				
			||||||
 | 
					   "display_name": "Python 3 (ipykernel)",
 | 
				
			||||||
 | 
					   "language": "python",
 | 
				
			||||||
 | 
					   "name": "python3"
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
 | 
					  "language_info": {
 | 
				
			||||||
 | 
					   "codemirror_mode": {
 | 
				
			||||||
 | 
					    "name": "ipython",
 | 
				
			||||||
 | 
					    "version": 3
 | 
				
			||||||
 | 
					   },
 | 
				
			||||||
 | 
					   "file_extension": ".py",
 | 
				
			||||||
 | 
					   "mimetype": "text/x-python",
 | 
				
			||||||
 | 
					   "name": "python",
 | 
				
			||||||
 | 
					   "nbconvert_exporter": "python",
 | 
				
			||||||
 | 
					   "pygments_lexer": "ipython3",
 | 
				
			||||||
 | 
					   "version": "3.13.3"
 | 
				
			||||||
 | 
					  }
 | 
				
			||||||
 | 
					 },
 | 
				
			||||||
 | 
					 "nbformat": 4,
 | 
				
			||||||
 | 
					 "nbformat_minor": 5
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
							
								
								
									
										4846
									
								
								1000_ml_jobs_us.csv
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										4846
									
								
								1000_ml_jobs_us.csv
									
									
									
									
									
										Normal file
									
								
							
										
											
												File diff suppressed because one or more lines are too long
											
										
									
								
							
							
								
								
									
										509
									
								
								week4_scikit_learn.ipynb
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										509
									
								
								week4_scikit_learn.ipynb
									
									
									
									
									
										Normal file
									
								
							
										
											
												File diff suppressed because one or more lines are too long
											
										
									
								
							
		Loading…
	
		Reference in New Issue
	
	Block a user