Modelo de ML para detectar el cancer
Enviado por bery • 13 de Octubre de 2023 • Tarea • 5.591 Palabras (23 Páginas) • 190 Visitas
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"mioti.png\" style=\"height: 100px\">\n",
"<center style=\"color:#888\">Módulo Data Science in IoT<br/>Asignatura Machine Learning</center>\n",
"# Challenge S3: Detección del cáncer"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Objetivos\n",
"\n",
"Los objetivos de este challenge es hacer un modelo capaz de detectar el cáncer."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configuración del entorno"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"\n",
"import warnings\n",
"warnings.filterwarnings(\"ignore\")\n",
"\n",
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import sklearn\n",
"from sklearn.model_selection import cross_val_score\n",
"from sklearn.dummy import DummyClassifier\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.metrics import confusion_matrix\n",
"from sklearn.metrics import precision_score, recall_score\n",
"import scikitplot as skplt"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"## Dataset\n",
"\n",
"En este caso vamos a utilizar un dataset real de análisis de cáncer de mama del Breast Cancer Center de Wisconsin. Este dataset está disponible en `sklearn` así que cargarlo es tan fácil como:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from sklearn import datasets\n",
"dataset = sklearn.datasets.load_breast_cancer()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Como siempre, una vez cargado debemos inspeccionar y comprender el dataset:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names'])\n"
]
}
],
"source": [
"print(dataset.keys())"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
...