Importar TensorFlow

Instalamos TensorFlow. Se puede instalar con pipen Colab, pero el magic command es más rápido. También accesible en este enlace.

%tensorflow_version 2.x
import tensorflow as tf
print("You are using TensorFlow version", tf.__version__)
if len(tf.config.list_physical_devices('GPU')) > 0:
  print("You have a GPU enabled.")
else:
  print("Enable a GPU before running this notebook.")

Colab tiene varias GPUS disponibles (se asigna una aleatoria, dependiendo de la disponibilidad). Para ver tipos de GPUs, se debe ejecutar !nvidia-smi en una celda.

# In this notebook, we'll use Keras: TensorFlow's user-friendly API to 
# define neural networks. Let's import Keras now.
from tensorflow import keras
import matplotlib.pyplot as plt

Descargar el dataset de MNIST

MNIST contiene 70,000 imágenes en blanco y negro en 10 categorías. La resolución es baja (28 x 28 pixels). Siempre es importante explorar un dataset antes de usarlo.

dataset = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = dataset.load_data()

Hay 60,000 imágenes para entrenar:

print(train_images.shape)

Y 10,000 imágenes en el set de prueba:

print(test_images.shape)

Cada etiqueta es un número entero 0-9:

print(train_labels)

Preprocesar los datos

Normalizamos los valores de píxeles entre 0 y 1. Importante hacerlo tanto en el set de entrenamiento como el de prueba:

train_images = train_images / 255.0
test_images = test_images / 255.0

Vemos 25 imágenes con sus etiquetas:

plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.xlabel(train_labels[i])
plt.show()

Crear las capas

Neural networks are made up of layers. Here, you'll define the layers, and assemble them into a model. We will start with a single Dense layer.

What does a layer do?

The basic building block of a neural network is the layer. Layers extract representations from the data fed into them. For example:

The first layer in a network might receives the pixel values as input. From these, it learns to detect edges (combinations of pixels).
The next layer in the network receives edges as input, and may learn to detect lines (combinations of edges).
If you added another layer, it might learn to detect shapes (combinations of edges).

The "Deep" in "Deep Learning" refers to the depth of the network. Deeper networks can learn increasingly abstract patterns. Roughly, the width of a layer (in terms of the number of neurons) refers to the number of patterns it can learn of each type.

Most of deep learning consists of chaining together simple layers. Most layers, such as tf.keras.layers.Dense, have parameters that are initialized randomly, then tuned (or learned) during training by gradient descent.

# A linear model
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(10, activation='softmax')
])

La primera capa, tf.keras.layers.Flatten, transforma el formato de las imágenes desde un array 2D (de 28 x 28 pixels) a uno unidimensional (de 28 * 28 = 784 pixels). Es como aplanar la imagen y poner los pixels en línea. Esta capa no tiene parámetros para aprender y es necesaria porque las capas densas necesitan arrays como entrada.

Después de aplanar la imagen, el modelo tiene una única capa densa. Es una capa densa completamente conectada. La capa densa tiene 10 unidades con una activación tipo softmax, que devuelve un array con 10 notas de probabilidad que suman 1.

Después de clasificar cada imagen, cada neurona contiene una nota (puntuación) con la probabilidad de que la imagen pertenezca a uno de las 10 clases.

Compilar el modelo

Before the model is ready for training, it needs a few more settings. These are added during the model's compile step:

Loss function — This measures how accurate the model is during training. You want to minimize this function to "steer" the model in the right direction.

Optimizer — This is how the model is updated based on the data it sees and its loss function.

Metrics — Used to monitor the training and testing steps. The following example uses accuracy, the fraction of the images that are correctly classified.

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

Entrenar el modelo

Training the neural network model requires the following steps:

Feed the training data to the model. In this example, the training data is in the train_images and train_labels arrays.
The model learns to associate images and labels.
You ask the model to make predictions about a test set—in this example, the test_images array.
Verify that the predictions match the labels from the test_labels array.

To begin training, call the model.fit method — so called because it "fits" the model to the training data:

EPOCHS=10
model.fit(train_images, train_labels, epochs=EPOCHS)

As the model trains, the loss and accuracy metrics are displayed. This model reaches an accuracy of about 0.90 (or 90%) on the training data. Accuracy may be slightly different each time you run this code, since the parameters inside the Dense layer are randomly initialized.

Precisión

Next, compare how the model performs on the test dataset:

test_loss, test_acc = model.evaluate(test_images, test_labels)
print('\nTest accuracy:', test_acc)

It turns out that the accuracy on the test dataset is a little less than the accuracy on the training dataset. This gap between training accuracy and test accuracy represents overfitting. Overfitting is when a machine learning model performs worse on new, previously unseen inputs than on the training data. An overfitted model "memorizes" the training data—with less accuracy on testing data.

Realizar una predicción

Con el modelo ya entrenado, vamos a realizar una predicción sobre imágenes nuevas

predictions = model.predict(test_images)

Here, the model has predicted the label for each image in the testing set. Let's take a look at the first prediction:

print(predictions[0])

A prediction is an array of 10 numbers. They represent the model's "confidence" that the image corresponds to each of the 10 digits. You can see which label has the highest confidence value:

print(tf.argmax(predictions[0]))