AppleのWebサイト https://developer.apple.com/metal/tensorflow-plugin/ にあるResNet50 のサンプルのソースコードで実験.
名前の通り,層が深いのでCPUとGPUの差がはっきりと出るのかもしれない.
1epochの計算時間と,CPU,GPU使用率のスクリーンショット.
Tensorflowのversion: Windows native環境ではtensorflow 2.10.1,それ以外は2.12 or 2.13 を使用.
import tensorflow as tf
cifar = tf.keras.datasets.cifar100
(x_train, y_train), (x_test, y_test) = cifar.load_data()
model = tf.keras.applications.ResNet50(
include_top=True,
weights=None,
input_shape=(32, 32, 3),
classes=100,)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
model.compile(optimizer="adam", loss=loss_fn, metrics=["accuracy"])
model.fit(x_train, y_train, epochs=5, batch_size=64)
Created device /job:localhost/replica:0/task:0/device:GPU:0 with 6666 MB memory: -> device: 0, name: NVIDIA GeForce GTX 980, pci bus id: 0000:01:00.0, compute capability: 5.2 Epoch 1/5 Loaded cuDNN version 8801 StreamExecutor device (0): NVIDIA GeForce GTX 980, Compute Capability 5.2 disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable. 782/782 [==============================] - 83s 69ms/step - loss: 4.8694 - accuracy: 0.0718 Epoch 2/5 782/782 [==============================] - 53s 67ms/step - loss: 4.0766 - accuracy: 0.1277 Epoch 3/5 782/782 [==============================] - 53s 68ms/step - loss: 3.7435 - accuracy: 0.1677 Epoch 4/5 782/782 [==============================] - 54s 69ms/step - loss: 3.8044 - accuracy: 0.1817 Epoch 5/5 782/782 [==============================] - 54s 69ms/step - loss: 3.4434 - accuracy: 0.2208
uya/ResNet50.py TF-TRT Warning: Could not find TensorRT GPU:0 with 10600 MB memory: NVIDIA TITAN X (Pascal), compute capability: 6.1 Epoch 1/5 Loaded cuDNN version 8801 StreamExecutor device (0): NVIDIA TITAN X (Pascal), Compute Capability 6.1 782/782 [==============================] - 92s 72ms/step - loss: 4.7130 - accuracy: 0.0721 Epoch 2/5 782/782 [==============================] - 54s 70ms/step - loss: 4.0493 - accuracy: 0.1363 Epoch 3/5 782/782 [==============================] - 54s 69ms/step - loss: 4.1746 - accuracy: 0.1366 Epoch 4/5 782/782 [==============================] - 54s 70ms/step - loss: 3.9994 - accuracy: 0.1588 Epoch 5/5 782/782 [==============================] - 54s 69ms/step - loss: 3.9037 - accuracy: 0.1532
/device:GPU:0 with 9826 MB memory: device: 0, name: NVIDIA TITAN V, compute capability: 7.0 Epoch 1/5 Loaded cuDNN version 8800 782/782 [==============================] - 65s 66ms/step - loss: 4.8289 - accuracy: 0.0671 Epoch 2/5 782/782 [==============================] - 53s 67ms/step - loss: 4.3591 - accuracy: 0.0976 Epoch 3/5 782/782 [==============================] - 54s 69ms/step - loss: 4.1728 - accuracy: 0.1067 Epoch 4/5 782/782 [==============================] - 52s 67ms/step - loss: 3.7613 - accuracy: 0.1540 Epoch 5/5 782/782 [==============================] - 55s 70ms/step - loss: 3.8035 - accuracy: 0.1512
/device:GPU:0 with 9764 MB memory: device: 0, name: NVIDIA TITAN V, compute capability: 7.0 Epoch 1/5 Loaded cuDNN version 8801 StreamExecutor device (0): NVIDIA TITAN V, Compute Capability 7.0 782/782 [==============================] - 91s 67ms/step - loss: 4.6153 - accuracy: 0.0900 Epoch 2/5 782/782 [==============================] - 51s 66ms/step - loss: 4.5570 - accuracy: 0.0846 Epoch 3/5 782/782 [==============================] - 50s 64ms/step - loss: 4.0112 - accuracy: 0.1204 Epoch 4/5 782/782 [==============================] - 50s 64ms/step - loss: 3.6474 - accuracy: 0.1689 Epoch 5/5 782/782 [==============================] - 50s 64ms/step - loss: 4.0726 - accuracy: 0.1295
Windows Nativeより若干早いか.Tensorflowのversionが少し新しい.
Laptopでもこれくらい.さすが.
GPU:0 with 1611 MB memory: device: 0, name: NVIDIA GeForce RTX 3050 Laptop GPU, compute capability: 8.6 Epoch 1/5 Loaded cuDNN version 8801 TensorFloat-32 will be used for the matrix multiplication. This will only be logged once. StreamExecutor device (0): NVIDIA GeForce RTX 3050 Laptop GPU, Compute Capability 8.6 782/782 [==============================] - 75s 58ms/step - loss: 4.9869 - accuracy: 0.0536 Epoch 2/5 782/782 [==============================] - 42s 54ms/step - loss: 4.5081 - accuracy: 0.0774 Epoch 3/5 782/782 [==============================] - 42s 54ms/step - loss: 4.5099 - accuracy: 0.0677 Epoch 4/5 782/782 [==============================] - 42s 54ms/step - loss: 4.3222 - accuracy: 0.0631 Epoch 5/5 782/782 [==============================] - 42s 54ms/step - loss: 4.5077 - accuracy: 0.0550
Metal device set to: Apple M1 Max systemMemory: 64.00 GB maxCacheSize: 24.00 GB Epoch 1/5 Plugin optimizer for device_type GPU is enabled. 782/782 [==============================] - 66s 71ms/step - loss: 4.9320 - accuracy: 0.0548 Epoch 2/5 782/782 [==============================] - 57s 72ms/step - loss: 4.2894 - accuracy: 0.0907 Epoch 3/5 782/782 [==============================] - 56s 72ms/step - loss: 3.8890 - accuracy: 0.1406 Epoch 4/5 782/782 [==============================] - 57s 72ms/step - loss: 3.6129 - accuracy: 0.1742 Epoch 5/5 782/782 [==============================] - 55s 71ms/step - loss: 3.4545 - accuracy: 0.2035
Apple M1 16.00 GB 5.33 GB 782/782 [==============================] - 116s 143ms/step - loss: 1.9598 - accuracy: 0.3815 Epoch 2/5 782/782 [==============================] - 111s 142ms/step - loss: 1.7507 - accuracy: 0.4290 Epoch 3/5 782/782 [==============================] - 111s 142ms/step - loss: 1.8120 - accuracy: 0.4301 Epoch 4/5 782/782 [==============================] - 111s 142ms/step - loss: 1.5029 - accuracy: 0.5215 Epoch 5/5 782/782 [==============================] - 111s 142ms/step - loss: 1.7785 - accuracy: 0.4335
Metal device set to: AMD Radeon Pro 560 systemMemory: 16.00 GB Epoch 1/5 Plugin optimizer for device_type GPU is enabled. 782/782 [==============================] - 205s 214ms/step - loss: 4.5386 - accuracy: 0.0804 Epoch 2/5 782/782 [==============================] - 167s 213ms/step - loss: 4.7394 - accuracy: 0.0634 Epoch 3/5 782/782 [==============================] - 154s 197ms/step - loss: 4.1199 - accuracy: 0.0959 Epoch 4/5 782/782 [==============================] - 151s 193ms/step - loss: 4.0015 - accuracy: 0.1285 Epoch 5/5 782/782 [==============================] - 150s 192ms/step - loss: 3.6907 - accuracy: 0.1623