{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "view-in-github"
},
"source": [
""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "z9Fa0V1T7AW9"
},
"outputs": [],
"source": [
"# Copyright (c) Facebook, Inc. and its affiliates. All rights reserved."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6XzxTZfKwFNo"
},
"source": [
"# Benchmark Full-Finetuning on ImageNet-1K\n",
"\n",
"In this tutorial, we look at a simple example of how to use VISSL to run full finetuning benchmark for a [ResNet-50 Torchvision pre-trained model](https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py#L16). This benchmark initializes the model trunk, attaches a linear classification head on top of the trunk features and trains the full model.\n",
"\n",
"You can make a copy of this tutorial by `File -> Open in playground mode` and make changes there. Please do *NOT* request access to this tutorial.\n",
"\n",
"**NOTE:** Please ensure your Collab Notebook has a GPU available. To ensure this, simply follow: `Edit -> Notebook Settings -> select GPU.`"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VohdWhBSw69e"
},
"source": [
"## Install VISSL\n",
"\n",
"Installing VISSL is pretty straightfoward. We will use pip binaries of VISSL and follow instructions from [here](https://github.com/facebookresearch/vissl/blob/master/INSTALL.md#install-vissl-pip-package)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "R5ISg59KTOqU"
},
"outputs": [],
"source": [
"# Install pytorch version 1.8\n",
"!pip install torch==1.8.0+cu101 torchvision==0.9.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html\n",
"\n",
"# install Apex by checking system settings: cuda version, pytorch version, and python version\n",
"import sys\n",
"import torch\n",
"version_str=\"\".join([\n",
" f\"py3{sys.version_info.minor}_cu\",\n",
" torch.version.cuda.replace(\".\",\"\"),\n",
" f\"_pyt{torch.__version__[0:5:2]}\"\n",
"])\n",
"print(version_str)\n",
"\n",
"# install apex (pre-compiled with optimizer C++ extensions and CUDA kernels)\n",
"!pip install apex -f https://dl.fbaipublicfiles.com/vissl/packaging/apexwheels/{version_str}/download.html\n",
"\n",
"# # clone vissl repository and checkout latest version.\n",
"!git clone --recursive https://github.com/facebookresearch/vissl.git\n",
"\n",
"%cd vissl/\n",
"\n",
"!git checkout v0.1.6\n",
"!git checkout -b v0.1.6\n",
"\n",
"# install vissl dependencies\n",
"!pip install --progress-bar off -r requirements.txt\n",
"!pip install opencv-python\n",
"\n",
"# update classy vision install to commit compatible with v0.1.6\n",
"!pip uninstall -y classy_vision\n",
"!pip install classy-vision@https://github.com/facebookresearch/ClassyVision/tarball/4785d5ee19d3bcedd5b28c1eb51ea1f59188b54d\n",
"\n",
"# Update fairscale to commit compatible with v0.1.6\n",
"!pip uninstall -y fairscale\n",
"!pip install fairscale@https://github.com/facebookresearch/fairscale/tarball/df7db85cef7f9c30a5b821007754b96eb1f977b6\n",
"\n",
"# install vissl dev mode (e stands for editable)\n",
"!pip install -e .[dev]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "u6Fxe3MWxqsI"
},
"source": [
"VISSL should be successfuly installed by now and all the dependencies should be available."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Np6atgoOTPrA"
},
"outputs": [],
"source": [
"import vissl\n",
"import tensorboard\n",
"import apex\n",
"import torch"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IxMXLYLpsJXj"
},
"source": [
"## Download the ResNet-50 weights from Torchvision\n",
"\n",
"We download the weights from the [torchvision ResNet50 model](https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py#L16):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "mv0quZwFsWxs"
},
"outputs": [],
"source": [
"!wget https://download.pytorch.org/models/resnet50-19c8e357.pth -P /content/"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "J0hng2EPY7pr"
},
"source": [
"## Creating a custom data\n",
"\n",
"For the purpose of this tutorial, since we don't have ImageNet on the disk, we will create a dummy dataset by copying an image from COCO dataset in ImageNet dataset folder style as below:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "5-sy6nD-RfwB"
},
"outputs": [],
"source": [
"!mkdir -p /content/dummy_data/train/class1\n",
"!mkdir -p /content/dummy_data/train/class2\n",
"!mkdir -p /content/dummy_data/val/class1\n",
"!mkdir -p /content/dummy_data/val/class2\n",
"\n",
"# create 2 classes in train and add 5 images per class\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/train/class1/img1.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/train/class1/img2.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/train/class1/img3.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/train/class1/img4.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/train/class1/img5.jpg\n",
"\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/train/class2/img1.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/train/class2/img2.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/train/class2/img3.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/train/class2/img4.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/train/class2/img5.jpg\n",
"\n",
"# create 2 classes in val and add 5 images per class\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/val/class1/img1.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/val/class1/img2.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/val/class1/img3.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/val/class1/img4.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/val/class1/img5.jpg\n",
"\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/val/class2/img1.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/val/class2/img2.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/val/class2/img3.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/val/class2/img4.jpg\n",
"!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O /content/dummy_data/val/class2/img5.jpg\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "KPGCiTsXZeW3"
},
"source": [
"## Using the custom data in VISSL\n",
"\n",
"Next step for us is to register the dummy data we created above with VISSL. Registering the dataset involves telling VISSL about the dataset name and the paths for the dataset. For this, we create a simple json file with the metadata and save it to `configs/config/dataset_catalog.py` file.\n",
"\n",
"**NOTE**: VISSL uses the specific `dataset_catalog.json` under the path `configs/config/dataset_catalog.json`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "M8Q6LCqaWjl1"
},
"outputs": [],
"source": [
"json_data = {\n",
" \"dummy_data_folder\": {\n",
" \"train\": [\n",
" \"/content/dummy_data/train\", \"/content/dummy_data/train\"\n",
" ],\n",
" \"val\": [\n",
" \"/content/dummy_data/val\", \"/content/dummy_data/val\"\n",
" ]\n",
" }\n",
"}\n",
"\n",
"# use VISSL's api to save or you can use your custom code.\n",
"from vissl.utils.io import save_file\n",
"save_file(json_data, \"/content/vissl/configs/config/dataset_catalog.json\", append_to_json=False)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "otN1pB32cBHK"
},
"source": [
"Next, we verify that the dataset is registered with VISSL. For that we query VISSL's dataset catalog as below:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "wZBhH-s5bcHd",
"outputId": "a7ff5917-803f-441e-c388-1b07af2a0a2f"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:fvcore.common.file_io:** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"['dummy_data_folder']\n",
"{'train': ['/content/dummy_data/train', '/content/dummy_data/train'], 'val': ['/content/dummy_data/val', '/content/dummy_data/val']}\n"
]
}
],
"source": [
"from vissl.data.dataset_catalog import VisslDatasetCatalog\n",
"\n",
"# list all the datasets that exist in catalog\n",
"print(VisslDatasetCatalog.list())\n",
"\n",
"# get the metadata of dummy_data_folder dataset\n",
"print(VisslDatasetCatalog.get(\"dummy_data_folder\"))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "YaUMDwMdzYHN"
},
"source": [
"## Run Full-Finetuning\n",
"\n",
"VISSL provides yaml configuration files that reproduces training of all self-supervised approaches [here](https://github.com/facebookresearch/vissl/tree/master/configs/config/pretrain). For the purpose of this tutorial, we will use [this config file](https://github.com/facebookresearch/vissl/blob/master/configs/config/benchmark/imagenet1k_fulltune/eval_resnet_8gpu_transfer_in1k_fulltune.yaml) for full-finetuning a ResNet-50 supervised model on 1-gpu. Let's go ahead and download the [example config file](https://github.com/facebookresearch/vissl/blob/master/configs/config/benchmark/imagenet1k_fulltune/eval_resnet_8gpu_transfer_in1k_fulltune.yaml).\n",
"\n",
"\n",
"VISSL provides a [helper python tool](https://github.com/facebookresearch/vissl/blob/main/tools/run_distributed_engines.py) that allows you to train models based on our configuration system. This tool allows:\n",
"- training and feature extraction.\n",
"- training on 1-gpu, multi-gpu, or even multi-machine using Pytorch DDP or Fairscale FSDP.\n",
"- allows training and feature extraction both using VISSL. \n",
"- also allows training on 1-gpu or multi-gpu. \n",
"- can be used to launch multi-machine distributed training.\n",
"\n",
"We are ready to run the full-finetuning. For the purpose of this tutorial, we will use synthetic dataset and train on dummy images. VISSL supports training on wide range of datasets and allows adding custom datasets. Please see VISSL documentation on how to use the datasets. To train on ImageNet instead: assuming your ImageNet dataset folder path is `/path/to/my/imagenet/folder/`, you can add the following command line \n",
"input to your training command: \n",
"```\n",
"config.DATA.TRAIN.DATASET_NAMES=[imagenet1k_folder] \\\n",
"config.DATA.TRAIN.DATA_SOURCES=[disk_folder] \\\n",
"config.DATA.TRAIN.DATA_PATHS=[\"/path/to/my/imagenet/folder/train\"] \\\n",
"config.DATA.TRAIN.LABEL_SOURCES=[disk_folder]\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fM7IigSpONW0"
},
"source": [
"The training command looks like:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "6v0HvauIj9S2",
"outputId": "2f9d3a67-14d9-44a0-c27c-aca49d31ef9d"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"####### overrides: ['hydra.verbose=true', 'config=benchmark/fulltune/imagenet1k/eval_resnet_8gpu_transfer_in1k_fulltune.yaml', 'config.DATA.TRAIN.DATA_SOURCES=[disk_folder]', 'config.DATA.TRAIN.LABEL_SOURCES=[disk_folder]', 'config.DATA.TRAIN.DATASET_NAMES=[dummy_data_folder]', 'config.DATA.TRAIN.BATCHSIZE_PER_REPLICA=2', 'config.DATA.TEST.DATA_SOURCES=[disk_folder]', 'config.DATA.TEST.LABEL_SOURCES=[disk_folder]', 'config.DATA.TEST.DATASET_NAMES=[dummy_data_folder]', 'config.DATA.TEST.BATCHSIZE_PER_REPLICA=2', 'config.OPTIMIZER.num_epochs=2', 'config.OPTIMIZER.param_schedulers.lr.values=[0.01,0.001]', 'config.OPTIMIZER.param_schedulers.lr.milestones=[1]', 'config.DISTRIBUTED.NUM_NODES=1', 'config.DISTRIBUTED.NUM_PROC_PER_NODE=1', 'config.CHECKPOINT.DIR=/content/checkpoints', 'config.MODEL.WEIGHTS_INIT.PARAMS_FILE=/content/resnet50-19c8e357.pth', 'config.MODEL.WEIGHTS_INIT.APPEND_PREFIX=trunk._feature_blocks.', 'config.MODEL.WEIGHTS_INIT.STATE_DICT_KEY_NAME=', 'hydra.verbose=true']\n",
"INFO 2021-10-18 00:57:13,053 distributed_launcher.py: 184: Spawning process for node_id: 0, local_rank: 0, dist_rank: 0, dist_run_id: localhost:57775\n",
"INFO 2021-10-18 00:57:13,053 train.py: 94: Env set for rank: 0, dist_rank: 0\n",
"INFO 2021-10-18 00:57:13,053 env.py: 50: CLICOLOR:\t1\n",
"INFO 2021-10-18 00:57:13,054 env.py: 50: CLOUDSDK_CONFIG:\t/content/.config\n",
"INFO 2021-10-18 00:57:13,054 env.py: 50: CLOUDSDK_PYTHON:\tpython3\n",
"INFO 2021-10-18 00:57:13,054 env.py: 50: COLAB_GPU:\t1\n",
"INFO 2021-10-18 00:57:13,054 env.py: 50: CUDA_VERSION:\t11.1.1\n",
"INFO 2021-10-18 00:57:13,054 env.py: 50: CUDNN_VERSION:\t8.0.5.39\n",
"INFO 2021-10-18 00:57:13,054 env.py: 50: DATALAB_SETTINGS_OVERRIDES:\t{\"kernelManagerProxyPort\":6000,\"kernelManagerProxyHost\":\"172.28.0.3\",\"jupyterArgs\":[\"--ip=\\\"172.28.0.2\\\"\"],\"debugAdapterMultiplexerPath\":\"/usr/local/bin/dap_multiplexer\",\"enableLsp\":true}\n",
"INFO 2021-10-18 00:57:13,054 env.py: 50: DEBIAN_FRONTEND:\tnoninteractive\n",
"INFO 2021-10-18 00:57:13,054 env.py: 50: ENV:\t/root/.bashrc\n",
"INFO 2021-10-18 00:57:13,055 env.py: 50: GCE_METADATA_TIMEOUT:\t0\n",
"INFO 2021-10-18 00:57:13,055 env.py: 50: GCS_READ_CACHE_BLOCK_SIZE_MB:\t16\n",
"INFO 2021-10-18 00:57:13,055 env.py: 50: GIT_PAGER:\tcat\n",
"INFO 2021-10-18 00:57:13,055 env.py: 50: GLIBCPP_FORCE_NEW:\t1\n",
"INFO 2021-10-18 00:57:13,055 env.py: 50: GLIBCXX_FORCE_NEW:\t1\n",
"INFO 2021-10-18 00:57:13,055 env.py: 50: HOME:\t/root\n",
"INFO 2021-10-18 00:57:13,055 env.py: 50: HOSTNAME:\t0440442413ae\n",
"INFO 2021-10-18 00:57:13,055 env.py: 50: JPY_PARENT_PID:\t67\n",
"INFO 2021-10-18 00:57:13,055 env.py: 50: LANG:\ten_US.UTF-8\n",
"INFO 2021-10-18 00:57:13,056 env.py: 50: LAST_FORCED_REBUILD:\t20211007\n",
"INFO 2021-10-18 00:57:13,056 env.py: 50: LD_LIBRARY_PATH:\t/usr/lib64-nvidia\n",
"INFO 2021-10-18 00:57:13,056 env.py: 50: LD_PRELOAD:\t/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4\n",
"INFO 2021-10-18 00:57:13,056 env.py: 50: LIBRARY_PATH:\t/usr/local/cuda/lib64/stubs\n",
"INFO 2021-10-18 00:57:13,056 env.py: 50: LOCAL_RANK:\t0\n",
"INFO 2021-10-18 00:57:13,056 env.py: 50: MPLBACKEND:\tmodule://ipykernel.pylab.backend_inline\n",
"INFO 2021-10-18 00:57:13,056 env.py: 50: NCCL_VERSION:\t2.7.8\n",
"INFO 2021-10-18 00:57:13,056 env.py: 50: NO_GCE_CHECK:\tTrue\n",
"INFO 2021-10-18 00:57:13,056 env.py: 50: NVIDIA_DRIVER_CAPABILITIES:\tcompute,utility\n",
"INFO 2021-10-18 00:57:13,057 env.py: 50: NVIDIA_REQUIRE_CUDA:\tcuda>=11.1 brand=tesla,driver>=418,driver<419 brand=tesla,driver>=440,driver<441 brand=tesla,driver>=450,driver<451\n",
"INFO 2021-10-18 00:57:13,057 env.py: 50: NVIDIA_VISIBLE_DEVICES:\tall\n",
"INFO 2021-10-18 00:57:13,057 env.py: 50: OLDPWD:\t/\n",
"INFO 2021-10-18 00:57:13,057 env.py: 50: PAGER:\tcat\n",
"INFO 2021-10-18 00:57:13,057 env.py: 50: PATH:\t/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tools/node/bin:/tools/google-cloud-sdk/bin:/opt/bin\n",
"INFO 2021-10-18 00:57:13,057 env.py: 50: PWD:\t/content/vissl\n",
"INFO 2021-10-18 00:57:13,057 env.py: 50: PYDEVD_USE_FRAME_EVAL:\tNO\n",
"INFO 2021-10-18 00:57:13,057 env.py: 50: PYTHONPATH:\t/env/python\n",
"INFO 2021-10-18 00:57:13,057 env.py: 50: PYTHONWARNINGS:\tignore:::pip._internal.cli.base_command\n",
"INFO 2021-10-18 00:57:13,057 env.py: 50: RANK:\t0\n",
"INFO 2021-10-18 00:57:13,058 env.py: 50: SHELL:\t/bin/bash\n",
"INFO 2021-10-18 00:57:13,058 env.py: 50: SHLVL:\t1\n",
"INFO 2021-10-18 00:57:13,058 env.py: 50: TBE_CREDS_ADDR:\t172.28.0.1:8008\n",
"INFO 2021-10-18 00:57:13,058 env.py: 50: TERM:\txterm-color\n",
"INFO 2021-10-18 00:57:13,058 env.py: 50: TF_FORCE_GPU_ALLOW_GROWTH:\ttrue\n",
"INFO 2021-10-18 00:57:13,058 env.py: 50: WORLD_SIZE:\t1\n",
"INFO 2021-10-18 00:57:13,058 env.py: 50: _:\t/usr/bin/python3\n",
"INFO 2021-10-18 00:57:13,058 env.py: 50: __EGL_VENDOR_LIBRARY_DIRS:\t/usr/lib64-nvidia:/usr/share/glvnd/egl_vendor.d/\n",
"INFO 2021-10-18 00:57:13,058 misc.py: 161: Set start method of multiprocessing to forkserver\n",
"INFO 2021-10-18 00:57:13,059 train.py: 105: Setting seed....\n",
"INFO 2021-10-18 00:57:13,059 misc.py: 173: MACHINE SEED: 2\n",
"INFO 2021-10-18 00:57:13,061 hydra_config.py: 131: Training with config:\n",
"INFO 2021-10-18 00:57:13,068 hydra_config.py: 140: {'CHECKPOINT': {'APPEND_DISTR_RUN_ID': False,\n",
" 'AUTO_RESUME': True,\n",
" 'BACKEND': 'disk',\n",
" 'CHECKPOINT_FREQUENCY': 1,\n",
" 'CHECKPOINT_ITER_FREQUENCY': -1,\n",
" 'DIR': '/content/checkpoints',\n",
" 'LATEST_CHECKPOINT_RESUME_FILE_NUM': 1,\n",
" 'OVERWRITE_EXISTING': False,\n",
" 'USE_SYMLINK_CHECKPOINT_FOR_RESUME': False},\n",
" 'CLUSTERFIT': {'CLUSTER_BACKEND': 'faiss',\n",
" 'DATA_LIMIT': -1,\n",
" 'DATA_LIMIT_SAMPLING': {'SEED': 0},\n",
" 'FEATURES': {'DATASET_NAME': '',\n",
" 'DATA_PARTITION': 'TRAIN',\n",
" 'DIMENSIONALITY_REDUCTION': 0,\n",
" 'EXTRACT': False,\n",
" 'LAYER_NAME': '',\n",
" 'PATH': '.',\n",
" 'TEST_PARTITION': 'TEST'},\n",
" 'NUM_CLUSTERS': 16000,\n",
" 'NUM_ITER': 50,\n",
" 'OUTPUT_DIR': '.'},\n",
" 'DATA': {'DDP_BUCKET_CAP_MB': 25,\n",
" 'ENABLE_ASYNC_GPU_COPY': True,\n",
" 'NUM_DATALOADER_WORKERS': 5,\n",
" 'PIN_MEMORY': True,\n",
" 'TEST': {'BASE_DATASET': 'generic_ssl',\n",
" 'BATCHSIZE_PER_REPLICA': 2,\n",
" 'COLLATE_FUNCTION': 'default_collate',\n",
" 'COLLATE_FUNCTION_PARAMS': {},\n",
" 'COPY_DESTINATION_DIR': '/tmp/imagenet1k/',\n",
" 'COPY_TO_LOCAL_DISK': False,\n",
" 'DATASET_NAMES': ['dummy_data_folder'],\n",
" 'DATA_LIMIT': -1,\n",
" 'DATA_LIMIT_SAMPLING': {'IS_BALANCED': False,\n",
" 'SEED': 0,\n",
" 'SKIP_NUM_SAMPLES': 0},\n",
" 'DATA_PATHS': [],\n",
" 'DATA_SOURCES': ['disk_folder'],\n",
" 'DEFAULT_GRAY_IMG_SIZE': 224,\n",
" 'DROP_LAST': False,\n",
" 'ENABLE_QUEUE_DATASET': False,\n",
" 'INPUT_KEY_NAMES': ['data'],\n",
" 'LABEL_PATHS': [],\n",
" 'LABEL_SOURCES': ['disk_folder'],\n",
" 'LABEL_TYPE': 'standard',\n",
" 'MMAP_MODE': True,\n",
" 'NEW_IMG_PATH_PREFIX': '',\n",
" 'RANDOM_SYNTHETIC_IMAGES': False,\n",
" 'REMOVE_IMG_PATH_PREFIX': '',\n",
" 'TARGET_KEY_NAMES': ['label'],\n",
" 'TRANSFORMS': [{'name': 'Resize', 'size': 256},\n",
" {'name': 'CenterCrop', 'size': 224},\n",
" {'name': 'ToTensor'},\n",
" {'mean': [0.485, 0.456, 0.406],\n",
" 'name': 'Normalize',\n",
" 'std': [0.229, 0.224, 0.225]}],\n",
" 'USE_DEBUGGING_SAMPLER': False,\n",
" 'USE_STATEFUL_DISTRIBUTED_SAMPLER': False},\n",
" 'TRAIN': {'BASE_DATASET': 'generic_ssl',\n",
" 'BATCHSIZE_PER_REPLICA': 2,\n",
" 'COLLATE_FUNCTION': 'default_collate',\n",
" 'COLLATE_FUNCTION_PARAMS': {},\n",
" 'COPY_DESTINATION_DIR': '/tmp/imagenet1k/',\n",
" 'COPY_TO_LOCAL_DISK': False,\n",
" 'DATASET_NAMES': ['dummy_data_folder'],\n",
" 'DATA_LIMIT': -1,\n",
" 'DATA_LIMIT_SAMPLING': {'IS_BALANCED': False,\n",
" 'SEED': 0,\n",
" 'SKIP_NUM_SAMPLES': 0},\n",
" 'DATA_PATHS': [],\n",
" 'DATA_SOURCES': ['disk_folder'],\n",
" 'DEFAULT_GRAY_IMG_SIZE': 224,\n",
" 'DROP_LAST': False,\n",
" 'ENABLE_QUEUE_DATASET': False,\n",
" 'INPUT_KEY_NAMES': ['data'],\n",
" 'LABEL_PATHS': [],\n",
" 'LABEL_SOURCES': ['disk_folder'],\n",
" 'LABEL_TYPE': 'standard',\n",
" 'MMAP_MODE': True,\n",
" 'NEW_IMG_PATH_PREFIX': '',\n",
" 'RANDOM_SYNTHETIC_IMAGES': False,\n",
" 'REMOVE_IMG_PATH_PREFIX': '',\n",
" 'TARGET_KEY_NAMES': ['label'],\n",
" 'TRANSFORMS': [{'name': 'RandomResizedCrop', 'size': 224},\n",
" {'name': 'RandomHorizontalFlip'},\n",
" {'name': 'ToTensor'},\n",
" {'mean': [0.485, 0.456, 0.406],\n",
" 'name': 'Normalize',\n",
" 'std': [0.229, 0.224, 0.225]}],\n",
" 'USE_DEBUGGING_SAMPLER': False,\n",
" 'USE_STATEFUL_DISTRIBUTED_SAMPLER': False}},\n",
" 'DISTRIBUTED': {'BACKEND': 'nccl',\n",
" 'BROADCAST_BUFFERS': True,\n",
" 'INIT_METHOD': 'tcp',\n",
" 'MANUAL_GRADIENT_REDUCTION': False,\n",
" 'NCCL_DEBUG': False,\n",
" 'NCCL_SOCKET_NTHREADS': '',\n",
" 'NUM_NODES': 1,\n",
" 'NUM_PROC_PER_NODE': 1,\n",
" 'RUN_ID': 'auto'},\n",
" 'EXTRACT_FEATURES': {'CHUNK_THRESHOLD': 0, 'OUTPUT_DIR': ''},\n",
" 'HOOKS': {'LOG_GPU_STATS': True,\n",
" 'MEMORY_SUMMARY': {'DUMP_MEMORY_ON_EXCEPTION': False,\n",
" 'LOG_ITERATION_NUM': 0,\n",
" 'PRINT_MEMORY_SUMMARY': True},\n",
" 'MODEL_COMPLEXITY': {'COMPUTE_COMPLEXITY': False,\n",
" 'INPUT_SHAPE': [3, 224, 224]},\n",
" 'PERF_STATS': {'MONITOR_PERF_STATS': True,\n",
" 'PERF_STAT_FREQUENCY': -1,\n",
" 'ROLLING_BTIME_FREQ': -1},\n",
" 'TENSORBOARD_SETUP': {'EXPERIMENT_LOG_DIR': 'tensorboard',\n",
" 'FLUSH_EVERY_N_MIN': 5,\n",
" 'LOG_DIR': '.',\n",
" 'LOG_PARAMS': True,\n",
" 'LOG_PARAMS_EVERY_N_ITERS': 310,\n",
" 'LOG_PARAMS_GRADIENTS': True,\n",
" 'USE_TENSORBOARD': False}},\n",
" 'IMG_RETRIEVAL': {'CROP_QUERY_ROI': False,\n",
" 'DATASET_PATH': '',\n",
" 'DEBUG_MODE': False,\n",
" 'EVAL_BINARY_PATH': '',\n",
" 'EVAL_DATASET_NAME': 'Paris',\n",
" 'FEATS_PROCESSING_TYPE': '',\n",
" 'GEM_POOL_POWER': 4.0,\n",
" 'IMG_SCALINGS': [1],\n",
" 'NORMALIZE_FEATURES': True,\n",
" 'NUM_DATABASE_SAMPLES': -1,\n",
" 'NUM_QUERY_SAMPLES': -1,\n",
" 'NUM_TRAINING_SAMPLES': -1,\n",
" 'N_PCA': 512,\n",
" 'RESIZE_IMG': 1024,\n",
" 'SAVE_FEATURES': False,\n",
" 'SAVE_RETRIEVAL_RANKINGS_SCORES': True,\n",
" 'SIMILARITY_MEASURE': 'cosine_similarity',\n",
" 'SPATIAL_LEVELS': 3,\n",
" 'TRAIN_DATASET_NAME': 'Oxford',\n",
" 'TRAIN_PCA_WHITENING': True,\n",
" 'USE_DISTRACTORS': False,\n",
" 'WHITEN_IMG_LIST': ''},\n",
" 'LOG_FREQUENCY': 100,\n",
" 'LOSS': {'CrossEntropyLoss': {'ignore_index': -1},\n",
" 'barlow_twins_loss': {'embedding_dim': 8192,\n",
" 'lambda_': 0.0051,\n",
" 'scale_loss': 0.024},\n",
" 'bce_logits_multiple_output_single_target': {'normalize_output': False,\n",
" 'reduction': 'none',\n",
" 'world_size': 1},\n",
" 'cross_entropy_multiple_output_single_target': {'ignore_index': -1,\n",
" 'normalize_output': False,\n",
" 'reduction': 'mean',\n",
" 'temperature': 1.0,\n",
" 'weight': None},\n",
" 'deepclusterv2_loss': {'BATCHSIZE_PER_REPLICA': 256,\n",
" 'DROP_LAST': True,\n",
" 'kmeans_iters': 10,\n",
" 'memory_params': {'crops_for_mb': [0],\n",
" 'embedding_dim': 128},\n",
" 'num_clusters': [3000, 3000, 3000],\n",
" 'num_crops': 2,\n",
" 'num_train_samples': -1,\n",
" 'temperature': 0.1},\n",
" 'dino_loss': {'crops_for_teacher': [0, 1],\n",
" 'ema_center': 0.9,\n",
" 'momentum': 0.996,\n",
" 'normalize_last_layer': True,\n",
" 'output_dim': 65536,\n",
" 'student_temp': 0.1,\n",
" 'teacher_temp_max': 0.07,\n",
" 'teacher_temp_min': 0.04,\n",
" 'teacher_temp_warmup_iters': 37500},\n",
" 'moco_loss': {'embedding_dim': 128,\n",
" 'momentum': 0.999,\n",
" 'queue_size': 65536,\n",
" 'temperature': 0.2},\n",
" 'multicrop_simclr_info_nce_loss': {'buffer_params': {'effective_batch_size': 4096,\n",
" 'embedding_dim': 128,\n",
" 'world_size': 64},\n",
" 'num_crops': 2,\n",
" 'temperature': 0.1},\n",
" 'name': 'cross_entropy_multiple_output_single_target',\n",
" 'nce_loss_with_memory': {'loss_type': 'nce',\n",
" 'loss_weights': [1.0],\n",
" 'memory_params': {'embedding_dim': 128,\n",
" 'memory_size': -1,\n",
" 'momentum': 0.5,\n",
" 'norm_init': True,\n",
" 'update_mem_on_forward': True},\n",
" 'negative_sampling_params': {'num_negatives': 16000,\n",
" 'type': 'random'},\n",
" 'norm_constant': -1,\n",
" 'norm_embedding': True,\n",
" 'num_train_samples': -1,\n",
" 'temperature': 0.07,\n",
" 'update_mem_with_emb_index': -100},\n",
" 'simclr_info_nce_loss': {'buffer_params': {'effective_batch_size': 4096,\n",
" 'embedding_dim': 128,\n",
" 'world_size': 64},\n",
" 'temperature': 0.1},\n",
" 'swav_loss': {'crops_for_assign': [0, 1],\n",
" 'embedding_dim': 128,\n",
" 'epsilon': 0.05,\n",
" 'normalize_last_layer': True,\n",
" 'num_crops': 2,\n",
" 'num_iters': 3,\n",
" 'num_prototypes': [3000],\n",
" 'output_dir': '.',\n",
" 'queue': {'local_queue_length': 0,\n",
" 'queue_length': 0,\n",
" 'start_iter': 0},\n",
" 'temp_hard_assignment_iters': 0,\n",
" 'temperature': 0.1,\n",
" 'use_double_precision': False},\n",
" 'swav_momentum_loss': {'crops_for_assign': [0, 1],\n",
" 'embedding_dim': 128,\n",
" 'epsilon': 0.05,\n",
" 'momentum': 0.99,\n",
" 'momentum_eval_mode_iter_start': 0,\n",
" 'normalize_last_layer': True,\n",
" 'num_crops': 2,\n",
" 'num_iters': 3,\n",
" 'num_prototypes': [3000],\n",
" 'queue': {'local_queue_length': 0,\n",
" 'queue_length': 0,\n",
" 'start_iter': 0},\n",
" 'temperature': 0.1,\n",
" 'use_double_precision': False}},\n",
" 'MACHINE': {'DEVICE': 'gpu'},\n",
" 'METERS': {'accuracy_list_meter': {'meter_names': [],\n",
" 'num_meters': 1,\n",
" 'topk_values': [1, 5]},\n",
" 'enable_training_meter': True,\n",
" 'mean_ap_list_meter': {'max_cpu_capacity': -1,\n",
" 'meter_names': [],\n",
" 'num_classes': 9605,\n",
" 'num_meters': 1},\n",
" 'name': 'accuracy_list_meter'},\n",
" 'MODEL': {'ACTIVATION_CHECKPOINTING': {'NUM_ACTIVATION_CHECKPOINTING_SPLITS': 2,\n",
" 'USE_ACTIVATION_CHECKPOINTING': False},\n",
" 'AMP_PARAMS': {'AMP_ARGS': {'opt_level': 'O1'},\n",
" 'AMP_TYPE': 'apex',\n",
" 'USE_AMP': False},\n",
" 'CUDA_CACHE': {'CLEAR_CUDA_CACHE': False, 'CLEAR_FREQ': 100},\n",
" 'FEATURE_EVAL_SETTINGS': {'EVAL_MODE_ON': True,\n",
" 'EVAL_TRUNK_AND_HEAD': False,\n",
" 'EXTRACT_TRUNK_FEATURES_ONLY': False,\n",
" 'FREEZE_TRUNK_AND_HEAD': False,\n",
" 'FREEZE_TRUNK_ONLY': False,\n",
" 'LINEAR_EVAL_FEAT_POOL_OPS_MAP': [],\n",
" 'SHOULD_FLATTEN_FEATS': True},\n",
" 'FSDP_CONFIG': {'AUTO_WRAP_THRESHOLD': 0,\n",
" 'bucket_cap_mb': 0,\n",
" 'clear_autocast_cache': True,\n",
" 'compute_dtype': torch.float32,\n",
" 'flatten_parameters': True,\n",
" 'fp32_reduce_scatter': False,\n",
" 'mixed_precision': True,\n",
" 'verbose': True},\n",
" 'GRAD_CLIP': {'MAX_NORM': 1, 'NORM_TYPE': 2, 'USE_GRAD_CLIP': False},\n",
" 'HEAD': {'BATCHNORM_EPS': 1e-05,\n",
" 'BATCHNORM_MOMENTUM': 0.1,\n",
" 'PARAMS': [['mlp', {'dims': [2048, 1000]}]],\n",
" 'PARAMS_MULTIPLIER': 1.0},\n",
" 'INPUT_TYPE': 'rgb',\n",
" 'MULTI_INPUT_HEAD_MAPPING': [],\n",
" 'NON_TRAINABLE_PARAMS': [],\n",
" 'SHARDED_DDP_SETUP': {'USE_SDP': False, 'reduce_buffer_size': -1},\n",
" 'SINGLE_PASS_EVERY_CROP': False,\n",
" 'SYNC_BN_CONFIG': {'CONVERT_BN_TO_SYNC_BN': False,\n",
" 'GROUP_SIZE': -1,\n",
" 'SYNC_BN_TYPE': 'pytorch'},\n",
" 'TEMP_FROZEN_PARAMS_ITER_MAP': [],\n",
" 'TRUNK': {'CONVIT': {'CLASS_TOKEN_IN_LOCAL_LAYERS': False,\n",
" 'LOCALITY_DIM': 10,\n",
" 'LOCALITY_STRENGTH': 1.0,\n",
" 'N_GPSA_LAYERS': 10,\n",
" 'USE_LOCAL_INIT': True},\n",
" 'EFFICIENT_NETS': {},\n",
" 'NAME': 'resnet',\n",
" 'REGNET': {},\n",
" 'RESNETS': {'DEPTH': 50,\n",
" 'GROUPNORM_GROUPS': 32,\n",
" 'GROUPS': 1,\n",
" 'LAYER4_STRIDE': 2,\n",
" 'NORM': 'BatchNorm',\n",
" 'STANDARDIZE_CONVOLUTIONS': False,\n",
" 'WIDTH_MULTIPLIER': 1,\n",
" 'WIDTH_PER_GROUP': 64,\n",
" 'ZERO_INIT_RESIDUAL': False},\n",
" 'VISION_TRANSFORMERS': {'ATTENTION_DROPOUT_RATE': 0,\n",
" 'CLASSIFIER': 'token',\n",
" 'DROPOUT_RATE': 0,\n",
" 'DROP_PATH_RATE': 0,\n",
" 'HIDDEN_DIM': 768,\n",
" 'IMAGE_SIZE': 224,\n",
" 'MLP_DIM': 3072,\n",
" 'NUM_HEADS': 12,\n",
" 'NUM_LAYERS': 12,\n",
" 'PATCH_SIZE': 16,\n",
" 'QKV_BIAS': False,\n",
" 'QK_SCALE': False,\n",
" 'name': None},\n",
" 'XCIT': {'ATTENTION_DROPOUT_RATE': 0,\n",
" 'DROPOUT_RATE': 0,\n",
" 'DROP_PATH_RATE': 0.05,\n",
" 'ETA': 1,\n",
" 'HIDDEN_DIM': 384,\n",
" 'IMAGE_SIZE': 224,\n",
" 'NUM_HEADS': 8,\n",
" 'NUM_LAYERS': 12,\n",
" 'PATCH_SIZE': 16,\n",
" 'QKV_BIAS': True,\n",
" 'QK_SCALE': False,\n",
" 'TOKENS_NORM': True,\n",
" 'name': None}},\n",
" 'WEIGHTS_INIT': {'APPEND_PREFIX': 'trunk._feature_blocks.',\n",
" 'PARAMS_FILE': '/content/resnet50-19c8e357.pth',\n",
" 'REMOVE_PREFIX': '',\n",
" 'SKIP_LAYERS': ['num_batches_tracked'],\n",
" 'STATE_DICT_KEY_NAME': ''},\n",
" '_MODEL_INIT_SEED': 1},\n",
" 'MONITORING': {'MONITOR_ACTIVATION_STATISTICS': 0},\n",
" 'MULTI_PROCESSING_METHOD': 'forkserver',\n",
" 'NEAREST_NEIGHBOR': {'L2_NORM_FEATS': False, 'SIGMA': 0.1, 'TOPK': 200},\n",
" 'OPTIMIZER': {'betas': [0.9, 0.999],\n",
" 'construct_single_param_group_only': False,\n",
" 'head_optimizer_params': {'use_different_lr': False,\n",
" 'use_different_wd': False,\n",
" 'weight_decay': 0.0},\n",
" 'larc_config': {'clip': False,\n",
" 'eps': 1e-08,\n",
" 'trust_coefficient': 0.001},\n",
" 'momentum': 0.9,\n",
" 'name': 'sgd',\n",
" 'nesterov': True,\n",
" 'non_regularized_parameters': [],\n",
" 'num_epochs': 2,\n",
" 'param_schedulers': {'lr': {'auto_lr_scaling': {'auto_scale': True,\n",
" 'base_lr_batch_size': 256,\n",
" 'base_value': 0.1,\n",
" 'scaling_type': 'linear'},\n",
" 'end_value': 0.0,\n",
" 'interval_scaling': [],\n",
" 'lengths': [],\n",
" 'milestones': [1],\n",
" 'name': 'multistep',\n",
" 'schedulers': [],\n",
" 'start_value': 0.1,\n",
" 'update_interval': 'epoch',\n",
" 'value': 0.1,\n",
" 'values': [0.00078125, 7.813e-05]},\n",
" 'lr_head': {'auto_lr_scaling': {'auto_scale': True,\n",
" 'base_lr_batch_size': 256,\n",
" 'base_value': 0.1,\n",
" 'scaling_type': 'linear'},\n",
" 'end_value': 0.0,\n",
" 'interval_scaling': [],\n",
" 'lengths': [],\n",
" 'milestones': [1],\n",
" 'name': 'multistep',\n",
" 'schedulers': [],\n",
" 'start_value': 0.1,\n",
" 'update_interval': 'epoch',\n",
" 'value': 0.1,\n",
" 'values': [0.00078125,\n",
" 7.813e-05]}},\n",
" 'regularize_bias': True,\n",
" 'regularize_bn': False,\n",
" 'use_larc': False,\n",
" 'use_zero': False,\n",
" 'weight_decay': 0.0},\n",
" 'PROFILING': {'MEMORY_PROFILING': {'TRACK_BY_LAYER_MEMORY': False},\n",
" 'NUM_ITERATIONS': 10,\n",
" 'OUTPUT_FOLDER': '.',\n",
" 'PROFILED_RANKS': [0, 1],\n",
" 'RUNTIME_PROFILING': {'LEGACY_PROFILER': False,\n",
" 'PROFILE_CPU': True,\n",
" 'PROFILE_GPU': True,\n",
" 'USE_PROFILER': False},\n",
" 'START_ITERATION': 0,\n",
" 'STOP_TRAINING_AFTER_PROFILING': False,\n",
" 'WARMUP_ITERATIONS': 0},\n",
" 'REPRODUCIBILITY': {'CUDDN_DETERMINISTIC': False},\n",
" 'SEED_VALUE': 1,\n",
" 'SLURM': {'ADDITIONAL_PARAMETERS': {},\n",
" 'COMMENT': 'vissl job',\n",
" 'CONSTRAINT': '',\n",
" 'LOG_FOLDER': '.',\n",
" 'MEM_GB': 250,\n",
" 'NAME': 'vissl',\n",
" 'NUM_CPU_PER_PROC': 8,\n",
" 'PARTITION': '',\n",
" 'PORT_ID': 40050,\n",
" 'TIME_HOURS': 72,\n",
" 'TIME_MINUTES': 0,\n",
" 'USE_SLURM': False},\n",
" 'SVM': {'cls_list': [],\n",
" 'costs': {'base': -1.0,\n",
" 'costs_list': [0.1, 0.01],\n",
" 'power_range': [4, 20]},\n",
" 'cross_val_folds': 3,\n",
" 'dual': True,\n",
" 'force_retrain': False,\n",
" 'loss': 'squared_hinge',\n",
" 'low_shot': {'dataset_name': 'voc',\n",
" 'k_values': [1, 2, 4, 8, 16, 32, 64, 96],\n",
" 'sample_inds': [1, 2, 3, 4, 5]},\n",
" 'max_iter': 2000,\n",
" 'normalize': True,\n",
" 'penalty': 'l2'},\n",
" 'TEST_EVERY_NUM_EPOCH': 1,\n",
" 'TEST_MODEL': True,\n",
" 'TEST_ONLY': False,\n",
" 'TRAINER': {'TASK_NAME': 'self_supervision_task',\n",
" 'TRAIN_STEP_NAME': 'standard_train_step'},\n",
" 'VERBOSE': True}\n",
"INFO 2021-10-18 00:57:14,265 train.py: 117: System config:\n",
"------------------- ---------------------------------------------------------------\n",
"sys.platform linux\n",
"Python 3.7.12 (default, Sep 10 2021, 00:21:48) [GCC 7.5.0]\n",
"numpy 1.19.5\n",
"Pillow 7.1.2\n",
"vissl 0.1.6 @/content/vissl/vissl\n",
"GPU available True\n",
"GPU 0 Tesla K80\n",
"CUDA_HOME /usr/local/cuda\n",
"torchvision 0.9.0+cu101 @/usr/local/lib/python3.7/dist-packages/torchvision\n",
"hydra 1.0.7 @/usr/local/lib/python3.7/dist-packages/hydra\n",
"classy_vision 0.7.0.dev @/usr/local/lib/python3.7/dist-packages/classy_vision\n",
"tensorboard 2.6.0\n",
"apex 0.1 @/usr/local/lib/python3.7/dist-packages/apex\n",
"cv2 4.1.2\n",
"PyTorch 1.8.0+cu101 @/usr/local/lib/python3.7/dist-packages/torch\n",
"PyTorch debug build False\n",
"------------------- ---------------------------------------------------------------\n",
"PyTorch built with:\n",
" - GCC 7.3\n",
" - C++ Version: 201402\n",
" - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications\n",
" - Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)\n",
" - OpenMP 201511 (a.k.a. OpenMP 4.5)\n",
" - NNPACK is enabled\n",
" - CPU capability usage: AVX2\n",
" - CUDA Runtime 10.1\n",
" - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70\n",
" - CuDNN 7.6.3\n",
" - Magma 2.5.2\n",
" - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.1, CUDNN_VERSION=7.6.3, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, \n",
"\n",
"CPU info:\n",
"------------------- ------------------------------\n",
"Architecture x86_64\n",
"CPU op-mode(s) 32-bit, 64-bit\n",
"Byte Order Little Endian\n",
"CPU(s) 2\n",
"On-line CPU(s) list 0,1\n",
"Thread(s) per core 2\n",
"Core(s) per socket 1\n",
"Socket(s) 1\n",
"NUMA node(s) 1\n",
"Vendor ID GenuineIntel\n",
"CPU family 6\n",
"Model 63\n",
"Model name Intel(R) Xeon(R) CPU @ 2.30GHz\n",
"Stepping 0\n",
"CPU MHz 2299.998\n",
"BogoMIPS 4599.99\n",
"Hypervisor vendor KVM\n",
"Virtualization type full\n",
"L1d cache 32K\n",
"L1i cache 32K\n",
"L2 cache 256K\n",
"L3 cache 46080K\n",
"NUMA node0 CPU(s) 0,1\n",
"------------------- ------------------------------\n",
"INFO 2021-10-18 00:57:14,266 trainer_main.py: 113: Using Distributed init method: tcp://localhost:57775, world_size: 1, rank: 0\n",
"INFO 2021-10-18 00:57:14,267 distributed_c10d.py: 187: Added key: store_based_barrier_key:1 to store for rank: 0\n",
"INFO 2021-10-18 00:57:14,267 trainer_main.py: 134: | initialized host 0440442413ae as rank 0 (0)\n",
"INFO 2021-10-18 00:57:16,535 train_task.py: 181: Not using Automatic Mixed Precision\n",
"INFO 2021-10-18 00:57:16,536 train_task.py: 449: Building model....\n",
"INFO 2021-10-18 00:57:16,537 resnext.py: 68: ResNeXT trunk, supports activation checkpointing. Deactivated\n",
"INFO 2021-10-18 00:57:16,537 resnext.py: 88: Building model: ResNeXt50-1x64d-w1-BatchNorm2d\n",
"INFO 2021-10-18 00:57:17,301 train_task.py: 423: Initializing model from: /content/resnet50-19c8e357.pth\n",
"INFO 2021-10-18 00:57:17,301 util.py: 276: Attempting to load checkpoint from /content/resnet50-19c8e357.pth\n",
"INFO 2021-10-18 00:57:17,586 util.py: 281: Loaded checkpoint from /content/resnet50-19c8e357.pth\n",
"INFO 2021-10-18 00:57:17,586 util.py: 240: Broadcasting checkpoint loaded from /content/resnet50-19c8e357.pth\n",
"INFO 2021-10-18 00:57:21,459 train_task.py: 429: Checkpoint loaded: /content/resnet50-19c8e357.pth...\n",
"INFO 2021-10-18 00:57:21,461 checkpoint.py: 886: Loaded: trunk._feature_blocks.conv1.weight of shape: torch.Size([64, 3, 7, 7]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,461 checkpoint.py: 886: Loaded: trunk._feature_blocks.bn1.weight of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,461 checkpoint.py: 886: Loaded: trunk._feature_blocks.bn1.bias of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,461 checkpoint.py: 886: Loaded: trunk._feature_blocks.bn1.running_mean of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,462 checkpoint.py: 886: Loaded: trunk._feature_blocks.bn1.running_var of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,462 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,462 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.conv1.weight of shape: torch.Size([64, 64, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,462 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.bn1.weight of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,462 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.bn1.bias of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,462 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.bn1.running_mean of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,462 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.bn1.running_var of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,463 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer1.0.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,463 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.conv2.weight of shape: torch.Size([64, 64, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,463 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.bn2.weight of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,463 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.bn2.bias of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,463 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.bn2.running_mean of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,464 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.bn2.running_var of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,464 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer1.0.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,464 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.conv3.weight of shape: torch.Size([256, 64, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,464 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.bn3.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,464 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.bn3.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,464 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.bn3.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,464 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.bn3.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,464 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer1.0.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,465 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.downsample.0.weight of shape: torch.Size([256, 64, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,465 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.downsample.1.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,465 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.downsample.1.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,465 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.downsample.1.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,465 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.0.downsample.1.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,465 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer1.0.downsample.1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,465 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.conv1.weight of shape: torch.Size([64, 256, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,466 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.bn1.weight of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,466 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.bn1.bias of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,466 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.bn1.running_mean of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,466 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.bn1.running_var of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,466 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer1.1.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,466 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.conv2.weight of shape: torch.Size([64, 64, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,466 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.bn2.weight of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,467 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.bn2.bias of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,467 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.bn2.running_mean of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,467 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.bn2.running_var of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,467 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer1.1.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,467 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.conv3.weight of shape: torch.Size([256, 64, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,467 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.bn3.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,467 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.bn3.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,468 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.bn3.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,468 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.1.bn3.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,468 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer1.1.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,468 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.conv1.weight of shape: torch.Size([64, 256, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,468 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.bn1.weight of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,468 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.bn1.bias of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,468 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.bn1.running_mean of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,469 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.bn1.running_var of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,469 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer1.2.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,469 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.conv2.weight of shape: torch.Size([64, 64, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,469 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.bn2.weight of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,469 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.bn2.bias of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,469 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.bn2.running_mean of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,470 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.bn2.running_var of shape: torch.Size([64]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,470 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer1.2.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,470 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.conv3.weight of shape: torch.Size([256, 64, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,470 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.bn3.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,470 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.bn3.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,470 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.bn3.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,471 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer1.2.bn3.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,471 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer1.2.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,471 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.conv1.weight of shape: torch.Size([128, 256, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,471 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.bn1.weight of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,471 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.bn1.bias of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,471 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.bn1.running_mean of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,471 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.bn1.running_var of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,472 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer2.0.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,472 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.conv2.weight of shape: torch.Size([128, 128, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,472 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.bn2.weight of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,472 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.bn2.bias of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,472 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.bn2.running_mean of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,472 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.bn2.running_var of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,473 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer2.0.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,473 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.conv3.weight of shape: torch.Size([512, 128, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,473 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.bn3.weight of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,473 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.bn3.bias of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,473 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.bn3.running_mean of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,473 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.bn3.running_var of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,474 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer2.0.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,474 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.downsample.0.weight of shape: torch.Size([512, 256, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,474 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.downsample.1.weight of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,474 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.downsample.1.bias of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,474 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.downsample.1.running_mean of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,474 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.0.downsample.1.running_var of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,474 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer2.0.downsample.1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,475 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.conv1.weight of shape: torch.Size([128, 512, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,475 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.bn1.weight of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,475 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.bn1.bias of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,475 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.bn1.running_mean of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,475 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.bn1.running_var of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,475 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer2.1.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,476 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.conv2.weight of shape: torch.Size([128, 128, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,476 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.bn2.weight of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,476 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.bn2.bias of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,476 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.bn2.running_mean of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,476 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.bn2.running_var of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,476 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer2.1.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,476 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.conv3.weight of shape: torch.Size([512, 128, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,477 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.bn3.weight of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,477 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.bn3.bias of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,477 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.bn3.running_mean of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,477 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.1.bn3.running_var of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,477 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer2.1.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,477 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.conv1.weight of shape: torch.Size([128, 512, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,478 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.bn1.weight of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,478 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.bn1.bias of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,478 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.bn1.running_mean of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,478 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.bn1.running_var of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,478 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer2.2.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,478 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.conv2.weight of shape: torch.Size([128, 128, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,478 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.bn2.weight of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,479 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.bn2.bias of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,479 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.bn2.running_mean of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,479 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.bn2.running_var of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,479 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer2.2.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,508 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.conv3.weight of shape: torch.Size([512, 128, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,508 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.bn3.weight of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,509 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.bn3.bias of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,509 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.bn3.running_mean of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,509 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.2.bn3.running_var of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,509 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer2.2.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,510 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.conv1.weight of shape: torch.Size([128, 512, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,510 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.bn1.weight of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,510 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.bn1.bias of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,510 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.bn1.running_mean of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,510 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.bn1.running_var of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,510 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer2.3.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,511 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.conv2.weight of shape: torch.Size([128, 128, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,511 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.bn2.weight of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,511 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.bn2.bias of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,511 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.bn2.running_mean of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,512 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.bn2.running_var of shape: torch.Size([128]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,512 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer2.3.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,512 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.conv3.weight of shape: torch.Size([512, 128, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,512 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.bn3.weight of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,512 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.bn3.bias of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,513 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.bn3.running_mean of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,513 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer2.3.bn3.running_var of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,513 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer2.3.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,513 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.conv1.weight of shape: torch.Size([256, 512, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,514 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.bn1.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,514 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.bn1.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,514 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.bn1.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,514 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.bn1.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,514 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.0.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,515 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.conv2.weight of shape: torch.Size([256, 256, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,515 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.bn2.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,516 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.bn2.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,516 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.bn2.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,516 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.bn2.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,516 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.0.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,516 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.conv3.weight of shape: torch.Size([1024, 256, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,517 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.bn3.weight of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,517 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.bn3.bias of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,517 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.bn3.running_mean of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,517 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.bn3.running_var of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,517 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.0.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,518 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.downsample.0.weight of shape: torch.Size([1024, 512, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,518 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.downsample.1.weight of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,518 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.downsample.1.bias of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,518 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.downsample.1.running_mean of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,519 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.0.downsample.1.running_var of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,519 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.0.downsample.1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,519 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.conv1.weight of shape: torch.Size([256, 1024, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,519 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.bn1.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,519 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.bn1.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,519 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.bn1.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,520 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.bn1.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,520 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.1.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,521 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.conv2.weight of shape: torch.Size([256, 256, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,521 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.bn2.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,521 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.bn2.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,521 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.bn2.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,521 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.bn2.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,521 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.1.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,522 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.conv3.weight of shape: torch.Size([1024, 256, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,522 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.bn3.weight of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,522 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.bn3.bias of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,522 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.bn3.running_mean of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,522 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.1.bn3.running_var of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,522 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.1.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,523 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.conv1.weight of shape: torch.Size([256, 1024, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,523 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.bn1.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,523 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.bn1.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,523 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.bn1.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,523 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.bn1.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,523 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.2.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,524 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.conv2.weight of shape: torch.Size([256, 256, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,524 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.bn2.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,524 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.bn2.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,524 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.bn2.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,525 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.bn2.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,525 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.2.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,525 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.conv3.weight of shape: torch.Size([1024, 256, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,525 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.bn3.weight of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,525 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.bn3.bias of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,526 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.bn3.running_mean of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,526 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.2.bn3.running_var of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,526 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.2.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,526 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.conv1.weight of shape: torch.Size([256, 1024, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,526 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.bn1.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,526 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.bn1.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,526 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.bn1.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,527 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.bn1.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,527 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.3.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,527 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.conv2.weight of shape: torch.Size([256, 256, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,528 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.bn2.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,528 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.bn2.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,528 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.bn2.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,528 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.bn2.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,528 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.3.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,528 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.conv3.weight of shape: torch.Size([1024, 256, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,529 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.bn3.weight of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,529 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.bn3.bias of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,529 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.bn3.running_mean of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,529 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.3.bn3.running_var of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,529 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.3.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,530 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.conv1.weight of shape: torch.Size([256, 1024, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,530 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.bn1.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,530 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.bn1.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,530 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.bn1.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,530 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.bn1.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,530 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.4.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,531 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.conv2.weight of shape: torch.Size([256, 256, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,531 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.bn2.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,531 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.bn2.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,531 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.bn2.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,532 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.bn2.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,532 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.4.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,532 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.conv3.weight of shape: torch.Size([1024, 256, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,532 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.bn3.weight of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,532 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.bn3.bias of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,532 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.bn3.running_mean of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,533 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.4.bn3.running_var of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,533 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.4.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,533 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.conv1.weight of shape: torch.Size([256, 1024, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,533 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.bn1.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,533 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.bn1.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,533 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.bn1.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,534 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.bn1.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,534 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.5.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,534 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.conv2.weight of shape: torch.Size([256, 256, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,534 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.bn2.weight of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,535 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.bn2.bias of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,535 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.bn2.running_mean of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,535 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.bn2.running_var of shape: torch.Size([256]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,535 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.5.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,535 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.conv3.weight of shape: torch.Size([1024, 256, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,536 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.bn3.weight of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,536 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.bn3.bias of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,536 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.bn3.running_mean of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,536 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer3.5.bn3.running_var of shape: torch.Size([1024]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,536 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer3.5.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,537 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.conv1.weight of shape: torch.Size([512, 1024, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,537 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.bn1.weight of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,537 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.bn1.bias of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,537 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.bn1.running_mean of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,538 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.bn1.running_var of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,538 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer4.0.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,540 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.conv2.weight of shape: torch.Size([512, 512, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,541 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.bn2.weight of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,541 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.bn2.bias of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,541 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.bn2.running_mean of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,541 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.bn2.running_var of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,541 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer4.0.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,617 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.conv3.weight of shape: torch.Size([2048, 512, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,617 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.bn3.weight of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,618 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.bn3.bias of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,618 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.bn3.running_mean of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,618 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.bn3.running_var of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,618 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer4.0.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,620 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.downsample.0.weight of shape: torch.Size([2048, 1024, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,620 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.downsample.1.weight of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,621 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.downsample.1.bias of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,621 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.downsample.1.running_mean of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,621 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.0.downsample.1.running_var of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,621 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer4.0.downsample.1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,622 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.conv1.weight of shape: torch.Size([512, 2048, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,622 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.bn1.weight of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,622 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.bn1.bias of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,622 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.bn1.running_mean of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,623 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.bn1.running_var of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,623 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer4.1.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,625 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.conv2.weight of shape: torch.Size([512, 512, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,625 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.bn2.weight of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,625 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.bn2.bias of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,625 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.bn2.running_mean of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,625 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.bn2.running_var of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,626 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer4.1.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,627 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.conv3.weight of shape: torch.Size([2048, 512, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,627 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.bn3.weight of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,627 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.bn3.bias of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,627 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.bn3.running_mean of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,627 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.1.bn3.running_var of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,627 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer4.1.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,629 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.conv1.weight of shape: torch.Size([512, 2048, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,629 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.bn1.weight of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,629 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.bn1.bias of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,629 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.bn1.running_mean of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,629 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.bn1.running_var of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,629 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer4.2.bn1.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,631 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.conv2.weight of shape: torch.Size([512, 512, 3, 3]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,632 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.bn2.weight of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,632 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.bn2.bias of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,632 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.bn2.running_mean of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,632 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.bn2.running_var of shape: torch.Size([512]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,632 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer4.2.bn2.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,633 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.conv3.weight of shape: torch.Size([2048, 512, 1, 1]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,634 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.bn3.weight of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,634 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.bn3.bias of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,634 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.bn3.running_mean of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,634 checkpoint.py: 886: Loaded: trunk._feature_blocks.layer4.2.bn3.running_var of shape: torch.Size([2048]) from checkpoint\n",
"INFO 2021-10-18 00:57:21,634 checkpoint.py: 851: Ignored layer:\ttrunk._feature_blocks.layer4.2.bn3.num_batches_tracked\n",
"INFO 2021-10-18 00:57:21,634 checkpoint.py: 894: Not found:\t\theads.0.clf.0.weight, not initialized\n",
"INFO 2021-10-18 00:57:21,634 checkpoint.py: 894: Not found:\t\theads.0.clf.0.bias, not initialized\n",
"INFO 2021-10-18 00:57:21,635 checkpoint.py: 901: Extra layers not loaded from checkpoint: ['trunk._feature_blocks.fc.weight', 'trunk._feature_blocks.fc.bias', 'trunk._feature_blocks.type']\n",
"INFO 2021-10-18 00:57:21,647 train_task.py: 651: Broadcast model BN buffers from primary on every forward pass\n",
"INFO 2021-10-18 00:57:21,648 classification_task.py: 387: Synchronized Batch Normalization is disabled\n",
"INFO 2021-10-18 00:57:21,690 optimizer_helper.py: 294: \n",
"Trainable params: 161, \n",
"Non-Trainable params: 0, \n",
"Trunk Regularized Parameters: 53, \n",
"Trunk Unregularized Parameters 106, \n",
"Head Regularized Parameters: 2, \n",
"Head Unregularized Parameters: 0 \n",
"Remaining Regularized Parameters: 0 \n",
"Remaining Unregularized Parameters: 0\n",
"INFO 2021-10-18 00:57:21,691 ssl_dataset.py: 157: Rank: 0 split: TEST Data files:\n",
"['/content/dummy_data/val']\n",
"INFO 2021-10-18 00:57:21,691 ssl_dataset.py: 160: Rank: 0 split: TEST Label files:\n",
"['/content/dummy_data/val']\n",
"INFO 2021-10-18 00:57:21,692 disk_dataset.py: 86: Loaded 10 samples from folder /content/dummy_data/val\n",
"INFO 2021-10-18 00:57:21,692 ssl_dataset.py: 157: Rank: 0 split: TRAIN Data files:\n",
"['/content/dummy_data/train']\n",
"INFO 2021-10-18 00:57:21,692 ssl_dataset.py: 160: Rank: 0 split: TRAIN Label files:\n",
"['/content/dummy_data/train']\n",
"INFO 2021-10-18 00:57:21,692 disk_dataset.py: 86: Loaded 10 samples from folder /content/dummy_data/train\n",
"INFO 2021-10-18 00:57:21,693 misc.py: 161: Set start method of multiprocessing to forkserver\n",
"INFO 2021-10-18 00:57:21,693 __init__.py: 126: Created the Distributed Sampler....\n",
"INFO 2021-10-18 00:57:21,693 __init__.py: 101: Distributed Sampler config:\n",
"{'num_replicas': 1, 'rank': 0, 'epoch': 0, 'num_samples': 10, 'total_size': 10, 'shuffle': True, 'seed': 0}\n",
"/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:477: UserWarning: This DataLoader will create 5 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.\n",
" cpuset_checked))\n",
"INFO 2021-10-18 00:57:21,694 __init__.py: 215: Wrapping the dataloader to async device copies\n",
"INFO 2021-10-18 00:57:21,694 misc.py: 161: Set start method of multiprocessing to forkserver\n",
"INFO 2021-10-18 00:57:21,694 __init__.py: 126: Created the Distributed Sampler....\n",
"INFO 2021-10-18 00:57:21,694 __init__.py: 101: Distributed Sampler config:\n",
"{'num_replicas': 1, 'rank': 0, 'epoch': 0, 'num_samples': 10, 'total_size': 10, 'shuffle': True, 'seed': 0}\n",
"INFO 2021-10-18 00:57:21,694 __init__.py: 215: Wrapping the dataloader to async device copies\n",
"INFO 2021-10-18 00:57:21,695 train_task.py: 384: Building loss...\n",
"INFO 2021-10-18 00:57:21,695 trainer_main.py: 268: Training 2 epochs\n",
"INFO 2021-10-18 00:57:21,695 trainer_main.py: 269: One epoch = 5 iterations.\n",
"INFO 2021-10-18 00:57:21,695 trainer_main.py: 270: Total 10 samples in one epoch\n",
"INFO 2021-10-18 00:57:21,695 trainer_main.py: 276: Total 10 iterations for training\n",
"INFO 2021-10-18 00:57:21,820 logger.py: 84: Mon Oct 18 00:57:21 2021 \n",
"+-----------------------------------------------------------------------------+\n",
"| NVIDIA-SMI 470.74 Driver Version: 460.32.03 CUDA Version: 11.2 |\n",
"|-------------------------------+----------------------+----------------------+\n",
"| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n",
"| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n",
"| | | MIG M. |\n",
"|===============================+======================+======================|\n",
"| 0 Tesla K80 Off | 00000000:00:04.0 Off | 0 |\n",
"| N/A 75C P0 77W / 149W | 562MiB / 11441MiB | 0% Default |\n",
"| | | N/A |\n",
"+-------------------------------+----------------------+----------------------+\n",
" \n",
"+-----------------------------------------------------------------------------+\n",
"| Processes: |\n",
"| GPU GI CI PID Type Process name GPU Memory |\n",
"| ID ID Usage |\n",
"|=============================================================================|\n",
"| No running processes found |\n",
"+-----------------------------------------------------------------------------+\n",
"\n",
"INFO 2021-10-18 00:57:21,822 trainer_main.py: 173: Model is:\n",
" Classy :\n",
"BaseSSLMultiInputOutputModel(\n",
" (_heads): ModuleDict()\n",
" (trunk): ResNeXt(\n",
" (_feature_blocks): ModuleDict(\n",
" (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)\n",
" (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv1_relu): ReLU(inplace=True)\n",
" (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)\n",
" (layer1): Sequential(\n",
" (0): Bottleneck(\n",
" (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" (downsample): Sequential(\n",
" (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" )\n",
" )\n",
" (1): Bottleneck(\n",
" (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" )\n",
" (2): Bottleneck(\n",
" (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" )\n",
" )\n",
" (layer2): Sequential(\n",
" (0): Bottleneck(\n",
" (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" (downsample): Sequential(\n",
" (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)\n",
" (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" )\n",
" )\n",
" (1): Bottleneck(\n",
" (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" )\n",
" (2): Bottleneck(\n",
" (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" )\n",
" (3): Bottleneck(\n",
" (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" )\n",
" )\n",
" (layer3): Sequential(\n",
" (0): Bottleneck(\n",
" (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" (downsample): Sequential(\n",
" (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)\n",
" (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" )\n",
" )\n",
" (1): Bottleneck(\n",
" (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" )\n",
" (2): Bottleneck(\n",
" (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" )\n",
" (3): Bottleneck(\n",
" (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" )\n",
" (4): Bottleneck(\n",
" (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" )\n",
" (5): Bottleneck(\n",
" (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" )\n",
" )\n",
" (layer4): Sequential(\n",
" (0): Bottleneck(\n",
" (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(, ), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" (downsample): Sequential(\n",
" (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(, ), bias=False)\n",
" (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" )\n",
" )\n",
" (1): Bottleneck(\n",
" (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" )\n",
" (2): Bottleneck(\n",
" (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace=True)\n",
" )\n",
" )\n",
" (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))\n",
" (flatten): Flatten()\n",
" )\n",
" )\n",
" (heads): ModuleList(\n",
" (0): MLP(\n",
" (clf): Sequential(\n",
" (0): Linear(in_features=2048, out_features=1000, bias=True)\n",
" )\n",
" )\n",
" )\n",
")\n",
"INFO 2021-10-18 00:57:21,822 trainer_main.py: 174: Loss is: CrossEntropyMultipleOutputSingleTargetLoss(\n",
" (criterion): CrossEntropyMultipleOutputSingleTargetCriterion(\n",
" (_losses): ModuleList()\n",
" )\n",
")\n",
"INFO 2021-10-18 00:57:21,829 trainer_main.py: 175: Starting training....\n",
"INFO 2021-10-18 00:57:21,829 __init__.py: 101: Distributed Sampler config:\n",
"{'num_replicas': 1, 'rank': 0, 'epoch': 0, 'num_samples': 10, 'total_size': 10, 'shuffle': True, 'seed': 0}\n",
"/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:477: UserWarning: This DataLoader will create 5 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.\n",
" cpuset_checked))\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"INFO 2021-10-18 00:57:27,492 trainer_main.py: 333: Phase advanced. Rank: 0\n",
"INFO 2021-10-18 00:57:27,494 log_hooks.py: 77: ========= Memory Summary at on_phase_start =======\n",
"|===========================================================================|\n",
"| PyTorch CUDA memory summary, device ID 0 |\n",
"|---------------------------------------------------------------------------|\n",
"| CUDA OOMs: 0 | cudaMalloc retries: 0 |\n",
"|===========================================================================|\n",
"| Metric | Cur Usage | Peak Usage | Tot Alloc | Tot Freed |\n",
"|---------------------------------------------------------------------------|\n",
"| Allocated memory | 101251 KB | 101251 KB | 101251 KB | 512 B |\n",
"| from large pool | 83416 KB | 83416 KB | 83416 KB | 0 B |\n",
"| from small pool | 17835 KB | 17835 KB | 17835 KB | 512 B |\n",
"|---------------------------------------------------------------------------|\n",
"| Active memory | 101251 KB | 101251 KB | 101251 KB | 512 B |\n",
"| from large pool | 83416 KB | 83416 KB | 83416 KB | 0 B |\n",
"| from small pool | 17835 KB | 17835 KB | 17835 KB | 512 B |\n",
"|---------------------------------------------------------------------------|\n",
"| GPU reserved memory | 143360 KB | 143360 KB | 143360 KB | 0 B |\n",
"| from large pool | 122880 KB | 122880 KB | 122880 KB | 0 B |\n",
"| from small pool | 20480 KB | 20480 KB | 20480 KB | 0 B |\n",
"|---------------------------------------------------------------------------|\n",
"| Non-releasable memory | 42109 KB | 42110 KB | 109570 KB | 67461 KB |\n",
"| from large pool | 39464 KB | 39464 KB | 93800 KB | 54336 KB |\n",
"| from small pool | 2645 KB | 2646 KB | 15770 KB | 13125 KB |\n",
"|---------------------------------------------------------------------------|\n",
"| Allocations | 324 | 324 | 325 | 1 |\n",
"| from large pool | 19 | 19 | 19 | 0 |\n",
"| from small pool | 305 | 305 | 306 | 1 |\n",
"|---------------------------------------------------------------------------|\n",
"| Active allocs | 324 | 324 | 325 | 1 |\n",
"| from large pool | 19 | 19 | 19 | 0 |\n",
"| from small pool | 305 | 305 | 306 | 1 |\n",
"|---------------------------------------------------------------------------|\n",
"| GPU reserved segments | 16 | 16 | 16 | 0 |\n",
"| from large pool | 6 | 6 | 6 | 0 |\n",
"| from small pool | 10 | 10 | 10 | 0 |\n",
"|---------------------------------------------------------------------------|\n",
"| Non-releasable allocs | 9 | 9 | 17 | 8 |\n",
"| from large pool | 6 | 6 | 6 | 0 |\n",
"| from small pool | 3 | 5 | 11 | 8 |\n",
"|===========================================================================|\n",
"\n",
"\n",
"INFO 2021-10-18 00:57:27,494 state_update_hooks.py: 113: Starting phase 0 [train]\n",
"INFO 2021-10-18 00:57:28,905 log_hooks.py: 77: ========= Memory Summary at on_forward =======\n",
"|===========================================================================|\n",
"| PyTorch CUDA memory summary, device ID 0 |\n",
"|---------------------------------------------------------------------------|\n",
"| CUDA OOMs: 0 | cudaMalloc retries: 0 |\n",
"|===========================================================================|\n",
"| Metric | Cur Usage | Peak Usage | Tot Alloc | Tot Freed |\n",
"|---------------------------------------------------------------------------|\n",
"| Allocated memory | 271503 KB | 2578 MB | 14863 MB | 14598 MB |\n",
"| from large pool | 224816 KB | 2537 MB | 14812 MB | 14593 MB |\n",
"| from small pool | 46687 KB | 45 MB | 50 MB | 4 MB |\n",
"|---------------------------------------------------------------------------|\n",
"| Active memory | 271503 KB | 2578 MB | 14863 MB | 14598 MB |\n",
"| from large pool | 224816 KB | 2537 MB | 14812 MB | 14593 MB |\n",
"| from small pool | 46687 KB | 45 MB | 50 MB | 4 MB |\n",
"|---------------------------------------------------------------------------|\n",
"| GPU reserved memory | 3038 MB | 4186 MB | 11632 MB | 8594 MB |\n",
"| from large pool | 2988 MB | 4142 MB | 11580 MB | 8592 MB |\n",
"| from small pool | 50 MB | 50 MB | 52 MB | 2 MB |\n",
"|---------------------------------------------------------------------------|\n",
"| Non-releasable memory | 465776 KB | 1676 MB | 2792 MB | 2337 MB |\n",
"| from large pool | 461264 KB | 1671 MB | 2752 MB | 2302 MB |\n",
"| from small pool | 4512 KB | 6 MB | 40 MB | 35 MB |\n",
"|---------------------------------------------------------------------------|\n",
"| Allocations | 540 | 540 | 657 | 117 |\n",
"| from large pool | 69 | 70 | 105 | 36 |\n",
"| from small pool | 471 | 471 | 552 | 81 |\n",
"|---------------------------------------------------------------------------|\n",
"| Active allocs | 540 | 540 | 657 | 117 |\n",
"| from large pool | 69 | 70 | 105 | 36 |\n",
"| from small pool | 471 | 471 | 552 | 81 |\n",
"|---------------------------------------------------------------------------|\n",
"| GPU reserved segments | 34 | 34 | 45 | 11 |\n",
"| from large pool | 9 | 10 | 19 | 10 |\n",
"| from small pool | 25 | 25 | 26 | 1 |\n",
"|---------------------------------------------------------------------------|\n",
"| Non-releasable allocs | 23 | 23 | 102 | 79 |\n",
"| from large pool | 5 | 7 | 21 | 16 |\n",
"| from small pool | 18 | 18 | 81 | 63 |\n",
"|===========================================================================|\n",
"\n",
"\n",
"INFO 2021-10-18 00:57:30,260 log_hooks.py: 77: ========= Memory Summary at on_backward =======\n",
"|===========================================================================|\n",
"| PyTorch CUDA memory summary, device ID 0 |\n",
"|---------------------------------------------------------------------------|\n",
"| CUDA OOMs: 0 | cudaMalloc retries: 0 |\n",
"|===========================================================================|\n",
"| Metric | Cur Usage | Peak Usage | Tot Alloc | Tot Freed |\n",
"|---------------------------------------------------------------------------|\n",
"| Allocated memory | 206433 KB | 2595 MB | 42189 MB | 41987 MB |\n",
"| from large pool | 170992 KB | 2550 MB | 42077 MB | 41910 MB |\n",
"| from small pool | 35441 KB | 47 MB | 111 MB | 77 MB |\n",
"|---------------------------------------------------------------------------|\n",
"| Active memory | 206433 KB | 2595 MB | 42189 MB | 41987 MB |\n",
"| from large pool | 170992 KB | 2550 MB | 42077 MB | 41910 MB |\n",
"| from small pool | 35441 KB | 47 MB | 111 MB | 77 MB |\n",
"|---------------------------------------------------------------------------|\n",
"| GPU reserved memory | 729088 KB | 4186 MB | 17722 MB | 17010 MB |\n",
"| from large pool | 686080 KB | 4142 MB | 17658 MB | 16988 MB |\n",
"| from small pool | 43008 KB | 52 MB | 64 MB | 22 MB |\n",
"|---------------------------------------------------------------------------|\n",
"| Non-releasable memory | 506270 KB | 2176 MB | 11250 MB | 10756 MB |\n",
"| from large pool | 498704 KB | 2171 MB | 11136 MB | 10649 MB |\n",
"| from small pool | 7566 KB | 8 MB | 114 MB | 106 MB |\n",
"|---------------------------------------------------------------------------|\n",
"| Allocations | 492 | 547 | 1074 | 582 |\n",
"| from large pool | 38 | 83 | 257 | 219 |\n",
"| from small pool | 454 | 478 | 817 | 363 |\n",
"|---------------------------------------------------------------------------|\n",
"| Active allocs | 492 | 547 | 1074 | 582 |\n",
"| from large pool | 38 | 83 | 257 | 219 |\n",
"| from small pool | 454 | 478 | 817 | 363 |\n",
"|---------------------------------------------------------------------------|\n",
"| GPU reserved segments | 29 | 35 | 58 | 29 |\n",
"| from large pool | 8 | 10 | 26 | 18 |\n",
"| from small pool | 21 | 26 | 32 | 11 |\n",
"|---------------------------------------------------------------------------|\n",
"| Non-releasable allocs | 34 | 34 | 348 | 314 |\n",
"| from large pool | 12 | 13 | 130 | 118 |\n",
"| from small pool | 22 | 22 | 218 | 196 |\n",
"|===========================================================================|\n",
"\n",
"\n",
"INFO 2021-10-18 00:57:30,272 log_hooks.py: 77: ========= Memory Summary at on_update =======\n",
"|===========================================================================|\n",
"| PyTorch CUDA memory summary, device ID 0 |\n",
"|---------------------------------------------------------------------------|\n",
"| CUDA OOMs: 0 | cudaMalloc retries: 0 |\n",
"|===========================================================================|\n",
"| Metric | Cur Usage | Peak Usage | Tot Alloc | Tot Freed |\n",
"|---------------------------------------------------------------------------|\n",
"| Allocated memory | 310013 KB | 2595 MB | 42391 MB | 42088 MB |\n",
"| from large pool | 256976 KB | 2550 MB | 42244 MB | 41994 MB |\n",
"| from small pool | 53037 KB | 52 MB | 146 MB | 94 MB |\n",
"|---------------------------------------------------------------------------|\n",
"| Active memory | 310013 KB | 2595 MB | 42391 MB | 42088 MB |\n",
"| from large pool | 256976 KB | 2550 MB | 42244 MB | 41994 MB |\n",
"| from small pool | 53037 KB | 52 MB | 146 MB | 94 MB |\n",
"|---------------------------------------------------------------------------|\n",
"| GPU reserved memory | 743424 KB | 4186 MB | 17736 MB | 17010 MB |\n",
"| from large pool | 686080 KB | 4142 MB | 17658 MB | 16988 MB |\n",
"| from small pool | 57344 KB | 56 MB | 78 MB | 22 MB |\n",
"|---------------------------------------------------------------------------|\n",
"| Non-releasable memory | 433410 KB | 2176 MB | 11367 MB | 10944 MB |\n",
"| from large pool | 429104 KB | 2171 MB | 11227 MB | 10808 MB |\n",
"| from small pool | 4306 KB | 8 MB | 140 MB | 136 MB |\n",
"|---------------------------------------------------------------------------|\n",
"| Allocations | 653 | 654 | 1396 | 743 |\n",
"| from large pool | 56 | 83 | 293 | 237 |\n",
"| from small pool | 597 | 598 | 1103 | 506 |\n",
"|---------------------------------------------------------------------------|\n",
"| Active allocs | 653 | 654 | 1396 | 743 |\n",
"| from large pool | 56 | 83 | 293 | 237 |\n",
"| from small pool | 597 | 598 | 1103 | 506 |\n",
"|---------------------------------------------------------------------------|\n",
"| GPU reserved segments | 36 | 36 | 65 | 29 |\n",
"| from large pool | 8 | 10 | 26 | 18 |\n",
"| from small pool | 28 | 28 | 39 | 11 |\n",
"|---------------------------------------------------------------------------|\n",
"| Non-releasable allocs | 15 | 35 | 386 | 371 |\n",
"| from large pool | 7 | 13 | 136 | 129 |\n",
"| from small pool | 8 | 24 | 250 | 242 |\n",
"|===========================================================================|\n",
"\n",
"\n",
"INFO 2021-10-18 00:57:30,272 log_hooks.py: 277: Rank: 0; [ep: 0] iter: 0; lr: 0.00078; loss: 6.94697; btime(ms): 0; eta: 0:00:00; peak_mem(M): 2595;\n",
"INFO 2021-10-18 00:57:30,360 log_hooks.py: 277: Rank: 0; [ep: 0] iter: 1; lr: 0.00078; loss: 7.53335; btime(ms): 8577; eta: 0:01:17; peak_mem(M): 2595; max_iterations: 10;\n",
"INFO 2021-10-18 00:57:30,593 trainer_main.py: 214: Meters synced\n",
"INFO 2021-10-18 00:57:30,632 log_hooks.py: 568: Average train batch time (ms) for 5 batches: 627\n",
"INFO 2021-10-18 00:57:30,633 log_hooks.py: 577: Train step time breakdown (rank 0):\n",
" Timer Host CudaEvent\n",
" read_sample: 21.33 ms 14.32 ms\n",
" forward: 281.86 ms 288.07 ms\n",
" loss_compute: 0.78 ms 0.78 ms\n",
" loss_all_reduce: 0.10 ms 0.12 ms\n",
" meters_update: 0.52 ms 0.52 ms\n",
" backward: 282.62 ms 311.82 ms\n",
" optimizer_step: 7.85 ms 10.37 ms\n",
" train_step_total: 619.58 ms 627.53 ms\n",
"INFO 2021-10-18 00:57:30,633 log_hooks.py: 498: Rank: 0, name: train_accuracy_list_meter, value: {'top_1': {0: 20.0}, 'top_5': {0: 40.0}}\n",
"INFO 2021-10-18 00:57:30,633 io.py: 63: Saving data to file: /content/checkpoints/metrics.json\n",
"INFO 2021-10-18 00:57:30,634 io.py: 89: Saved data to file: /content/checkpoints/metrics.json\n",
"INFO 2021-10-18 00:57:30,634 log_hooks.py: 426: [phase: 0] Saving checkpoint to /content/checkpoints\n",
"INFO 2021-10-18 00:57:31,134 checkpoint.py: 131: Saved checkpoint: /content/checkpoints/model_phase0.torch\n",
"INFO 2021-10-18 00:57:31,135 checkpoint.py: 140: Creating symlink...\n",
"INFO 2021-10-18 00:57:31,136 checkpoint.py: 144: Created symlink: /content/checkpoints/checkpoint.torch\n",
"INFO 2021-10-18 00:57:31,136 __init__.py: 101: Distributed Sampler config:\n",
"{'num_replicas': 1, 'rank': 0, 'epoch': 1, 'num_samples': 10, 'total_size': 10, 'shuffle': True, 'seed': 0}\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"INFO 2021-10-18 00:57:36,682 trainer_main.py: 333: Phase advanced. Rank: 0\n",
"INFO 2021-10-18 00:57:36,682 state_update_hooks.py: 113: Starting phase 1 [test]\n",
"INFO 2021-10-18 00:57:36,884 trainer_main.py: 214: Meters synced\n",
"INFO 2021-10-18 00:57:36,885 log_hooks.py: 568: Average test batch time (ms) for 5 batches: 40\n",
"INFO 2021-10-18 00:57:36,885 log_hooks.py: 498: Rank: 0, name: test_accuracy_list_meter, value: {'top_1': {0: 50.0}, 'top_5': {0: 100.0}}\n",
"INFO 2021-10-18 00:57:36,885 io.py: 63: Saving data to file: /content/checkpoints/metrics.json\n",
"INFO 2021-10-18 00:57:36,886 io.py: 89: Saved data to file: /content/checkpoints/metrics.json\n",
"INFO 2021-10-18 00:57:36,886 __init__.py: 101: Distributed Sampler config:\n",
"{'num_replicas': 1, 'rank': 0, 'epoch': 2, 'num_samples': 10, 'total_size': 10, 'shuffle': True, 'seed': 0}\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"INFO 2021-10-18 00:57:42,361 trainer_main.py: 333: Phase advanced. Rank: 0\n",
"INFO 2021-10-18 00:57:42,362 state_update_hooks.py: 113: Starting phase 2 [train]\n",
"INFO 2021-10-18 00:57:42,509 log_hooks.py: 277: Rank: 0; [ep: 1] iter: 5; lr: 8e-05; loss: 4.80023; btime(ms): 1518; eta: 0:00:07; peak_mem(M): 2595;\n",
"INFO 2021-10-18 00:57:42,862 trainer_main.py: 214: Meters synced\n",
"INFO 2021-10-18 00:57:42,905 log_hooks.py: 568: Average train batch time (ms) for 5 batches: 108\n",
"INFO 2021-10-18 00:57:42,905 log_hooks.py: 577: Train step time breakdown (rank 0):\n",
" Timer Host CudaEvent\n",
" read_sample: 11.00 ms 3.88 ms\n",
" forward: 27.48 ms 34.67 ms\n",
" loss_compute: 0.65 ms 0.64 ms\n",
" loss_all_reduce: 0.10 ms 0.10 ms\n",
" meters_update: 0.46 ms 0.47 ms\n",
" backward: 15.65 ms 57.23 ms\n",
" optimizer_step: 8.85 ms 10.63 ms\n",
" train_step_total: 99.71 ms 108.45 ms\n",
"INFO 2021-10-18 00:57:42,905 log_hooks.py: 498: Rank: 0, name: train_accuracy_list_meter, value: {'top_1': {0: 50.0}, 'top_5': {0: 100.0}}\n",
"INFO 2021-10-18 00:57:42,906 io.py: 63: Saving data to file: /content/checkpoints/metrics.json\n",
"INFO 2021-10-18 00:57:42,906 io.py: 89: Saved data to file: /content/checkpoints/metrics.json\n",
"INFO 2021-10-18 00:57:42,906 log_hooks.py: 426: [phase: 1] Saving checkpoint to /content/checkpoints\n",
"INFO 2021-10-18 00:57:43,403 checkpoint.py: 131: Saved checkpoint: /content/checkpoints/model_final_checkpoint_phase1.torch\n",
"INFO 2021-10-18 00:57:43,404 checkpoint.py: 140: Creating symlink...\n",
"INFO 2021-10-18 00:57:43,404 checkpoint.py: 144: Created symlink: /content/checkpoints/checkpoint.torch\n",
"INFO 2021-10-18 00:57:43,405 __init__.py: 101: Distributed Sampler config:\n",
"{'num_replicas': 1, 'rank': 0, 'epoch': 3, 'num_samples': 10, 'total_size': 10, 'shuffle': True, 'seed': 0}\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"** fvcore version of PathManager will be deprecated soon. **\n",
"** Please migrate to the version in iopath repo. **\n",
"https://github.com/facebookresearch/iopath \n",
"\n",
"INFO 2021-10-18 00:57:48,868 trainer_main.py: 333: Phase advanced. Rank: 0\n",
"INFO 2021-10-18 00:57:48,869 state_update_hooks.py: 113: Starting phase 3 [test]\n",
"INFO 2021-10-18 00:57:49,126 trainer_main.py: 214: Meters synced\n",
"INFO 2021-10-18 00:57:49,126 log_hooks.py: 568: Average test batch time (ms) for 5 batches: 51\n",
"INFO 2021-10-18 00:57:49,127 log_hooks.py: 498: Rank: 0, name: test_accuracy_list_meter, value: {'top_1': {0: 50.0}, 'top_5': {0: 100.0}}\n",
"INFO 2021-10-18 00:57:49,127 io.py: 63: Saving data to file: /content/checkpoints/metrics.json\n",
"INFO 2021-10-18 00:57:49,127 io.py: 89: Saved data to file: /content/checkpoints/metrics.json\n",
"INFO 2021-10-18 00:57:49,227 train.py: 131: All Done!\n",
"INFO 2021-10-18 00:57:49,228 logger.py: 73: Shutting down loggers...\n",
"INFO 2021-10-18 00:57:49,228 distributed_launcher.py: 168: All Done!\n",
"INFO 2021-10-18 00:57:49,229 logger.py: 73: Shutting down loggers...\n"
]
}
],
"source": [
"!python3 tools/run_distributed_engines.py \\\n",
" hydra.verbose=true \\\n",
" config=benchmark/fulltune/imagenet1k/eval_resnet_8gpu_transfer_in1k_fulltune.yaml \\\n",
" config.DATA.TRAIN.DATA_SOURCES=[disk_folder] \\\n",
" config.DATA.TRAIN.LABEL_SOURCES=[disk_folder] \\\n",
" config.DATA.TRAIN.DATASET_NAMES=[dummy_data_folder] \\\n",
" config.DATA.TRAIN.BATCHSIZE_PER_REPLICA=2 \\\n",
" config.DATA.TEST.DATA_SOURCES=[disk_folder] \\\n",
" config.DATA.TEST.LABEL_SOURCES=[disk_folder] \\\n",
" config.DATA.TEST.DATASET_NAMES=[dummy_data_folder] \\\n",
" config.DATA.TEST.BATCHSIZE_PER_REPLICA=2 \\\n",
" config.OPTIMIZER.num_epochs=2 \\\n",
" config.OPTIMIZER.param_schedulers.lr.values=[0.01,0.001] \\\n",
" config.OPTIMIZER.param_schedulers.lr.milestones=[1] \\\n",
" config.DISTRIBUTED.NUM_NODES=1 \\\n",
" config.DISTRIBUTED.NUM_PROC_PER_NODE=1 \\\n",
" config.CHECKPOINT.DIR=\"/content/checkpoints\" \\\n",
" config.MODEL.WEIGHTS_INIT.PARAMS_FILE=\"/content/resnet50-19c8e357.pth\" \\\n",
" config.MODEL.WEIGHTS_INIT.APPEND_PREFIX=\"trunk._feature_blocks.\" \\\n",
" config.MODEL.WEIGHTS_INIT.STATE_DICT_KEY_NAME=\"\"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "A8fILq7VzyOu"
},
"source": [
"And we are done!! We have the full-finetuned model and the `metrics.json` containing `top-1` and `top-5` accuracy on validation set is available in `checkpoints/metrics.json`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "otUmgl4ms96M",
"outputId": "75ef95e3-f0b7-4f49-b72b-167e35f66680"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[0m\u001b[01;36mcheckpoint.torch\u001b[0m@ model_final_checkpoint_phase1.torch train_config.yaml\n",
"log.txt model_phase0.torch\n",
"metrics.json stdout.json\n"
]
}
],
"source": [
"ls /content/checkpoints/"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "vDMLjudpya2I",
"outputId": "a3ebafa6-f0da-41cf-80fd-553baf237bfa"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\"iteration\": 5, \"phase_idx\": 0, \"train_accuracy_list_meter\": {\"top_1\": {\"0\": 20.0}, \"top_5\": {\"0\": 40.0}}, \"train_phase_idx\": 0}\n",
"{\"iteration\": 5, \"phase_idx\": 1, \"test_accuracy_list_meter\": {\"top_1\": {\"0\": 50.0}, \"top_5\": {\"0\": 100.0}}, \"train_phase_idx\": 0}\n",
"{\"iteration\": 10, \"phase_idx\": 2, \"train_accuracy_list_meter\": {\"top_1\": {\"0\": 50.0}, \"top_5\": {\"0\": 100.0}}, \"train_phase_idx\": 1}\n",
"{\"iteration\": 10, \"phase_idx\": 3, \"test_accuracy_list_meter\": {\"top_1\": {\"0\": 50.0}, \"top_5\": {\"0\": 100.0}}, \"train_phase_idx\": 1}\n"
]
}
],
"source": [
"cat /content/checkpoints/metrics.json"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9xFUcTj00B_a"
},
"source": [
"# Loading Pre-trained models in VISSL\n",
"\n",
"VISSL supports Torchvision models out of the box. Generally, for loading any non-VISSL model, one needs to correctly set the following configuration options:\n",
"\n",
"```yaml\n",
"WEIGHTS_INIT:\n",
" # path to the .torch weights files\n",
" PARAMS_FILE: \"\"\n",
" # name of the state dict. checkpoint = {\"classy_state_dict\": {layername:value}}. Options:\n",
" # 1. classy_state_dict - if model is trained and checkpointed with VISSL.\n",
" # checkpoint = {\"classy_state_dict\": {layername:value}}\n",
" # 2. \"\" - if the model_file is not a nested dictionary for model weights i.e.\n",
" # checkpoint = {layername:value}\n",
" # 3. key name that your model checkpoint uses for state_dict key name.\n",
" # checkpoint = {\"your_key_name\": {layername:value}}\n",
" STATE_DICT_KEY_NAME: \"classy_state_dict\"\n",
" # specify what layer should not be loaded. Layer names with this key are not copied\n",
" # By default, set to BatchNorm stats \"num_batches_tracked\" to be skipped.\n",
" SKIP_LAYERS: [\"num_batches_tracked\"]\n",
" ####### If loading a non-VISSL trained model, set the following two args carefully #########\n",
" # to make the checkpoint compatible with VISSL, if you need to remove some names\n",
" # from the checkpoint keys, specify the name\n",
" REMOVE_PREFIX: \"\"\n",
" # In order to load the model (if not trained with VISSL) with VISSL, there are 2 scenarios:\n",
" # 1. If you are interested in evaluating the model features and freeze the trunk.\n",
" # Set APPEND_PREFIX=\"trunk.base_model.\" This assumes that your model is compatible\n",
" # with the VISSL trunks. The VISSL trunks start with \"_feature_blocks.\" prefix. If\n",
" # your model doesn't have these prefix you can append them. For example:\n",
" # For TorchVision ResNet trunk, set APPEND_PREFIX=\"trunk.base_model._feature_blocks.\"\n",
" # 2. where you want to load the model simply and finetune the full model.\n",
" # Set APPEND_PREFIX=\"trunk.\"\n",
" # This assumes that your model is compatible with the VISSL trunks. The VISSL\n",
" # trunks start with \"_feature_blocks.\" prefix. If your model doesn't have these\n",
" # prefix you can append them.\n",
" # For TorchVision ResNet trunk, set APPEND_PREFIX=\"trunk._feature_blocks.\"\n",
" # NOTE: the prefix is appended to all the layers in the model\n",
" APPEND_PREFIX: \"trunk._feature_blocks.\"\n",
" ```\n",
"\n",
" **NOTE:** The above configuration will only load the TRUNK of a torchvision model. If you wish to load the HEAD and TRUNK of a torchvision model, you will have to convert the torchvision model to a VISSL supported checkpoint."
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"include_colab_link": true,
"name": "Benchmark Full-Finetuning on ImageNet-1K V0.1.6.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}