ubuntu18.04 下slowfast网络环境安装及模型测试( python3.9)

2023-10-24 12:50

本文主要是介绍ubuntu18.04 下slowfast网络环境安装及模型测试( python3.9),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

用pip 安装建议用国内源,如 pip install xxx -i https://pypi.tuna.tsinghua.edu.cn/simple

目录

1.conda env 环境创建

2. install pytorch 

3. install fvcore

4. install simplejson

5. gcc版本查看

6. PyAV

7.ffmpeg with PyAV

8. PyYaml , tqdm

9.iopath

10. psutil

11. opencv

12. tensorboard

13. moviepy

14. PyTorchVideo

15. Detectron2

16. FairScale

17. SlowFast

运行Demo测试模型

安装过程中遇到的一些errors

error0 

         error1

error2

error3

error4

error5

error6

error7


1.conda env 环境创建

conda create -n py39 python=3.9

2. install pytorch 

先查看cuda版本 , 再对应pytorch版本

查看系统nvidia驱动版本支持最高cuda版本

查看当前cuda版本

根据对应cuda版本安装pytorch torchvision

source activate py39
conda install pytorch torchvision cudatoolkit=11.3 -c pytorch

3. install fvcore

pip install git+https://github.com/facebookresearch/fvcore

4. install simplejson

pip install simplejson 

5. gcc版本查看

gcc -v



版本是 7.5.0

6. PyAV

conda install av -c conda-forge

7.ffmpeg with PyAV

pip install av

8. PyYaml , tqdm

pip list fvcore

9.iopath

pip install -U iopath

10. psutil

pip install psutil

11. opencv

pip install opencv-python

12. tensorboard

查看是否安装tensorboard:

conda list tensorboard


没有安装tensorboard

pip install tensorboard

13. moviepy

pip install moviepy

14. PyTorchVideo

pip install pytorchvideo

15. Detectron2

git clone https://github.com/facebookresearch/detectron2 detectron2_repo

pip install -e detectron2_repo

16. FairScale

pip install git+https://github.com/facebookresearch/fairscale

17. SlowFast

git clone https://github.com/facebookresearch/SlowFast.git


cd SlowFast
python setup.py build develop

运行Demo测试模型

python3 tools/run_net.py --cfg demo/AVA/SLOWFAST_32x2_R101_50_50.yaml

安装过程中遇到的一些errors

error0 

not find PIL 

解决办法:将setup.py 中的 PIL 更改为 Pillow 

error1

from pytorchvideo.layers.distributed import ( # noqa
ImportError: cannot import name 'cat_all_gather' from 'pytorchvideo.layers.distributed' (/home/cxgk/anaconda3/envs/sf/lib/python3.9/site-packages/pytorchvideo/layers/distributed.py)

解决方式:

方式一:将pytorchvideo/pytorchvideo at main · facebookresearch/pytorchvideo · GitHub文件下内容复制到虚拟环境所对应的文件下,这里是:/home/cxgk/anaconda3/envs/sf/lib/python3.9/site-packages/pytorchvideo/

方式二:
layers/distributed.py添加如下内容

# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved."""Distributed helpers."""import torch
import torch.distributed as dist
from torch._C._distributed_c10d import ProcessGroup
from torch.autograd.function import Function_LOCAL_PROCESS_GROUP = Nonedef get_world_size() -> int:"""Simple wrapper for correctly getting worldsize in both distributed/ non-distributed settings"""return (torch.distributed.get_world_size()if torch.distributed.is_available() and torch.distributed.is_initialized()else 1)def cat_all_gather(tensors, local=False):"""Performs the concatenated all_reduce operation on the provided tensors."""if local:gather_sz = get_local_size()else:gather_sz = torch.distributed.get_world_size()tensors_gather = [torch.ones_like(tensors) for _ in range(gather_sz)]torch.distributed.all_gather(tensors_gather,tensors,async_op=False,group=_LOCAL_PROCESS_GROUP if local else None,)output = torch.cat(tensors_gather, dim=0)return outputdef init_distributed_training(cfg):"""Initialize variables needed for distributed training."""if cfg.NUM_GPUS <= 1:returnnum_gpus_per_machine = cfg.NUM_GPUSnum_machines = dist.get_world_size() // num_gpus_per_machinefor i in range(num_machines):ranks_on_i = list(range(i * num_gpus_per_machine, (i + 1) * num_gpus_per_machine))pg = dist.new_group(ranks_on_i)if i == cfg.SHARD_ID:global _LOCAL_PROCESS_GROUP_LOCAL_PROCESS_GROUP = pgdef get_local_size() -> int:"""Returns:The size of the per-machine process group,i.e. the number of processes per machine."""if not dist.is_available():return 1if not dist.is_initialized():return 1return dist.get_world_size(group=_LOCAL_PROCESS_GROUP)def get_local_rank() -> int:"""Returns:The rank of the current process within the local (per-machine) process group."""if not dist.is_available():return 0if not dist.is_initialized():return 0assert _LOCAL_PROCESS_GROUP is not Nonereturn dist.get_rank(group=_LOCAL_PROCESS_GROUP)def get_local_process_group() -> ProcessGroup:assert _LOCAL_PROCESS_GROUP is not Nonereturn _LOCAL_PROCESS_GROUPclass GroupGather(Function):"""GroupGather performs all gather on each of the local process/ GPU groups."""@staticmethoddef forward(ctx, input, num_sync_devices, num_groups):"""Perform forwarding, gathering the stats across different process/ GPUgroup."""ctx.num_sync_devices = num_sync_devicesctx.num_groups = num_groupsinput_list = [torch.zeros_like(input) for k in range(get_local_size())]dist.all_gather(input_list, input, async_op=False, group=get_local_process_group())inputs = torch.stack(input_list, dim=0)if num_groups > 1:rank = get_local_rank()group_idx = rank // num_sync_devicesinputs = inputs[group_idx * num_sync_devices : (group_idx + 1) * num_sync_devices]inputs = torch.sum(inputs, dim=0)return inputs@staticmethoddef backward(ctx, grad_output):"""Perform backwarding, gathering the gradients across different process/ GPUgroup."""grad_output_list = [torch.zeros_like(grad_output) for k in range(get_local_size())]dist.all_gather(grad_output_list,grad_output,async_op=False,group=get_local_process_group(),)grads = torch.stack(grad_output_list, dim=0)if ctx.num_groups > 1:rank = get_local_rank()group_idx = rank // ctx.num_sync_devicesgrads = grads[group_idx* ctx.num_sync_devices : (group_idx + 1)* ctx.num_sync_devices]grads = torch.sum(grads, dim=0)return grads, None, None

error2

from scipy.ndimage import gaussian_filter

ModuleNotFoundError: No module named 'scipy'

解决方法:

pip install scipy

error3

from av._core import time_base, library_versions

ImportError: /home/cxgk/anaconda3/envs/sf/lib/python3.9/site-packages/av/../../.././libgnutls.so.30: symbol mpn_copyi version HOGWEED_6 not defined in file libhogweed.so.6 with link time reference
 

解决方法:

先移处av包

使用 pip安装


pip install av


error4

File "/media/cxgk/Linux/work/SlowFast/slowfast/models/losses.py", line 11, in
from pytorchvideo.losses.soft_target_cross_entropy import (
ModuleNotFoundError: No module named 'pytorchvideo.losses'

解决办法:

打开"/home/cxgk/anaconda3/envs/sf/lib/python3.9/site-packages/pytorchvideo/losses",在文件夹下新建 soft_target_cross_entropy.py, 并打开添加如下代码:

# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.import torch
import torch.nn as nn
import torch.nn.functional as F
from pytorchvideo.layers.utils import set_attributes
from pytorchvideo.transforms.functional import convert_to_one_hotclass SoftTargetCrossEntropyLoss(nn.Module):"""Adapted from Classy Vision: ./classy_vision/losses/soft_target_cross_entropy_loss.py.This allows the targets for the cross entropy loss to be multi-label."""def __init__(self,ignore_index: int = -100,reduction: str = "mean",normalize_targets: bool = True,) -> None:"""Args:ignore_index (int): sample should be ignored for loss if the class is this value.reduction (str): specifies reduction to apply to the output.normalize_targets (bool): whether the targets should be normalized to a sum of 1based on the total count of positive targets for a given sample."""super().__init__()set_attributes(self, locals())assert isinstance(self.normalize_targets, bool)if self.reduction not in ["mean", "none"]:raise NotImplementedError('reduction type "{}" not implemented'.format(self.reduction))self.eps = torch.finfo(torch.float32).epsdef forward(self, input: torch.Tensor, target: torch.Tensor) -> torch.Tensor:"""Args:input (torch.Tensor): the shape of the tensor is N x C, where N is the number ofsamples and C is the number of classes. The tensor is raw input withoutsoftmax/sigmoid.target (torch.Tensor): the shape of the tensor is N x C or N. If the shape is N, wewill convert the target to one hot vectors."""# Check if targets are inputted as class integersif target.ndim == 1:assert (input.shape[0] == target.shape[0]), "SoftTargetCrossEntropyLoss requires input and target to have same batch size!"target = convert_to_one_hot(target.view(-1, 1), input.shape[1])assert input.shape == target.shape, ("SoftTargetCrossEntropyLoss requires input and target to be same "f"shape: {input.shape} != {target.shape}")# Samples where the targets are ignore_index do not contribute to the lossN, C = target.shapevalid_mask = torch.ones((N, 1), dtype=torch.float).to(input.device)if 0 <= self.ignore_index <= C - 1:drop_idx = target[:, self.ignore_idx] > 0valid_mask[drop_idx] = 0valid_targets = target.float() * valid_maskif self.normalize_targets:valid_targets /= self.eps + valid_targets.sum(dim=1, keepdim=True)per_sample_per_target_loss = -valid_targets * F.log_softmax(input, -1)per_sample_loss = torch.sum(per_sample_per_target_loss, -1)# Perform reductionif self.reduction == "mean":# Normalize based on the number of samples with > 0 non-ignored targetsloss = per_sample_loss.sum() / torch.sum((torch.sum(valid_mask, -1) > 0)).clamp(min=1)elif self.reduction == "none":loss = per_sample_lossreturn 

error5

from sklearn.metrics import confusion_matrix

ModuleNotFoundError: No module named 'sklearn'

解决办法:

pip install scikit-learn

error6

raise KeyError("Non-existent config key: {}".format(full_key))

KeyError: 'Non-existent config key: TENSORBOARD.MODEL_VIS.TOPK'

解决方法:

注释掉如下三行:

TENSORBOARD

MODEL_VIS

TOPK

error7

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 3.94 GiB total capacity; 2.83 GiB already allocated; 25.44 MiB free; 2.84 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

解决方法:

将yaml里的帧数改小:

DATA:
NUM_FRAMES: 16

Reference:

https://github.com/facebookresearch/pytorchvideo/blob/main/pytorchvideo

这篇关于ubuntu18.04 下slowfast网络环境安装及模型测试( python3.9)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/275294

相关文章

Linux中压缩、网络传输与系统监控工具的使用完整指南

《Linux中压缩、网络传输与系统监控工具的使用完整指南》在Linux系统管理中,压缩与传输工具是数据备份和远程协作的桥梁,而系统监控工具则是保障服务器稳定运行的眼睛,下面小编就来和大家详细介绍一下它... 目录引言一、压缩与解压:数据存储与传输的优化核心1. zip/unzip:通用压缩格式的便捷操作2.

Python中win32包的安装及常见用途介绍

《Python中win32包的安装及常见用途介绍》在Windows环境下,PythonWin32模块通常随Python安装包一起安装,:本文主要介绍Python中win32包的安装及常见用途的相关... 目录前言主要组件安装方法常见用途1. 操作Windows注册表2. 操作Windows服务3. 窗口操作

SQLite3 在嵌入式C环境中存储音频/视频文件的最优方案

《SQLite3在嵌入式C环境中存储音频/视频文件的最优方案》本文探讨了SQLite3在嵌入式C环境中存储音视频文件的优化方案,推荐采用文件路径存储结合元数据管理,兼顾效率与资源限制,小文件可使用B... 目录SQLite3 在嵌入式C环境中存储音频/视频文件的专业方案一、存储策略选择1. 直接存储 vs

使用Python进行GRPC和Dubbo协议的高级测试

《使用Python进行GRPC和Dubbo协议的高级测试》GRPC(GoogleRemoteProcedureCall)是一种高性能、开源的远程过程调用(RPC)框架,Dubbo是一种高性能的分布式服... 目录01 GRPC测试安装gRPC编写.proto文件实现服务02 Dubbo测试1. 安装Dubb

Python的端到端测试框架SeleniumBase使用解读

《Python的端到端测试框架SeleniumBase使用解读》:本文主要介绍Python的端到端测试框架SeleniumBase使用,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全... 目录SeleniumBase详细介绍及用法指南什么是 SeleniumBase?SeleniumBase

gitlab安装及邮箱配置和常用使用方式

《gitlab安装及邮箱配置和常用使用方式》:本文主要介绍gitlab安装及邮箱配置和常用使用方式,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录1.安装GitLab2.配置GitLab邮件服务3.GitLab的账号注册邮箱验证及其分组4.gitlab分支和标签的

MySQL MCP 服务器安装配置最佳实践

《MySQLMCP服务器安装配置最佳实践》本文介绍MySQLMCP服务器的安装配置方法,本文结合实例代码给大家介绍的非常详细,对大家的学习或工作具有一定的参考借鉴价值,需要的朋友参考下... 目录mysql MCP 服务器安装配置指南简介功能特点安装方法数据库配置使用MCP Inspector进行调试开发指

在Windows上使用qemu安装ubuntu24.04服务器的详细指南

《在Windows上使用qemu安装ubuntu24.04服务器的详细指南》本文介绍了在Windows上使用QEMU安装Ubuntu24.04的全流程:安装QEMU、准备ISO镜像、创建虚拟磁盘、配置... 目录1. 安装QEMU环境2. 准备Ubuntu 24.04镜像3. 启动QEMU安装Ubuntu4

python常见环境管理工具超全解析

《python常见环境管理工具超全解析》在Python开发中,管理多个项目及其依赖项通常是一个挑战,下面:本文主要介绍python常见环境管理工具的相关资料,文中通过代码介绍的非常详细,需要的朋友... 目录1. conda2. pip3. uvuv 工具自动创建和管理环境的特点4. setup.py5.

Python UV安装、升级、卸载详细步骤记录

《PythonUV安装、升级、卸载详细步骤记录》:本文主要介绍PythonUV安装、升级、卸载的详细步骤,uv是Astral推出的下一代Python包与项目管理器,主打单一可执行文件、极致性能... 目录安装检查升级设置自动补全卸载UV 命令总结 官方文档详见:https://docs.astral.sh/