图片分享网站源码,wordpress 绿色,临夏州建设厅官方网站,公司管理软件有哪些配置GCNv2_SLAM所需环境并实现AutoDL云端运行项目的全过程记录。 本文首发于❄慕雪的寒舍 1. 引子
前几天写了一篇在本地虚拟机里面CPU运行GCNv2_SLAM项目的博客#xff1a;链接#xff0c;关于GCNv2_SLAM项目相关的介绍请移步此文章#xff0c;本文不再重复说明。
GCNv2:…配置GCNv2_SLAM所需环境并实现AutoDL云端运行项目的全过程记录。 本文首发于❄慕雪的寒舍 1. 引子
前几天写了一篇在本地虚拟机里面CPU运行GCNv2_SLAM项目的博客链接关于GCNv2_SLAM项目相关的介绍请移步此文章本文不再重复说明。
GCNv2: Efficient Correspondence Prediction for Real-Time SLAM;github.com/jiexiong2016/GCNv2_SLAM;
在之前的测试中本地虚拟机CPU运行的效果非常差推理速度只有可怜兮兮的0.5 HZ但是我手头又没有带显卡的环境所以想到了可以去网上租个带显卡的容器化环境。
AutoDL就是一个租GPU环境的平台: https://www.autodl.com/而且autodl租显卡是可以按小时付费的比按月付费的更加划算更好过自己买个显卡在本地倒腾ubuntu环境所以就直接开整了
先注册一个AutoDL的账户给里面充值一丢丢钱然后就可以租一个显卡容器化环境来运行GCNv2_SLAM啦 2. AutoDL环境选择
老版本PyTorch的镜像由于4090无法使用太低的cuda版本导致无法选择如果需要使用更低版本的pytorch镜像则需要租用2080ti或者1080ti显卡的环境。
2080ti显卡可以选择如下环境实测可用
PyTorch 1.5.1
Python 3.8(ubuntu18.04)
Cuda 10.1创建环境后建议使用左侧的ssh登录指令直接在本地终端里面执行登录到云端。如果你没有本地的ssh终端也可以点击JupyterLab里面的终端来运行命令。 后文涉及到下载很多文件如果从github下载很慢可以在本地下好之后通过JupyterLab传到云端去。注意传文件之前要先在文件列表里面选好目标的目录。 还可以尝试autodl自带的代理www.autodl.com/docs/network_turbo/但是慕雪试用的时候这个代理一直返回503不可用状态。
3. 依赖安装
3.1. 需要的apt包安装
运行之前先更新一下环境这部分操作和在本地虚拟机里面安装环境都是一样的。
sudo apt-get update -y
sudo apt-get upgrade -y更新的时候会有一个新的sshd配置的提醒这里直接选择1用新版本配置就可以了
A new version (/tmp/file1bBLK4) of configuration file /etc/ssh/sshd_config is available, but the version installed currently has been
locally modified.1. install the package maintainers version 5. show a 3-way difference between available versions2. keep the local version currently installed 6. do a 3-way merge between available versions3. show the differences between the versions 7. start a new shell to examine the situation4. show a side-by-side difference between the versions
What do you want to do about modified configuration file sshd_config? 1因为选了Pytorch镜像Python工具组系统已经自带了不需要安装。
安装要用的到的工具包
# 工具包
sudo apt-get install -y \apt-utils \curl wget unzip zip \cmake make automake \openssh-server \net-tools \vim git gcc g安装x11相关的依赖包
# x11 for gui
sudo apt-get install -y \libx11-xcb1 \libfreetype6 \libdbus-1-3 \libfontconfig1 \libxkbcommon0 \libxkbcommon-x11-0注意这里安装的x11依赖包有两个版本过高了后续安装pangolin等项目的依赖的时候会报错依赖冲突需要降级下面这两个依赖包
apt-get install -y \libx11-xcb12:1.6.4-3ubuntu0.4 \libx11-62:1.6.4-3ubuntu0.43.2. Pangolin-6.0
安装pangolin之前先安装如下依赖包
# pangolin
sudo apt-get install -y \libgl1-mesa-dev \libglew-dev \libboost-dev \libboost-thread-dev \libboost-filesystem-dev \libpython2.7-dev \libglu1-mesa-dev freeglut3-dev如果不执行降级命令安装Pangolin依赖包的时候的终端输出
rootautodl-container-e39d46b8d3-01da7b14:~# apt-get install -y libgl1-mesa-dev libglew-dev libboost-dev libboost-thread-dev libboost-filesystem-dev libpython2.7-dev libglu1-mesa-dev freeglut3-dev
Reading package lists... Done
Building dependency tree
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:The following packages have unmet dependencies:freeglut3-dev : Depends: libxext-dev but it is not going to be installedDepends: libxt-dev but it is not going to be installedlibgl1-mesa-dev : Depends: mesa-common-dev ( 20.0.8-0ubuntu1~18.04.1) but it is not going to be installedDepends: libx11-dev but it is not going to be installedDepends: libx11-xcb-dev but it is not going to be installedDepends: libxdamage-dev but it is not going to be installedDepends: libxext-dev but it is not going to be installedDepends: libxfixes-dev but it is not going to be installedDepends: libxxf86vm-dev but it is not going to be installed
E: Unable to correct problems, you have held broken packages.随后使用如下命令来编译安装PangolinGithub地址Pangolin-0.6。
建议这些依赖包都进入~/autodl-tmp数据盘来下载和安装这样即便后续需要更换镜像也能保留
# 下载
wget -O Pangolin-0.6.tar.gz https://github.com/stevenlovegrove/Pangolin/archive/refs/tags/v0.6.tar.gz
# 解压
tar -zxvf Pangolin-0.6.tar.gzpushd Pangolin-0.6rm -rf buildmkdir build cd build# 编译安装 cmake -DCPP11_NO_BOOST1 ..make -j$(nproc)make install# 刷新动态库ldconfig
popd编译安装成功 3.3. OpenCV 3.4.5
先安装依赖项
sudo apt-get install -y \build-essential libgtk2.0-dev \libavcodec-dev libavformat-dev \libjpeg.dev libtiff5.dev libswscale-dev \libcanberra-gtk-module因为autodl环境是amd64所以直接用下面的命令就OK了不需要额外的处理
# amd64
# 添加新源后继续安装
sudo apt-get install -y software-properties-common
# 下面这条命令实测在arm64上不能用先不要执行
sudo add-apt-repository deb http://security.ubuntu.com/ubuntu xenial-security main
sudo apt-get -y update
sudo apt-get install -y libjasper1 libjasper-dev以下是安装libjasper的截图 安装好了依赖项后使用如下命令编译opencvGithub地址opencv的3.4.5版本。
# 下载和解压
wget -O opencv-3.4.5.tar.gz https://github.com/opencv/opencv/archive/refs/tags/3.4.5.tar.gz
tar -zxvf opencv-3.4.5.tar.gz
# 开始编译和安装
pushd opencv-3.4.5rm -rf buildmkdir build cd build # 构建和编译安装-j4代表4线程并发cmake -D CMAKE_BUILD_TYPERelease -D CMAKE_INSTALL_PREFIX/usr/local ..make -j$(nproc)make install# 刷新动态库ldconfig
popd正常编译安装莫得问题 3.4. Eigen 3.7
Eigen包在gitlab里面下载gitlab.com/libeigen/eigen/-/releases/3.3.7
# 下载
wget -O eigen-3.3.7.tar.gz https://gitlab.com/libeigen/eigen/-/archive/3.3.7/eigen-3.3.7.tar.gz
tar -zxvf eigen-3.3.7.tar.gz
# 开始编译和安装
cd eigen-3.3.7
mkdir build cd build
cmake ..
make make install
# 拷贝路径避免头文件引用不到
sudo cp -r /usr/local/include/eigen3/Eigen /usr/local/include还是用相同的cpp的demo代码来测试是否安装成功直接g编译就可以了
#include iostream
//需要将头文件从 /usr/local/include/eigen3/ 复制到 /usr/local/include
#include Eigen/Dense
//using Eigen::MatrixXd;
using namespace Eigen;
using namespace Eigen::internal;
using namespace Eigen::Architecture;
using namespace std;
int main()
{cout*******************1D-object****************endl;Vector4d v1;v1 1,2,3,4;coutv1\nv1endl;VectorXd v2(3);v21,2,3;coutv2\nv2endl;Array4i v3;v31,2,3,4;coutv3\nv3endl;ArrayXf v4(3);v41,2,3;coutv4\nv4endl;
}正常编译运行
rootautodl-container-e39d46b8d3-01da7b14:~/pkg/eigen-3.3.7/build# g test.cpp -o t
rootautodl-container-e39d46b8d3-01da7b14:~/pkg/eigen-3.3.7/build# ./t
*******************1D-object****************
v1
1
2
3
4
v2
1
2
3
v3
1
2
3
4
v4
1
2
33.5. Libtorch 1.5.0
3.5.1. 关于手工编译的说明
因为我们选择的autodl环境里面已经带了Pytorch了所以可以不需要自己手动从源码构建了。
我尝试过从源码构建pytorch 1.1.0版本会在构建的半路被killed掉不清楚问题在哪里猜测是构建占用内存cpu过多导致的当时被kill掉的输出如下大约在74%的时候前后都没有出现error就直接被干掉了。 3.5.2. 不能使用本地已有的版本
我们选用的autodl镜像里面其实已经自带了一个可用的Torch目录路径如下所示
/root/miniconda3/lib/python3.8/site-packages/torch/share/cmake/Torch但是这个目录中引用的libtorch预编译版本是不包含C11ABI兼容机制的会最终导致Pangolin链接失败错误输出如下所示。
这个链接失败的问题和使用的Pangolin版本没有关系尝试过Pangolin5.0和6.0都会链接失败。
[100%] Linking CXX executable ../GCN2/rgbd_gcn
../lib/libORB_SLAM2.so: undefined reference to pangolin::Split(std::string const, char)
../lib/libORB_SLAM2.so: undefined reference to pangolin::CreatePanel(std::string const)
../lib/libORB_SLAM2.so: undefined reference to DBoW2::FORB::fromString(cv::Mat, std::string const)
../lib/libORB_SLAM2.so: undefined reference to pangolin::BindToContext(std::string)
../lib/libORB_SLAM2.so: undefined reference to DBoW2::FORB::toString(cv::Mat const)
../lib/libORB_SLAM2.so: undefined reference to pangolin::CreateWindowAndBind(std::string, int, int, pangolin::Params const)
collect2: error: ld returned 1 exit status
CMakeFiles/rgbd_gcn.dir/build.make:152: recipe for target ../GCN2/rgbd_gcn failed
make[2]: *** [../GCN2/rgbd_gcn] Error 1
CMakeFiles/Makefile2:67: recipe for target CMakeFiles/rgbd_gcn.dir/all failed
make[1]: *** [CMakeFiles/rgbd_gcn.dir/all] Error 2
Makefile:83: recipe for target all failed
make: *** [all] Error 2在GCNv2的GITHUB中是有提到这个问题的翻译过来就是不要使用预编译版本的libtorch因为会出现CXX11 ABI导致的连接错误。 在Pytroch 1.3.0之后的版本官方就已经提供了带CXX11 ABI兼容的预编译版本了所以可以下载预编译包之后来使用。直接使用容器内的libtorch依旧会有链接问题。
3.5.3. 下载预编译版本
最开始我选择的就是pytorch1.1.0版本的镜像但是由于没办法从源码编译所以切换成了pytorch1.5.1的镜像。因为在pytorch1.3.0之后官方才提供了CXX11 ABI兼容的预编译包在这之前的版本都需要手工编译否则会有链接错误。
我们需要做的操作是从官网上下一个带CXX11 ABI兼容的libtorch预编译包下载地址中包含cxx11-abi的才是带有CXX11 ABI兼容的。1.5.0版本的libtorch包下载地址如下其中cu101代表cuda10.1最后的libtorch版本是1.5.0libtorch 1.5.1版本的包下不了
https://download.pytorch.org/libtorch/cu101/libtorch-cxx11-abi-shared-with-deps-1.5.0.zip直接通过unzip解压这个目录就能得到一个libtorch文件夹后文需要的TORCH_PATH在libtorch的libtorch/share/cmake/Torch目录中就有
rootautodl-container-e39d46b8d3-01da7b14:~/autodl-tmp# ls libtorch/share/cmake/Torch
TorchConfig.cmake TorchConfigVersion.cmake预编译的libtorch包容量都挺大的建议本地提前下好然后上传到autodl里面在autodl里面直接下载太耗时了都是钱呐 4. 编译GCNv2_SLAM
上正主了克隆一下代码
git clone https://github.com/jiexiong2016/GCNv2_SLAM.git因为这次是在autodl环境中跑有了显卡pytorch的版本和之前的博客中的完全不一样所以需要修改的代码内容也不一样。可以参考博客 GCNv2_SLAM-CPU详细安装教程(ubuntu18.04)-CSDN博客 中的说明进行修改。
4.1. 修改build.sh
预编译版本的TORCH_PATH在压缩包解压后libtorch目录中即libtorch/share/cmake/Torch目录。修改build.sh脚本中的路径为此目录就可以了
-DTORCH_PATH/root/autodl-tmp/libtorch/share/cmake/Torch修改之后就可以开始编译并根据报错来解决后面的一些问题了
4.2. 修改代码兼容高版本libtorch
这部分修改可以在我的Github仓库中找到github.com/musnows/GCNv2_SLAM/tree/pytorch1.5.0
4.2.1. C14编译配置
初次运行会出现如下错误高版本的torch需要C14来编译因为用到了14的新特性
/root/autodl-tmp/libtorch/include/c10/util/C17.h:27:2: error: #error You need C14 to compile PyTorch27 | #error You need C14 to compile PyTorch| ^~~~~需要我们修改camke文件修改GCNv2_SLAM/CMakeLists.txt新增如下内容
# 头部插入
set(CMAKE_CXX_STANDARD 14)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
# 修改尾部的11为14
# set_property(TARGET rgbd_gcn PROPERTY CXX_STANDARD 11)
set_property(TARGET rgbd_gcn PROPERTY CXX_STANDARD 14)然后还需要注释掉和C11判断相关的cmake配置也就是下面这一堆
#Check C11 or C0x support
#include(CheckCXXCompilerFlag)
#CHECK_CXX_COMPILER_FLAG(-stdc11 COMPILER_SUPPORTS_CXX11)
#CHECK_CXX_COMPILER_FLAG(-stdc0x COMPILER_SUPPORTS_CXX0X)
#if(COMPILER_SUPPORTS_CXX11)
# set(CMAKE_CXX_FLAGS ${CMAKE_CXX_FLAGS} -stdc11)add_definitions(-DCOMPILEDWITHC11)
# message(STATUS Using flag -stdc11.)
#elseif(COMPILER_SUPPORTS_CXX0X)
# set(CMAKE_CXX_FLAGS ${CMAKE_CXX_FLAGS} -stdc0x)
# add_definitions(-DCOMPILEDWITHC0X)
# message(STATUS Using flag -stdc0x.)
#else()
# message(FATAL_ERROR The compiler ${CMAKE_CXX_COMPILER} has no C11 support. Please use a different C compiler.)
#endif()其中add_definitions(-DCOMPILEDWITHC11)不要注释掉有用
修改cmake后需要删除GCNv2_SLAM/build目录重新运行build.sh脚本否则修改可能不会生效。
4.2.2. 缺少对应的operator
报错如下
/root/autodl-tmp/GCNv2_SLAM/src/GCNextractor.cc: In constructor ‘ORB_SLAM2::GCNextractor::GCNextractor(int, float, int, int, int)’:
/root/autodl-tmp/GCNv2_SLAM/src/GCNextractor.cc:218:37: error: no match for ‘operator’ (operand types are ‘std::shared_ptrtorch::jit::Module’ and ‘torch::jit::Module’)module torch::jit::load(net_fn);^
In file included from /usr/include/c/7/memory:81:0,from /root/miniconda3/lib/python3.8/site-packages/torch/include/c10/core/Allocator.h:4,from /root/miniconda3/lib/python3.8/site-packages/torch/include/ATen/ATen.h:3,from /root/miniconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,from /root/miniconda3/lib/python3.8/site-packages/torch/include/torch/script.h:3,from /root/autodl-tmp/GCNv2_SLAM/include/GCNextractor.h:24,from /root/autodl-tmp/GCNv2_SLAM/src/GCNextractor.cc:63:问题主要是torch::jit::Module入参不再是一个指针了所以要把shared_ptr给改成普通对象。
修改GCNv2_SLAM/include/GCNextractor.h文件的99行
//原代码
std::shared_ptrtorch::jit::script::Module module;
//更改为
torch::jit::script::Module module;还需要对应修改GCNv2_SLAM/src/GCNextractor.cc的270行
//原代码
auto output module-forward(inputs).toTuple();
//更改为
auto output module.forward(inputs).toTuple();4.2.3. 标准库chrono编译问题
如果你的cmake修改不对还可能会遇到chrono导致的编译报错
/root/autodl-tmp/GCNv2_SLAM/GCN2/rgbd_gcn.cc: In function ‘int main(int, char**)’:
/root/autodl-tmp/GCNv2_SLAM/GCN2/rgbd_gcn.cc:97:22: error: ‘std::chrono::monotonic_clock’ has not been declaredstd::chrono::monotonic_clock::time_point t1 std::chrono::monotonic_clock::now();^~~~~~~~~~~~~~~
/root/autodl-tmp/GCNv2_SLAM/GCN2/rgbd_gcn.cc:106:22: error: ‘std::chrono::monotonic_clock’ has not been declaredstd::chrono::monotonic_clock::time_point t2 std::chrono::monotonic_clock::now();^~~~~~~~~~~~~~~
/root/autodl-tmp/GCNv2_SLAM/GCN2/rgbd_gcn.cc:109:84: error: ‘t2’ was not declared in this scopedouble ttrack std::chrono::duration_caststd::chrono::durationdouble (t2 - t1).count();^~
/root/autodl-tmp/GCNv2_SLAM/GCN2/rgbd_gcn.cc:109:84: note: suggested alternative: ‘tm’double ttrack std::chrono::duration_caststd::chrono::durationdouble (t2 - t1).count();^~tm
/root/autodl-tmp/GCNv2_SLAM/GCN2/rgbd_gcn.cc:109:89: error: ‘t1’ was not declared in this scopedouble ttrack std::chrono::duration_caststd::chrono::durationdouble (t2 - t1).count();^~
/root/autodl-tmp/GCNv2_SLAM/GCN2/rgbd_gcn.cc:109:89: note: suggested alternative: ‘tm’double ttrack std::chrono::duration_caststd::chrono::durationdouble (t2 - t1).count();^~tm
^CCMakeFiles/rgbd_gcn.dir/build.make:62: recipe for target CMakeFiles/rgbd_gcn.dir/GCN2/rgbd_gcn.cc.o failed
make[2]: *** [CMakeFiles/rgbd_gcn.dir/GCN2/rgbd_gcn.cc.o] Interrupt
CMakeFiles/Makefile2:67: recipe for target CMakeFiles/rgbd_gcn.dir/all failed
make[1]: *** [CMakeFiles/rgbd_gcn.dir/all] Interrupt
Makefile:83: recipe for target all failed
make: *** [all] Interrupt错误的主要含义就是std::chrono::monotonic_clock不存在这是老版本的一个类C11新版本已经给它删掉了。查看GCN2/rgbd_gcn.cc代码可以发现这里有宏定义来区分
// GCNv2_SLAM/GCN2/rgbd_gcn.cc
#ifdef COMPILEDWITHC11std::chrono::steady_clock::time_point t1 std::chrono::steady_clock::now();
#elsestd::chrono::monotonic_clock::time_point t1 std::chrono::monotonic_clock::now();
#endif前文提到的GCNv2_SLAM/CMakeLists.txt中需要保留add_definitions(-DCOMPILEDWITHC11)就是这个原因。有了这个宏定义此处代码就会编译std::chrono::steady_clock不会有编译错误了。
4.2.4. 修改PT文件
依旧需要修改3个pt文件注意这时候修改的内容和CPU运行不一样
修改GCNv2_SLAM/GCN2下gcn2_320x240.pt、gcn2_640x480.pt和gcn2_tiny_320x240.pt中的内容。需要先解压文件
unzip gcn2_320x240.pt解压出来之后会有GCNv2_SLAM/GCN2/gcn/code/gcn.py文件这里的grid_sampler函数在pytorch 1.3.0之前是默认传入True的1.3.0改成默认False了所以需要手动传入True
# 原代码
_32 torch.squeeze(torch.grid_sampler(input, grid, 0, 0))
# 修改为
_32 torch.squeeze(torch.grid_sampler(input, grid, 0, 0, True))替换了之后重新压缩pt文件先删了原本的重新压缩
rm -rf gcn2_320x240.pt
zip -r gcn2_320x240.pt gcn
rm -rf gcn #删除刚刚的gcn文件夹这只是一个例子其他几个gcn2压缩包都要用相同的方式修改
unzip gcn2_640x480.pt
rm -rf gcn2_640x480.pt
# 修改下面这个文件
# GCNv2_SLAM/GCN2/gcn2_480x640/code/gcn2_480x640.py
# 重新压缩
zip -r gcn2_640x480.pt gcn2_480x640
rm -rf gcn2_480x640unzip gcn2_tiny_320x240.pt
rm -rf gcn2_tiny_320x240.pt
# 修改文件
# gcnv2slam/GCNv2_SLAM/GCN2/gcn2_tiny/code/gcn2_tiny.py
# 重新压缩
zip -r gcn2_tiny_320x240.pt gcn2_tiny
rm -rf gcn2_tiny4.3. 编译项目
修改了上面提到的几处问题就能正常编译成功了 5. 配置VNC环境
5.1. 安装VNC服务端
默认情况下autodl是没有GUI环境的也就没有办法运行项目会有x11报错
所以我们需要依照官方文档来配置一下GUIwww.autodl.com/docs/gui/
# 安装基本的依赖包
apt update apt install -y libglu1-mesa-dev mesa-utils xterm xauth x11-xkb-utils xfonts-base xkb-data libxtst6 libxv1# 安装libjpeg-turbo和turbovnc
export TURBOVNC_VERSION2.2.5
export LIBJPEG_VERSION2.0.90
wget http://aivc.ks3-cn-beijing.ksyun.com/packages/libjpeg-turbo/libjpeg-turbo-official_${LIBJPEG_VERSION}_amd64.deb
wget http://aivc.ks3-cn-beijing.ksyun.com/packages/turbovnc/turbovnc_${TURBOVNC_VERSION}_amd64.deb
dpkg -i libjpeg-turbo-official_${LIBJPEG_VERSION}_amd64.deb
dpkg -i turbovnc_${TURBOVNC_VERSION}_amd64.deb
rm -rf *.deb# 启动VNC服务端这一步可能涉及vnc密码配置注意不是实例的账户密码。另外如果出现报错xauth未找到那么使用apt install xauth再安装一次
rm -rf /tmp/.X1* # 如果再次启动删除上一次的临时文件否则无法正常启动
USERroot /opt/TurboVNC/bin/vncserver :1 -desktop X -auth /root/.Xauthority -geometry 1920x1080 -depth 24 -rfbwait 120000 -rfbauth /root/.vnc/passwd -fp /usr/share/fonts/X11/misc/,/usr/share/fonts -rfbport 6006# 检查是否启动如果有vncserver的进程证明已经启动
ps -ef | grep vnc | grep -v grep启动vnc服务端会让你输入密码为了方便我直接用了autodl实例的密码。只读密码view-only password选择n不设置。
[rootautodl-container-e39d46b8d3-01da7b14:~/vnc]$ USERroot /opt/TurboVNC/bin/vncserver :1 -desktop X -auth /root/.Xauthority -geometry 1920x1080 -depth 24 -rfbwait 120000 -rfbauth /root/.vnc/passwd -fp /usr/share/fonts/X11/misc/,/usr/share/fonts -rfbport 6006You will require a password to access your desktops.Password:
Warning: password truncated to the length of 8.
Verify:
Would you like to enter a view-only password (y/n)? n
xauth: file /root/.Xauthority does not existDesktop TurboVNC: autodl-container-e39d46b8d3-01da7b14:1 (root) started on display autodl-container-e39d46b8d3-01da7b14:1Creating default startup script /root/.vnc/xstartup.turbovnc
Starting applications specified in /root/.vnc/xstartup.turbovnc
Log file is /root/.vnc/autodl-container-e39d46b8d3-01da7b14:1.log启动vnc服务端后就能搜到进程了
rootautodl-container-e39d46b8d3-01da7b14:~/vnc# ps -ef | grep vnc | grep -v grep
root 28861 1 0 11:22 pts/0 00:00:00 /opt/TurboVNC/bin/Xvnc :1 -desktop TurboVNC: autodl-container-64eb44b6f5-c569ba8d:1 (root) -httpd /opt/TurboVNC/bin//../java -auth /root/.Xauthority -geometr如果关闭了实例之后需要重启vnc执行这两个命令就行了
rm -rf /tmp/.X1* # 如果再次启动删除上一次的临时文件否则无法正常启动
USERroot /opt/TurboVNC/bin/vncserver :1 -desktop X -auth /root/.Xauthority -geometry 1920x1080 -depth 24 -rfbwait 120000 -rfbauth /root/.vnc/passwd -fp /usr/share/fonts/X11/misc/,/usr/share/fonts -rfbport 60065.2. 本地端口绑定
随后还需要进行本地ssh端口绑定先到autodl的控制台实例列表里面复制一下ssh链接命令应该长这样
ssh -p 端口号 root域名使用下面这个命令在本地的终端运行就能实现把远程的端口绑定到本地的6006端口了
ssh -CNgv -L 6006:127.0.0.1:6006 root域名 -p 端口号如果命令正确输入这个命令后会让你键入autodl实例的密码在控制台里面复制然后ctrlshiftvcommandv粘贴就行了。
期间需要保持这个终端一直开启不然转发会终止。
5.3. 链接VNC
这里我使用了祖传的VNC Viewer来连云端全平台都有客户端下载安装就可以了。
安装了之后直接在顶栏输入127.0.0.1:6006来链接云端。 如果提示connection closed大概率是vnc服务没有正常安装或者端口转发没有成功请重试上述步骤。顺利的话就会弹出来让你输入密码。
这里的密码是启动vnc服务端时设置的密码根据你设置的密码输入就行。 链接成功会是黑屏正常情况 5.4. 测试VNC是否安装成功
我们可以用Pangolin的示例程序来试试有没有配置成功
cd Pangolin-0.6/examples/HelloPangolin
mkdir build cd build
cmake ..
make编译完成之后需要先执行export DISPLAY:1启用GUI再启动需要GUI的程序
export DISPLAY:1
./HelloPangolin 如果没有export直接启动还是会报错
rootautodl-container-e39d46b8d3-01da7b14:~/autodl-tmp/Pangolin-0.6/examples/HelloPangolin/build# ./HelloPangolin
terminate called after throwing an instance of std::runtime_errorwhat(): Pangolin X11: Failed to open X display
Aborted (core dumped)export了环境变量之后就能正常启动且VNC里面也能看到画面了
rootautodl-container-e39d46b8d3-01da7b14:~/autodl-tmp/Pangolin-0.6/examples/HelloPangolin/build# export DISPLAY:1
rootautodl-container-e39d46b8d3-01da7b14:~/autodl-tmp/Pangolin-0.6/examples/HelloPangolin/build# ./HelloPangolin 出现下面这个魔方就是安装VNC成功啦 你也可以编译opencv的demo来测试vnc是否正常
cd opencv-3.4.5/samples/cpp/example_cmake
mkdir build cd build
cmake ..
make
# 导入环境变量之后再启动
export DISPLAY:1
./opencv_example如果正常vnc里面会出现一个hello opencv因为没有摄像头所以是黑屏 6. 运行GCNv2_SLAM分析TUM数据集
接下来就可以运行项目了还是去下载TUM数据集这里把之前博客的命令copy过来。
6.1. 下载数据集
下载地址cvg.cit.tum.de/data/datasets/rgbd-dataset/download
下载fr1/desk数据集这是一个桌子的RGBD数据 在GCNv2_SLAM工程下新建datasets/TUM,将数据集下载到其中
# 新建datasets/TUM数据集文件夹
mkdir -p datasets/TUM
cd datasets/TUM
# 下载数据集到datasets/TUM文件夹内
wget -O rgbd_dataset_freiburg1_desk.tgz https://cvg.cit.tum.de/rgbd/dataset/freiburg1/rgbd_dataset_freiburg1_desk.tgz
# 解压数据集
tar -xvf rgbd_dataset_freiburg1_desk.tgz然后还需要下载一个associate.py脚本来处理一下数据集才能正常运行
下载地址svncvpr.in.tum.de同时在我的Github仓库也做了留档。
wget -O associate.py https://svncvpr.in.tum.de/cvpr-ros-pkg/trunk/rgbd_benchmark/rgbd_benchmark_tools/src/rgbd_benchmark_tools/associate.py这个脚本只能用python2运行需要下载numpy库。注意autodl的环境中python绑定到了python3环境中的python2被拦掉了所以需要安装独立的python2命令来运行python2。
在Pytorch1.5.1版本的autodl镜像中可以直接使用下面的命令来安装python2和pip2
apt-get install -y python-dev python-pip随后安装numpy库就ok了
rootautodl-container-e39d46b8d3-01da7b14:~/autodl-tmp/GCNv2_SLAM/datasets/TUM# pip2 install numpy
DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: http://mirrors.aliyun.com/pypi/simple
Collecting numpyDownloading http://mirrors.aliyun.com/pypi/packages/3a/5f/47e578b3ae79e2624e205445ab77a1848acdaa2929a00eeef6b16eaaeb20/numpy-1.16.6-cp27-cp27mu-manylinux1_x86_64.whl (17.0 MB)|████████████████████████████████| 17.0 MB 21.1 MB/s
Installing collected packages: numpy
Successfully installed numpy-1.16.6执行脚本来处理两个文件在数据文件夹里执行命令
python2 associate.py rgbd_dataset_freiburg1_desk/rgb.txt rgbd_dataset_freiburg1_desk/depth.txt rgbd_dataset_freiburg1_desk/associate.txt执行python命令后可以看看合并成功了没有如下应该就是没问题了。
1305031472.895713 rgb/1305031472.895713.png 1305031472.892944 depth/1305031472.892944.png
1305031472.927685 rgb/1305031472.927685.png 1305031472.924814 depth/1305031472.924814.png
1305031472.963756 rgb/1305031472.963756.png 1305031472.961213 depth/1305031472.961213.png在同一个网站下载的其他TUM数据集也需要用相同的方式进行处理
6.2. 运行项目
随后进入项目的GCN2目录执行命令我把命令中的路径都改成了相对路径
# 注意需要导入vnc环境变量
export DISPLAY:1
# 运行项目
cd GCN2
GCN_PATHgcn2_320x240.pt ./rgbd_gcn ../Vocabulary/GCNvoc.bin TUM3_small.yaml ../datasets/TUM/rgbd_dataset_freiburg1_desk ../datasets/TUM/rgbd_dataset_freiburg1_desk/associate.txt项目能正常运行VNC中也有图像输出 运行结束后的输出如下
[rootautodl-container-e39d46b8d3-01da7b14:~/autodl-tmp/GCNv2_SLAM/GCN2]$ GCN_PATHgcn2_320x240.pt ./rgbd_gcn ../Vocabulary/GCNvoc.bin TUM3_small.yaml ../datasets/TUM/rgbd_dataset_freiburg1_desk ../datasets/TUM/rgbd_dataset_freiburg1_desk/associate.txtORB-SLAM2 Copyright (C) 2014-2016 Raul Mur-Artal, University of Zaragoza.
This program comes with ABSOLUTELY NO WARRANTY;
This is free software, and you are welcome to redistribute it
under certain conditions. See LICENSE.txt.Input sensor was set to: RGB-DLoading ORB Vocabulary. This could take a while...
Vocabulary loaded!Camera Parameters:
- fx: 267.7
- fy: 269.6
- cx: 160.05
- cy: 123.8
- k1: 0
- k2: 0
- p1: 0
- p2: 0
- fps: 30
- color order: RGB (ignored if grayscale)ORB Extractor Parameters:
- Number of Features: 1000
- Scale Levels: 8
- Scale Factor: 1.2
- Initial Fast Threshold: 20
- Minimum Fast Threshold: 7Depth Threshold (Close/Far Points): 5.97684-------
Start processing sequence ...
Images in the sequence: 573Framebuffer with requested attributes not available. Using available framebuffer. You may see visual artifacts.New map created with 251 points
Finished!
-------median tracking time: 0.0187857
mean tracking time: 0.0193772Saving camera trajectory to CameraTrajectory.txt ...trajectory saved!Saving keyframe trajectory to KeyFrameTrajectory.txt ...trajectory saved!用时0.0187857约合53hz和论文里面GTX1070laptop的80hz还是差的有点远。
后面又跑了几次结果更慢了。不过整体还是比CPU运行快了n多倍了
median tracking time: 0.0225817
mean tracking time: 0.02368447. 尝试4090运行失败
7.1. 环境配置
我尝试使用过4090显卡环境如下。4090没办法选更低版本的PyTorch了。
PyTorch 1.11.0
Python 3.8(ubuntu20.04)
Cuda 11.3依赖项都用相同的命令安装以下是安装依赖项时的部分截图。 对应的Pytorch 1.11.0版本的libtorch下载链接如下。
https://download.pytorch.org/libtorch/cu113/libtorch-cxx11-abi-shared-with-deps-1.11.0%2Bcu113.zip整个包比较大一共有1.6GB需要慢慢等待下载了。建议还是本地提前下好再传上去毕竟autodl每一分钟都是钱呐 最终项目可以正常编译完成也需要执行上文提到的代码修改 7.2. 数据集处理
在Pytorch1.11.0镜像中需要用下面的方式来安装python2来处理数据集主要是python-pip包会提示不可用没办法直接安装。
apt-get install -y python-dev-is-python2
wget https://bootstrap.pypa.io/pip/2.7/get-pip.py
python2 get-pip.py获取到的python2如下随后正常安装numpy来运行脚本就行了
rootautodl-container-64eb44b6f5-c569ba8d:~# python2 -V
Python 2.7.18
rootautodl-container-64eb44b6f5-c569ba8d:~# pip2 -V
pip 20.3.4 from /usr/local/lib/python2.7/dist-packages/pip (python 2.7)7.3. 运行GCN2发生coredump
还是用相同的命令启动程序
cd GCN2
GCN_PATHgcn2_320x240.pt ./rgbd_gcn ../Vocabulary/GCNvoc.bin TUM3_small.yaml ../datasets/TUM/rgbd_dataset_freiburg1_desk ../datasets/TUM/rgbd_dataset_freiburg1_desk/associate.txt完蛋coredump了
Camera Parameters:
- fx: 267.7
- fy: 269.6
- cx: 160.05
- cy: 123.8
- k1: 0
- k2: 0
- p1: 0
- p2: 0
- fps: 30
- color order: RGB (ignored if grayscale)
terminate called after throwing an instance of c10::Errorwhat(): Legacy model format is not supported on mobile.
Exception raised from deserialize at ../torch/csrc/jit/serialization/import.cpp:267 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar ) 0x6b (0x7fefb6de20eb in /root/autodl-tmp/libtorch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) 0xd1 (0x7fefb6dddc41 in /root/autodl-tmp/libtorch/lib/libc10.so)
frame #2: unknown function 0x35dd53d (0x7feff3ef353d in /root/autodl-tmp/libtorch/lib/libtorch_cpu.so)
frame #3: torch::jit::load(std::shared_ptrcaffe2::serialize::ReadAdapterInterface, c10::optionalc10::Device, std::unordered_mapstd::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar , std::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar , std::hashstd::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar , std::equal_tostd::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar , std::allocatorstd::pairstd::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar const, std::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar ) 0x1cd (0x7feff3ef48ad in /root/autodl-tmp/libtorch/lib/libtorch_cpu.so)
frame #4: torch::jit::load(std::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar const, c10::optionalc10::Device, std::unordered_mapstd::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar , std::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar , std::hashstd::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar , std::equal_tostd::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar , std::allocatorstd::pairstd::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar const, std::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar ) 0xc1 (0x7feff3ef64c1 in /root/autodl-tmp/libtorch/lib/libtorch_cpu.so)
frame #5: torch::jit::load(std::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar const, c10::optionalc10::Device) 0x6f (0x7feff3ef65cf in /root/autodl-tmp/libtorch/lib/libtorch_cpu.so)
frame #6: ORB_SLAM2::GCNextractor::GCNextractor(int, float, int, int, int) 0x670 (0x7ff071e213c0 in /root/autodl-tmp/GCNv2_SLAM/lib/libORB_SLAM2.so)
frame #7: ORB_SLAM2::Tracking::Tracking(ORB_SLAM2::System*, DBoW2::TemplatedVocabularycv::Mat, DBoW2::FORB*, ORB_SLAM2::FrameDrawer*, ORB_SLAM2::MapDrawer*, ORB_SLAM2::Map*, ORB_SLAM2::KeyFrameDatabase*, std::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar const, int) 0x1e7e (0x7ff071dfcf0e in /root/autodl-tmp/GCNv2_SLAM/lib/libORB_SLAM2.so)
frame #8: ORB_SLAM2::System::System(std::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar const, std::__cxx11::basic_stringchar, std::char_traitschar, std::allocatorchar const, ORB_SLAM2::System::eSensor, bool) 0x5ae (0x7ff071de459e in /root/autodl-tmp/GCNv2_SLAM/lib/libORB_SLAM2.so)
frame #9: main 0x22f (0x5609d811ae2f in ./rgbd_gcn)
frame #10: __libc_start_main 0xf3 (0x7fefb704a083 in /lib/x86_64-linux-gnu/libc.so.6)
frame #11: _start 0x2e (0x5609d811c7ce in ./rgbd_gcn)Aborted (core dumped)这个问题我没找到解决方案于是放弃治疗。
本来GCNv2就是一个很老的项目了在40系显卡上不好运行也正常。网上其实能搜到一篇在4060拯救者上运行GCNv2的博客但是那篇博客里面并没有提到这个coredump的问题问GPT也没给出一个可行的方案还是不浪费时间了。
8. The end
本文成功在2080ti的环境上运行了GCNv2_SLAM项目虽然运行速度依旧抵不上论文中用1070laptop跑出来的80HZ但总比本地CPU运行的龟速快多了。