4A起動給電やSwapメモリ, クロック数の変更, CUDAサンプル実行 - JETSON NANO 開発者キット その3

JETSON NANO 開発者キット を試す その3

その他拡張

www.jetsonhacks.com

この Jetson hacks というサイトを見ると、いろいろ情報が出てくる。公式さいとなのかな?

5V4A

5V4A での動作方法について書いてある。

Jetson Nano - Use More Power! - JetsonHacks

スワップメモリの作成

Jetson Nano - Use More Memory! - JetsonHacks

$ git clone https://github.com/JetsonHacksNano/installSwapfile 

:~/installSwapfile$ ./installSwapfile.sh 
Creating Swapfile at:  /mnt
Swapfile Size:  6G
Automount:  Y
-rw-r--r-- 1 root root 6.0G  423 22:28 swapfile
-rw------- 1 root root 6.0G  423 22:28 swapfile
Setting up swapspace version 1, size = 6 GiB (6442446848 bytes)
no label, UUID=db94d31a-7bec-439a-ac17-8cccd0d5ebba
Filename                                Type            Size    Used    Priority
/mnt/swapfile                           file            6291452 0       -1
Modifying /etc/fstab to enable on boot
/mnt/swapfile
Swap file has been created
Reboot to make sure changes are in effect

$ free
              total        used        free      shared  buff/cache   available
Mem:        4059712      549092     3131308       18896      379312     3340012
Swap:       6291452           0     6291452

クロックスを上げる

$ sudo jetson_clocks --show
SOC family:tegra210  Machine:jetson-nano
Online CPUs: 0-3
CPU Cluster Switching: Disabled
cpu0: Online=1 Governor=schedutil MinFreq=102000 MaxFreq=1428000 CurrentFreq=1428000 IdleStates: WFI=1 c7=1 
cpu1: Online=1 Governor=schedutil MinFreq=102000 MaxFreq=1428000 CurrentFreq=1326000 IdleStates: WFI=1 c7=1 
cpu2: Online=1 Governor=schedutil MinFreq=102000 MaxFreq=1428000 CurrentFreq=1428000 IdleStates: WFI=1 c7=1 
cpu3: Online=1 Governor=schedutil MinFreq=102000 MaxFreq=1428000 CurrentFreq=1326000 IdleStates: WFI=1 c7=1 
GPU MinFreq=76800000 MaxFreq=921600000 CurrentFreq=76800000
EMC MinFreq=204000000 MaxFreq=1600000000 CurrentFreq=1600000000 FreqOverride=0
Fan: speed=0
NV Power Mode: MAXN

冷却ファンを足す

ヒートシンクに、ネジ穴が空いていて、ファンを取り付けられるようになっている。

5V の PWM制御できるやつ。サイズは 40mm

ファンの制御

$ sudo sh -c 'echo 255 > /sys/devices/pwm-fan/target_pwm'

Cuda サンプルを一通り動かす

第563回 NVIDIA Jetson Nano Developer KitにUbuntuをインストールしよう!:Ubuntu Weekly Recipe|gihyo.jp … 技術評論社

$ cp -a /usr/local/cuda-10.0/samples/ ~/
$ cd ~/samples/1_Utilities/deviceQuery
$ make
$ $ ./deviceQuery 
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA Tegra X1"
  CUDA Driver Version / Runtime Version          10.0 / 10.0
  CUDA Capability Major/Minor version number:    5.3
  Total amount of global memory:                 3965 MBytes (4157145088 bytes)
  ( 1) Multiprocessors, (128) CUDA Cores/MP:     128 CUDA Cores
  GPU Max Clock rate:                            922 MHz (0.92 GHz)
  Memory Clock rate:                             13 Mhz
  Memory Bus Width:                              64-bit
  L2 Cache Size:                                 262144 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            Yes
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            No
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.0, CUDA Runtime Version = 10.0, NumDevs = 1
Result = PASS
$ cd ~/samples/5_Simulations/oceanFFT/
$ make
$ ./oceanFFT

f:id:pongsuke:20190424102112p:plain

$ cd ~/samples/5_Simulations/smokeParticles/
$ make
make: Nothing to be done for 'all'

$ ./smokeParticles

f:id:pongsuke:20190424102414p:plain

$ cd ~/samples/5_Simulations/nbody/
$ make
$ ./nbody

f:id:pongsuke:20190424102731p:plain

$ mkdir ~/visionworks/
$ cd /usr/share/visionworks/sources/
$ ./install-samples.sh ~/visionworks/
$ ./install-samples.sh ~/visionworks/
Creating the /home/kiyo/visionworks//VisionWorks-1.6-Samples directory...
Copying VisionWorks samples to /home/kiyo/visionworks//VisionWorks-1.6-Samples...
Finished copying VisionWorks samples

$ cd ~/visionworks/VisionWorks-1.6-Samples/demos/hough_transform/
$ make

Darknet install

メモだけ

エラー

cudnn not found でたので、 ~/.bash_rc に、2行追加。

CUDA and cuDNN paths

export PATH=/usr/local/cuda-10.0/bin/:${PATH} export LD_LIBRARY_PATH=/usr/lib/aarch64-linux-gnu/::${LD_LIBRARY_PATH}

$ git clone https://github.com/pjreddie/darknet.git
$ cd darknet
$ make

$ ./darknet 
usage: ./darknet <function>

OOM

Out of memory: Kill process 13202 (darknet) score 52 or sacrifice child

cfg の subdivisions を変更する。

元:subdivisions=16
新:subdivisions=32