The C++ engine that powers the scientific computing library ND4J - n-dimensional arrays for Java
C++ Cuda CMake C Makefile Shell Python
Latest commit c573f81 Dec 21, 2016 @raver119 raver119 1D softmax fix
Permalink
Failed to load latest commit information.
blas bad fix for alex Dec 17, 2016
cmake more stuff impl Nov 11, 2016
cubin s/long/long long Mar 4, 2016
deb Testpr (#168) May 8, 2016
eclipse windows experiments Mar 7, 2016
fatbin s/long/long long Mar 4, 2016
include 1D softmax fix Dec 21, 2016
msi Msi (#172) May 8, 2016
profile s/long/long long Mar 4, 2016
ptx s/long/long long Mar 4, 2016
rpm Testpr (#168) May 8, 2016
src update null -< nullptr Apr 24, 2016
test update null -< nullptr Apr 24, 2016
.cproject ensure order is checked on cuda element wise stride Apr 1, 2016
.gitignore Update .gitignore (#200) May 22, 2016
.project ensure order is checked on cuda element wise stride Apr 1, 2016
CMakeLists.txt Corrected library (#320) Dec 7, 2016
LICENSE s/long/long long Mar 4, 2016
Neanderthal-EPL.txt s/long/long long Mar 4, 2016
README.md additional compilation params explained Nov 27, 2016
RaspberryPi2.md Changes to support Raspberry Pi (pull #331) Nov 26, 2016
buildnativeoperations.sh Remove BLAS/LAPACK wrapper in favor of the JavaCPP Presets (pull #332) Dec 9, 2016
cibuild.sh Add set -eu to script and fix uninitialized variables Jun 22, 2016
development.md Create development.md Apr 26, 2016
linuxOnPower.md adding linuxOnPower.md (#245) Jun 28, 2016
macOSx10 (CPU only).md Added information on avoiding gcc version conflict. Nov 7, 2016
main.cpp add new reduce algo (#144) Apr 26, 2016
msys2.cmake windows experiments Mar 7, 2016
setuposx.sh Update setuposx.sh to install GCC Nov 4, 2016
windows.md dl4j without cuda build instructions Nov 28, 2016

README.md

LibND4J

Native operations for nd4j. Build using cmake

Prerequisites

  • Gcc 4+ or clang
  • Cuda (if needed)
  • CMake
  • A blas implementation or openblas is required

Additional build arguments

There's few additional arguments for buildnativeoperations.sh script you could use:

 -a // shortcut for -march/-mtune, i.e. -a native
 -b release OR -b debug // enables/desables debug builds. release is considered by default
 -cc // CUDA-only argument, builds only binaries for target GPU architecture. use this for fast builds

OS Specific Requirements

Android

Download the NDK, extract it somewhere, and execute the following commands, replacing android-xxx with either android-arm or android-x86:

git clone https://github.com/bytedeco/javacpp-presets
git clone https://github.com/deeplearning4j/libnd4j
git clone https://github.com/deeplearning4j/nd4j
export ANDROID_NDK=/path/to/android-ndk/
export LIBND4J_HOME=$PWD/libnd4j/
export OpenBLAS_HOME=$PWD/javacpp-presets/openblas/cppbuild/android-xxx/
cd javacpp-presets/openblas
bash cppbuild.sh install -platform android-xxx
cd ../../libnd4j
bash buildnativeoperations.sh -platform android-xxx
cd ../nd4j
mvn clean install -Djavacpp.platform=android-xxx -DskipTests -pl '!nd4j-backends/nd4j-backend-impls/nd4j-cuda,!nd4j-backends/nd4j-backend-impls/nd4j-cuda-platform'

OSX

Run ./setuposx.sh (Please ensure you have brew installed)

See macOSx10 (CPU only).md

Linux

Depends on the distro - ask in the earlyadopters channel for specifics on distro

Ubuntu Linux 15.10

wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda-repo-ubuntu1504-7-5-local_7.5-18_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1504-7-5-local_7.5-18_amd64.deb
sudo apt-get update
sudo apt-get install cuda
sudo apt-get install libopenblas-dev
sudo apt-get install cmake
sudo apt-get install gcc-4.9
sudo apt-get install g++-4.9
sudo apt-get install git
git clone https://github.com/deeplearning4j/libnd4j
cd libnd4j/
export LIBND4J_HOME=~/libnd4j/
sudo rm /usr/bin/gcc
sudo rm /usr/bin/g++
sudo ln -s /usr/bin/gcc-4.9 /usr/bin/gcc
sudo ln -s /usr/bin/g++-4.9 /usr/bin/g++
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH

Ubuntu Linux 16.04

sudo apt install libopenblas-dev
sudo apt install cmake
sudo apt install nvidia-cuda-dev nvidia-cuda-toolkit nvidia-361
export TRICK_NVCC=YES
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH

The standard development headers are needed.

CentOS 6

yum install centos-release-scl-rh epel-release
yum install devtoolset-3-toolchain maven30 cmake3 git openblas-devel
scl enable devtoolset-3 maven30 bash
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH

Windows

See Windows.md

Setup for All OS

  1. Set a LIBND4J_HOME as an environment variable to the libnd4j folder you've obtained from GIT

    • Note: this is required for building nd4j as well.
  2. Setup cpu followed by gpu, run the following on the command line:

    • For standard builds:

      ./buildnativeoperations.sh
      ./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH
    • For Debug builds:

      ./buildnativeoperations.sh blas -b debug
      ./buildnativeoperations.sh blas -c cuda -сс YOUR_DEVICE_ARCH -b debug
    • For release builds (default):

      ./buildnativeoperations.sh
      ./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH

Linking with MKL

We can link with MKL either at build time, or at runtime with binaries initially linked with another BLAS implementation such as OpenBLAS. In either case, simply add the path containing libmkl_rt.so (or mkl_rt.dll on Windows), say /path/to/intel64/lib/, to the LD_LIBRARY_PATH environment variable on Linux (or PATH on Windows), and build or run your Java application as usual. If you get an error message like undefined symbol: omp_get_num_procs, it probably means that libiomp5.so, libiomp5.dylib, or libiomp5md.dll is not present on your system. In that case though, it is still possible to use the GNU version of OpenMP by setting these environment variables on Linux, for example:

export MKL_THREADING_LAYER=GNU
export LD_PRELOAD=/usr/lib64/libgomp.so.1