So the other day, one of my friends came to my room, asking for help on a “LAMMPS” library that has to do with molecular dynamics. He got the basics running by getting the pre-built Ubuntu Linux executables:
sudo add-apt-repository ppa:gladky-anton/lammps sudo apt-get update sudo apt-get install lammps-daily
But then he realized that the ‘apt-get’ version will only use one CPU thread instead of multiple, which would be too slow for his simulation. Also, he has an Nvidia Quadro GPU which he wished could be utilized for his simulations. So I dug around, trying to find a way to include multi-thread as well as GPU packages for LAMMPS. The annoying part was, the community of LAMMPS is a little bit outdated, and the newest thread I could find was still discussing about Kepler GPUs… So, after ~15 hours of digging around and GDB-level manipulation (which is absolutely painful for someone who hasn’t taken a single course in C/C++), the library was working. So here is everything you need to know to get LAMMPS running on your Linux with an Nvidia GPU or Multi-core CPU.
I have one in my GitHub repo that is compiled for CUDA computing capability 5.0, so you are also welcomed to simply download a compiled version of LAMMPS with GPU support.
Official documentation can be found at http://lammps.sandia.gov/doc/Manual.html .
First, download the official repository of LAMMPS from github, and switch to the stable branch:
~$ git clone https://github.com/lammps/lammps Cloning into 'lammps'... remote: Counting objects: 147198, done. remote: Compressing objects: 100% (4/4), done. remote: Total 147198 (delta 2), reused 0 (delta 0), pack-reused 147194 Receiving objects: 100% (147198/147198), 357.63 MiB | 3.26 MiB/s, done. Resolving deltas: 100% (125710/125710), done. Checking out files: 100% (9710/9710), done. ~$ cd lammps ~/lammps$ git checkout stable Switched to branch 'stable'
Before going further, we need to check the versions of dependencies to make sure everything is at the right version. I was using an old gcc compiler, and the build ended up failing all the time.
~/lammps$ gcc -v <omitted> gcc version 6.4.0 20180424 (Ubuntu 6.4.0-17ubuntu1~16.04)
g++ and gcc should be
6.x.x here, and if not, one needs add the PPA, update, then install the correct version:
sudo add-apt-repository ppa:ubuntu-toolchain-r/test sudo apt-get update sudo apt-get install gcc-6 g++-6
Another problem is that multiple gcc/g++ versions can exit on the same computer, and they have different priorities. Thus, we need to set the
6.x.x version to default:
~/lammps$ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-6 100 --slave /usr/bin/g++ g++ /usr/bin/g++-6 ~/lammps$ sudo update-alternatives --config gcc There are 3 choices for the alternative gcc (providing /usr/bin/gcc). Selection Path Priority Status ------------------------------------------------------------ * 0 /usr/bin/gcc-6 100 auto mode 1 /usr/bin/gcc-4.8 50 manual mode 2 /usr/bin/gcc-4.9 60 manual mode 3 /usr/bin/gcc-6 100 manual mode Press <enter> to keep the current choice[*], or type selection number: 0
gcc should now be set to the default compiler. If not, type in the number associated to
gcc-6, and set it to default.
The next library to check is OpenMPI, A High Performance Message Passing Library. The version that worked for me is
3.1.0, and to install the newest version, go to https://www.open-mpi.org/software/ompi/v3.1/ and download the
mpirun --version mpirun (Open MPI) 3.1.0 Report bugs to http://www.open-mpi.org/community/help/
And, follow the steps from the official documentation to build the library:
~/Downloads$ gunzip -c openmpi-3.1.0.tar.gz | tar xf - ~/Downloads$ cd openmpi-3.1.0 ~/Downloads$ ./configure --prefix=/usr/local <...lots of output...> ~/Downloads$ make all install ~/Downloads$ sudo rm -r openmpi-3.1.0
Last, CUDA and CUDA toolkit should all be version 9.0. Run the following commands to check them:
~/lammps$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2017 NVIDIA Corporation Built on Fri_Sep__1_21:08:03_CDT_2017 Cuda compilation tools, release 9.0, V9.0.176 ~/lammps$ nvidia-smi Wed Jun 6 21:53:17 2018 <omitted> | NVIDIA-SMI 384.130 Driver Version: 384.130 <omitted> | GPU Name Persistence-M| Bus-Id Disp.A | <omitted> | 0 Quadro M2000M Off | 00000000:01:00.0 Off | <omitted>
If not, search online for instructions to install CUDA 9.0 on your computer.
Once all dependencies have been checked out, we can go on and explore the LAMMPS building options.
Go to the
/src directory, and in the terminal type
make to get the ways to use the command.
~/lammps$ cd src ~/lammps/src$ make make clean-all delete all object files make clean-machine delete object files for one machine make mpi-stubs build dummy MPI library in STUBS make install-python install LAMMPS wrapper in Python make tar create lmp_src.tar.gz for src dir and packages make package list available packages and their dependencies make package-status (ps) status of all packages make package-installed (pi) list of installed packages make yes-package install a single pgk in src dir make no-package remove a single pkg from src dir make yes-all install all pgks in src dir make no-all remove all pkgs from src dir make yes-standard (yes-std) install all standard pkgs make no-standard (no-std) remove all standard pkgs make yes-user install all user pkgs make no-user remove all user pkgs make yes-lib install all pkgs with libs (included or ext) make no-lib remove all pkgs with libs (included or ext) make yes-ext install all pkgs with external libs make no-ext remove all pkgs with external libs make package-update (pu) replace src files with updated package files make package-overwrite replace package files with src files make package-diff (pd) diff src files against package files make lib-package help for download/build/install a package library make lib-package args="..." download/build/install a package library make purge purge obsolete copies of source files make machine build LAMMPS for machine make mode=lib machine build LAMMPS as static lib for machine make mode=shlib machine build LAMMPS as shared lib for machine make mode=shexe machine build LAMMPS as shared exe for machine make makelist create Makefile.list used by old makes make -f Makefile.list machine build LAMMPS for machine (old) machine is one of these from src/MAKE: # mpi = MPI with its default compiler # serial = GNU g++ compiler, no MPI ... or one of these from src/MAKE/OPTIONS: # big = MPI with its default compiler, BIGBIG switch # fftw = MPI with its default compiler, FFTW3 support # g++_mpich = MPICH with compiler set to GNU g++ # g++_mpich_link = GNU g++ compiler, link to MPICH # g++_openmpi = OpenMPI with compiler set to GNU g++ # g++_openmpi_link = GNU g++ compiler, link to OpenMPI # gpu = GPU package, MPI with its default compiler # g++_serial = GNU g++ compiler, no MPI # icc_mpich = MPICH with compiler set to Intel icc # icc_mpich_link = Intel icc compiler, link to MPICH # icc_openmpi = OpenMPI with compiler set to Intel icc # icc_openmpi_link = Intel icc compiler, link to OpenMPI # icc_serial = Intel icc compiler, no MPI # intel_phi = USER-INTEL package with Phi offload support, Intel MPI, MKL FFT # intel_cpu_intelmpi = USER-INTEL package, Intel MPI, MKL FFT # intel_cpu_intelmpi = USER-INTEL package, Intel MPI, MKL FFT # intel_cpu_mpich = USER-INTEL package, MPICH with compiler set to Intel icc # intel_cpu_openmpi = USER-INTEL package, OpenMPI with compiler set to Intel icc # jpeg = default MPI compiler, default MPI, JPEG support # knl = Flags for Knights Landing Xeon Phi Processor,Intel Compiler/MPI,MKL FFT # kokkos_cuda_mpi = KOKKOS/CUDA package, MPICH or OpenMPI with nvcc compiler, Kepler GPU # kokkos_mpi_only = KOKKOS package, no threading, MPI with its default compiler # kokkos_omp = KOKKOS/OMP package, MPI with its default compiler # kokkos_phi = KOKKOS package with PHI support, Intel compiler, default MPI # mgptfast = MPI with its default compiler, optimizations for USER-MGPT # omp = USER-OMP package, MPI with its default compiler # opt = OPT package, MPI with its default compiler # pgi_mpich_link = Portland group compiler, link to MPICH # png = default MPI compiler, default MPI, PNG support ... or one of these from src/MAKE/MACHINES: # linux = RedHat Linux box, Intel icc, MPICH2, FFTW # bgl = LLNL Blue Gene Light machine, xlC, native MPI, FFTW # bgq = IBM Blue Gene/Q, multiple compiler options, native MPI, ALCF FFTW2 # multiple compiler options for BGQ # chama - Intel SandyBridge, mpic++, openmpi, no FFTW # cori2 = NERSC Cori II KNL, static build, FFTW (single precision) # cygwin = Windows Cygwin, mpicxx, MPICH, FFTW # glory = Linux cluster with 4-way quad cores, Intel mpicxx, native MPI, FFTW # mpi = MPI with its default compiler # jaguar = ORNL Jaguar Cray XT5, CC, native MPICH, FFTW # mac = Apple PowerBook G4 laptop, c++, no MPI, FFTW 2.1.5 # mac_mpi = Apple laptop, MacPorts Open MPI 1.4.3, gcc 4.8, fftw, jpeg # mingw32-cross = Win 32-bit, gcc-4.7.1, MinGW, internal FFT, no MPI, OpenMP # mingw32-cross-mpi = Win 32-bit, gcc-4.7.1, MinGW, internal FFT, MPICH2, OpenMP # mingw64-cross = Win 64-bit, gcc-4.7.1, MinGW, internal FFT, no MPI, OpenMP # mingw64-cross-mpi = Win 64-bit, gcc-4.7.1, MinGW, internal FFT, MPICH2, OpenMP # myrinet = cluster, g++, myrinet MPI, no FFTs # power = IBM Power5+, mpCC_r, native MPI, FFTW # redsky - SUN X6275 nodes, Nehalem procs, mpic++, openmpi, OpenMP, no FFTW # serial = RedHat Linux box, g++4, no MPI, no FFTs # stampede = Intel Compiler, MKL FFT, Offload to Xeon Phi # storm = Cray Red Storm XT3, Cray CC, native MPI, FFTW # tacc = UT Lonestar TACC machine, mpiCC, MPI, FFTW # ubuntu = Ubuntu Linux box, g++, openmpi, FFTW3 # ubuntu_simple = Ubuntu Linux box, g++, openmpi, KISS FFT # kokkos_cuda = KOKKOS/CUDA package, OpenMPI with nvcc compiler, Kepler GPU # xe6 = Cray XE6, Cray CC, native MPI, FFTW # xt3 = PSC BigBen Cray XT3, CC, native MPI, FFTW # xt5 = Cray XT5, Cray CC, native MPI, FFTW ... or one of these from src/MAKE/MINE:
We are building for
mpi with its default compilers. Before proceeding, one should check the required packages for the simulation code, and re-install the package. It’s always good to do
make clean-all and
make no-all first to ensure a clean compilation.
Multi-threading in LAMMPS require the OPT package. The installation is simple: just type in
make yes-opt with the rest of the required packages, then
make mpi or
~/lammps/src$ make yes-opt Installing package opt ~/lammps/src$ make mpi make: Entering directory '/home/a***o/lammps/src/Obj_mpi' cc -O -o fastdep.exe ../DEPEND/fastdep.c <...lots of output...> size ../lmp_mpi text data bss dec hex filename 3804320 8056 14184 3826560 3a6380 ../lmp_mpi make: Leaving directory '/home/a***o/lammps/src/Obj_mpi'
There might be warnings during the process of creating the executable, but it is okay to ignore them.
~/lammps/src$ find . -name lmp_mpi ./lmp_mpi
The file is located in
/src, and on may use the
cp command to copy it to other places for further use.
To start the simulation with multi-core GPU, type the following command:
mpirun -np 8 lmp_mpi -sf opt -in your_simulation.lmp
The number after
-np refers to the 8 threads on my Xeon CPU, and the
opt option after
-sf indicates that OPT package is being used. If you would like to start the simulation without an input file, remove
~/lammps/src$ mpirun -np 8 lmp_mpi -sf opt LAMMPS (11 May 2018)
The problem with GPU is more complicated. Since the GPU package is built upon a specific sm architectures (corresponding to the compute capability), we need to rebuild our library.
~/lammps/src$ cd .. ~/lammps$ cd lib ~/lammps/lib$ cd gpu ~/lammps/lib/gpu$ ls -a ~/lammps/lib/gpu$ vim Makefile.linux.double
Go to the
lib/gpu directory, and open the file named
# /* ---------------------------------------------------------------------- # Generic Linux Makefile for CUDA # - Change CUDA_ARCH for your GPU # ------------------------------------------------------------------------- */ # which file will be copied to Makefile.lammps EXTRAMAKE = Makefile.lammps.standard CUDA_HOME = /usr/local/cuda NVCC = nvcc # Kepler CUDA #CUDA_ARCH = -arch=sm_35 # Tesla CUDA CUDA_ARCH = -arch=sm_21 # newer CUDA #CUDA_ARCH = -arch=sm_13 # older CUDA #CUDA_ARCH = -arch=sm_10 -DCUDA_PRE_THREE # this setting should match LAMMPS Makefile # one of LAMMPS_SMALLBIG (default), LAMMPS_BIGBIG and LAMMPS_SMALLSMALL LMP_INC = -DLAMMPS_SMALLBIG
Since the default path of CUDA 9.0 is
CUDA_HOME variable should be changed accordingly. Also, the number after
sm_ shoud match the compute capability of the current Nvidia GPU. For more information, refer to the official website.
If a former build exists, use
make -f clean to remove them. Otherwise, type
make -f Makefile.linux into the terminal:
~/lammps/lib/gpu$ make -f Makefile.linux <...lots of output...> mpicxx -DMPI_GERYON -DUCL_NO_EXIT -DMPICH_IGNORE_CXX_SEEK -DOMPI_SKIP_MPICXX=1 -fPIC -O2 -DLAMMPS_SMALLBIG -D_SINGLE_DOUBLE -I/usr/local/cuda-9.0/include -DUSE_CUDPP -Icudpp_mini -o nvc_get_devices ./geryon/ucl_get_devices.cpp -DUCL_CUDADR -L/usr/local/cuda-9.0/lib64 -lcuda
A new executable
nvc_get_devices should appear in the directory. Run it to see if LAMMPS can get the corrent information of your gpu.
~/lammps/lib/gpu$ ./nvc_get_devices Found 1 platform(s). Using platform: NVIDIA Corporation NVIDIA CUDA Driver CUDA Driver Version: 9.0 Device 0: "Quadro M2000M" Type of device: GPU Compute capability: 5 Double precision support: Yes Total amount of global memory: 3.94745 GB Number of compute units/multiprocessors: 5 Number of cores: 960 Total amount of constant memory: 65536 bytes Total amount of local/shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per block: 1024 Maximum group size (# of threads per block) 1024 x 1024 x 64 Maximum item sizes (# threads for each dim) 2147483647 x 65535 x 65535 Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Clock rate: 1.137 GHz Run time limit on kernels: Yes Integrated: No Support host page-locked memory mapping: Yes Compute mode: Default Concurrent kernel execution: Yes Device has ECC support enabled: No
Finally, go back to
/src, install the new gpu package, and compile with MPI.
~\lammps\src$ make yes-gpu Installing package gpu ~lammps/src$ make mpi make: Entering directory '/home/a***o/lammps/src/Obj_mpi' cc -O -o fastdep.exe ../DEPEND/fastdep.c <...lots of output...> size ../lmp_mpi text data bss dec hex filename 8563129 12048 163568 8738745 8557b9 ../lmp_mpi make: Leaving directory '/home/a***o/lammps/src/Obj_mpi'
Again, there will be lots of warnings, but it’s perfectly normal.
To run with GPU package, type:
mpirun lmp_mpi -sf gpu -pk gpu 1 -in your_simulation.lmp
-sf indicates that GPU package is being used,
gpu 1 specifies the exact GPU being used (in case that multiple exist). If your build is successful, you should something like the following when querying the GPU usage through
watch -n 1 nvidia-smi:
~\lammps\src$ mpirun lmp_mpi -sf gpu -pk gpu 1 LAMMPS (16 Mar 2018)
| 0 30068 C lmp_mpi 26MiB | | 0 30069 C lmp_mpi 26MiB | | 0 30070 C lmp_mpi 26MiB | | 0 30071 C lmp_mpi 26MiB | +-----------------------------------------------------------------------------+