Saturday, February 23, 2013

Memory Debugging A Crash course To get rid of stack errors, buffers, exceptions or segmentation faults , leaked memories, pointer bounds, etc while using pointers and files, and string operations is not straightforward. These bugs are inevitable while programming and can cause weird behaviour on different machines, etc. Most of the time these errors or bugs can be removed cleanly using simple debugging tools. I have been using "valgrind" which is simple, intuitive, and a powerful tool to debug memory / data / file operations generated...
"A User's Guide to MPI" Peter S. Pacheco http://www.academia.edu/2602564/_A_Users_Guide_to_MPI_by_Peter_S._Pacheco ...

Monday, February 18, 2013

Basic Metrics of a CUDA application After developing a CUDA application, the costly routines (in terms of runtime) need to be tuned or optimised for better performance. The primary metrics to identify whether the kernel is memory intensive or computationally intensive  are L1 / L2 cache hit rates, the data throughput, and the computational throughput. This article discusses about the data and computational throughput. Let's understand this using a simple example; The kernel SUM adds two vectors a and b and stores the result in c. _global__...

Tuesday, February 12, 2013

Compilers, Debuggers & Profilers Part A : Compilers 1. GNU compilers [ gcc / g++ / gfortran ] - free 2. Intel  [ icc / icpc / ifort ]- free for linux Part B : Debuggers 1. GNU debugger [ gdb ] - free 2. Intel debugger [ idb ] - free for linux Part C : Profilers 1. ValGrind (Command line) , Valkyrie (GUI) Purpose : memory debugging, memory leak detection, and profiling Link : http://valgrind.org 2. nvprof (command line),  computeprof / nvvp (GUI) Purpose : profiling Languages : CUDA Some available lists; A...
CUDA Enabled Devices - Metrics Reference (Using computeprof / nvvp) This section contains detailed descriptions of the metrics that can be collected by the Visual Profiler. These metrics can be collected only from within the Visual Profiler. The command-line profiler and nvprof can collect low-level events but are not capable of collecting metrics. Capability 2.x Metrics Metric NameDescriptionFormula sm_efficiencyThe ratio of the time at least one warp is active on a multiprocessor to the total time100 * (active_cycles...
Fortran binding in C Most of the libraries or useful routines are in FORTRAN and if you intend to use them in your C code, here are 2 ways of how to use them. Warning working with multi-dimensional arrays should include the row-major behaviour of C and column-major behaviour of FORTRAN. Also, take care of the data types compatibility across the two languages. Part A - Using .f / .f90 directly 1. Compile the .f / .f90 files to .o object file using; FC -c -O filename.f  This generates the filename.o object file...

Sunday, February 10, 2013

Profiling a CUDA application  Tools required : NVIDIA's Visual profiler ( nvprof / computeprof ) Readme : http://docs.nvidia.com/cuda/profiler-users-guide/index.html Firstly, identify an algorithms 'heavy' areas i.e. the most time consuming routines or kernels.  Then prepare the code for profiling, 1. Include these headers cuda_profiler_api.h (or cudaProfiler.h for the driver API) 2. Add functions to start and stop profile data collection. cudaProfilerStart() is used to start profiling  cudaProfilerStop()...

Friday, February 8, 2013

Sparse (& dense) matrix libraries A general survey Useful for BLAS, LAPACK type implementations (but not limited to!) and possible iterative methods with the support of compressed sparse matrix formats. FORTRAN flavours can also be found analogously. Few libraries also include parallel features. Intel MKL (highly recommended ! ) http://software.intel.com/en-us/intel-mkl C++ http://seldon.sourceforge.net/ C++ / POSIX http://plasimo.phys.tue.nl/TBCI/ C++ / http://math.nist.gov/sparselib++/ Linear solvers (not sure if sparse functionality...
A comprehensive list of Numerical libraries ......... for daily use A library used is a hundred bugs eliminated Multi-language ALGLIB is an open source numerical analysis library which may be used from C++, C#, FreePascal, Delphi, VBA. IMSL Numerical Libraries are libraries of numerical analysis functionality implemented in standard programming languages like C, Java, C# .NET, Fortran, and Python. The NAG Library is a collection of mathematical and statistical routines for multiple programming languages (C, C++,...
Subscribe to RSS Feed Follow me on Twitter!