BogoToBogo
  • Home
  • About
  • Big Data
  • Machine Learning
  • AngularJS
  • Python
  • C++
  • go
  • DevOps
  • Kubernetes
  • Algorithms
  • More...
    • Qt 5
    • Linux
    • FFmpeg
    • Matlab
    • Django 1.8
    • Ruby On Rails
    • HTML5 & CSS

C++ Tutorial Multi-Threaded Programming Debugging - 2020

cplusplus_icon.png




Bookmark and Share





bogotobogo.com site search:




Multithread Debugging Tools

  1. TotalView

    1. Platforms: Linux, AIX, Solaris, Tru64, Cray (Linux-CNL, Catamount, Mac OS
    2. From wiki - It allows process control down to the single thread, the ability to look at data for a single thread or all threads at the same time, and the ability to synchronize threads through breakpoints. TotalView integrates memory leak detection and other heap memory debugging features. Data analysis features help find anomalies and problems in the target program's data, and the combination of visualization and evaluation points lets the user watch data change as the program executes. TotalView includes the ability to test fixes while debugging. It supports parallel programming including Message Passing Interface (MPI), Unified Parallel C (UPC) and OpenMP. It can be extended to support debugging CUDA. It also has an optional add-on called ReplayEngine that can be used to perform reverse debugging (stepping backwards to look at older values of variables.)


  2. IntelĀ® Parallel Studio 2011

    It plugs into the Microsoft Visual Studio Integrated Development Environment by adopting a common runtime called the Microsoft Concurrency Runtime, which is part of Visual Studio 2010.
    1. Parallel Composer consists of the Intel C++ compiler, a number of performance libraries (Integrated Performance Primitives), Intel Threading Building Blocks and a parallel debugger extension.
    2. Parallel Inspector improves reliability by identifying memory errors and threading errors.
    3. Parallel Amplifier is a performance profiler that analyzes hotspots, concurrency and locks-and-waits.


  3. Oracle Solaris Studio 12.2

    Solaris and Linux


  4. Visual Studio 2012


  5. Valgrind
    It is a programming tool for memory debugging, memory leak detection, and profiling. Released under the terms of the GNU General Public License, Valgrind is free software.
    Here is a simple way of checking memory leak: Code looks like this:
    #include <stdlib.h>
    #include <stdio.h>
    int main()
    {
        printf("mem leak testing....\n");
        int *ptr =(int *)malloc(1000*sizeof(int));
        return 0;
    }
    
            
    Compile and run with Valgrind:
    $ gcc -g -o test test.c
    $ valgrind --tool=memcheck --leak-check=full ./test
    ==2948== Memcheck, a memory error detector
    ==2948== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
    ==2948== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
    ==2948== Command: ./test
    ==2948== 
    mem leak testing....
    ==2948== 
    ==2948== HEAP SUMMARY:
    ==2948==     in use at exit: 4,000 bytes in 1 blocks
    ==2948==   total heap usage: 1 allocs, 0 frees, 4,000 bytes allocated
    ==2948== 
    ==2948== 4,000 bytes in 1 blocks are definitely lost in loss record 1 of 1
    ==2948==    at 0x402BB7A: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
    ==2948==    by 0x804845C: main (test.c:6)
    ==2948== 
    ==2948== LEAK SUMMARY:
    ==2948==    definitely lost: 4,000 bytes in 1 blocks
    ==2948==    indirectly lost: 0 bytes in 0 blocks
    ==2948==      possibly lost: 0 bytes in 0 blocks
    ==2948==    still reachable: 0 bytes in 0 blocks
    ==2948==         suppressed: 0 bytes in 0 blocks
    ==2948== 
    ==2948== For counts of detected and suppressed errors, rerun with: -v
    ==2948== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
            
    Helgrind is a tool capable of detecting race conditions in multithreaded code.

    The following sample shows the race condition detected by valgrind using helgrind tool.

    The code Race Condition looks like this:

    #include <stdio.h>
    #include <pthread.h>
    
    static volatile int balance = 0;
    
    void *deposit(void *param)
    {
        char *who = param;
    
        int i;
        printf("%s: begin\n", who);
        for (i = 0; i < 1000000; i++) {
            balance = balance + 1;
        }
        printf("%s: done\n", who);
        return NULL;
    }
    
    int main()
    {
        pthread_t p1, p2;
        printf("main() starts depositing, balance = %d\n", balance);
        pthread_create(&p1;, NULL, deposit, "A");
        pthread_create(&p2;, NULL, deposit, "B");
    
        // join waits for the threads to finish
        pthread_join(p1, NULL);
        pthread_join(p2, NULL);
        printf("main() A and B finished, balance = %d\n", balance);
        return 0;
    }
            
    Here is the Makefile:
    test2: test2.o
            gcc -g -o test2 test2.o -Wall -lpthread
    test2.o: test2.c
            gcc -c test2.c
    clean:
            rm -f *.o test2
            
    Run with valgrind:
    $ make
    $ valgrind --tool=helgrind ./test2
    ==3041== Helgrind, a thread error detector
    ==3041== Copyright (C) 2007-2011, and GNU GPL'd, by OpenWorks LLP et al.
    ==3041== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
    ==3041== Command: ./test2
    ==3041== 
    main() starts depositing, balance = 0
    A: begin
    B: begin
    ==3041== ---Thread-Announcement------------------------------------------
    ==3041== 
    ==3041== Thread #3 was created
    ==3041==    at 0x4153D28: clone (clone.S:111)
    ==3041== 
    ==3041== ---Thread-Announcement------------------------------------------
    ==3041== 
    ==3041== Thread #2 was created
    ==3041==    at 0x4153D28: clone (clone.S:111)
    ==3041== 
    ==3041== ----------------------------------------------------------------
    ==3041== 
    ==3041== Possible data race during read of size 4 at 0x804A02C by thread #3
    ==3041== Locks held: none
    ==3041==    at 0x8048514: deposit (in /home/khong/TEST/Valgrind/test2)
    ==3041==    by 0x402D95F: ?? (in /usr/lib/valgrind/vgpreload_helgrind-x86-linux.so)
    ==3041==    by 0x4050D4B: start_thread (pthread_create.c:308)
    ==3041==    by 0x4153D3D: clone (clone.S:130)
    ==3041== 
    ==3041== This conflicts with a previous write of size 4 by thread #2
    ==3041== Locks held: none
    ==3041==    at 0x804851C: deposit (in /home/khong/TEST/Valgrind/test2)
    ==3041==    by 0x402D95F: ?? (in /usr/lib/valgrind/vgpreload_helgrind-x86-linux.so)
    ==3041==    by 0x4050D4B: start_thread (pthread_create.c:308)
    ==3041==    by 0x4153D3D: clone (clone.S:130)
    ==3041== 
    ==3041== ----------------------------------------------------------------
    ==3041== 
    ==3041== Possible data race during write of size 4 at 0x804A02C by thread #3
    ==3041== Locks held: none
    ==3041==    at 0x804851C: deposit (in /home/khong/TEST/Valgrind/test2)
    ==3041==    by 0x402D95F: ?? (in /usr/lib/valgrind/vgpreload_helgrind-x86-linux.so)
    ==3041==    by 0x4050D4B: start_thread (pthread_create.c:308)
    ==3041==    by 0x4153D3D: clone (clone.S:130)
    ==3041== 
    ==3041== This conflicts with a previous write of size 4 by thread #2
    ==3041== Locks held: none
    ==3041==    at 0x804851C: deposit (in /home/khong/TEST/Valgrind/test2)
    ==3041==    by 0x402D95F: ?? (in /usr/lib/valgrind/vgpreload_helgrind-x86-linux.so)
    ==3041==    by 0x4050D4B: start_thread (pthread_create.c:308)
    ==3041==    by 0x4153D3D: clone (clone.S:130)
    ==3041== 
    A: done
    B: done
    main() A and B finished, balance = 2000000
    ==3041== 
    ==3041== For counts of detected and suppressed errors, rerun with: -v
    ==3041== Use --history-level=approx or =none to gain increased speed, at
    ==3041== the cost of reduced accuracy of conflicting-access information
    ==3041== ERROR SUMMARY: 22 errors from 2 contexts (suppressed: 68 from 19)
            


Profiler

There is a rule of thumb known as the Pareto principle, and it is also referred as the 80-20 rule. In other words, 80% of the effects of event come from only 20% of the possible causes. So, if we optimize 20% of our code, we realize 80% of all the gains in speed.

How can we know which 20% of our code to optimize? We need a profiler.

List of Profilers
  1. CodeAnalyst is a free performance analyzer from Advanced Micro Devices for programs on AMD hardware. It also does basic timer-based profiling on Intel processors.

  2. DTrace dynamic tracing tool for Solaris, FreeBSD, Mac OS X and other operating systems.

  3. Insure++ is Parasoft's runtime memory analysis and error detection tool. Its Inuse component provides a graphical view of memory allocations over time, with specific visibility into overall heap usage, block allocations, possible outstanding leaks, etc.

  4. Parallel Studio from Intel contains Parallel Amplifier, which tunes both serial and parallel programs. It also includes Parallel Inspector, which detects races, deadlocks and memory errors. Parallel Composer includes codecov, a command line coverage tool.

  5. Visual Studio Team System Profiler is Microsoft's commercial profiler offering.

  6. Developer Edition by Software Diagnostics is a commercial integrated recorder, profiler and debugger for dynamic analysis, integrating dynamic tracing functionalities enabling reverse debugging and full comprehension of system behavior as well as Performance Analysis functionalities over the full software life cycle.

  7. VTune from Intel for optimizing performance across Intel architectures.




Multithread Debugging

Coming...



SohBackSan





Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization

YouTubeMy YouTube channel

Sponsor Open Source development activities and free contents for everyone.

Thank you.

- K Hong






Sponsor Open Source development activities and free contents for everyone.

Thank you.

- K Hong






C++ Tutorials

C++ Home

Algorithms & Data Structures in C++ ...

Application (UI) - using Windows Forms (Visual Studio 2013/2012)

auto_ptr

Binary Tree Example Code

Blackjack with Qt

Boost - shared_ptr, weak_ptr, mpl, lambda, etc.

Boost.Asio (Socket Programming - Asynchronous TCP/IP)...

Classes and Structs

Constructor

C++11(C++0x): rvalue references, move constructor, and lambda, etc.

C++ API Testing

C++ Keywords - const, volatile, etc.

Debugging Crash & Memory Leak

Design Patterns in C++ ...

Dynamic Cast Operator

Eclipse CDT / JNI (Java Native Interface) / MinGW

Embedded Systems Programming I - Introduction

Embedded Systems Programming II - gcc ARM Toolchain and Simple Code on Ubuntu and Fedora

Embedded Systems Programming III - Eclipse CDT Plugin for gcc ARM Toolchain

Exceptions

Friend Functions and Friend Classes

fstream: input & output

Function Overloading

Functors (Function Objects) I - Introduction

Functors (Function Objects) II - Converting function to functor

Functors (Function Objects) - General



Git and GitHub Express...

GTest (Google Unit Test) with Visual Studio 2012

Inheritance & Virtual Inheritance (multiple inheritance)

Libraries - Static, Shared (Dynamic)

Linked List Basics

Linked List Examples

make & CMake

make (gnu)

Memory Allocation

Multi-Threaded Programming - Terminology - Semaphore, Mutex, Priority Inversion etc.

Multi-Threaded Programming II - Native Thread for Win32 (A)

Multi-Threaded Programming II - Native Thread for Win32 (B)

Multi-Threaded Programming II - Native Thread for Win32 (C)

Multi-Threaded Programming II - C++ Thread for Win32

Multi-Threaded Programming III - C/C++ Class Thread for Pthreads

MultiThreading/Parallel Programming - IPC

Multi-Threaded Programming with C++11 Part A (start, join(), detach(), and ownership)

Multi-Threaded Programming with C++11 Part B (Sharing Data - mutex, and race conditions, and deadlock)

Multithread Debugging

Object Returning

Object Slicing and Virtual Table

OpenCV with C++

Operator Overloading I

Operator Overloading II - self assignment

Pass by Value vs. Pass by Reference

Pointers

Pointers II - void pointers & arrays

Pointers III - pointer to function & multi-dimensional arrays

Preprocessor - Macro

Private Inheritance

Python & C++ with SIP

(Pseudo)-random numbers in C++

References for Built-in Types

Socket - Server & Client

Socket - Server & Client 2

Socket - Server & Client 3

Socket - Server & Client with Qt (Asynchronous / Multithreading / ThreadPool etc.)

Stack Unwinding

Standard Template Library (STL) I - Vector & List

Standard Template Library (STL) II - Maps

Standard Template Library (STL) II - unordered_map

Standard Template Library (STL) II - Sets

Standard Template Library (STL) III - Iterators

Standard Template Library (STL) IV - Algorithms

Standard Template Library (STL) V - Function Objects

Static Variables and Static Class Members

String

String II - sstream etc.

Taste of Assembly

Templates

Template Specialization

Template Specialization - Traits

Template Implementation & Compiler (.h or .cpp?)

The this Pointer

Type Cast Operators

Upcasting and Downcasting

Virtual Destructor & boost::shared_ptr

Virtual Functions



Programming Questions and Solutions ↓

Strings and Arrays

Linked List

Recursion

Bit Manipulation

Small Programs (string, memory functions etc.)

Math & Probability

Multithreading

140 Questions by Google



Qt 5 EXPRESS...

Win32 DLL ...

Articles On C++

What's new in C++11...

C++11 Threads EXPRESS...

Go Tutorial

OpenCV...








Contact

BogoToBogo
contactus@bogotobogo.com

Follow Bogotobogo

About Us

contactus@bogotobogo.com

YouTubeMy YouTube channel
Pacific Ave, San Francisco, CA 94115

Pacific Ave, San Francisco, CA 94115

Copyright © 2024, bogotobogo
Design: Web Master