Java Concurrency and Performance Workshop

Training Content 28th, 29th & 30th June 2013

MOVAA Technologies


(The most important reason for most applications(Enterprise and real-time) to fail or underperform is improper design for concurrency. This is especially true because of multi cores/processor becoming ubiquitous. This training course takes a wholistic view of concurrency including designing for multi-cores/processors(including NUMA).)

Pre-requisites: Basic knowledge of Java (introductory course or equivalent practical experience). Target Audience: The target group is programmers who want to know foundations of concurrent programming and existing concurrent programming environments, in order, now or in future, to develop multithreaded applications for multi-core processors and shared memory multiprocessors. Objective: Understand concurrency control issues in general. Know the instruments available in Java. Avoid common errors and pitfalls. Understand concurrency control idioms. What you will learn: Designing applications for multi-core/multi-processor environments through detailed concurrency patterns and design principles. Will understand the use of highly concurrent data structures and their design, well enough to be applied in software design. Dealing with threads and collections on a multi core .
JDK 5,6,7 which have Features and Classes to harness the power of the underlying technologies. Detecting Deadlocks and Livelocks in existing applications by following Concurrency Patterns to avoid the above mentioned. Will understand the use of Hardware based Locking. To quickly identify the root causes of poor performance in your applications. Eliminate conditions that will prevent you from finding performance bottlenecks.

Brief Table of Contents

Producer Consumer(Basic Hand-Off) (Day:1) Common Issues with thread Java Memory Model(JMM) Applied Threading techniques Building Blocks for Highly Concurrent Design Highly Concurrent Data Structures-Part1 (Day 2) Designing For Concurrency Canned Synchronizers Highly Concurrent Data Structures-Part2 (Day 3) Crash course in Mordern hardware Concurrent Reasoning Concurrency Patterns Designing for multi-core/processor environment Introduction to Big Data and Hadoop

Detailed Table of Contents

Producer Consumer(Basic Hand-Off) (Part-1)

Why wait-notify require Synchronization

locking handling done by OS Hidden queue Structural modification to hidden queue by wait-notify use cases for notify-notifyAll notifyAll used as work around design issues with synchronization

Common Issues with thread

problem with stop Dealing with InterruptedStatus Uncaught Exception Handler

Java Memory Model(JMM)

Sequential Consistency would disallow common optimizations Instruction Reordering

heavily pipelines processors super-scalar processors NUMA(Non uniform memory access)

Cache Coherency

Real Meaning and effect of synchronization Volatile Final The changes in JMM

Applied Threading techniques

Thread Local Storage Safe Construction techniques

UnSafe Construction techniques Thread safety levels

Building Blocks for Highly Concurrent Design

CAS Hardware based locking

Optimistic Design ABA problem


Markable reference Stamped reference weakCompareAndSet

Wait-free Stack implementation Wait-free Queue implementation Design issues with synchronization Multiple user conditions and wait queues Lock Polling techniques Reentrant Lock

Lock Implementation

ReentrantReadWriteLock ReentrantLock

Based on CAS Lock Striping on table Lock Striping on LinkNodes. segregating them based on Thread safety levels

Lock Striping

Indentifying scalability bottlenecks in java.util.Collection

Highly Concurrent Data StructuresPart1


Structure Almost immutability Using volatile to detect interference Read does not block in common code path remove/put/resize lock

Weakly Consistent Iterators vs Fail Fast Iterators LockFreeHashMap

For systems with more than 100 cpus/cores Constant Time key-value mapping no locks even during resize all CAS spin loop bounded faster than ConcurrentHashMap State based Reasoning

Designing For Concurrency

Confinement Immutability Almost Immutability Atomicity Visibility Restructuring and refactoring


Canned Synchronizers

Synchronous Queue Framework

Future Semaphore Mutex Barrier Latches SynchronousQueue Exchanger

Highly Concurrent Data Structures-Part2

CopyOnWriteArray(List/Set) Queue interfaces

Queue BlockingQueue Deque BlockingDeque

Queue Implementations

ConcurrentLinkedQueue LinkedBlockingQueue and LinkedBlockingDeque ArrayBlockingDeque ArrayDeque and ArrayBlockingDeque WorkStealing using Deques LinkedTransferQueue



Sequential Skiplist Lock based concurrent Skiplist Lock free concurrent Skiplist Concurrent Skiplist

Executor FrameWork

Configuration Hardware shapes programming idiom Exposing fine grained parallelism Divide and conquer Fork and Join Anatomy of Fork and Join Work Stealing Fork -join decomposition ParallelArray Limitations

Fork and Join Framework

Crash course in Mordern hardware

Amdahl's Law Cache


Direct mapped Address mapping in cache read write cache controller

Memory Architectures


Concurrent Reasoning

Sequential Consistency Linearizability Quiescent Consistency Compositionality

Concurrency Patterns
Fine grained Synchronization Optimistic Synchronization Lazy Synchronization Lock free Synchronization

Designing for multi-core/processor environment

Harsh Realities of parallelism Parallel Programming Concurrent Objects

Concurrency and Correctness Quiescent Consistency Sequential Consistency Linearizability Progress Conditions Lock suitable for NUMA systems Coarse Grained Synchronization Fine Grained Synchronization Optimistic Synchronization Lazy Synchronization Non Blocking Synchronization Bounded Partial Queue Unbounded Total Queue Unbounded lock-free Queue



Concurrent Queues

Concurrent Stack
Concurrent Hashing Closed Address Hashing Open Address Hashing Lock Free Hashing Sequential Skiplist Lock based Concurrent Skiplist Lock free Skiplist Array Based bounded Priority Queue Tree based Bounded Priority Queue Heap Based Unbounded Priority Queue Skiplist based Unbounded priority Queue


Priority Queues


