Instead of directly attacking the topic of Multi-threading, I am going to spend some time in the periphery of it, discussing all the terms that constitute Multi-threading.
A Program is a set of instructions. A Program is saved in a computer’s hard disk or external memory.
A program becomes a Process when it is loaded into a computer’s memory. A process has resources allocated to it such as registers, memory, etc. If you open multiple instances of an application then each instance will be a separate process. A process has its own Virtual Memory and is self-contained. If one process become unresponsive then other processes are not impacted.
Time Slicing and Context Switching
Time Slicing is a technique by which the CPU time is divided between running processes, giving users an illusion of a Multitasking or Multiprogramming system. The current Process is switched with the next process in line, which is picked in a Round Robin fashion. This switching of Process requires saving the State of Process in memory so that the execution can be resumed where it was left off previously. This is called Context Switching. Context Switching can also happen if a Process is busy doing I/O intensive tasks and does not require CPU cycles. So, instead of just waiting till the end of the time slice, the process can swap even earlier.
Context Switching will be very slow if the state of Process is saved in computer’s hard disk. Reloading and Saving will take some time, as a result user will not have a fluent Multitasking experience. So, to make the Context Switch faster, the Process State is saved in the computer’s Main Memory and it is called as PCB.
PCB (Process Control Block)
All the information about a Process is packed into a data structure called the Process Control Block. Information such as Process ID, Registers, Code, Data, Program Counter, Stack pointer, etc. PCB is used in Context Switching and is saved in a computer’s Main Memory. Although it is relatively faster, it is still a lot of data that gets loaded and saved. And that’s how the concept of Thread was introduced.
There are numerous definitions of Thread found from different sources, some of them are listed out below.
- A thread is the smallest Unit of Execution.
- A thread is a sequence of instructions within a program that can be executed independently of other code.
- A thread is an execution context, which is all the information a CPU needs to execute a stream of instructions.
- A thread is a basic unit of CPU utilization; it comprises a thread ID, a program counter, a register set, and a stack.
- A thread is a kernel abstraction for scheduling work on the processor, a thread is what the kernel gives you to manage the processor time and share work with others
Here is my favorite definition …
Thread is a light-weight Process.
So, a thread basically points to a Process and uses the Code and Data saved in the Process. A thread will have its own Program Counter and Stack Pointer which will tell you where it is exactly in the Execution path. During Context Switching, instead of switching the whole Process, light-weight threads can be switched. Processes can have multiple threads, each with their own copy of Program counter and Stack Pointer. They still share the same Code and Data from the parent Process. Similar to PCB, the data structure that saves thread information is called the Thread Control Block (TCB). TBC is also saved in the Main memory.
Thread Safety/Synchronize resources
Threads belonging to the same process share resources such as Process-level Data and Code. This can create problems as two or more threads may try to change the Shared Data at the same time. This can result in a Race condition, where it is not certain who gets precedence in accessing and modifying the Shared Data. The code that can cause this is called a Critical Section. In Critical Section, the code statements are not atomic and even a single line of code when compiled gets translated into multiple lines of code making it non-atomic. Depending on the sequence in which two or more threads execute the Critical Section, the result can vary. To avoid this, threads need to be Synchronized, making sure that only one thread can enter the Critical Section and no other thread can enter until the first thread has left the Critical Section. Thread Synchronization can be achieved by using Semaphores, Monitors, and Locks.
What is Multi-threading, Concurrency and Parallelism?
Multi-threading is the ability of a central processing unit (CPU) or a single core in a multi-core processor to execute multiple threads.
A single processor can perform multi-threading by Time Slicing and Context Switching. The processor can switch execution resources between threads so that each thread gets some CPU time and it appears that they are executed simultaneously. This is known as Concurrency.
In a multi-processor environment, each thread in the process can run on a separate processor or core, resulting in parallel execution, known as Parallelism.
Multi-tasking vs Multi-processing vs Multi-threading
Multi-tasking is the ability to execute more than one task or a process. Only ONE CPU is involved when we refer to Multi-tasking. This is achieved by Time Slicing.
Multi-processing is the same as Multi-tasking, but more than one CPUs or multi-core processors are involved.
Multi-threading is Multi-tasking but at the thread level. So if all the processes have only One Thread each then Multi-threading will act the same as Multi-tasking.
One thread can handle rapid responses (UI thread) while other threads can do CPU intensive tasks.
- Maximize CPU Utilization
When a thread is working on I/O intensive tasks where CPU cycles are not needed, it can be switched with another thread to minimize CPU idle time.
- Resource Sharing
Threads within a process share common code, data, and other resources which allow for simultaneous processing in a single address space.
- Low cost
Creating, managing and Context Switching of threads is cheaper in terms of time and resources as versus Processes.
To take advantage of multi-processor or multi-core system, multi-threaded applications can be split amongst available processors to increase application’s throughput.
Multi-threading in .Net using System.Threading
By default, a .NET program is started with a single thread, often called the primary thread. However, it can create additional threads to execute code in parallel or concurrently with the primary thread. These threads are often called worker threads.
In .Net Thread class is found in namespace System.Threading and it can be used to create and manage threads. When a new thread is created, it costs some time and resources, and the thread dies off when it finishes its work. Since the user has the power to create threads, it can go out of hands quickly, where user is creating number of threads causing overhead rather than increasing the throughput. Using Thread class in your applications is not the recommended way unless there is a special need for it. Here is a quick look of how a new thread can be created.
Starting with the .NET Framework 4, the recommended way to utilize multithreading is to use Task Parallel Library (TPL) and Parallel LINQ (PLINQ). Both TPL and PLINQ rely on the ThreadPool threads.
In my next article, I will talk about Thread Pool, TPL and PLINQ.
Get Help with KTL Solutions
KTL Solutions can help with all of your software needs. From consulting, to custom software development to managed IT services, KTL Solutions is the gold-certified Microsoft Partner that can help you with any of your business needs. We’ll help you apply the right technologies to improve your financial, customer service, and operational processes so that you can worry about your most important challenge, growth.
Ready to improve your software? Contact KTL today.