Using Parallel Threads Why? When? Where?
George Serbanut
University of Turin / INFN Turin / ISS Bucharest
(serbanut@to.infn.it)
CSC – August 28
th, 2007
Aim
Stage 1 (before)
Stage 2 (after)
Outline
●
What?
●
How?
●
Why?
●
When?
●
Where?
Outline
●
What? - Definitions
●
Where? - Modern (single core) CPU's
●
How? - Simple examples in C/C++/JAVA
●
Why? - Benchmark for Intel dual core2 (CygWin)
●
Where and when? - Limitations in working with
parallel threads
What? - Definitions
●
Processing element (see GRID technologies)
– A CPU / a set of CPU's connected by the same Mother Board (PC, laptop/notebook)
●
Pipelines
– Execution lines in the processing element
●
Dataflow (see Network QoS & performances)
– A stream of instructions
●
Threaded dataflow (see TProof in ROOT)
– A stream of instructions “flowing” in a pipeline
Where? - Modern Processing Elements
Where? - Modern Processing Elements
Pipelines here!
Level of serialization
defined here
C programming language
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *print_message_function( void *ptr );
main() {
pthread_t thread1, thread2;
char *message1 = "Hello from thread 1";
char *message2 = "Hello from thread 2";
int iret1, iret2;
iret1 = pthread_create( &thread1, NULL, print_message_function, (void*) message1);
iret2 = pthread_create( &thread2, NULL, print_message_function, (void*) message2);
pthread_join( thread1, NULL);
pthread_join( thread2, NULL);
printf("Thread 1 returns: %d\n",iret1);
printf("Thread 2 returns: %d\n",iret2);
exit(0);
}
void *print_message_function( void *ptr ) {
char *message;
message = (char *) ptr;
printf("%s \n", message);
}
How? - Working with Threads
~ simple examples ~
See GNU POSIX threading (pthread)
documentation
C programming language
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *print_message_function( void *ptr );
main() {
pthread_t thread1, thread2;
char *message1 = "Hello from thread 1";
char *message2 = "Hello from thread 2";
int iret1, iret2;
iret1 = pthread_create( &thread1, NULL, print_message_function, (void*) message1);
iret2 = pthread_create( &thread2, NULL, print_message_function, (void*) message2);
pthread_join( thread1, NULL);
pthread_join( thread2, NULL);
printf("Thread 1 returns: %d\n",iret1);
printf("Thread 2 returns: %d\n",iret2);
exit(0);
}
void *print_message_function( void *ptr ) {
char *message;
message = (char *) ptr;
printf("%s \n", message);
}
How? - Working with Threads
~ simple examples ~
See GNU POSIX threading (pthread)
documentation
#include <pthread.h>
C programming language
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *print_message_function( void *ptr );
main() {
pthread_t thread1, thread2;
char *message1 = "Hello from thread 1";
char *message2 = "Hello from thread 2";
int iret1, iret2;
iret1 = pthread_create( &thread1, NULL, print_message_function, (void*) message1);
iret2 = pthread_create( &thread2, NULL, print_message_function, (void*) message2);
pthread_join( thread1, NULL);
pthread_join( thread2, NULL);
printf("Thread 1 returns: %d\n",iret1);
printf("Thread 2 returns: %d\n",iret2);
exit(0);
}
void *print_message_function( void *ptr ) {
char *message;
message = (char *) ptr;
printf("%s \n", message);
}
How? - Working with Threads
~ simple examples ~
See GNU POSIX threading (pthread)
documentation
pthread_t thread1, thread2;
C programming language
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *print_message_function( void *ptr );
main() {
pthread_t thread1, thread2;
char *message1 = "Hello from thread 1";
char *message2 = "Hello from thread 2";
int iret1, iret2;
iret1 = pthread_create( &thread1, NULL, print_message_function, (void*) message1);
iret2 = pthread_create( &thread2, NULL, print_message_function, (void*) message2);
pthread_join( thread1, NULL);
pthread_join( thread2, NULL);
printf("Thread 1 returns: %d\n",iret1);
printf("Thread 2 returns: %d\n",iret2);
exit(0);
}
void *print_message_function( void *ptr ) {
char *message;
message = (char *) ptr;
printf("%s \n", message);
}
How? - Working with Threads
~ simple examples ~
See GNU POSIX threading (pthread)
documentation
iret1 = pthread_create(...)
C programming language
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *print_message_function( void *ptr );
main() {
pthread_t thread1, thread2;
char *message1 = "Hello from thread 1";
char *message2 = "Hello from thread 2";
int iret1, iret2;
iret1 = pthread_create( &thread1, NULL, print_message_function, (void*) message1);
iret2 = pthread_create( &thread2, NULL, print_message_function, (void*) message2);
pthread_join( thread1, NULL);
pthread_join( thread2, NULL);
printf("Thread 1 returns: %d\n",iret1);
printf("Thread 2 returns: %d\n",iret2);
exit(0);
}
void *print_message_function( void *ptr ) {
char *message;
message = (char *) ptr;
printf("%s \n", message);
}
How? - Working with Threads
~ simple examples ~
See GNU POSIX threading (pthread)
documentation
pthread_join( thread1, NULL);
How? - Working with Threads
~ simple examples ~
C++ programming language
class definition:
class Thread {
public:
Thread();
int Start(void * arg);
protected:
int Run(void * arg);
static void * EntryPoint(void*);
virtual void Setup();
virtual void Execute(void*);
void * Arg() const {return Arg_;}
void Arg(void* a){Arg_ = a;}
private:
THREADID ThreadId_;
void * Arg_;
};
class implementation:
Thread::Thread() {}
int Thread::Start(void * arg) {
Arg(arg); // store user data
int code = thread_create(Thread::EntryPoint, this, & ThreadId_);
return code;
}
int Thread::Run(void * arg) {
Setup(); Execute( arg );
}
/*static */
void * Thread::EntryPoint(void * pthis) {
Thread * pt = (Thread*)pthis; pthis->Run( Arg() );
}
virtual void Thread::Setup() { /*Do any setup here*/ }
virtual void Thread::Execute(void* arg) { /*Your code goes here*/ } (See Ryan Teixeira's tutorial on internet)
How? - Working with Threads
~ simple examples ~
C++ programming language
class definition:
class Thread {
public:
Thread();
int Start(void * arg);
protected:
int Run(void * arg);
static void * EntryPoint(void*);
virtual void Setup();
virtual void Execute(void*);
void * Arg() const {return Arg_;}
void Arg(void* a){Arg_ = a;}
private:
THREADID ThreadId_;
void * Arg_;
};
class implementation:
Thread::Thread() {}
int Thread::Start(void * arg) {
Arg(arg); // store user data
int code = thread_create(Thread::EntryPoint, this, & ThreadId_);
return code;
}
int Thread::Run(void * arg) {
Setup(); Execute( arg );
}
/*static */
void * Thread::EntryPoint(void * pthis) {
Thread * pt = (Thread*)pthis; pthis->Run( Arg() );
}
virtual void Thread::Setup() { /*Do any setup here*/ }
virtual void Thread::Execute(void* arg) { /*Your code goes here*/ } (See Ryan Teixeira's tutorial on internet)
int code = thread_create(...)
How? - Working with Threads
~ simple examples ~
JAVA:
import java.lang.Thread;
public class HelloThread extends Thread {
public void HelloThread(int i) {
counter = i;
}
public void run() {
System.out.println("Hello from thread " + counter);
}
int counter;
public static void main(String args[]) {
(new HelloThread(1)).start();
(new HelloThread(2)).start();
} }
see http://www.javaworld.com/javaworld/jw-04-1996/jw-04-threads.html or SUN website
How? - Working with Threads
~ simple examples ~
JAVA:
import java.lang.Thread;
public class HelloThread extends Thread {
public void HelloThread(int i) {
counter = i;
}
public void run() {
System.out.println("Hello from thread " + counter);
}
int counter;
public static void main(String args[]) {
(new HelloThread(1)).start();
(new HelloThread(2)).start();
} }
see http://www.javaworld.com/javaworld/jw-04-1996/jw-04-threads.html or SUN website
import java.lang.Thread;
How? - Working with Threads
~ simple examples ~
JAVA:
import java.lang.Thread;
public class HelloThread extends Thread {
public void HelloThread(int i) {
counter = i;
}
public void run() {
System.out.println("Hello from thread " + counter);
}
int counter;
public static void main(String args[]) {
(new HelloThread(1)).start();
(new HelloThread(2)).start();
} }
see http://www.javaworld.com/javaworld/jw-04-1996/jw-04-threads.html or SUN website
... extends Thread
How? - Working with Threads
~ simple examples ~
JAVA:
import java.lang.Thread;
public class HelloThread extends Thread {
public void HelloThread(int i) {
counter = i;
}
public void run() {
System.out.println("Hello from thread " + counter);
}
int counter;
public static void main(String args[]) {
(new HelloThread(1)).start();
(new HelloThread(2)).start();
} }
see http://www.javaworld.com/javaworld/jw-04-1996/jw-04-threads.html or SUN website
public void run()
How? - Working with Threads
~ simple examples ~
JAVA:
import java.lang.Thread;
public class HelloThread extends Thread {
public void HelloThread(int i) {
counter = i;
}
public void run() {
System.out.println("Hello from thread " + counter);
}
int counter;
public static void main(String args[]) {
(new HelloThread(1)).start();
(new HelloThread(2)).start();
} }
see http://www.javaworld.com/javaworld/jw-04-1996/jw-04-threads.html or SUN website
... .start()
How? - Working with Threads
~ simple examples ~
JAVA:
import java.lang.Thread;
public class HelloThread extends Thread {
public void HelloThread(int i) {
counter = i;
}
public void run() {
System.out.println("Hello from thread " + counter);
}
int counter;
public static void main(String args[]) {
(new HelloThread(1)).start();
(new HelloThread(2)).start();
} }
see http://www.javaworld.com/javaworld/jw-04-1996/jw-04-threads.html or SUN website
new versions of JAVA changed Thread with Runnable
Why? - Benchmark for Intel Dual Core2 (CygWin)
time (s) time (s)
1 19.796 21 10.593
6 10.937 22 10.578
11 10.406 23 10.203
16 10.281 24 10.265
21 10.203 25 10.265
26 10.234 26 10.203
31 10.265 27 10.203
36 10.218 28 10.234
41 10.234 29 10.250
46 10.234 30 10.218
# ths # ths
When? - Working with Threads
~ limitations ~
No predefined ANSI threading implementation
Where? - Working with Threads
~ limitations ~
No predefined ANSI threading implementation
Each core of processing elements based on Intel 32-bits
architecture (Intel, AMD) supports up to 16 pipelines
Where? - Working with Threads
~ limitations ~
No predefined ANSI threading implementation
Each core of processing elements based on Intel 32-bits architecture (Intel, AMD) supports up to 16 pipelines
Pipelines in processing units are FIFO
When? - Working with Threads
~ limitations ~
No predefined ANSI threading implementation
Each core of processing elements based on Intel 32-bits architecture (Intel, AMD) supports up to 16 pipelines
Pipelines in processing units are FIFO
Parallel threads do not work with static/constant defined
members
When? - Working with Threads
~ limitations ~
No predefined ANSI threading implementation
Each core of processing elements based on Intel 32-bits architecture (Intel, AMD) supports up to 16 pipelines
Pipelines in processing units are FIFO
Parallel threads do not work with static/constant defined members
Initialization of threads takes CPU cycles
Therefore not to be used in simple applications
Where? - Working with Threads
~ limitations ~
No predefined ANSI threading implementation
Each core of processing elements based on Intel 32-bits architecture (Intel, AMD) supports up to 16 pipelines
Pipelines in processing units are FIFO
Parallel threads do not work with static/constant defined members
Initialization of threads takes CPU cycles
Therefore not to be used in simple applications
Under MS Windows OS:
if used with GUI and if priority is not defined, the active window takes the
highest priority
Where? - Working with Threads
~ limitations ~
No predefined ANSI threading implementation
Each core of processing elements based on Intel 32-bits architecture (Intel, AMD) supports up to 16 pipelines
Pipelines in processing units are FIFO
Parallel threads do not work with static/constant defined members
Initialization of threads takes CPU cycles
Therefore not to be used in simple applications
Under MS Windows OS:
if used with GUI and if priority is not defined, the active window takes the highest priority
he parallelization of the threads is done by a pseudo-serialization (so, weak
enhancement in performances)
Where? - Working with Threads
~ limitations ~
No predefined ANSI threading implementation
Each core of processing elements based on Intel 32-bits architecture (Intel, AMD) supports up to 16 pipelines
Pipelines in processing units are FIFO
Parallel threads do not work with static/constant defined members
Initialization of threads takes CPU cycles
Therefore not to be used in simple applications
Under MS Windows OS:
if used with GUI and if priority is not defined, the active window takes the highest priority
he parallelization of the threads is done by a pseudo-serialization (so, weak enhancement in performances)
On LINUX OS:
starting with kernel 2.6, the threading system was rewritten from scratch, even
though, POSIX threads kept the same function names.
Where? - Working with Threads
~ limitations ~
No predefined ANSI threading implementation
Each core of processing elements based on Intel 32-bits architecture (Intel, AMD) supports up to 16 pipelines
Pipelines in processing units are FIFO
Parallel threads do not work with static/constant defined members
Initialization of threads takes CPU cycles
Therefore not to be used in simple applications
Under MS Windows OS:
if used with GUI and if priority is not defined, the active window takes the highest priority
he parallelization of the threads is done by a pseudo-serialization (so, weak enhancement in performances)
On LINUX OS:
starting with kernel 2.6, the threading system was rewritten from scratch, even though, POSIX threads kept the same function names
Parallel threads are defined on a single computer element
for more advanced parallel computing see TProof, cluster and GRID
technologies)
When and where? - Working with Threads (limitations)
No predefined ANSI threading implementation
Each core of processing elements based on Intel 32-bits architecture (Intel, AMD) supports up to 16 pipelines
Pipelines in processing units are FIFO
Parallel threads do not work with static/constant defined members
Initialization of threads takes CPU cycles
Therefore not to be used in simple applications
Under MS Windows OS:
if used with GUI and if priority is not defined, the active window takes the highest priority
he parallelization of the threads is done by a pseudo-serialization (so, weak enhancement in performances)
On LINUX OS:
starting with kernel 2.6, the threading system was rewritten from scratch, even though, POSIX threads kept the same function names
Parallel threads are defined on a single computer element
for more advanced parallel computing see TProof, cluster and GRID
technologies)
THANK YOU!
MIT Processing Element
~ static dataflow processors ~
MIT Processing Element
~ static dataflow processors ~
Manchester Processing Element
~ dynamic dataflow processors ~
Manchester Processing Element
~ dynamic dataflow processors ~
Monsoon Processing Element
~ explicit data store or controlled dataflow processors ~
Monsoon Processing Element
~ explicit data store or controlled dataflow processors ~
Processing Element
~ memory addressing and operations in controlled dataflow
processors ~
Processing Element
~ hybrids ~
Processing Element
~ threads in hybrids ~
See http://csd.ijs.si/courses/processor/index.html for more information