NCSU brick logo

CSC/ECE 506: Architecture of Parallel Computers

Spring 2023

course schedule: lectures

Toggle Sections

Lecture Schedule

Week 1[ 01/08 - 01/14 ]

[Mon 01/09] Lecture 1: Overview of parallel computation
In-class exercises

Where are you sitting today?  Submit  
Number of transistors on a chip  Submit  See
Multicore/manycore processor info  Submit  See
Top 500 observation  Submit  See

[Wed 01/11] Lecture 2: Three parallel-programming models
In-class exercises

Advantages and disadvantages of SMP organization  Submit  See
Overheads of message-passing  Submit  See
Shared-memory vs. message-passing programming  Submit  See
Reflection  Submit  

Week 2[ 01/15 - 01/21 ]

[Wed 01/18] Lecture 3: GPU architecture
In-class exercises

Best definition of speedup  Submit  See
Amdahl's law example  Submit  See
Upload your answers to practice questions  Submit  

Week 3[ 01/22 - 01/28 ]

[Mon 01/23] Lecture 4: Caches
In-class exercises

Direct-mapped cache: field sizes  Submit  See
Fully associative cache: field sizes  Submit  See
Set-associative cache: field sizes  Submit  See
Write policy in two-level caches  Submit  See
Reflection  Submit  

[Wed 01/25] Lecture 5: Physical and logical cache organization
In-class exercises

Steps in cache access  Submit  See
Parallelism in cache access  Submit  See
Alternatives for cache indexing and tagging  Submit  See
Multilevel cache design  Submit  See
Characteristics of inclusion properties  Submit  See
Reflection  Submit  

Week 4[ 01/29 - 02/04 ]

[Mon 01/30] Lecture 6: The cache-coherence problem
In-class exercises

Shared vs. distributed memory  Submit  See
Cache-coherence questions  Submit  See
Software lock using a flag  Submit  See
What's gone wrong in this race situation?  Submit  See
Reflection  Submit  

[Wed 02/01] Lecture 7: Coherence and consistency
In-class exercises

How does write-through guarantee coherence?  Submit  See
How many processors on a write-through bus?  Submit  See
What happens when a block is ejected?  Submit  See
Invalidation vs. update protocols  Submit  See
Ordering of operations in two threads  Submit  See
Why might A not print as 1?  Submit  See
Reflection  Submit  

Week 5[ 02/05 - 02/11 ]

[Mon 02/06] Lecture 8: Shared-memory parallel programming
In-class exercises

Please make a photocopy of textbook pages for me  Submit  
The three levels of parallelism  Submit  See
Dependences example  Submit  See
Dependences in truncated 4-point iteration example  Submit  See
LDG for Loop Nest 2  Submit  See
Second dependences example  Submit  See
Reflection  Submit  

[Wed 02/08] Lecture 9: Dependences, DOACROSS, DOPIPE
In-class exercises

Dependences in function-parallelism example  Submit  See
Dependences in DOPIPE-parallelism example  Submit  See
Variable scopes - Example 1  Submit  See
Exercise 2: for i tasks  Submit  See
Reflection  Submit  

Week 6[ 02/12 - 02/18 ]

[Mon 02/13] Test 1 - 7:00-9:00 PM, EB II 1231
[Wed 02/15] Lecture 10: Variable scope
In-class exercises

Why is each variable privatizable?  Submit  See
Example 1: Which variables should be declared as shared/private?  Submit  See
Example 2: Which variables should be declared as shared/private?  Submit  See
Scopes in matrix multiplication - for k ||ization  Submit  See
Scopes in matrix multiplication - for i ||ization  Submit  See
Reflection  Submit  

Week 7[ 02/19 - 02/25 ]

[Mon 02/20] Lecture 11: Parallelizing the Ocean application
In-class exercises

Questions about the serial solver  Submit  
Order of updating points  Submit  
Concurrency along antidiagonals  Submit  
Bad ways of exploiting parallelism in Ocean application  Submit  
Red/black ordering  Submit  
Does it matter that execution is no longer deterministic?  Submit  

[Wed 02/22] Lecture 12: Parallelization in three models
In-class exercises

Advantages and disadvantages of assignment options  Submit  
Block assignment and communication  Submit  
Block partitioning  Submit  
Synchronization in the shared-memory program  Submit  
Barrier synchronization in shared-memory version  Submit  
Questions about the message-passing program  Submit  
Typos in message-passing if statements  Submit  
Reflection  Submit  

Week 8[ 02/26 - 03/04 ]

[Mon 02/27] Lecture 13: Data-parallel algorithms
Online videos

13a. Control parallelism vs. data parallelism [4:53] Watch  
13b. Building blocks for data parallelism [13:16] Watch  
13c. Pointer doubling [10:14] Watch  
13d. Multiplying matrices [5:00] Watch  
13e. Labeling regions in an image [8:01] Watch  

[Wed 03/01] Lecture 14: Parallelizing linked data structures
In-class exercises

Parallelizing operations on linked data structures  Submit  See
Conflict between an insertion and a deletion  Submit  See
Fine-grain locking approach  Submit  See
Questions about insertion with fine-grain locks  Submit  See
Reflection  Submit  

Week 9[ 03/05 - 03/11 ]

[Mon 03/06] Lecture 15: Invalidation and update protocols
Online videos

15a. The MSI protocol [14:20] Watch  
15b. The MESI protocol [10:35] Watch  
15c. The Dragon protocol [10:37] Watch  
15d. The Firefly protocol [6:52] Watch  

[Wed 03/08] Lecture 16: Multicore caches: organization & performance
In-class exercises

Hits and misses in set-associative cache  Submit  See
Hits and misses in direct-mapped cache  Submit  See
Coherence misses  Submit  See
Cache changes to reduce miss rate  Submit  See
Effects of increasing line size  Submit  See
Context-switch misses  Submit  See
Logical cache organization  Submit  See
Partitioned shared cache organization  Submit  See

Week 10[ 03/19 - 03/25 ]

[Mon 03/20] Lecture 17: Hardware support for locking
In-class exercises

Performance of test-and-set  Submit  
TSL vs. TTSL  Submit  
LL/SC vs. TTSL  Submit  
Ticket locks vs. array-based queueing locks  Submit  
Reflection  

[Wed 03/22] Lecture 18: Barrier implementations
In-class exercises

Ticket lock with MSI  Submit  
Scalability at the barrier  Submit  
Performance of combining-tree barrier  Submit  
Reflection  Submit  

Week 11[ 03/26 - 04/01 ]

[Mon 03/27] Test 2 - 7:00-9:00 PM, EB II 1231
[Wed 03/29] Lecture 19: Memory consistency
In-class exercises

Permission form for study on dual-submission homework  Submit  
Interest in independent study/thesis topics  Submit  See
Example: Why is a memory consistency model needed?  Submit  See
Sequentially consistent vs. non-seq. consistent outcomes  Submit  See
Which outcomes are possible under SC?  Submit  See
Prefetching early and late  Submit  See
Reflection  Submit  

Week 12[ 04/02 - 04/08 ]

[Mon 04/03] Lecture 20: Relaxed memory-consistency models
In-class exercises

Need for relaxed consistency models  Submit  
Causual-consistency example  Submit  
Strongest consistency model  Submit  
How can both processes be killed?  Submit  
Weak ordering  Submit  

[Wed 04/05] Lecture 21: Caching in DSM machines
In-class exercises

Why doesn't a bus-based design scale?  Submit  
Why aren't invalidations too slow?  Submit  
Page placement without interleaving  Submit  
Directory messages for read and write misses  Submit  
Merging the directory with the LLC tag array  Submit  
Reflection  Submit  

Week 13[ 04/09 - 04/15 ]

[Mon 04/10] Lecture 22: Coherence in DSM machines
In-class exercises

Pseudocode for full bit-vector approach  Submit  
Block states in main memory  Submit  
Optimizing a full bit-vector scheme  Submit  
Reflection  Submit  

[Wed 04/12] Lecture 23: The Silicon Graphics S2MP architecture
Online videos

23a. Today's MP architectures [7:20] Watch  
23b. Directory-based coherence [8:41] Watch  
23c. Scaling the SMP model [7:05] Watch  
23d. SGI's Origin [5:55] Watch  
23e. Design issues [9:36] Watch  
23f. Directory organization [5:42] Watch  
23g. Coherence protocol and summary [10:33] Watch  

Week 14[ 04/16 - 04/22 ]

[Mon 04/17] Lecture 24: DSM implementation correctness & performance
In-class exercises

An invalidation to a node that no longer has a block  Submit  
Transition from state U on a read request  Submit  
Transition from state S on a readX request  Submit  
Home-centric vs. requester-assisted approach  Submit  
Reflection  Submit  

[Wed 04/19] Lecture 25: Caching in multicore architectures
In-class exercises

ReadX in state S or U with non-atomic message  Submit  
ReadX to EM block with non-atomic message  Submit  
What's wrong with imprecise directory info?  Submit  
Increased power consumption and latency  Submit  
Other problems with stale directory info  Submit  
Accelerating thread migration  Submit  
Reflection  Submit  

Week 15[ 04/23 - 04/29 ]

[Mon 04/24] Lecture 26: Review
In-class exercises

Three orchestrations of Ocean  Submit  
Coherence and consistency  Submit  
Physical and logical cache organization  Submit  
Four "C"s of cache misses  Submit  
Summing a vector with copy-scan  Submit  
Miscellaneous questions  Submit  
Kahoot questions  Submit  

Week 16[ 04/30 - 05/06 ]

[Wed 05/03] Final Exam - 3:30-6:00 PM, EB II 1230 and EB III 2236
©2007-2022 NC State University | Disclaimer
back to top