Administrative information
Administrative course information is available here
We use the inf-2202-f16@list.uit.no mailing list to send important information.
We have the following rooms and hours:
- Tuesdays 14:15-16:00, A016: lab exercises.
- Thursdays 14:15-16:00, B203 (Small auditorium): lectures.
- Thursdays 14:15-16:00, p-lab: lab exercises.
- Fridays 10:15-11:00, NYLYSTH AUD: lectures.
Staff
- Lars Ailo Bongo larsab@cs.uit.no, Office: A259
- Tim Alexander Teige tte008@post.uit.no, Office: ODS-Lab
GitHub
Lecture plan
Lecture | Date | Subject | Lecturer |
---|---|---|---|
L1 | Fri 19.08 | Introduction | Lars Ailo |
L2 | Fri 26.08 | Threads and synchronization primitives | Lars Ailo |
L3 | Thu 01.09 | Guest lecture: Go Language | Giacomo Tartari |
L4 | Fri 02.09 | Parallel architectures | Lars Ailo |
L5 | Fri 09.09 | Parallel programs | Lars Ailo |
L6 | Fri 16.09 | Programming for performance | Lars Ailo |
L7 | Fri 23.09 | Parallel program performance evaluation | Lars Ailo |
L8 | Fri 30.09 | Performance evaluation | Lars Ailo |
L9 | Fri 07.10 | Cloud computing (no slides) | Lars Ailo |
L10 | Thu 13.10 | Guest lecture: Scala and Spark | Inge Alexander Raknes |
- | Fri 21.10 | Postponed | - |
L11 | Fri 28.10 | Data-intensive computing | Lars Ailo |
L12 | Thu 03.10 | Spark libraries | Lars Ailo |
L13 | Fri 04.11 | Guest lecture: The new Stallo Supercomputer | Steinar Trædal-Henden |
L14 | Thu 10.11 | Summary lecture | Lars Ailo |
- | Thu 10.11 | Course evaluation | Jan Fuglesteg and Kai-Even Nilssen |
- | Thu 24.11 | Exam | - |
Readings
All lecture notes are Mandatory.
In addition, unless otherwise noted, the following are also mandatory readings:
- Introduction
- None
- Threads and synchronization primitives (operating systems course recap):
- Modern operating systems, 3ed, Andrew S. Tanenbaum. Prentice Hall. 2007. Chapters: 2.2, 2.3, 2.5, 10.3, 11.4
- Alternative to MOS: another operating systems textbook: the chapters about threading, IPC mechanisms, and classical IPC problems.
- Go
- Rob Pike. SPLASH keynote talk
- A tour of Go
- How to write Go code
- Effective Go
- Go concurrency patterns (video, slides)
- Advanced Go concurrency patterns (video, slides)
- Parallel architectures
- Computer Organization and Design: the Hardware/Software Interface, 5th. David A. Patterson, John L. Hennessy. Morgan Kaufmann. 2011. Chapter 6: “Parallel Processors from Client to Cloud”.
- Parallel programs
- None
- Programming for performance
- None
- Parallel performance evaluation
- Performance evaluation
- None
- Cloud compting
- Scala, and Spark
- Data-intensive computing
- “Jim Gray on eScience”, and chapters 1 and 2 from The Fourth Paradigm: Data-Intensive Scientific Discovery. Edited by Tony Hey, Stewart Tansley, and Kristin Tolle. 2010.
- Optional: Google File System paper
- Optional: MapReduce paper
- Optional: BigTable paper
- Optional: Exascale Computing and Big Data
- Spark libraries
- Optional: lecture notes, papers and videos in the slide comments
- Stallo supercomputer
- None
- Summary
- None
The following are suggested additional readings:
- The Go Programming Language. Alan Donovan and Brian Kernighan. 2015.
-
Learning Spark. Holden Karau, Andy Konwinski, Patrick Wendell, Matei Zaharia. O’Reilly. 2015.
- Parallel Computer Architecture: A Hardware/Software Approach. David Culler, J.P. Singh, Anoop Gupta. Morgan Kaufmann. 1998.
- This book has a great introduction to parallel programming.
- There is one copy in the library. Please be nice to your fellow students and do not lend that copy for an extended period.
- The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling. R. K. Jain. Wiley. 1991.
- A very good book about performance analysis.
- There is one copy in the library. Please be nice to your fellow students and do not lend that copy for an extended period.
- Computer Architecture, Fifth Edition: A Quantitative Approach, 5ed. John L. Hennessy, David A. Patterson. Morgan Kaufmann. 2011.
- This book has a throughout description of different parallel architectures.
- You can purchase this book from your favourite bookstore.
- The Fourth Paradigm: Data-Intensive Scientific Discovery. Edited by Tony Hey, Stewart Tansley, and Kristin Tolle. 2010.
- This collection of essays describe many of the opportunities and challenges for data-intensive computing in different scientific fields.
- The book is freely available as an ebook.
- Advanced Analytics with Spark. Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills. O’Reilly. 2015.
Mandatory assignments
Project | Start | Due | Subject | Lecturer |
---|---|---|---|---|
P1 | 23.08 | 19.09 | Concurrent B+tree | Tim |
P2 | 20.09 | 10.10 | Deduplication system | Tim |
P3 | 11.10 | 07.11 | PageRank on AWS | Tim |
Exercises
- Introduction
- None
- Threads and synchronization primitives
- Compare the overhead of forking a process vs. creating a Pthread
- Compare the overhead of forking a process vs. creating a Python thread
- Implement a solution the following classical IPC problems using pthreads/Python threads and semaphores/condition variables. Note that you also need to generate a use case, test data, and useful output:
- Producer/ consumer
- Reader/ writer
- Sleeping barber
- Dining philosophers
- Modify the code in 3) to use message passing.
- Go
- Take a tour of Go
- Implement the classical IPC problems in exercise 2.3. in Go.
- Parallel architectures
- None
- Parallel programs
- Implement a simpliefied BLAST search program in Go that does similarity search on two lists of random DNA sequences.
- Implement a heat distribution (SOR) program using Pthreads or (/and) Go.
- Programming for performance
- Implement a tuple space in Python with semantics similar to Linda. Use your tuple space to implement a parallel version of Mandelbrot that uses dynamic assignment (pool of tasks).
- Parallel program performance evaluation
- Performance evaluation
- None
- Cloud computing
- Create an account at AWS and calculate the approximate cost for analyzing 1TB and 1PB of data.
- Scala and Spark
- Run the provided WordCount in assignment 3 on AWS
- Implement grep in Scala and run it on AWS
- Data-intensive computing
- Implement word count in MapReduce and run it on AWS.
- Implement grep in MapReduce and run it on AWS.
- Spark libraries
- Refactor your assignment 3 code to use GraphX
- Stallo
- None
- Summary lecture
- Exam from 2015
- Exam from 2013
- Sample exam (from 2013)