Articles & Papers
- A closer look at GPUs. Fatahalian, K., & Houston, M. (2008) CACM - http://graphics.stanford.edu/~kayvonf/papers/fatahalianCACM.pdf
- AMD’s Cayman GPU Architecture - http://www.realworldtech.com/cayman/
- Benchmarking the cost of thread divergence in CUDA - https://arxiv.org/abs/1504.01650
- Broadcom VideoCore IV GPU
- Life of a Triangle - https://latchup.blogspot.com/2016/02/life-of-triangle.html
- VideoCore QPU Pipeline - https://latchup.blogspot.com/2016/03/videocore-qpu-pipeline.html
- Demystifying GPU Microarchitecture through Microbenchmarking - http://www.eecg.toronto.edu/~myrto/gpuarch-ispass2010.pdf - microbenchmark suite: http://www.stuffedcow.net/research/cudabmk
- GPU Concurrency: Weak Behaviours and Programming Assumptions
Alglave, J.; Batty, M.; Donaldson, A. F.; Gopalakrishnan, G.; Ketema, J.; Poetzl, D.; Sorensen, T.; and Wickerson, J. In 20th ACM Int. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS '15), 2015. Invited for fast-track submission to ACM Transactions on Computer Systems (TOCS).
- http://johnwickerson.github.io/papers/gpuconcurrency.pdf
- http://multicore.doc.ic.ac.uk/gpu-litmus/
- GPU Performance Modeling and Optimization - Ang Li
- https://pure.tue.nl/ws/files/39759895/20161018_Li.pdf
- GPUs and the Future of Parallel Computing
Keckler et al., IEEE Micro 2011.
- http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.232.1574&rep=rep1&type=pdf
- https://www.computer.org/cms/Computer.org/ComputingNow/homepage/2011/1111/W_MI_GPUsandtheFutureofParallelComputing.pdf
- HAXWell - Joshua Barczak
- Code which loads custom ISA on Intel Haswell GPUs - https://github.com/jbarczak/HAXWell
- You Compiled This, Driver. Trust Me… - http://www.joshbarczak.com/blog/?p=1028
- SPMD Is Not Intel’s Cup Of Tea - http://www.joshbarczak.com/blog/?p=1120
- GPU Ray Tracing The Wrong Way - http://www.joshbarczak.com/blog/?p=1197
- Inside Fermi: Nvidia’s HPC Push - http://www.realworldtech.com/fermi/
- Intel Processor Graphics: Microarchitecture and ISA, Tutorial, MICRO 2016
- Microarchitecture: https://software.intel.com/sites/default/files/managed/89/92/Intel-Graphics-Architecture-ISA-and-microarchitecture.pdf
- ISA: https://software.intel.com/sites/default/files/managed/89/92/micro-2016-ISA-tutorial.pdf
- Low-Level GPU Documentation - http://renderingpipeline.com/graphics-literature/low-level-gpu-documentation/
- NVIDIA Tesla: A Unified Graphics and Computing Architecture
Lindholm, E., Nickolls, J., Oberman, S., & Montrym, J. (2008). Micro, IEEE.
- http://people.cs.umass.edu/~emery/classes/cmpsci691st/readings/Arch/gpu.pdf
- NVIDIA’s GT200: Inside a Parallel Processor - http://www.realworldtech.com/gt200/
- Patterson, Hennessy (2016): Computer Organization and Design: The Hardware/Software Interface ARM Edition - Appendix B Graphics and Computing GPUs - http://booksite.elsevier.com/9780128017333/content/Appendix%20B.pdf
- Performance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU)
H. Kim, R. Vuduc, S. Baghsorkhi, J. Choi, W.-m. Hwu, 2012.
- http://impact.crhc.illinois.edu/shared/papers/sara2012.pdf
- http://impact.crhc.illinois.edu/paper_details.aspx?paper_id=203
- Predicting AMD and Nvidia GPU Performance - http://www.realworldtech.com/amd-nvidia-gpu-performance/
- Understanding Latency Hiding on GPUs
- Vasily Volkov; EECS Department; University of California, Berkeley; Technical Report No. UCB/EECS-2016-143; August 12, 2016
- https://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-143.html
- Understanding the GPU Microarchitecture to Achieve Bare-Metal Performance Tuning (PPoPP 2017)
- http://dl.acm.org/citation.cfm?id=3018755
- https://github.com/PAA-NCIC/PPoPP2017_artifact
CUDA
CUDA Books
CUDA Courses
- Intro to Parallel Programming
CUDA Documentation
Open Source Hardware GPU Projects
Software
- PerfTest: GPU texture/buffer performance tester
A simple GPU shader memory operation performance test tool. Current implementation is DirectX 11.0 based.
- https://github.com/sebbbi/perftest
- Pyramid Shader Analyzer
Pyramid is a free, open GUI tool for offline shader validation and analysis. The UI takes HLSL or GLSL as input, and runs them through various shader compilers and static analyzers.
- https://github.com/jbarczak/Pyramid
Simulators
Talks
- GPU Architectures and New Programming Model Features
- Introduction to GPU Architecture and Programming Models
- Portable GPU Programming: Hands-on
Tags:
assembly
native
reading
hardware
Last modified 17 September 2024