Articles
"Apple M1 Assembly Hello World":
"Both MacOS and Linux are based on Unix and are more similar than different. However there are a few differences of note:
- MacOS uses LLVM by default whereas Linux uses GNU GCC. This really just affects the command line arguments in the makefile for the purposes of this article. You can use LLVM on Linux and GCC should be available for Apple M1 shortly.
- The MacOS linker/loader doesn’t like doing relocations, so you need to use the ADR rather than LDR instruction to load addresses. You could use ADR in Linux and if you do this it will work in both.
- The Unix API calls are nearly the same, the difference is that Linux redid the function numbers when they went to 64-bit, but MacOS kept the function numbers the same. In the 32-bit world they were the same, but now they are all different.
- When calling a Linux service the function number goes in X16 rather than X8.
Linux installs the various libraries and includes files under /usr/lib
and /usr/include
, so they are easy to find and use. When you install XCode, it installs SDKs for MacOS, iOS, iPadOS, iWatchOS, etc. with the option of installing lots for versions. The paths to the libs and includes are rather complicated and you need a tool to find them.
- In MacOS the program must start on a 64-bit boundary, hence the listing has an “.align 2” directive near top.
- In MacOS you need to link in the System library even if you don’t make a system call from it or you get a linker error. This sample Hello World program uses software interrupts to make the system calls rather than the API in the System library and so shouldn’t need to link to it.
- In MacOS the default entry point is
_main
whereas in Linux it is _start
. This is changed via a command line argument to the linker.
Below is the simple Assembly Language program to print out “Hello World” in a terminal window.
//
// Assembler program to print "Hello World!"
// to stdout.
//
// X0-X2 - parameters to linux function services
// X16 - linux function number
//
.global _start // Provide program starting address to linker
.align 2
// Setup the parameters to print hello world
// and then call Linux to do it.
_start: mov X0, #1 // 1 = StdOut
adr X1, helloworld // string to print
mov X2, #13 // length of our string
mov X16, #4 // MacOS write system call
svc 0 // Call linux to output the string
// Setup the parameters to exit the program
// and then call Linux to do it.
mov X0, #0 // Use 0 return code
mov X16, #1 // Service command code 1 terminates this program
svc 0 // Call MacOS to terminate the program
helloworld: .ascii "Hello World!\n"
Makefile:
HelloWorld: HelloWorld.o
ld -macosx_version_min 11.0.0 -o HelloWorld HelloWorld.o -lSystem -syslibroot
`xcrun -sdk macosx --show-sdk-path` -e _start -arch arm64
HelloWorld.o: HelloWorld.s
as -o HelloWorld.o HelloWorld.s
An introduction to assembly on Apple Silicon Macs.: "In this repository, I will code along with the book Programming with 64-Bit ARM Assembly Language, adjusting all sample code for Apple's ARM64 line of computers"
Code in Assembly for Apple Silicon with the AsmAttic.app
Tutorials, Courses
AArch64
Readings
Readings: Binary Analysis
See also: Software: Binary Analysis
- A Retargetable Static Binary Translator for the ARM Architecture
- Balancing Performance and Productivity for the Development of Dynamic Binary Instrumentation Tools: A Case Study on Arm Systems
- Exploiting SIMD Asymmetry in ARM-to-x86 Dynamic Binary Translation
- Exploiting Vector Processing in Dynamic Binary Translation
- Optimising Dynamic Binary Modification across 64-bit Arm Microarchitectures
- RevARM: A Platform-Agnostic ARM Binary Rewriter for Security Applications
- Translating AArch64 Floating-Point Instruction Set to the x86-64 Platform
Concurrency
- Formalising the ARMv8 Memory Consistency Model
- Mixed-size Concurrency: ARM, POWER, C/C++11, and SC
- Modelling the ARM v8 Architecture, Operationally: Concurrency and ISA
- No Barrier in the Road: A Comprehensive Study and Optimization of ARM Barriers
- Relaxed-Memory Concurrency - Power and ARM
- RMEM
- Simplifying ARM Concurrency: Multicopy-atomic Axiomatic and Operational Models for ARMv8
- The ARMv8 Application Level Memory Model
- The Semantics of Power and ARM Multiprocessor Machine Code
- The Semantics of Power and ARM Multiprocessor Programs
Formalization, Specification, Verification
- A Trustworthy Monadic Formalization of the ARMv7 Instruction Set Architecture
- Alastair Reid's
- ARMv8-A system semantics: instruction fetch in relaxed architectures
- ESOP 2020: European Symposium on Programming
- Ben Simner, Shaked Flur, Christopher Pulte, Alasdair Armstrong, Jean Pichon-Pharabod, Luc Maranget, Peter Sewell
- https://www.cl.cam.ac.uk/~pes20/iflat/
- ASL Interpreter
- End-to-End Verification of ARM Processors with ISA-Formal
- Formal Semantics Extraction from Natural Language Specifications for ARM
- FM 2019: 23rd International Symposium on Formal Methods
- Anh V. Vu and Mizuhito Ogawa
- https://anhvvcs.github.io/pubs/corana.pdf
- Corana: Dynamic Symbolic Execution Engine for ARM Cortex-M
- Formal Semantics Extraction from Natural Language Specifications for ARM
- hs-arm: (Dis)assembler and analyzer generated from the machine-readable ARMv8.3-A specification
- https://github.com/nspin/hs-arm
- library for (dis)assembling and analyzing ARMv8.3-A code, part of which is generated from the MRAS.
- implementation of ARM ASL (architecture specification language)
- ISA Semantics for ARMv8-A, RISC-V, and CHERI-MIPS
- POPL 2019
- Alasdair Armstrong, Thomas Bauereiss, Brian Campbell, Alastair Reid, Kathryn E. Gray, Robert M. Norton, Prashanth Mundkur, Mark Wassell, Jon French, Christopher Pulte, Shaked Flur, Ian Stark, Neel Krishnaswami, Peter Sewell
- https://alastairreid.github.io/papers/POPL_19/
- L3: A Specification Language for Instruction Set Architectures
- Low-level program verification under cached address translation
- 2019 PhD Dissertation; Hira Taqdees Syeda
- "In this thesis, we present a formal model of the memory management unit (MMU) in the interactive proof assistant Isabelle/HOL for the ARMv7-A architecture which includes the TLB, its maintenance operations, and its derived properties. We integrate this specification into the Cambridge ARM model. We derive sufficient conditions for TLB consistency, and we abstract away the functional details of the MMU using data refinement for simpler reasoning about executions in the presence of cached address translation, including complete and partial walks."
- https://www.unsworks.unsw.edu.au/permalink/f/a5fmj0/unsworks_60079
- http://unsworks.unsw.edu.au/fapi/datastream/unsworks:60079/SOURCE02?view=true
- sail-arm: Sail version of the ARMv8.5-A ISA definition
- Scapula: Compare ARM CPUs Against ARM's Machine Parsable Architecture Reference Manual
- tools for performing testing and verification of ARM CPUs against a machine parsable version of the ARMv8-A Architecture Reference Manual
- https://github.com/ainfosec/scapula
- Scapula: An Open-Source Toolkit For Model-Based Fuzzing and Verification of ARM CPUs
- Trustworthy Specifications of ARM v8-A and v8-M System Level Architecture
- Weak Persistency Semantics from the Ground Up: Formalising the Persistency Semantics of ARMv8 and Transactional Models
- Who Guards the Guards? Formal Validation of the Arm v8-M Architecture Specification
Instruction Set Architecture
Shellcode
- Alphanumeric ARM Shellcode
- Alphanumeric RISC ARM Shellcode
- Alphanumeric Shellcode Generator for ARM Architecture
- ARM Assembly and Shellcode Basics
- ARM Shellcode - Azeria Labs
- ARM shellcode and exploit development
- ARMv8 Shellcodes from 'A' to 'Z'
- Exploring New Depths of Threat Hunting ...or How to Write ARM Shellcode in Six Minutes
- Filter-resistant Code Injection on ARM
- Make ARM Shellcode Great Again
- Shellcode: Encryption Algorithms in ARM Assembly
A-profile
- BFloat16 extensions for Armv8-A
M-profile
- Code-Generation for the Arm M-profile Vector Extension
- Helium
- Making Helium
Performance
- An Instruction Level Energy Characterization of ARM Processors
- CoreSight, Perf and the OpenCSD Library
- Linaro Wiki - perf
- On-Target Trace Using the CoreSight Access Library
- OpenCSD HOWTO - using the library with perf
- Statistical Profiling Extension for ARMv8-A
Performance: Numerics
- ARM Floating Point 2019: Latency, Area, Power
- LLVM and the Automatic Vectorization of Loops Invoking Math Routines:
-fsimdmath
Security
- ARM Lab Environment - https://www.vulnhub.com/series/arm-lab,145/
- ARM Return Oriented Programming (ROP) - Billy Ellis
- Cache Speculation Side-channels
- Damn Vulnerable ARM Router (DVAR)
- Exploitation on ARM-based Systems
- Micro-Architectural Power Simulator for Leakage Assessment of Cryptographic Software on ARM Cortex-M3 Processors
- NORAX: Enabling Execute-Only Memory for COTS Binaries on AArch64
- RevARM: A Platform-Agnostic ARM Binary Rewriter for Security Applications
- Safe and Efficient Implementation of a Security System on ARM using Intra-level Privilege Separation
- Smashing the ARM Stack: ARM Exploitation Part 1
- TCP Bind Shell in Assembly (ARM 32-bit)
- Understanding the Security of ARM Debugging Features
Memory Tagging Extension (MTE)
- ARM Memory Tagging Extension and How It Improves C/C++ Memory Safety
- Hardware-assisted AddressSanitizer (HWASAN)
- Memory Tagging and how it improves C/C++ memory safety
- Memory Tagging, how it improves C++ memory safety, and what does it mean for compiler optimizations
- scudo: Add initial memory tagging support
- Security analysis of memory tagging
Pointer Authentication
- arm64e: An ABI for Pointer Authentication
- Examining Pointer Authentication on the iPhone XS
- LLVM-based Implementations
- [llvm-dev] [RFC] Pointer authentication for arm64e
- [LLVM and Clang] Upstream arm64e and Pointer Authentication support #14
- Add arm64e and pointer authentication support for Swift #30112
- PAC it up: Towards Pointer Integrity using ARM Pointer Authentication
- Pointer Authentication on ARMv8.3: Design and Analysis of the New Software Security Instructions
- Raising the Bar: New Hardware Primitives for Exploit Mitigations
TrustZone
- A Deep Dive Into Samsung's TrustZone
- Attacking the ARM's TrustZone
- Azeria Labs
- TrustZone Research
- Trusted Execution Environments and Arm TrustZone
- Trustonic’s Kinibi TEE Implementation
- Breaking Samsung's ARM TrustZone
- Cachegrab: a tool designed to help perform and visualize trace-driven cache attacks against software in the secure world of TrustZone-enabled ARMv8 cores
- Demystifying Arm TrustZone: A Comprehensive Survey
- Hardware-assisted Transparent Tracing and Debugging on ARM
- Introduction to Trusted Execution Environment: ARM's TrustZone
- Introduction to Trusted Execution Environment and ARM's TrustZone
- Ninja: Towards Transparent Tracing and Debugging on ARM
- PARTEMU: Enabling Dynamic Analysis of Real-World TrustZone Software Using Emulation
- SoK: Understanding the Prevailing Security Vulnerabilities in TrustZone-assisted TEE Systems
- Verification of a Practical Hardware Security Architecture Through Static Information Flow Analysis (ARM TrustZone)
- vTZ: Virtualizing ARM TrustZone
Simulation
- Simulation of ARM and x86 microprocessors using in-order and out-of-order CPU models with Gem5 simulator
- Simulation of 64-bit ARM Systems: Implementation, Validation and Design Space Exploration
Virtualization
- ARM Virtualization: Performance and Architectural Implications
- Hiding in the Shadows: Empowering ARM for Stealthy Virtual Machine Introspection
- Hypervisor Necromancy; Reanimating Kernel Protectors, or On emulating hypervisors; a Samsung RKP case study
- The Design, Implementation, and Evaluation of Software and Architectural Support for ARM Virtualization
References
Intrinsics & SIMD
NEON
Scalable Vector Extension (SVE)
- Arm SVE Tools Training
- Asvie: A Timing-Agnostic SVE Optimization Methodology
- Methodology for ArmIE SVE
- Asvie: A Timing-Agnostic SVE Optimization Methodology
- Porting and Optimizing HPC Applications for Arm SVE Documentation
- Mastering the Arm HPC ecosystem
- Scalable Vector Extension (SVE)
- Scalable Vector Extension support for AArch64 Linux
- The ARM Scalable Vector Extension
- IEEE Micro, March 2017
- Nigel Stephens, Stuart Biles, Matthias Boettcher, Jacob Eapen, Mbou Eyole, Giacomo Gabrielli, Matt Horsnell, Grigorios Magklis, Alejandro Martinez, Nathanael Premillieu, Alastair Reid, Alejandro Rico, Paul Walker
- Preprint: https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf
- http://dx.doi.org/10.1109/MM.2017.35
- Using Arm’s scalable vector extension on stencil codes
- The Journal of Supercomputing (2019)
- Armejach, Adrià, Helena Caminal, Juan M. Cebrian, Rubén Langarita, Rekai González-Alberquilla, Chris Adeniyi-Jones, Mateo Valero, Marc Casas, Miquel Moretó
- https://doi.org/10.1007/s11227-019-02842-5
SVE: LLVM Implementation
- Road to SVE enablement in LLDB
- Scalable Vectorization for LLVM
- SVE/SVE2 Patches on Phabricator
Toolchains
Software
- Arm HPC Users Group - resources for end-users and developers deploying on Arm hardware.
- AMaCC (Another Mini ARM C Compiler) - Small C Compiler generating ELF executable for Arm architecture
- AZM - Live ARM Assembler and Syntax Checker
- mra_tools: Tools to process ARM's Machine Readable Architecture Specification
- VIXL: AArch64 Runtime Code Generation Library
- https://github.com/ARM-software/asl-interpreter: Example implementation of Arm's Architecture Specification Language (ASL)
Software: Binary Analysis
See also: Readings: Binary Analysis
- MAMBO: A Low-Overhead Dynamic Binary Modification Tool for ARM
- https://github.com/beehive-lab/mambo
- Low Overhead Dynamic Binary Translation on ARM
- Optimising Dynamic Binary Modification across ARM microarchitectures
- Optimising Dynamic Binary Modification Across ARM Microarchitectures
- Dynamic Binary Instrumentation and Modification with MAMBO
- mbed-os-linker-report: d3.js based ELF Linker Statistics
Software: Debugging, Tracing
- Coresight Access Library
- OpenCSD - Open CoreSight Decoder library
- ptm2human: ARM PTM (and ETMv4) trace to human-readable format
- ARM PTM decoder, and ARM ETM v4 decoder. ptm2human is a decoder for trace data outputted by Program Trace Macrocell (PTM) and Embedded Trace Macrocell (ETMv4).
- https://github.com/hwangcc23/ptm2human
- Statically compiled ARM binaries for debugging and runtime analysis
- Troll
Software: Emulation, Simulation
Software: Lifting
Disassemblers, Decompilers, Recompilers
- Dynarmic: A dynamic recompiler for the ARMv6K architecture
- IDA script for highlighting and decoding ARM system instructions
- REIL: A C++ translation/emulation library for the AArch64 instruction set to REIL
- retools: a reverse engineering toolkit for normies
- Collection of tools (disassembler, emulator, binary parser) aimed at reverse enginering tasks, more specifically, bug finding related. Currently we target ARMv7 and Mach-O though in the future more architectures and formats are planned.
- retools is somewhat unique in that most of the semantics for relevant instructions are parsed out of the specification PDFs as opposed to being generated by hand. Currently the disassembler, emulator, and binary parsers are partially done, with a symbolic execution engine and instrumentation/hooking framework to come as I get more time.
- https://github.com/agustingianni/retools
- Spedi: a speculative disassembler for the variable-size Thumb ISA
Software: Performance
See also: Performance Tools
- Arm HPC tools and libraries
- Arm Optimized Routines
- Compute Library
- LIKWID: Performance monitoring and benchmarking suite
- Microbenchmarks for Cortex A53
- Ne10 Open Source Library
- Ne10 is a library of common, useful functions that have been heavily optimised for ARM-based CPUs equipped with NEON SIMD capabilities. It provides consistent, well-tested behaviour, allowing for painless integration into a wide variety of applications. The library currently focuses primarily around math, signal processing, image processing, and physics functions.
- http://projectne10.github.io/Ne10/
- https://github.com/projectNe10/Ne10
- Streamline Performance Analyzer
- User-mode access to ARMv7 PMU cycle counters
- User-mode access to ARMv8 PMU cycle counters
- Using Perf and its friend eBPF on Arm platform
Software: Virtualization
Talks
2019
- Arm/AArch64 BoF
- The Definitive Guide to Make Software Fail on ARM64
2018
- Arm/AArch64 BoF
- Arm Architecture Enhancements in 2018
- ARMaHYDAN - Misadventures of ARM instruction encodings
- Introduction To Return Oriented Exploitation On ARM64
- The Path to Fast Data on Arm - Brian Brooks
- FD.io Mini-Summit: KubeCon + CloudNativeCon EU 2018
- FD.io Mini Summit: Open Networking Summit North America 2018
- Using perf On Arm platforms
2017
2016
- ARM Research
- ARMv8-A Next Generation Vector Architecture for HPC
- Hardware Assisted Rootkits and Instrumentation ARM Edition
- Hardware Assisted Tracing on ARM with CoreSight and OpenCSD
- Embedded Linux Conference 2016
- Linaro Connect Las Vegas 2016 (LAS16)
- Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA
2015
- perf status on ARM and ARM64
2014
- Cycle Accurate Profiling With Perf
- Square Pegs in Round holes, or System Level Performance Data and perf
2012
- Advanced Software Exploitation on ARM Microprocessors
- The AArch64 backend: status and plans
2011
2010
- Exploitation on ARM - Technique and Bypassing Defense Mechanisms
History
- ARM inventor: Sophie Wilson
- ARM microarchitect: Steve Furber
- ARM Processor Evolution
- Clever solutions find inconvenient truths: A history of the ARM Architecture, and the lessons learned while building it
- The Future of Microprocessors, Sophie Wilson
Tags:
language
native
object
metaobject
Last modified 07 October 2024