Académique Documents
Professionnel Documents
Culture Documents
June 2011
Scalable Graphics Everywhere
Graphics is going into “anything that has a screen”
Handsets, navigation, set-top box and DTV, tablets, automotive
Video telephony, digital photo frames, cameras, printers, ...
Imposing a wide range of requirements on the GPUs
Deliver performance on an embedded systems budget
Enable easy application portability
Open standards for software API
Maintain developer ecosystem
Robust & efficient drivers for OS
2
From SoC to Final Products
Mali™-200 and Mali-55 Silicon
Publically shipping silicon devices
end 2010
3
Mali-200 Overview
World’s most licensed OpenGL® ES 2.0 core
4
Mali-300 Overview
The optimum combination of price and performance
Ideal configuration for a wide range of use cases
High-quality drivers
Single optimized driver for
Mali-300 and Mali-400 MP
Transparent transition to Mali-400 MP
5
Mali-400 MP Overview
Scalable graphics performance
Up to 4 fragment processors
World’s 1st embedded multicore GPU
High-quality drivers
Single optimized driver for all configurations
Multicore scaling transparent to software developers
6
Mali-400 in 1080p at 60fps with 4xFSAA
Bump map shaders Scenes with up to 1000 objects
Fillrate intensive
Glow and
particle and
transparency
explosion effects
Text with
shadows and
animated glow
7
Mali Leads in Performance
fonehome.co.uk
8
The Trend is Towards Visual Computing
9
Mali-T604 for Visual Computing
Innovative GPU architecture
Tri-pipe –for performance and flexibility
GPGPU computing
Performance and scalability
Up to 5x Mali-400 MP performance
Scalable up to 4 cores
Power efficiency
Dynamic power management
Broad API support
OpenGL ES 2.0 and 1.1, OpenVG 1.1
OpenCL 1.1 Full Profile for GPGPU
DirectX® Tri-
Tri-Pipe
OS support
Android™, Linux, Windows® Phone, ...
10
Mali-T604 High-level Architecture
Up to four Reduces
shader memory
cores bandwidth
Distributes Memory
tasks Management
between Unit
the cores
Maintains
Configurable
cache
cache shared
coherency
between
between
cores
cores
11
Anti-aliasing in Mali-T604
The Mali architecture supports anti-aliasing in hardware
4x full scene anti-aliasing (multi-sampled)
Performance is maintained at 90%+
Power-efficient anti-aliasing effectively comes 'for free' in the hardware
16x full scene anti-aliasing (multi-sampled & super-sampled)
There is an up to 4x increase in per-pixel rendering cost
This is a cost-effective path to very high image quality
No FSAA 4x FSAA 16x FSAA
12
Mali-T604 System Integration
ARM Cortex™
Processor
13
Towards Even Lower Memory Bandwidth
Mali-400 MP
Improved
Texture
Compression
Transaction
Elimination
Hierarchical
Tiling
14
Mali-T604 Power Management
Mali-T604 has multiple power management features
Clock gating
Multi-level gating, both architectural and inferred
Fully automatic – clocks are stopped when not needed
Power gating
Multiple power domains controlled through a unified interface
Power domains are controlled by the Job Manager
A flexible mechanism to implement system-wide power policies
Compatible with external DVFS control
ALWAYS DOMAIN DOMAIN DOMAIN DOMAIN DOMAIN
ON SC0 SC1 SC2 SC3 CG0
15
Taking GPU Power Beyond Graphics
OpenCL enables portable heterogeneous multiprocessing
Addresses important industry trends and unlocks new use cases
Provides a standard framework for parallel computing
16
GPGPU Use Cases
Imaging
Image stabilization, face and smile recognition
Image editing, filters, landmark & context recognition
Superimposition of information, Augmented Reality
Multimedia
Post-processing, motion vectors, transcoding
Stream Data Processing
Mobile deep packet inspection
Antivirus, encryption
Compression, data analytics
UIs, Gaming and 3D Modelling
Voice recognition, gesture recognition, physics
AI, photorealistic ray tracing, modelling
17
GPGPU - Architecture
Compiler
Runtime
Objects
Plug-in architecture
Mali-T604
ARMv7-A, NEON
Custom Device
Video Decoder
18
Multicore Heterogeneous Computing
Cortex/NEON CPUs used together with Mali GPUs
Strength in diversity - each type of processor brings unique benefits
Program the system as a whole with a framework such as OpenCL
The heterogeneous platform must cooperate for best efficiency
Multicore CPU Multicore GPU e.g. Cache Coherency
Snoop Control Unit
Maintains L2 coherency across
CPU cores and across GPU cores.
Power efficient.
Corelink CCI-400
System on Chip
19
Comprehensive API and OS Support
21
Mali Ecosystem Benefits
22
Mali Developer Center
Mali Tools
Boards
www.malideveloper.com
23
A Summary of the Mali GPU Portfolio
Mali-200
Entry level, OpenVG and OpenGL ES 2.0
Leading anti-aliasing for superior image quality
World’s most licensed OpenGL ES 2.0 core
Mali-300
Ideal configuration for mid range use cases
Graphics acceleration with OpenVG and OpenGL ES
Efficient energy and bandwidth usage
Mali-400 MP
High performance OpenVG and OpenGL ES 2.0
World’s first multicore embedded GPU
Scalable pixel processing up to 1080p displays
Leading power efficiency and reduced bandwidth
Mali-T604
Tri-pipe for performance and flexibility
GPGPU computing with OpenCL 1.1
State of the art bandwidth reduction
DirectX® 11 and next generation Khronos graphics standards
*Mali GPUs noted as based on a published Khronos Specification are conformant, or expected to pass the Khronos Conformance Testing Process.
24
ARM GPUs – Now and in the Future
THANK YOU
25