Vous êtes sur la page 1sur 25

®

ARM Mali™ GPUs


Now and in the Future

June 2011
Scalable Graphics Everywhere
 Graphics is going into “anything that has a screen”
 Handsets, navigation, set-top box and DTV, tablets, automotive
 Video telephony, digital photo frames, cameras, printers, ...
 Imposing a wide range of requirements on the GPUs
 Deliver performance on an embedded systems budget
 Enable easy application portability
 Open standards for software API
 Maintain developer ecosystem
 Robust & efficient drivers for OS

2
From SoC to Final Products
Mali™-200 and Mali-55 Silicon
Publically shipping silicon devices
end 2010

11 plus SoCs across 8 partners

Mali-400 MPSoC – end 2010

3
Mali-200 Overview
 World’s most licensed OpenGL® ES 2.0 core

 OpenVG™ 1.1, OpenGL ES 2.0 / 1.1

 1st OpenGL ES 2.0


conformant GPU at 1080p

 Market leading anti-aliasing

 Robust and performance


optimized drivers

4
Mali-300 Overview
 The optimum combination of price and performance
 Ideal configuration for a wide range of use cases

 Khronos compliant OpenVG 1.1, OpenGL ES 2.0 / 1.1

 Efficient energy and bandwidth


usage
 Integrated 8KB L2 cache

 High-quality drivers
 Single optimized driver for
Mali-300 and Mali-400 MP
 Transparent transition to Mali-400 MP
5
Mali-400 MP Overview
 Scalable graphics performance
 Up to 4 fragment processors
 World’s 1st embedded multicore GPU

 OpenVG 1.1, OpenGL ES 2.0 / 1.1

 Efficient energy and bandwidth usage


 Configurable L2 cache tuned for maximum throughput
 Leading bandwidth efficiency and latency tolerance

 High-quality drivers
 Single optimized driver for all configurations
 Multicore scaling transparent to software developers

6
Mali-400 in 1080p at 60fps with 4xFSAA
Bump map shaders Scenes with up to 1000 objects

Fillrate intensive
Glow and
particle and
transparency
explosion effects

Text with
shadows and
animated glow

Stereoscopic 3D, everything is Full 7-axis camera freedom;


rendered twice per frame in 1080p no assumptions can be made

7
Mali Leads in Performance

Samsung Galaxy S2 smashes


speed tests ahead of launch
Techradar.com

... astonishingly fast ...


3D games never looked so good Quadrant

fonehome.co.uk

Put simply, this is the most powerful


mobile handset we've yet tested
engadget.com

Samsung Galaxy S2:


Dual Cortex-A9 + Quad Mali-400 MP

8
The Trend is Towards Visual Computing

 The need for Visual Computing is driven by


 Complex use-cases (stereo, augmented reality, image processing)
 Increased shader complexity – greater image realism, interaction
 Generation of procedural textures
 GPGPU and OpenCL answer these challenges by
leveraging the combined compute resource of the platform

9
Mali-T604 for Visual Computing
 Innovative GPU architecture
 Tri-pipe –for performance and flexibility
 GPGPU computing
 Performance and scalability
 Up to 5x Mali-400 MP performance
 Scalable up to 4 cores
 Power efficiency
 Dynamic power management
 Broad API support
 OpenGL ES 2.0 and 1.1, OpenVG 1.1  
 OpenCL 1.1 Full Profile for GPGPU
 DirectX® Tri-
Tri-Pipe
 OS support
 Android™, Linux, Windows® Phone, ...
10
Mali-T604 High-level Architecture

Up to four Reduces
shader memory
cores bandwidth

Distributes Memory
tasks Management
between Unit
the cores

Maintains
Configurable
cache
cache shared
coherency
between
between
cores
cores

11
Anti-aliasing in Mali-T604
 The Mali architecture supports anti-aliasing in hardware
 4x full scene anti-aliasing (multi-sampled)
 Performance is maintained at 90%+
 Power-efficient anti-aliasing effectively comes 'for free' in the hardware
 16x full scene anti-aliasing (multi-sampled & super-sampled)
 There is an up to 4x increase in per-pixel rendering cost
 This is a cost-effective path to very high image quality
No FSAA 4x FSAA 16x FSAA

12
Mali-T604 System Integration

ARM Cortex™
Processor

13
Towards Even Lower Memory Bandwidth
Mali-400 MP

Memory Bandwidth Relative to Mali-400 MP Mali-T604

Improved
Texture
Compression

Transaction
Elimination

Hierarchical
Tiling

14
Mali-T604 Power Management
 Mali-T604 has multiple power management features
 Clock gating
 Multi-level gating, both architectural and inferred
 Fully automatic – clocks are stopped when not needed
 Power gating
 Multiple power domains controlled through a unified interface
 Power domains are controlled by the Job Manager
 A flexible mechanism to implement system-wide power policies
 Compatible with external DVFS control
ALWAYS DOMAIN DOMAIN DOMAIN DOMAIN DOMAIN
ON SC0 SC1 SC2 SC3 CG0

15
Taking GPU Power Beyond Graphics
 OpenCL enables portable heterogeneous multiprocessing
 Addresses important industry trends and unlocks new use cases
 Provides a standard framework for parallel computing

 Mali support for OpenCL takes GPU power beyond graphics


 Mali-T604: GPUs designed for general purpose computing
 Implementation of Full Profile ensures no limitations for developers
 Extendable framework supporting CPU (NEON™), GPGPU and more

 ARM CPU and Mali GPU ready for heterogeneous computing


 Integrated and interoperable software stack proposition
 A cohesive vision with ARM compute subsystem and CoreLink™ CCI
 IP designed to support all industry compute language paradigms

16
GPGPU Use Cases
 Imaging
 Image stabilization, face and smile recognition
 Image editing, filters, landmark & context recognition
 Superimposition of information, Augmented Reality
 Multimedia
 Post-processing, motion vectors, transcoding
 Stream Data Processing
 Mobile deep packet inspection
 Antivirus, encryption
 Compression, data analytics
 UIs, Gaming and 3D Modelling
 Voice recognition, gesture recognition, physics
 AI, photorealistic ray tracing, modelling

17
GPGPU - Architecture

 Compiler
 Runtime
 Objects
 Plug-in architecture
 Mali-T604
 ARMv7-A, NEON
 Custom Device
 Video Decoder

18
Multicore Heterogeneous Computing
 Cortex/NEON CPUs used together with Mali GPUs
 Strength in diversity - each type of processor brings unique benefits
 Program the system as a whole with a framework such as OpenCL
 The heterogeneous platform must cooperate for best efficiency
Multicore CPU Multicore GPU e.g. Cache Coherency
Snoop Control Unit
Maintains L2 coherency across
CPU cores and across GPU cores.
Power efficient.

Corelink CCI-400
System on Chip

Maintains coherency between CPU L2


and GPU L2 across the bus.
Data remains within the SoC.
Power efficient.
Corelink CCI-400 Coherent AMBA Bus
Without Cache Coherency
Data between CPU/GPU must be
flushed to external memory.
Consumes significant power.
Off
Off--chip Memory

19
Comprehensive API and OS Support

 ARM delivers a complete set of graphics APIs on Mali


 Open standards are at the heart of embedded graphics
 Well-established proprietary standards are equally important
 Acceleration for a wide-range of middleware and 3rd party applications
 The Mali driver is common across a range of Mali products
 A unified driver stack for easy migration to future Mali GPUs
 Future-proof your investment in system and software integration
 All popular operating systems are supported
20
Mali Ecosystem in Action

21
Mali Ecosystem Benefits

22
Mali Developer Center
Mali Tools

Content & Developer


Demos Support

Boards

www.malideveloper.com

Supporting Mali developers with a full range of resources


through one, easy-access portal

23
A Summary of the Mali GPU Portfolio
 Mali-200
 Entry level, OpenVG and OpenGL ES 2.0
 Leading anti-aliasing for superior image quality
 World’s most licensed OpenGL ES 2.0 core
 Mali-300
 Ideal configuration for mid range use cases
 Graphics acceleration with OpenVG and OpenGL ES
 Efficient energy and bandwidth usage
 Mali-400 MP
 High performance OpenVG and OpenGL ES 2.0
 World’s first multicore embedded GPU
 Scalable pixel processing up to 1080p displays
 Leading power efficiency and reduced bandwidth
 Mali-T604
 Tri-pipe for performance and flexibility
 GPGPU computing with OpenCL 1.1
 State of the art bandwidth reduction
 DirectX® 11 and next generation Khronos graphics standards
*Mali GPUs noted as based on a published Khronos Specification are conformant, or expected to pass the Khronos Conformance Testing Process.

24
ARM GPUs – Now and in the Future

THANK YOU

25

Vous aimerez peut-être aussi