SlideShare a Scribd company logo
Can I use Neural Engine
to run my neural networks
on A11 devices?
Koan-Sin Tan

freedom@computer.org

Hsinch Coding Serfs Meeting, Nov, 2018
https://www.anandtech.com/show/13392/the-iphone-xs-xs-max-review-unveiling-the-
silicon-secrets/5
• AnandTech is one of my favorite tech sites. Usually, they provides
good technical analysis

• E.g., Apple’s CPUs

• cache sizes

• execution units

• various instruction latency

• Not good enough for NN accelerators on mobile phones

• floating-point VGG16, Inception V3, and ResNet34?

• come on, are you still in Neolithic era?
ANE on A12, how about A11?
Why I said VGG16 is
Neolithic Era
• Lightweight models are there

• MobileNet V1 could have roughly
the same top-1 accuracy event
with quantized uint8

• MobileNet V2 could have better
top-1 accuracy

• Mnasnet could be better than
MobileNet V1

• Classification, object detection,
segmentation, etc.

• 8-bit quantization are good enough
for many cases
https://github.com/tensorflow/models/raw/master/research/slim/nets/mobilenet/
madds_top1_accuracy.png
How to use Neural Engine
• According to Apple:

• A11: 600 G ops per second, A12: 5 T ops per second

• Yes, by default, it's enabled on A12 device. If you have pre-iOS 10.12 apps built on top of Core ML, they
should be able to use it automatically. But, not on A11 devices.

• How to verify it?

• MLConfiguration [1]: instance variable 

@property(readwrite) MLComputeUnits computeUnits;

• there is usesCPUOnly for VNRequest in iOS11, but not something like MLComputUnits

• See my example [2]

[1] https://developer.apple.com/documentation/coreml/mlmodelconfiguration?language=objc

[2] https://github.com/freedomtan/coremlbenchmark/
Why not VNRequest?
• Since I mentioned VNRequest in Vision.framework, why not VNCoreMLRequest?

• Yes, I wrote simple VNCoreMLRequest based app before. Both Swift and objective-c
ones [1][2].

• Simplified interface and image crop and scale for you.

• Yes, image operations time.

• This actually reminds us an important system software issue.

• Modern cellphone SoCs use DVFS and all kinds of energy-saving techniques
extensively. How can use get good performance?

• Inference with camera on is usually faster than with camera off!!!
[1] https://github.com/freedomtan/SimpleInceptionV3/

[2] https://github.com/freedomtan/SimpleInceptionV3-ObjC
Neural Engine in Action
• H11ANESevicesThread

• A12 is for iPhone11,x

• No H10ANEServicesThread

• So, who started
H11ANEServicesThread? There is no
anything named H11 in /System/
Library/Frameworks/
CoreML.framework/CoreML

• It seems it’s in /System/Library/
PrivateFrameworks/
ANEServices.framework/
ANEServices
• A12 devices only
iPhone Xs Max
default 17:17:14.002705 +0800 kernel IOReturn H11ANEIn::ANE_ProcessDestroy_gated(H11ANEProcessDestroyArgs *, bool, uint32_t *) :
H11ANEIn::ANE_ProgramDestroy_gated WARN: Freeing intermediate buffer inside ProcessDestroy
default 17:17:14.004821 +0800 kernel IOReturn H11ANEIn::ANE_ProcessDestroy_gated(H11ANEProcessDestroyArgs *, bool, uint32_t *) :
H11ANE:ANE_ProcessDestroy_gated Removed client aned from programHandle=0x8a03aa2e112. Num clients for program=0
default 17:17:14.004938 +0800 kernel IOReturn H11ANEIn::ANE_ProcessDestroy_gated(H11ANEProcessDestroyArgs *, bool, uint32_t *) :
H11ANEIn::ANE_ProgramDestroy_gated WARN: Freeing intermediate buffer inside ProcessDestroy
default 17:17:14.011142 +0800 kernel IOReturn H11ANEIn::ANE_ProcessDestroy_gated(H11ANEProcessDestroyArgs *, bool, uint32_t *) :
H11ANE:ANE_ProcessDestroy_gated Removed client aned from programHandle=0x8a02e50c71e. Num clients for program=0
default 17:17:14.011358 +0800 kernel IOReturn H11ANEIn::ANE_ProcessDestroy_gated(H11ANEProcessDestroyArgs *, bool, uint32_t *) :
H11ANEIn::ANE_ProgramDestroy_gated WARN: Freeing intermediate buffer inside ProcessDestroy
default 17:17:14.024969 +0800 kernel IOReturn H11ANEInUserClient::ANE_PowerOff() - client aned requesting Power Off
default 17:17:14.025291 +0800 kernel IOReturn H11ANEIn::setPowerStateGated(unsigned long, IOService *) : H11ANEIn::setPowerStateGated: 0
default 17:17:14.026850 +0800 kernel IOReturn H11ANEIn::ANE_deInit() : H11ANEIn::ANE_deInit - CSNE_CMD_POWER_DOWN command completed:
res=0x00000000
default 17:17:14.026880 +0800 kernel IOReturn H11ANEIn::ANE_deInit() : H11ANEIn::ANE_deInit - ANECPU in WFI after CSNE_CMD_SUSPEND/
CISP_CMD_POWER_DOWN. retries=0 ASCWRAP_IDLE_STATUS = 0x2d
default 17:17:14.039520 +0800 kernel IOReturn H11ANEIn::ANE_HandlePowerStateChecksForClient() : INFO: H11ANEIn: ANE power status:
isPowered: 0, fDeInitInProgress: 0, fFirmwareTimeout: 0
default 17:17:14.039563 +0800 kernel IOReturn H11ANEIn::ANE_UserClientCleanup_gated(void *) : Info: H11ANEIn: Skipping user client
cleanup for client (<private>) as power is already off
default 17:17:14.039723 +0800 kernel virtual IOReturn H11ANEInUserClient::clientClose() - aned
default 17:17:14.039749 +0800 kernel virtual void H11ANEInUserClient::free() - Freeing UserClient for process: aned (pid 191)
iPhone 8 Plus
default 17:08:51.256253 +0800 kernel ISPCPU: CmdTurnOffDevicePower: TS: 2.901495 Disable CAM0_SHUTDOWN=0
default 17:08:51.256277 +0800 kernel ISPCPU: Addr: 0x00000002122a8000
default 17:08:51.258444 +0800 kernel ISPCPU: TurnOffPower:DONE TS: 2.903766 rail: 0x5, ch: 0, cameraPowerBitEnable:
0x7e
default 17:08:51.258684 +0800 kernel AppleH10CamIn::ISP_PPMAdmissionCheck_gated: subClientID=1; budgetReq=0;
budgetAlloc=0; result=0x00000000
default 17:08:51.258726 +0800 kernel AppleH10CamIn::ISP_StopCamera_gated: subClientID=1; channel=0; budgetReq=0;
budgetAlloc=0; result=0x00000000, numPreviewFrames=72, numStillCaptureFrames:0
default 17:08:51.258813 +0800 kernel ISPCPU: [ISP: 2.904275] CH = 0 CMD = 0x0104 [CISP_CMD_CH_BUFFER_RETURN]
default 17:08:51.266156 +0800 kernel AppleH10CamIn::ISP_FlushInactiveDARTMappings: 0x00000000
default 17:08:51.266234 +0800 mediaserverd H10ISPServicesRemote: SetProperty 2 (sent)
default 17:08:51.267404 +0800 mediaserverd H10ISPServicesRemote: SetProperty 2 (reply=0x00000000)
default 17:08:51.272115 +0800 kernel ISPCPU: [ISP: 2.917542] CH = 0 CMD = 0x820b
[CISP_CMD_APPLE_CH_AE_TILES_MATRIX_METADATA_ENABLE]
default 17:08:51.273311 +0800 kernel ISPCPU: [ISP: 2.918641] CH = 0 CMD = 0x0130 [CISP_CMD_CH_GENERAL_PROCESS_STOP]
default 17:08:51.273747 +0800 kernel AppleH10CamIn::ISP_FlushInactiveDARTMappings: 0x00000000
default 17:08:51.276237 +0800 kernel AppleH10CamIn::ISP_ReleaseChannel_gated - channel: 0 (process: mediaserverd)
iPhone 6s
default 17:18:52.814006 +0800 kernel AppleH6CamIn::setPowerStateGated: 1
default 17:18:52.814054 +0800 kernel AppleH6CamIn::power_on_hardware
default 17:18:52.910762 +0800 kernel AppleH6CamIn::MotionDataEnable: Enabling for Endpoint 0
default 17:18:52.924652 +0800 mediaserverd FigSignalError: -12785, invalidated
default 17:18:52.954154 +0800 mediaserverd FigSignalError: -12785, invalidated
default 17:18:52.954361 +0800 kernel AppleH6CamIn::ISP_SelectBestMIPIFrequencyIndex_gated - channel: 0, currentRawBitDepth: 1, index: 2
default 17:18:53.118463 +0800 kernel AppleH6CamIn::ISP_CopySetfile_gated (camChan=0)
default 17:19:07.301879 +0800 kernel AppleH6CamIn::ISP_FlushInactiveDARTMappings: 0x00000000
default 17:19:12.307839 +0800 kernel AppleH6CamInUserClient::free - Freeing UserClient for process: mediaserverd (pid 2465)
default 17:19:12.308025 +0800 kernel AppleH6CamIn::setPowerStateGated: 0
default 17:19:12.308185 +0800 kernel AppleH6CamIn::power_off_hardware
default 17:19:12.321409 +0800 kernel AppleH6CamIn::ISP_FlushInactiveDARTMappings: 0x00000000
default 17:19:12.321478 +0800 kernel AppleH6CamIn::MotionDataDisable: Enabling for Endpoint 0
iPhone Xs Max iPhone 8 Plus
https://github.com/freedomtan/TestANE/
/* Generated by RuntimeBrowser
Image: /System/Library/PrivateFrameworks/AppleNeuralEngine.framework/AppleNeuralEngine
*/
@interface _ANEDeviceInfo : NSObject
+ (id)bootArgs;
+ (id)buildVersion;
+ (bool)hasANE;
+ (bool)isInternalBuild;
+ (bool)precompiledModelChecksDisabled;
@end
https://github.com/nst/iOS-Runtime-Headers/blob/master/PrivateFrameworks/
AppleNeuralEngine.framework/_ANEDeviceInfo.h
size -l -x -m /tmp/arm64e/System/Library/PrivateFrameworks/AppleNeuralEngine.framework/AppleNeuralEngine
Segment __TEXT: 0x11000 (vmaddr 0x1abe22000 fileoff 0)
Section __text: 0xb728 (addr 0x1abe23d18 offset 7448)
Section __auth_stubs: 0x3d0 (addr 0x1abe2f440 offset 54336)
Section __cstring: 0xb87 (addr 0x1abe2f810 offset 55312)
Section __objc_methname: 0x10a5 (addr 0x1abe30397 offset 58263)
Section __objc_classname: 0x140 (addr 0x1abe3143c offset 62524)
Section __objc_methtype: 0x498 (addr 0x1abe3157c offset 62844)
Section __gcc_except_tab: 0x8cc (addr 0x1abe31a14 offset 64020)
Section __const: 0xd0 (addr 0x1abe322e0 offset 66272)
Section __oslogstring: 0x8d0 (addr 0x1abe323b0 offset 66480)
Section __unwind_info: 0x330 (addr 0x1abe32c80 offset 68736)
Section __eh_frame: 0x50 (addr 0x1abe32fb0 offset 69552)
total 0xf2e8
Segment __DATA: 0xe00 (vmaddr 0x1ba4ef3b8 fileoff 69632)
Section __objc_selrefs: 0x3e0 (addr 0x1ba4ef3b8 offset 69632)
Section __objc_protorefs: 0x10 (addr 0x1ba4ef798 offset 70624)
Section __objc_classrefs: 0x1b8 (addr 0x1ba4ef7a8 offset 70640)
Section __objc_superrefs: 0x38 (addr 0x1ba4ef960 offset 71080)
Section __objc_ivar: 0x60 (addr 0x1ba4ef998 offset 71136)
Section __objc_data: 0x4b0 (addr 0x1ba4ef9f8 offset 71232)
Section __data: 0x228 (addr 0x1ba4efea8 offset 72432)
Section __auth_ptr: 0x8 (addr 0x1ba4f00d0 offset 72984)
Section __bss: 0xe0 (addr 0x1ba4f00d8 offset 0)
total 0xe00
…
otool -o /tmp/arm64e/System/Library/
PrivateFrameworks/AppleNeuralEngine.framework/
AppleNeuralEngine
/tmp/arm64e/System/Library/PrivateFrameworks/
AppleNeuralEngine.framework/AppleNeuralEngine:
Contents of (__DATA_CONST,__objc_classlist)
section
00000001b7a76a78 0x80001ba4efa20
00000001b7a76a80 0x80001ba4efa70
00000001b7a76a88 0x80001ba4efa98
00000001b7a76a90 0x80001ba4efae8
…
~/work/ios-hacking/tools/jtool -d objc /tmp/arm64/System/
Library/PrivateFrameworks/AppleNeuralEngine.framework/
AppleNeuralEngine
Fat binary, big-endian, 1 architectures: will auto-process
this architecture
arm64_ANEDeviceInfo
_ANEDataReporter
_ANEProgramForEvaluation
_ANEModel
_ANEHashEncodin
_ANERequest
_ANELog
_ANEQoSMapper
_ANEStrings
_ANEDaemonConnection
_ANEIOSurfaceObject
_ANEDeviceController
_ANEClient
_ANEErrors
_ANECloneHelper
http://www.newosxbook.com/tools/jtool.html
Mach-O Headers
• Mac OS X ABI Mach-O File Format Reference, no longer
available on Apple web site, google it.

• headers: /usr/include/mach-o/loader.h

• objc runtime

• https://opensource.apple.com/source/objc4/
objc4-723/, https://opensource.apple.com/tarballs/
objc4/objc4-723.tar.gz
Dive a bit deeper into Core
ML
• Frameworks and some binaries used to be shipped unstripped as parts of iPhoneOS
SDK in Xcode. Not anymore, most framework binaries are in dyld_shared_cache.

• Fortunately, It’s quite easy to check iOS file system nowadays. Apple stopped encrypting
.ipsw since iOS 10 beta (more than 2 years ago). So, get a .ipsw, unzip it (remember it's
a .zip file), then mount the largest .dmg (this needs extra steps on Windows and Linux
though). E.g.,

1. get iOS 12.0 ipsw for iPhone Xs Max [1]. See [2] for other firmwares.

2. unzip it.

3. mount 048-10782-224.dmg, that's it. You can see the whole filesystem used by
iPhone Xs Max.

• Thus, we can get /System/Library/Caches/com.apple.dyld/
dyld_shared_cache_arm* we want 

[1] http://updates-http.cdn-apple.com/2018FallFCS/fullrestores/091-65188/11BE19F6-AC8E-11E8-A312-F5CEDE149863/iPhone11,4,iPhone11,6_12.0_16A366_Restore.ipsw
[2] https://www.theiphonewiki.com/wiki/Firmware/iPhone/12.x
Dive a bit deeper into Core
ML
• If you are on macOS and have Xcode installed, there are some binaries
with symbols in ~/Library/Developer/Xcode/iOS
DeviceSupport/12.1 (16B92) arm64e/

• What do I mean by “some”? E.g., there is /System/Library/
PrivateFrameworks/AppleNeuralEngine.framework/
XPCServices/ANECompilerService.xpc/
ANECompilerService on A12 devices, but not in Xcode’s support
library

• Yes, we can find /System/Library/Frameworks/
CoreML.framework/CoreML
• Even /System/Library/Caches/com.apple.dyld/
dyld_shared_cache_arm* is there
extract binaries from
dyld_shared_cache
• jtool can do it for you. E.g.,

• list

~/work/ios-hacking/tools/jtool -l /Volumes/Peace16A366.D331OS/System/Library/Caches/com.apple.dyld/dyld_shared_cache_arm64e
• extract

~/work/ios-hacking/tools/jtool -e /System/Library/PrivateFrameworks/AppleNeuralEngine.framework/AppleNeuralEngine /Volumes/Peace16A366.D331OS/System/
Library/Caches/com.apple.dyld/dyld_shared_cache_arm64e
Extracting /System/Library/PrivateFrameworks/AppleNeuralEngine.framework/AppleNeuralEngine at 0x2be22000 into dyld_shared_cache_arm64e.AppleNeuralEngine
• dyld source code

• https://opensource.apple.com/source/dyld/dyld-551.4/, https://
opensource.apple.com/tarballs/dyld/dyld-551.4.tar.gz

• Read dyld source and [1] for more about dyld_shared_cache

[1] https://iphonedevwiki.net/index.php/Dyld_shared_cache
What to read beyond
Apple’s docs
• https://www.theiphonewiki.com, e.g., https://
www.theiphonewiki.com/wiki/Firmware/iPhone/12.x

• http://iphonedevwiki.net/index.php/Main_Page, e.g.,
http://iphonedevwiki.net/index.php/
Reverse_Engineering_Tools

• http://newosxbook.com/index.php, e.g., http://
newosxbook.com/index.php?page=notes

• https://papers.put.as
kernel side
• So, how about extract or just put ANE related stuff into A11
devices?

• Well, if you look into kernel_cache of A11 and A12 devices

• As expected, we can see lots of H11ANE information in
A12 kernel_cache

• A11 kernel_cache does mentioned H11ANE several
times, but it seems important modules are not there.

• So, I guess if we don’t jailbreak and root, we are out of luck!
That’s it
Isn’t XNU (Darwin source
code open)?
• Well, there are more than 200 kernel modules, only some of them
are open

$ ~/work/ios-hacking/tools/jtool2 -k ../../iphonex/ipsw/kernelcache.release.iphone10b
0xfffffff00583c000:com.apple.kpi.mach
0xfffffff00583c080:com.apple.kpi.private
0xfffffff00583c100:com.apple.kpi.unsupported
0xfffffff00583c180:com.apple.kpi.iokit
0xfffffff00583c200:com.apple.kpi.libkern
0xfffffff00583c280:com.apple.kpi.bsd
0xfffffff00583c300:com.apple.iokit.IONetworkingFamily
0xfffffff00583de00:com.apple.iokit.IOTimeSyncFamily
0xfffffff0058416c0:com.apple.iokit.IOSlowAdaptiveClockingFamily
0xfffffff005841c40:com.apple.iokit.IOStorageFamily
0xfffffff005842e80:com.apple.iokit.IOReportFamily
0xfffffff005843680:com.apple.driver.AppleARMPlatform
0xfffffff00584cd80:com.apple.driver.AppleSamsungSPI
0xfffffff00584dd00:com.apple.kpi.dsep
0xfffffff00584dd80:com.apple.kec.corecrypto
…

More Related Content

What's hot

Priority Inversion on Mars
Priority Inversion on MarsPriority Inversion on Mars
Priority Inversion on Mars
National Cheng Kung University
 
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Brendan Gregg
 
Introduction to TensorFlow Lite
Introduction to TensorFlow Lite Introduction to TensorFlow Lite
Introduction to TensorFlow Lite
Koan-Sin Tan
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
Brendan Gregg
 
Windows 10 Nt Heap Exploitation (Chinese version)
Windows 10 Nt Heap Exploitation (Chinese version)Windows 10 Nt Heap Exploitation (Chinese version)
Windows 10 Nt Heap Exploitation (Chinese version)
Angel Boy
 
How to Build & Use OpenCL on OpenCV & Android NDK
How to Build & Use OpenCL on OpenCV & Android NDKHow to Build & Use OpenCL on OpenCV & Android NDK
Qemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System EmulationQemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System Emulation
National Cheng Kung University
 
Windows 10 Nt Heap Exploitation (English version)
Windows 10 Nt Heap Exploitation (English version)Windows 10 Nt Heap Exploitation (English version)
Windows 10 Nt Heap Exploitation (English version)
Angel Boy
 
AMD: Where Gaming Begins
AMD: Where Gaming BeginsAMD: Where Gaming Begins
AMD: Where Gaming Begins
AMD
 
Creating a keystroke logger in unix shell scripting
Creating a keystroke logger in unix shell scriptingCreating a keystroke logger in unix shell scripting
Creating a keystroke logger in unix shell scripting
Dan Morrill
 
IRQs: the Hard, the Soft, the Threaded and the Preemptible
IRQs: the Hard, the Soft, the Threaded and the PreemptibleIRQs: the Hard, the Soft, the Threaded and the Preemptible
IRQs: the Hard, the Soft, the Threaded and the Preemptible
Alison Chaiken
 
syzkaller: the next gen kernel fuzzer
syzkaller: the next gen kernel fuzzersyzkaller: the next gen kernel fuzzer
syzkaller: the next gen kernel fuzzer
Dmitry Vyukov
 
An AI accelerator ASIC architecture
An AI accelerator ASIC architectureAn AI accelerator ASIC architecture
An AI accelerator ASIC architecture
Khanh Le
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
Brendan Gregg
 
Process Scheduler and Balancer in Linux Kernel
Process Scheduler and Balancer in Linux KernelProcess Scheduler and Balancer in Linux Kernel
Process Scheduler and Balancer in Linux Kernel
Haifeng Li
 
Accelerated Linux Core Dump Analysis training public slides
Accelerated Linux Core Dump Analysis training public slidesAccelerated Linux Core Dump Analysis training public slides
Accelerated Linux Core Dump Analysis training public slides
Dmitry Vostokov
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
Brendan Gregg
 
USENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame GraphsUSENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame Graphs
Brendan Gregg
 
YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing Performance
Brendan Gregg
 
AI Hardware Landscape 2021
AI Hardware Landscape 2021AI Hardware Landscape 2021
AI Hardware Landscape 2021
Grigory Sapunov
 

What's hot (20)

Priority Inversion on Mars
Priority Inversion on MarsPriority Inversion on Mars
Priority Inversion on Mars
 
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPF
 
Introduction to TensorFlow Lite
Introduction to TensorFlow Lite Introduction to TensorFlow Lite
Introduction to TensorFlow Lite
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
 
Windows 10 Nt Heap Exploitation (Chinese version)
Windows 10 Nt Heap Exploitation (Chinese version)Windows 10 Nt Heap Exploitation (Chinese version)
Windows 10 Nt Heap Exploitation (Chinese version)
 
How to Build & Use OpenCL on OpenCV & Android NDK
How to Build & Use OpenCL on OpenCV & Android NDKHow to Build & Use OpenCL on OpenCV & Android NDK
How to Build & Use OpenCL on OpenCV & Android NDK
 
Qemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System EmulationQemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System Emulation
 
Windows 10 Nt Heap Exploitation (English version)
Windows 10 Nt Heap Exploitation (English version)Windows 10 Nt Heap Exploitation (English version)
Windows 10 Nt Heap Exploitation (English version)
 
AMD: Where Gaming Begins
AMD: Where Gaming BeginsAMD: Where Gaming Begins
AMD: Where Gaming Begins
 
Creating a keystroke logger in unix shell scripting
Creating a keystroke logger in unix shell scriptingCreating a keystroke logger in unix shell scripting
Creating a keystroke logger in unix shell scripting
 
IRQs: the Hard, the Soft, the Threaded and the Preemptible
IRQs: the Hard, the Soft, the Threaded and the PreemptibleIRQs: the Hard, the Soft, the Threaded and the Preemptible
IRQs: the Hard, the Soft, the Threaded and the Preemptible
 
syzkaller: the next gen kernel fuzzer
syzkaller: the next gen kernel fuzzersyzkaller: the next gen kernel fuzzer
syzkaller: the next gen kernel fuzzer
 
An AI accelerator ASIC architecture
An AI accelerator ASIC architectureAn AI accelerator ASIC architecture
An AI accelerator ASIC architecture
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
 
Process Scheduler and Balancer in Linux Kernel
Process Scheduler and Balancer in Linux KernelProcess Scheduler and Balancer in Linux Kernel
Process Scheduler and Balancer in Linux Kernel
 
Accelerated Linux Core Dump Analysis training public slides
Accelerated Linux Core Dump Analysis training public slidesAccelerated Linux Core Dump Analysis training public slides
Accelerated Linux Core Dump Analysis training public slides
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 
USENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame GraphsUSENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame Graphs
 
YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing Performance
 
AI Hardware Landscape 2021
AI Hardware Landscape 2021AI Hardware Landscape 2021
AI Hardware Landscape 2021
 

Similar to Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?

HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
Linaro
 
Tesla Hacking to FreedomEV
Tesla Hacking to FreedomEVTesla Hacking to FreedomEV
Tesla Hacking to FreedomEV
Jasper Nuyens
 
Windows内核技术介绍
Windows内核技术介绍Windows内核技术介绍
Windows内核技术介绍jeffz
 
XPDDS17: Approach to Native Applications in XEN on ARM - Volodymyr Babchuk, E...
XPDDS17: Approach to Native Applications in XEN on ARM - Volodymyr Babchuk, E...XPDDS17: Approach to Native Applications in XEN on ARM - Volodymyr Babchuk, E...
XPDDS17: Approach to Native Applications in XEN on ARM - Volodymyr Babchuk, E...
The Linux Foundation
 
Accelerated .NET Memory Dump Analysis training public slides
Accelerated .NET Memory Dump Analysis training public slidesAccelerated .NET Memory Dump Analysis training public slides
Accelerated .NET Memory Dump Analysis training public slides
Dmitry Vostokov
 
Zen alert - Why You Need and How It Works
Zen alert - Why You Need and How It WorksZen alert - Why You Need and How It Works
Zen alert - Why You Need and How It Works
ZenAlert
 
Accelerators: the good, the bad, and the ugly
Accelerators: the good, the bad, and the uglyAccelerators: the good, the bad, and the ugly
Accelerators: the good, the bad, and the ugly
Intel IT Center
 
the NML project
the NML projectthe NML project
the NML projectLei Yang
 
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon Yang
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon YangPractical IoT Exploitation (DEFCON23 IoTVillage) - Lyon Yang
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon Yang
Lyon Yang
 
Tesla hacking presentation fri3d
Tesla hacking presentation fri3dTesla hacking presentation fri3d
Tesla hacking presentation fri3d
Jasper Nuyens
 
Panic report 121112
Panic report 121112Panic report 121112
Panic report 121112wangxueGT
 
Nmap Guide
Nmap GuideNmap Guide
Nmap Guide
Eras Piccunk
 
Practical virtual network functions with Snabb (SDN Barcelona VI)
Practical virtual network functions with Snabb (SDN Barcelona VI)Practical virtual network functions with Snabb (SDN Barcelona VI)
Practical virtual network functions with Snabb (SDN Barcelona VI)
Igalia
 
DEF CON 27 - ALI ISLAM and DAN REGALADO WEAPONIZING HYPERVISORS
DEF CON 27 - ALI ISLAM and DAN REGALADO WEAPONIZING HYPERVISORSDEF CON 27 - ALI ISLAM and DAN REGALADO WEAPONIZING HYPERVISORS
DEF CON 27 - ALI ISLAM and DAN REGALADO WEAPONIZING HYPERVISORS
Felipe Prado
 
Techno-Fest-15nov16
Techno-Fest-15nov16Techno-Fest-15nov16
Techno-Fest-15nov16
Satish Navkar
 
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
The Linux Foundation
 
Android Things in action
Android Things in actionAndroid Things in action
Android Things in action
Stefano Sanna
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloads
inside-BigData.com
 
Analyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE MethodAnalyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE Method
Brendan Gregg
 
VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld
 

Similar to Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices? (20)

HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
 
Tesla Hacking to FreedomEV
Tesla Hacking to FreedomEVTesla Hacking to FreedomEV
Tesla Hacking to FreedomEV
 
Windows内核技术介绍
Windows内核技术介绍Windows内核技术介绍
Windows内核技术介绍
 
XPDDS17: Approach to Native Applications in XEN on ARM - Volodymyr Babchuk, E...
XPDDS17: Approach to Native Applications in XEN on ARM - Volodymyr Babchuk, E...XPDDS17: Approach to Native Applications in XEN on ARM - Volodymyr Babchuk, E...
XPDDS17: Approach to Native Applications in XEN on ARM - Volodymyr Babchuk, E...
 
Accelerated .NET Memory Dump Analysis training public slides
Accelerated .NET Memory Dump Analysis training public slidesAccelerated .NET Memory Dump Analysis training public slides
Accelerated .NET Memory Dump Analysis training public slides
 
Zen alert - Why You Need and How It Works
Zen alert - Why You Need and How It WorksZen alert - Why You Need and How It Works
Zen alert - Why You Need and How It Works
 
Accelerators: the good, the bad, and the ugly
Accelerators: the good, the bad, and the uglyAccelerators: the good, the bad, and the ugly
Accelerators: the good, the bad, and the ugly
 
the NML project
the NML projectthe NML project
the NML project
 
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon Yang
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon YangPractical IoT Exploitation (DEFCON23 IoTVillage) - Lyon Yang
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon Yang
 
Tesla hacking presentation fri3d
Tesla hacking presentation fri3dTesla hacking presentation fri3d
Tesla hacking presentation fri3d
 
Panic report 121112
Panic report 121112Panic report 121112
Panic report 121112
 
Nmap Guide
Nmap GuideNmap Guide
Nmap Guide
 
Practical virtual network functions with Snabb (SDN Barcelona VI)
Practical virtual network functions with Snabb (SDN Barcelona VI)Practical virtual network functions with Snabb (SDN Barcelona VI)
Practical virtual network functions with Snabb (SDN Barcelona VI)
 
DEF CON 27 - ALI ISLAM and DAN REGALADO WEAPONIZING HYPERVISORS
DEF CON 27 - ALI ISLAM and DAN REGALADO WEAPONIZING HYPERVISORSDEF CON 27 - ALI ISLAM and DAN REGALADO WEAPONIZING HYPERVISORS
DEF CON 27 - ALI ISLAM and DAN REGALADO WEAPONIZING HYPERVISORS
 
Techno-Fest-15nov16
Techno-Fest-15nov16Techno-Fest-15nov16
Techno-Fest-15nov16
 
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
 
Android Things in action
Android Things in actionAndroid Things in action
Android Things in action
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloads
 
Analyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE MethodAnalyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE Method
 
VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep Dive
 

More from Koan-Sin Tan

running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on android
Koan-Sin Tan
 
A Peek into TFRT
A Peek into TFRTA Peek into TFRT
A Peek into TFRT
Koan-Sin Tan
 
Running TFLite on Your Mobile Devices, 2020
Running TFLite on Your Mobile Devices, 2020Running TFLite on Your Mobile Devices, 2020
Running TFLite on Your Mobile Devices, 2020
Koan-Sin Tan
 
Exploring Thermal Related Stuff in iDevices using Open-Source Tool
Exploring Thermal Related Stuff in iDevices using Open-Source ToolExploring Thermal Related Stuff in iDevices using Open-Source Tool
Exploring Thermal Related Stuff in iDevices using Open-Source Tool
Koan-Sin Tan
 
TFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU DelegatesTFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU Delegates
Koan-Sin Tan
 
A Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlowA Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlow
Koan-Sin Tan
 
A Peek into Google's Edge TPU
A Peek into Google's Edge TPUA Peek into Google's Edge TPU
A Peek into Google's Edge TPU
Koan-Sin Tan
 
open source nn frameworks on cellphones
open source nn frameworks on cellphonesopen source nn frameworks on cellphones
open source nn frameworks on cellphones
Koan-Sin Tan
 
Caffe2 on Android
Caffe2 on AndroidCaffe2 on Android
Caffe2 on Android
Koan-Sin Tan
 
Tensorflow on Android
Tensorflow on AndroidTensorflow on Android
Tensorflow on Android
Koan-Sin Tan
 
SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016
Koan-Sin Tan
 
A peek into Python's Metaclass and Bytecode from a Smalltalk User
A peek into Python's Metaclass and Bytecode from a Smalltalk UserA peek into Python's Metaclass and Bytecode from a Smalltalk User
A peek into Python's Metaclass and Bytecode from a Smalltalk User
Koan-Sin Tan
 
Android Wear and the Future of Smartwatch
Android Wear and the Future of SmartwatchAndroid Wear and the Future of Smartwatch
Android Wear and the Future of Smartwatch
Koan-Sin Tan
 
Understanding Android Benchmarks
Understanding Android BenchmarksUnderstanding Android Benchmarks
Understanding Android BenchmarksKoan-Sin Tan
 
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsDark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsKoan-Sin Tan
 
Smalltalk and ruby - 2012-12-08
Smalltalk and ruby  - 2012-12-08Smalltalk and ruby  - 2012-12-08
Smalltalk and ruby - 2012-12-08
Koan-Sin Tan
 

More from Koan-Sin Tan (16)

running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on android
 
A Peek into TFRT
A Peek into TFRTA Peek into TFRT
A Peek into TFRT
 
Running TFLite on Your Mobile Devices, 2020
Running TFLite on Your Mobile Devices, 2020Running TFLite on Your Mobile Devices, 2020
Running TFLite on Your Mobile Devices, 2020
 
Exploring Thermal Related Stuff in iDevices using Open-Source Tool
Exploring Thermal Related Stuff in iDevices using Open-Source ToolExploring Thermal Related Stuff in iDevices using Open-Source Tool
Exploring Thermal Related Stuff in iDevices using Open-Source Tool
 
TFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU DelegatesTFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU Delegates
 
A Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlowA Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlow
 
A Peek into Google's Edge TPU
A Peek into Google's Edge TPUA Peek into Google's Edge TPU
A Peek into Google's Edge TPU
 
open source nn frameworks on cellphones
open source nn frameworks on cellphonesopen source nn frameworks on cellphones
open source nn frameworks on cellphones
 
Caffe2 on Android
Caffe2 on AndroidCaffe2 on Android
Caffe2 on Android
 
Tensorflow on Android
Tensorflow on AndroidTensorflow on Android
Tensorflow on Android
 
SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016
 
A peek into Python's Metaclass and Bytecode from a Smalltalk User
A peek into Python's Metaclass and Bytecode from a Smalltalk UserA peek into Python's Metaclass and Bytecode from a Smalltalk User
A peek into Python's Metaclass and Bytecode from a Smalltalk User
 
Android Wear and the Future of Smartwatch
Android Wear and the Future of SmartwatchAndroid Wear and the Future of Smartwatch
Android Wear and the Future of Smartwatch
 
Understanding Android Benchmarks
Understanding Android BenchmarksUnderstanding Android Benchmarks
Understanding Android Benchmarks
 
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsDark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
 
Smalltalk and ruby - 2012-12-08
Smalltalk and ruby  - 2012-12-08Smalltalk and ruby  - 2012-12-08
Smalltalk and ruby - 2012-12-08
 

Recently uploaded

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 

Recently uploaded (20)

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 

Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?

  • 1. Can I use Neural Engine to run my neural networks on A11 devices? Koan-Sin Tan freedom@computer.org Hsinch Coding Serfs Meeting, Nov, 2018
  • 2. https://www.anandtech.com/show/13392/the-iphone-xs-xs-max-review-unveiling-the- silicon-secrets/5 • AnandTech is one of my favorite tech sites. Usually, they provides good technical analysis • E.g., Apple’s CPUs • cache sizes • execution units • various instruction latency • Not good enough for NN accelerators on mobile phones • floating-point VGG16, Inception V3, and ResNet34? • come on, are you still in Neolithic era? ANE on A12, how about A11?
  • 3. Why I said VGG16 is Neolithic Era • Lightweight models are there • MobileNet V1 could have roughly the same top-1 accuracy event with quantized uint8 • MobileNet V2 could have better top-1 accuracy • Mnasnet could be better than MobileNet V1 • Classification, object detection, segmentation, etc. • 8-bit quantization are good enough for many cases https://github.com/tensorflow/models/raw/master/research/slim/nets/mobilenet/ madds_top1_accuracy.png
  • 4. How to use Neural Engine • According to Apple: • A11: 600 G ops per second, A12: 5 T ops per second • Yes, by default, it's enabled on A12 device. If you have pre-iOS 10.12 apps built on top of Core ML, they should be able to use it automatically. But, not on A11 devices. • How to verify it? • MLConfiguration [1]: instance variable @property(readwrite) MLComputeUnits computeUnits; • there is usesCPUOnly for VNRequest in iOS11, but not something like MLComputUnits • See my example [2] [1] https://developer.apple.com/documentation/coreml/mlmodelconfiguration?language=objc [2] https://github.com/freedomtan/coremlbenchmark/
  • 5. Why not VNRequest? • Since I mentioned VNRequest in Vision.framework, why not VNCoreMLRequest? • Yes, I wrote simple VNCoreMLRequest based app before. Both Swift and objective-c ones [1][2]. • Simplified interface and image crop and scale for you. • Yes, image operations time. • This actually reminds us an important system software issue. • Modern cellphone SoCs use DVFS and all kinds of energy-saving techniques extensively. How can use get good performance? • Inference with camera on is usually faster than with camera off!!! [1] https://github.com/freedomtan/SimpleInceptionV3/ [2] https://github.com/freedomtan/SimpleInceptionV3-ObjC
  • 6. Neural Engine in Action • H11ANESevicesThread • A12 is for iPhone11,x • No H10ANEServicesThread • So, who started H11ANEServicesThread? There is no anything named H11 in /System/ Library/Frameworks/ CoreML.framework/CoreML • It seems it’s in /System/Library/ PrivateFrameworks/ ANEServices.framework/ ANEServices • A12 devices only
  • 7. iPhone Xs Max default 17:17:14.002705 +0800 kernel IOReturn H11ANEIn::ANE_ProcessDestroy_gated(H11ANEProcessDestroyArgs *, bool, uint32_t *) : H11ANEIn::ANE_ProgramDestroy_gated WARN: Freeing intermediate buffer inside ProcessDestroy default 17:17:14.004821 +0800 kernel IOReturn H11ANEIn::ANE_ProcessDestroy_gated(H11ANEProcessDestroyArgs *, bool, uint32_t *) : H11ANE:ANE_ProcessDestroy_gated Removed client aned from programHandle=0x8a03aa2e112. Num clients for program=0 default 17:17:14.004938 +0800 kernel IOReturn H11ANEIn::ANE_ProcessDestroy_gated(H11ANEProcessDestroyArgs *, bool, uint32_t *) : H11ANEIn::ANE_ProgramDestroy_gated WARN: Freeing intermediate buffer inside ProcessDestroy default 17:17:14.011142 +0800 kernel IOReturn H11ANEIn::ANE_ProcessDestroy_gated(H11ANEProcessDestroyArgs *, bool, uint32_t *) : H11ANE:ANE_ProcessDestroy_gated Removed client aned from programHandle=0x8a02e50c71e. Num clients for program=0 default 17:17:14.011358 +0800 kernel IOReturn H11ANEIn::ANE_ProcessDestroy_gated(H11ANEProcessDestroyArgs *, bool, uint32_t *) : H11ANEIn::ANE_ProgramDestroy_gated WARN: Freeing intermediate buffer inside ProcessDestroy default 17:17:14.024969 +0800 kernel IOReturn H11ANEInUserClient::ANE_PowerOff() - client aned requesting Power Off default 17:17:14.025291 +0800 kernel IOReturn H11ANEIn::setPowerStateGated(unsigned long, IOService *) : H11ANEIn::setPowerStateGated: 0 default 17:17:14.026850 +0800 kernel IOReturn H11ANEIn::ANE_deInit() : H11ANEIn::ANE_deInit - CSNE_CMD_POWER_DOWN command completed: res=0x00000000 default 17:17:14.026880 +0800 kernel IOReturn H11ANEIn::ANE_deInit() : H11ANEIn::ANE_deInit - ANECPU in WFI after CSNE_CMD_SUSPEND/ CISP_CMD_POWER_DOWN. retries=0 ASCWRAP_IDLE_STATUS = 0x2d default 17:17:14.039520 +0800 kernel IOReturn H11ANEIn::ANE_HandlePowerStateChecksForClient() : INFO: H11ANEIn: ANE power status: isPowered: 0, fDeInitInProgress: 0, fFirmwareTimeout: 0 default 17:17:14.039563 +0800 kernel IOReturn H11ANEIn::ANE_UserClientCleanup_gated(void *) : Info: H11ANEIn: Skipping user client cleanup for client (<private>) as power is already off default 17:17:14.039723 +0800 kernel virtual IOReturn H11ANEInUserClient::clientClose() - aned default 17:17:14.039749 +0800 kernel virtual void H11ANEInUserClient::free() - Freeing UserClient for process: aned (pid 191)
  • 8. iPhone 8 Plus default 17:08:51.256253 +0800 kernel ISPCPU: CmdTurnOffDevicePower: TS: 2.901495 Disable CAM0_SHUTDOWN=0 default 17:08:51.256277 +0800 kernel ISPCPU: Addr: 0x00000002122a8000 default 17:08:51.258444 +0800 kernel ISPCPU: TurnOffPower:DONE TS: 2.903766 rail: 0x5, ch: 0, cameraPowerBitEnable: 0x7e default 17:08:51.258684 +0800 kernel AppleH10CamIn::ISP_PPMAdmissionCheck_gated: subClientID=1; budgetReq=0; budgetAlloc=0; result=0x00000000 default 17:08:51.258726 +0800 kernel AppleH10CamIn::ISP_StopCamera_gated: subClientID=1; channel=0; budgetReq=0; budgetAlloc=0; result=0x00000000, numPreviewFrames=72, numStillCaptureFrames:0 default 17:08:51.258813 +0800 kernel ISPCPU: [ISP: 2.904275] CH = 0 CMD = 0x0104 [CISP_CMD_CH_BUFFER_RETURN] default 17:08:51.266156 +0800 kernel AppleH10CamIn::ISP_FlushInactiveDARTMappings: 0x00000000 default 17:08:51.266234 +0800 mediaserverd H10ISPServicesRemote: SetProperty 2 (sent) default 17:08:51.267404 +0800 mediaserverd H10ISPServicesRemote: SetProperty 2 (reply=0x00000000) default 17:08:51.272115 +0800 kernel ISPCPU: [ISP: 2.917542] CH = 0 CMD = 0x820b [CISP_CMD_APPLE_CH_AE_TILES_MATRIX_METADATA_ENABLE] default 17:08:51.273311 +0800 kernel ISPCPU: [ISP: 2.918641] CH = 0 CMD = 0x0130 [CISP_CMD_CH_GENERAL_PROCESS_STOP] default 17:08:51.273747 +0800 kernel AppleH10CamIn::ISP_FlushInactiveDARTMappings: 0x00000000 default 17:08:51.276237 +0800 kernel AppleH10CamIn::ISP_ReleaseChannel_gated - channel: 0 (process: mediaserverd)
  • 9. iPhone 6s default 17:18:52.814006 +0800 kernel AppleH6CamIn::setPowerStateGated: 1 default 17:18:52.814054 +0800 kernel AppleH6CamIn::power_on_hardware default 17:18:52.910762 +0800 kernel AppleH6CamIn::MotionDataEnable: Enabling for Endpoint 0 default 17:18:52.924652 +0800 mediaserverd FigSignalError: -12785, invalidated default 17:18:52.954154 +0800 mediaserverd FigSignalError: -12785, invalidated default 17:18:52.954361 +0800 kernel AppleH6CamIn::ISP_SelectBestMIPIFrequencyIndex_gated - channel: 0, currentRawBitDepth: 1, index: 2 default 17:18:53.118463 +0800 kernel AppleH6CamIn::ISP_CopySetfile_gated (camChan=0) default 17:19:07.301879 +0800 kernel AppleH6CamIn::ISP_FlushInactiveDARTMappings: 0x00000000 default 17:19:12.307839 +0800 kernel AppleH6CamInUserClient::free - Freeing UserClient for process: mediaserverd (pid 2465) default 17:19:12.308025 +0800 kernel AppleH6CamIn::setPowerStateGated: 0 default 17:19:12.308185 +0800 kernel AppleH6CamIn::power_off_hardware default 17:19:12.321409 +0800 kernel AppleH6CamIn::ISP_FlushInactiveDARTMappings: 0x00000000 default 17:19:12.321478 +0800 kernel AppleH6CamIn::MotionDataDisable: Enabling for Endpoint 0
  • 10. iPhone Xs Max iPhone 8 Plus https://github.com/freedomtan/TestANE/
  • 11. /* Generated by RuntimeBrowser Image: /System/Library/PrivateFrameworks/AppleNeuralEngine.framework/AppleNeuralEngine */ @interface _ANEDeviceInfo : NSObject + (id)bootArgs; + (id)buildVersion; + (bool)hasANE; + (bool)isInternalBuild; + (bool)precompiledModelChecksDisabled; @end https://github.com/nst/iOS-Runtime-Headers/blob/master/PrivateFrameworks/ AppleNeuralEngine.framework/_ANEDeviceInfo.h
  • 12. size -l -x -m /tmp/arm64e/System/Library/PrivateFrameworks/AppleNeuralEngine.framework/AppleNeuralEngine Segment __TEXT: 0x11000 (vmaddr 0x1abe22000 fileoff 0) Section __text: 0xb728 (addr 0x1abe23d18 offset 7448) Section __auth_stubs: 0x3d0 (addr 0x1abe2f440 offset 54336) Section __cstring: 0xb87 (addr 0x1abe2f810 offset 55312) Section __objc_methname: 0x10a5 (addr 0x1abe30397 offset 58263) Section __objc_classname: 0x140 (addr 0x1abe3143c offset 62524) Section __objc_methtype: 0x498 (addr 0x1abe3157c offset 62844) Section __gcc_except_tab: 0x8cc (addr 0x1abe31a14 offset 64020) Section __const: 0xd0 (addr 0x1abe322e0 offset 66272) Section __oslogstring: 0x8d0 (addr 0x1abe323b0 offset 66480) Section __unwind_info: 0x330 (addr 0x1abe32c80 offset 68736) Section __eh_frame: 0x50 (addr 0x1abe32fb0 offset 69552) total 0xf2e8 Segment __DATA: 0xe00 (vmaddr 0x1ba4ef3b8 fileoff 69632) Section __objc_selrefs: 0x3e0 (addr 0x1ba4ef3b8 offset 69632) Section __objc_protorefs: 0x10 (addr 0x1ba4ef798 offset 70624) Section __objc_classrefs: 0x1b8 (addr 0x1ba4ef7a8 offset 70640) Section __objc_superrefs: 0x38 (addr 0x1ba4ef960 offset 71080) Section __objc_ivar: 0x60 (addr 0x1ba4ef998 offset 71136) Section __objc_data: 0x4b0 (addr 0x1ba4ef9f8 offset 71232) Section __data: 0x228 (addr 0x1ba4efea8 offset 72432) Section __auth_ptr: 0x8 (addr 0x1ba4f00d0 offset 72984) Section __bss: 0xe0 (addr 0x1ba4f00d8 offset 0) total 0xe00 …
  • 13. otool -o /tmp/arm64e/System/Library/ PrivateFrameworks/AppleNeuralEngine.framework/ AppleNeuralEngine /tmp/arm64e/System/Library/PrivateFrameworks/ AppleNeuralEngine.framework/AppleNeuralEngine: Contents of (__DATA_CONST,__objc_classlist) section 00000001b7a76a78 0x80001ba4efa20 00000001b7a76a80 0x80001ba4efa70 00000001b7a76a88 0x80001ba4efa98 00000001b7a76a90 0x80001ba4efae8 … ~/work/ios-hacking/tools/jtool -d objc /tmp/arm64/System/ Library/PrivateFrameworks/AppleNeuralEngine.framework/ AppleNeuralEngine Fat binary, big-endian, 1 architectures: will auto-process this architecture arm64_ANEDeviceInfo _ANEDataReporter _ANEProgramForEvaluation _ANEModel _ANEHashEncodin _ANERequest _ANELog _ANEQoSMapper _ANEStrings _ANEDaemonConnection _ANEIOSurfaceObject _ANEDeviceController _ANEClient _ANEErrors _ANECloneHelper http://www.newosxbook.com/tools/jtool.html
  • 14. Mach-O Headers • Mac OS X ABI Mach-O File Format Reference, no longer available on Apple web site, google it. • headers: /usr/include/mach-o/loader.h • objc runtime • https://opensource.apple.com/source/objc4/ objc4-723/, https://opensource.apple.com/tarballs/ objc4/objc4-723.tar.gz
  • 15. Dive a bit deeper into Core ML • Frameworks and some binaries used to be shipped unstripped as parts of iPhoneOS SDK in Xcode. Not anymore, most framework binaries are in dyld_shared_cache. • Fortunately, It’s quite easy to check iOS file system nowadays. Apple stopped encrypting .ipsw since iOS 10 beta (more than 2 years ago). So, get a .ipsw, unzip it (remember it's a .zip file), then mount the largest .dmg (this needs extra steps on Windows and Linux though). E.g., 1. get iOS 12.0 ipsw for iPhone Xs Max [1]. See [2] for other firmwares. 2. unzip it. 3. mount 048-10782-224.dmg, that's it. You can see the whole filesystem used by iPhone Xs Max. • Thus, we can get /System/Library/Caches/com.apple.dyld/ dyld_shared_cache_arm* we want [1] http://updates-http.cdn-apple.com/2018FallFCS/fullrestores/091-65188/11BE19F6-AC8E-11E8-A312-F5CEDE149863/iPhone11,4,iPhone11,6_12.0_16A366_Restore.ipsw [2] https://www.theiphonewiki.com/wiki/Firmware/iPhone/12.x
  • 16. Dive a bit deeper into Core ML • If you are on macOS and have Xcode installed, there are some binaries with symbols in ~/Library/Developer/Xcode/iOS DeviceSupport/12.1 (16B92) arm64e/ • What do I mean by “some”? E.g., there is /System/Library/ PrivateFrameworks/AppleNeuralEngine.framework/ XPCServices/ANECompilerService.xpc/ ANECompilerService on A12 devices, but not in Xcode’s support library • Yes, we can find /System/Library/Frameworks/ CoreML.framework/CoreML • Even /System/Library/Caches/com.apple.dyld/ dyld_shared_cache_arm* is there
  • 17. extract binaries from dyld_shared_cache • jtool can do it for you. E.g., • list ~/work/ios-hacking/tools/jtool -l /Volumes/Peace16A366.D331OS/System/Library/Caches/com.apple.dyld/dyld_shared_cache_arm64e • extract ~/work/ios-hacking/tools/jtool -e /System/Library/PrivateFrameworks/AppleNeuralEngine.framework/AppleNeuralEngine /Volumes/Peace16A366.D331OS/System/ Library/Caches/com.apple.dyld/dyld_shared_cache_arm64e Extracting /System/Library/PrivateFrameworks/AppleNeuralEngine.framework/AppleNeuralEngine at 0x2be22000 into dyld_shared_cache_arm64e.AppleNeuralEngine • dyld source code • https://opensource.apple.com/source/dyld/dyld-551.4/, https:// opensource.apple.com/tarballs/dyld/dyld-551.4.tar.gz • Read dyld source and [1] for more about dyld_shared_cache [1] https://iphonedevwiki.net/index.php/Dyld_shared_cache
  • 18. What to read beyond Apple’s docs • https://www.theiphonewiki.com, e.g., https:// www.theiphonewiki.com/wiki/Firmware/iPhone/12.x • http://iphonedevwiki.net/index.php/Main_Page, e.g., http://iphonedevwiki.net/index.php/ Reverse_Engineering_Tools • http://newosxbook.com/index.php, e.g., http:// newosxbook.com/index.php?page=notes • https://papers.put.as
  • 19. kernel side • So, how about extract or just put ANE related stuff into A11 devices? • Well, if you look into kernel_cache of A11 and A12 devices • As expected, we can see lots of H11ANE information in A12 kernel_cache • A11 kernel_cache does mentioned H11ANE several times, but it seems important modules are not there. • So, I guess if we don’t jailbreak and root, we are out of luck!
  • 21. Isn’t XNU (Darwin source code open)? • Well, there are more than 200 kernel modules, only some of them are open $ ~/work/ios-hacking/tools/jtool2 -k ../../iphonex/ipsw/kernelcache.release.iphone10b 0xfffffff00583c000:com.apple.kpi.mach 0xfffffff00583c080:com.apple.kpi.private 0xfffffff00583c100:com.apple.kpi.unsupported 0xfffffff00583c180:com.apple.kpi.iokit 0xfffffff00583c200:com.apple.kpi.libkern 0xfffffff00583c280:com.apple.kpi.bsd 0xfffffff00583c300:com.apple.iokit.IONetworkingFamily 0xfffffff00583de00:com.apple.iokit.IOTimeSyncFamily 0xfffffff0058416c0:com.apple.iokit.IOSlowAdaptiveClockingFamily 0xfffffff005841c40:com.apple.iokit.IOStorageFamily 0xfffffff005842e80:com.apple.iokit.IOReportFamily 0xfffffff005843680:com.apple.driver.AppleARMPlatform 0xfffffff00584cd80:com.apple.driver.AppleSamsungSPI 0xfffffff00584dd00:com.apple.kpi.dsep 0xfffffff00584dd80:com.apple.kec.corecrypto …