State of the art. In [87], the authors first recall the security and data privacy loss risks exposed by multi-party learning models likely to take place in 5G network management (e.g., operators may not share their network operating metadata) as well as the merits of Intel SGX to mitigate these risks. Because of the expected performance losses incumbent to SGX, the authors produce some optimizations for customized binary integration of learning algorithms (K-means, CNN, SVM, Matrix factorization) and stress the requirements for data obliviousness which preserve privacy for the training and sample data, collected and generated outside SGX. In doing so, the authors map the security and privacy issues holistically, all way through the complete AI data pipeline. The incurred overhead when running the model inside SGX varies from a more than satisfactory 1% to a more impacting 91% according to the algorithm type (respectively, CNN and K-Means). In [88], the authors deliver efficient deep learning on multi-source private data, leveraging Differential Privacy (DP) on commercial TEEs. Their technology dubbed MYELIN shows similar performance (or negligible slow down) when applying DP-protected ML. To do so, their implementation goes through the compilation of a static library embedding the core minimal routines. The static library is then fully run in the TEE, which removes any costly context switch from the TEE mode to the normal execution mode. Specialized hardware accelerators (Tensor Processing Units - TPUs) are also viewed as the necessary step to take for highly demanding (fast) decision taking. That is a grey area, with no existing TEE embodiment for specialized hardware to the best of our knowledge. In addition, leveraging TEE data sealing capability looks like another path to consider for further improvements. In [89], the authors deliver a fast, verifiable and private execution of neural networks in trusted hardware, leveraging a commercial TEE. SLALOM splits the execution between a Graphics Processing Unit (GPU) and the TEE while delivering security assurance on the GPU operation ĐŽƌƌĞĐƚŶĞƐƐ ƵƐŝŶŐ &ƌĞŝǀĂůroĚceƐss͛frƐom thĂe ůTEŐE tŽo tƌheŝGƚPUŚisŵai͘m edKƵƚƐŽ at boosting performance, in a scheme that can be applied to any faster co-processor. Full TEE- embedded inference was the bottom line of this research, deemed as not satisfactory on the performance aspect. In [90] , the authors recall the need for ever-growing and security-privacy sensitive training data set which calls for cloud operation, but this comes with its own off-premise security issues. The authors describe the cloud operation security threat as being training data and model stealing by a cloud operator and advocate for the perfect remediation of these risks by leveraging SGX enclave TEE. For that, they employ SCONE framework which drastically limits the efforts to integrate an application inside SGX. TensorSCONE design comprises two main components placed within SGX: TensorSCONE controller interfacing with the system for system calls (network, filesystem, user space thread) and on the other side the TensorFlow Library which enables to deploy untouched TensorFlow application. The authors describe the integration of the different TensorFlow components, namely the learning from TensorFlow and the classification from TensorFlow Lite, the feature-reduced variant designed for ŵŽďŝůĞ ĂŶĚ ĞŵďĞĚĚĞĚ ĚĞǀŝĐĞƐ͘ dŚĞ ŽďũĞĐƚŝǀĞ ŽĨ
Appears in 1 contract
Sources: Grant Agreement
State of the art. In [87], the authors first recall the security and data privacy loss risks exposed by multi-party learning models likely to take place in 5G network management (e.g., operators may not share their network operating metadata) as well as the merits of Intel SGX to mitigate these risks. Because of the expected performance losses incumbent to SGX, the authors produce some optimizations for customized binary integration of learning algorithms (K-means, CNN, SVM, Matrix factorization) and stress the requirements for data obliviousness which preserve privacy for the training and sample data, collected and generated outside SGX. In doing so, the authors map the security and privacy issues holistically, all way through the complete AI data pipeline. The incurred overhead when running the model inside SGX varies from a more than satisfactory 1% to a more impacting 91% according to the algorithm type (respectively, respectively CNN and K-Means). In [88], the authors deliver efficient deep learning on multi-source private data, leveraging Differential Privacy (DP) differential privacy on commercial TEEs. Their technology dubbed MYELIN shows similar performance (or negligible slow down) when applying DP-protected ML. To do so, their implementation goes through the compilation of a static library embedding the core minimal routines. The static library is then fully run in the TEE, which removes any costly context switch from the TEE mode to the normal execution mode. Specialized hardware accelerators (Tensor Processing Units - TPUs) are also viewed as the necessary step to take for highly demanding (fast) decision taking. That is a grey gray area, with no existing TEE embodiment for specialized hardware to the best of our knowledge. In addition, leveraging TEE data sealing capability looks like another path to consider for further improvements. In [89], the authors deliver a fast, verifiable and private execution of neural networks in trusted hardware, leveraging a commercial TEE. SLALOM splits the execution between a Graphics Processing Unit (GPU) GPU and the TEE while delivering security assurance on the GPU operation ĐŽƌƌĞĐƚŶĞƐƐ ƵƐŝŶŐ &ƌĞŝǀĂůroĚceƐss͛frƐom thĂe ůTEŐE tŽo tƌheŝGƚPUŚisŵai͘m edKƵƚƐŽ correctness using ▇▇▇▇▇▇▇▇▇’s algorithm. Outsourcing linear process from the TEE to the GPU is aimed at boosting performance, in a scheme that can be applied to any faster co-processor. Full TEE- TEE-embedded inference was the bottom line of this research, deemed as not satisfactory on the performance aspect. In [90] , the authors recall the need for ever-growing and security-privacy sensitive training data set which calls for cloud operation, operation but this comes with its own off-premise security issues. The authors describe the cloud operation security threat as being training data and model stealing by a cloud operator and advocate for the perfect remediation of these risks by leveraging SGX enclave TEE. For that, they employ SCONE framework which drastically limits the efforts to integrate an application inside SGX. TensorSCONE design comprises two main components are placed within SGX: SGX (TensorSCONE controller interfacing with the system for system calls (network, filesystem, user space thread) and on the other side the TensorFlow Library which enables to deploy untouched TensorFlow application. The authors describe the integration of the different TensorFlow components, namely the learning from TensorFlow and the classification from TensorFlow Lite, the feature-reduced variant designed for ŵŽďŝůĞ ĂŶĚ ĞŵďĞĚĚĞĚ ĚĞǀŝĐĞƐ͘ dŚĞ ŽďũĞĐƚŝǀĞ ŽĨmobile and embedded devices. The objective of this selection is to meet the SGX’s EPC memory restriction (below 128 Mb for older devices, while from Intel 10th the limit was raised to 1TB) while maintaining TensorFlow features, when possible, to keep the ML application framework easy and efficient. The authors discuss the performance losses compared to native TensorFlow implementations at both stages of training and classification, with a benchmark on image recognition (i.e., inception-v4). Classification throughput is degraded with a ratio 3, while training latency is inflated with a ratio 4 to 8. The authors do not foresee or discuss major gains in performance while have gone deep in their understanding the several causes of SGX induced losses (inevitable page swapping for large size sampling and training data) and instead consider approximate computing as a possible loss saver. In [91], the authors propose a design to extend GPUs hardware with a hardware standard extension to offload kernel and security sensitive applications into the GPU. Graviton is a design to break the classical TEE performance bottleneck. Beside the detailed design explanations, the authors describe a typical implementation as Cifar-10 convolutional network. As the solution is not materialized in a chip, simulation is used to evaluate the performance overhead as being in the range from 53% to 73% according to the training phases. As AI/ML is becoming the hottest trend in ICT at the current time and GPU are well-fitted to parallelized processing of model generation, we may see emerging GPUs coping with AI/ML application security, featuring probably less workflow constraints and performance overhead than classical TEEs. Graviton is probably the initial (emulation) step in this direction. Securing AI by means of TEEs is a nascent discipline, at its early stage, with encouraging results recently reported. From this first survey however, it appears that turnkey frameworks (TensorSCONE) are far more costly in terms of performance overhead compared to what can be achieved by more demanding customized integration (Microsoft research). We expect that performance of AI/ML oriented SGX tool will get improved and optimized over time.
Appears in 1 contract
Sources: Grant Agreement