RNNs for Speech

•

0 likes•73 views

This document summarizes and compares several techniques for improving RNN performance for speech recognition: 1) FastGRNN proposes techniques like low-rank matrix approximation and quantization to make GRUs faster and smaller. 2) LightGRU removes the reset gate from GRUs and replaces tanh with ReLU for improved speech recognition performance. 3) AWD-LSTM incorporates techniques like dropout, averaged SGD, and activation regularization to prevent overfitting in LSTMs. Overall the document evaluates different approaches for making RNNs more efficient and effective for speech tasks.

Technology

RNNs for Speech
Faster and smaller RNNs with new regularization techniques.

Old Good RNNs
Cannot train RNN!!
Gradients get crazy!!
Fishes are better at remembering!!!
I watched Schmidhuber and liked him!!
I don’t care baseline, I use what the cool boys use!!
Why so big, Occam’s will cry!!
My GPU has 4GB!!
I can’t wait months to train!!
X et al. said GRUs are better!!

What else?
I need a RNN size model with LSTM performance !!
I need a smaller model or a better smart phone !!
FastGRNN
http://manikvarma.org/pubs/kusupati18.pdf
This forget gate makes no sense!!
May the ReLU be with you!!
I do speech recognition!!
I watched Bengio and liked him!!
LightGRU
https://arxiv.org/abs/1803.10225
I need Regularization!!!
Dropout is not good!!!
AWD-LSTM
https://arxiv.org/abs/1708.02182

Fast GRNN
● 2 trainable matrices vs 6 trainable matrices in a GRU layer.
● Low rank approximation of matrices: w = w1(w2).T
● Integer quantization for parameters.
● Piecewise linear approximation of non-linearities.

Light Gated Recurrent Units
● Remove the reset gate.
● Replace tanh with ReLU
● Batch normalization to reduce ReLU unstability.
● Specifically targeting speech recognition.
● Orthogonal weight initialization, Variational dropout

Results
40 log-mel filter banks Maximum likelihood
linear regression

ASGD Weight Dropped LSTM
● Drop Connect
● Averaged SGD
● Embedding Dropout
● Activation Regularization

Weight Dropping
● Apply Drop-Connect to hidden to hidden connections. (All U matrices)
● Preventing recurrent unit overfitting.
● It needs not to modify optimized RNN implementations in DL frameworks.
● Apply the same dropout mask for the all sequence.

Average SGD and NT-ASGD
Number of steps
to start averaging Weights optimized per iterationWeights used as the
final model
PyTorch implementation:
https://github.com/pytorch/pytorch/blob/cd9b27231b51633e76e28b6a34002ab83b0660fc/torch/optim/asgd.py
NT-ASGD: Only use ASGD when validation metric fails to improve

Embedding Dropout
● Apply dropout in word level, that is dropout zeros-out randomly selected word
vectors.
Activation Regularization
● Panalize network for producing large changes in hidden states and large
outputs leading to overfitting.

Similar to RNNs for Speech

Cephalocon apac china

Vikhyat Umrao

Common Support Issues And How To Troubleshoot Them - Michael Hackett, Vikhyat...

Ceph Community

The Dark Side Of Go -- Go runtime related problems in TiDB in production

ARI. HiPEAK 2014

Memory Bandwidth QoS

Computer network (7)

Provisioning and Capacity Planning Workshop (Dogpatch Labs, September 2015)

Brian Brazil

Multicore

Mark Veltzer

Apache Singa AI

Mike Frampton

5G-Performance-Optimisation DATA RADIO oT+P+++.pdf

Zouhir Allaoui

LAS16-307: Benchmarking Schedutil in Android Speakers: Steve Muckle Date: September 28, 2016 ★ Session Description ★ Being able to see the performance and power impacts of changes in a real world environment such as Android is a prerequisite to doing meaningful development on scheduler-guided frequency (or many other sensitive subsystems). The first half of this session will review setting up the tools to automate testing for performance and power in Android. The second half will cover the results of using these tests to compare the schedutil and interactive governors. ★ Resources ★ Etherpad: pad.linaro.org/p/las16-307 Presentations & Videos: http://connect.linaro.org/resource/las16/las16-307/ ★ Event Details ★ Linaro Connect Las Vegas 2016 – #LAS16 September 26-30, 2016 http://www.linaro.org http://connect.linaro.org

LAS16-307: Benchmarking Schedutil in Android

Linaro

Ratpack the story so far

Phill Barber

Speaker: Cat Gurinsky Abstract: How often do you find yourself doing the same set of commands when troubleshooting issues in your network? I am willing to bet the answer to this is quite often! Usually we have a list of our favorite commands that we will always use to quickly narrow down a specific problem type. Switch reloaded unexpectedly? "show reload cause" Fan failure? "show environment power" Fiber link reporting high errors or down on your monitoring system? "show interface counters errors", "show interface transceiver", "show interface Mac detail" Outputs like the above examples help you quickly pinpoint the source of your failures for remediation. SSH'ing into the boxes and running these commands by hand is time consuming, especially if you are for example a NOC dealing with numerous failures throughout the day. Most switch platforms have API's now and you can instead program against them to get these outputs in seconds. I will go over a variety of examples and creative ways to use these scripts for optimal use of your troubleshooting time and to get you away from continually doing these repetitive tasks by hand. NOTE: My tutorial examples will be using python and the Arista pyeapi module with Arista examples, but the concepts can easily be transferred to other platforms and languages.

Simplified Troubleshooting through API Scripting

Network Automation Forum

Much recent effort has been invested in non-autoregressive neural machine translation, which appears to be an efficient alternative to state-of-the-art autoregressive machine translation on modern GPUs. In contrast to the latter, where generation is sequential, the former allows generation to be parallelized across target token positions. Some of the latest non-autoregressive models have achieved impressive translation quality-speed tradeoffs compared to autoregressive baselines. In this work, we reexamine this tradeoff and argue that autoregressive baselines can be substantially sped up without loss in accuracy. Specifically, we study autoregressive models with encoders and decoders of varied depths. Our extensive experiments show that given a sufficiently deep encoder, a single-layer autoregressive decoder can substantially outperform strong non-autoregressive models with comparable inference speed. We show that the speed disadvantage for autoregressive baselines compared to non-autoregressive methods has been overestimated in three aspects: suboptimal layer allocation, insufficient speed measurement, and lack of knowledge distillation. Our results establish a new protocol for future research toward fast, accurate machine translation.

Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Transl...

Jungo Kasai

Distributed implementation of a lstm on spark and tensorflow

Emanuel Di Nardo

HPP Week 1 Summary

Pipat Methavanitpong

Screaming Fast Wpmu

djcp

Part 1. Javac and JVM optimizations will cover most significant low-level aspects of java runtime. - JIT (Just In Time Compilation) - Method Inlining, - Loop Unrolling, - Lock Coarsening - Lock Eliding, - Branch Prediction, - Escape Analysis - OSR (On Stack Replacement) - TLAB (Thread Local Allocation Buffers) Part 2 : Concurrency : Hardware level Going from low level hardware specifics to parallel code execution. - CPU architecture - Cache Coherency - Memory Barriers - Store Buffers - Cachelines - monitors, volatiles, locks, atomics, synchronization etc... Part 3 : LMAX Disruptor (Ring Buffer)

Java under the hood

Vachagan Balayan

SOLID refactoring - racing car katas

Georg Berky

Traffic Shaping Basics with PRIQ - pfSense Hangout February 2016

Netgate

Similar to RNNs for Speech (20)

Cephalocon apac china

Common Support Issues And How To Troubleshoot Them - Michael Hackett, Vikhyat...

The Dark Side Of Go -- Go runtime related problems in TiDB in production

ARI. HiPEAK 2014

Memory Bandwidth QoS

Computer network (7)

Provisioning and Capacity Planning Workshop (Dogpatch Labs, September 2015)

Multicore

Apache Singa AI

5G-Performance-Optimisation DATA RADIO oT+P+++.pdf

LAS16-307: Benchmarking Schedutil in Android

Ratpack the story so far

Simplified Troubleshooting through API Scripting

Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Transl...

Distributed implementation of a lstm on spark and tensorflow

HPP Week 1 Summary

Screaming Fast Wpmu

Java under the hood

SOLID refactoring - racing car katas

Traffic Shaping Basics with PRIQ - pfSense Hangout February 2016

Recently uploaded

Dubai, known for its towering skyscrapers, luxurious lifestyle, and relentless pursuit of innovation, often finds itself in the global spotlight. However, amidst the glitz and glamour, the emirate faces its own set of challenges, including the occasional threat of flooding. In recent years, Dubai has experienced sporadic but significant floods, disrupting normalcy and posing unique challenges to its infrastructure. Among the critical nodes in this bustling metropolis is the Dubai International Airport, a vital hub connecting the world. This article delves into the intersection of Dubai flood events and the resilience demonstrated by the Dubai International Airport in the face of such challenges.

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Orbitshub

When you’re building (micro)services, you have lots of framework options. Spring Boot is no doubt a popular choice. But there’s more! Take Quarkus, a framework that’s considered the rising star for Kubernetes-native Java. It always depends on what's best for your situation, but how to choose the best solution if you're comparing 2 frameworks? Both Spring Boot and Quarkus have their positives and negatives. Let us compare the two by live coding a couple of common use cases in Spring Boot and Quarkus. After this talk, you’ll be ready to get started with Quarkus yourself, and know when to select Quarkus or Spring Boot.

Spring Boot vs Quarkus the ultimate battle - DevoxxUK

Jago de Vreede

Artificial Intelligence Chap.5 : Uncertainty

Khushali Kathiriya

The value of a flexible API Management solution for Open Banking Steve Melan, Manager for IT Innovation and Architecture - State's and Saving's Bank of Luxembourg Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The value of a flexible API Management solution for O...

apidays

Axa Assurance Maroc - Insurer Innovation Award 2024

The Digital Insurer

Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows. We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases. This video focuses on the deployment of external web forms using Jotform for Bonterra Impact Management. This solution can be customized to your organization’s needs and deployed to support the common use cases below: - Intake and consent - Assessments - Surveys - Applications - Program registration Interested in deploying web form automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Jeffrey Haguewood

Strategies for Landing an Oracle DBA Job as a Fresher

Remote DBA Services

Keynote 2: APIs in 2030: The Risk of Technological Sleepwalk Paolo Malinverno, Growth Advisor - The Business of Technology Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

apidays

AWS Community Day CPH - Three problems of Terraform

Andrey Devyatkin

ICT role in 21st century education and its challenges

rafiqahmad00786416

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Martijn de Jong

The Good, the Bad and the Governed - Why is governance a dirty word? David O'Neill, Chief Operating Officer - APIContext Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

apidays

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

💥 You’re lucky! We’ve found two different (lead) developers that are willing to share their valuable lessons learned about using UiPath Document Understanding! Based on recent implementations in appealing use cases at Partou and SPIE. Don’t expect fancy videos or slide decks, but real and practical experiences that will help you with your own implementations. 📕 Topics that will be addressed: • Training the ML-model by humans: do or don't? • Rule-based versus AI extractors • Tips for finding use cases • How to start 👨‍🏫👨‍💻 Speakers: o Dion Morskieft, RPA Product Owner @Partou o Jack Klein-Schiphorst, Automation Developer @Tacstone Technology

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam

UiPathCommunity

The microservices honeymoon is over. When starting a new project or revamping a legacy monolith, teams started looking for alternatives to microservices. The Modular Monolith, or 'Modulith', is an architecture that reaps the benefits of (vertical) functional decoupling without the high costs associated with separate deployments. This talk will delve into the advantages and challenges of this progressive architecture, beginning with exploring the concept of a 'module', its internal structure, public API, and inter-module communication patterns. Supported by spring-modulith, the talk provides practical guidance on addressing the main challenges of a Modultith Architecture: finding and guarding module boundaries, data decoupling, and integration module-testing. You should not miss this talk if you are a software architect or tech lead seeking practical, scalable solutions. About the author With two decades of experience, Victor is a Java Champion working as a trainer for top companies in Europe. Five thousands developers in 120 companies attended his workshops, so he gets to debate every week the challenges that various projects struggle with. In return, Victor summarizes key points from these workshops in conference talks and online meetups for the European Software Crafters, the world’s largest developer community around architecture, refactoring, and testing. Discover how Victor can help you on victorrentea.ro : company training catalog, consultancy and YouTube playlists.

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Victor Rentea

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

MadyBayot

MS Copilot expands with MS Graph connectors

Nanddeep Nachan

The action of the next cyber saga takes place in the mystical lands of the Asia-Pacific region, where the main characters began their digital activities in the middle of 2021 and qualitatively strengthened it in 2022. Corporate espionage, document theft, audio recordings, and data leaks from messaging platforms were all a matter of one day for Dark Pink. Their geographical focus may have started in the Asia-Pacific region, but their ambitions knew no bounds, targeting a European government ministry in a bold move to expand their portfolio. Their victim profile was as diverse as a UN meeting, targeting military organizations, government agencies, and even a religious organization. Because discrimination is not a fashionable agenda. In the world of cybercrime, they serve as a reminder that sometimes the most serious threats come in the most unassuming packages with a pink bow.

Cyberprint. Dark Pink Apt Group [EN].pdf

Overkill Security

Passkeys: Developing APIs to enable passwordless authentication Cody Salas, Sr Developer Advocate | Solutions Architect - Yubico Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

apidays

Angeliki Cooney has spent over twenty years at the forefront of the life sciences industry, working out of Wynantskill, NY. She is highly regarded for her dedication to advancing the development and accessibility of innovative treatments for chronic diseases, rare disorders, and cancer. Her professional journey has centered on strategic consulting for biopharmaceutical companies, facilitating digital transformation, enhancing omnichannel engagement, and refining strategic commercial practices. Angeliki's innovative contributions include pioneering several software-as-a-service (SaaS) products for the life sciences sector, earning her three patents. As the Senior Vice President of Life Sciences at Avenga, Angeliki orchestrated the firm's strategic entry into the U.S. market. Avenga, a renowned digital engineering and consulting firm, partners with significant entities in the pharmaceutical and biotechnology fields. Her leadership was instrumental in expanding Avenga's client base and establishing its presence in the competitive U.S. market.

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...

Angeliki Cooney

Recently uploaded (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Spring Boot vs Quarkus the ultimate battle - DevoxxUK

Artificial Intelligence Chap.5 : Uncertainty

Apidays New York 2024 - The value of a flexible API Management solution for O...

Axa Assurance Maroc - Insurer Innovation Award 2024

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Strategies for Landing an Oracle DBA Job as a Fresher

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

AWS Community Day CPH - Three problems of Terraform

ICT role in 21st century education and its challenges

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

How to Troubleshoot Apps for the Modern Connected Worker

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

MS Copilot expands with MS Graph connectors

Cyberprint. Dark Pink Apt Group [EN].pdf

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...

RNNs for Speech

1. RNNs for Speech Faster and smaller RNNs with new regularization techniques.

2. Old Good RNNs Cannot train RNN!! Gradients get crazy!! Fishes are better at remembering!!! I watched Schmidhuber and liked him!! I don’t care baseline, I use what the cool boys use!! Why so big, Occam’s will cry!! My GPU has 4GB!! I can’t wait months to train!! X et al. said GRUs are better!!

3. What else? I need a RNN size model with LSTM performance !! I need a smaller model or a better smart phone !! FastGRNN http://manikvarma.org/pubs/kusupati18.pdf This forget gate makes no sense!! May the ReLU be with you!! I do speech recognition!! I watched Bengio and liked him!! LightGRU https://arxiv.org/abs/1803.10225 I need Regularization!!! Dropout is not good!!! AWD-LSTM https://arxiv.org/abs/1708.02182

4. Fast GRNN ● 2 trainable matrices vs 6 trainable matrices in a GRU layer. ● Low rank approximation of matrices: w = w1(w2).T ● Integer quantization for parameters. ● Piecewise linear approximation of non-linearities.

5. FastGRNN vs GRU

7. Light Gated Recurrent Units ● Remove the reset gate. ● Replace tanh with ReLU ● Batch normalization to reduce ReLU unstability. ● Specifically targeting speech recognition. ● Orthogonal weight initialization, Variational dropout

8. Redundancy of Reset Gate

9. Results 40 log-mel filter banks Maximum likelihood linear regression

10. All together GRU FastGRNN LightGRU

11. ASGD Weight Dropped LSTM ● Drop Connect ● Averaged SGD ● Embedding Dropout ● Activation Regularization

12. Weight Dropping ● Apply Drop-Connect to hidden to hidden connections. (All U matrices) ● Preventing recurrent unit overfitting. ● It needs not to modify optimized RNN implementations in DL frameworks. ● Apply the same dropout mask for the all sequence.

13. Average SGD and NT-ASGD Number of steps to start averaging Weights optimized per iterationWeights used as the final model PyTorch implementation: https://github.com/pytorch/pytorch/blob/cd9b27231b51633e76e28b6a34002ab83b0660fc/torch/optim/asgd.py NT-ASGD: Only use ASGD when validation metric fails to improve

14. Embedding Dropout ● Apply dropout in word level, that is dropout zeros-out randomly selected word vectors. Activation Regularization ● Panalize network for producing large changes in hidden states and large outputs leading to overfitting.

15. Results

RNNs for Speech

Recommended

Recommended

More Related Content

Similar to RNNs for Speech

Similar to RNNs for Speech (20)

More from Bilkent University

More from Bilkent University (6)

Recently uploaded

Recently uploaded (20)

RNNs for Speech