H2O World 2015 - Arno Candel
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...Edureka!
( **Natural Language Processing Using Python: - https://www.edureka.co/python-natural... ** )
This PPT will provide you with detailed and comprehensive knowledge of the two important aspects of Natural Language Processing ie. Stemming and Lemmatization. It will also provide you with the differences between the two with Demo on each. Following are the topics covered in this PPT:
Introduction to Big Data
What is Text Mining?
What is NLP?
Introduction to Stemming
Introduction to Lemmatization
Applications of Stemming & Lemmatization
Difference between stemming & Lemmatization
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
I will try to say – what is QA, how could we get the answer to questions on natural language and how successful have we been in that domain.
I have gained all of my knowledge from three proposed papers and what I read around them.
Aprenda através dessa sequência de slides que serão como capítulos de um livro, como funciona essa linguagem sensacional que é o HTML, atualmente sendo a linguagem mais utilizada para estruturação de páginas web no mundo inteiro. Conheça de uma maneira interessante e nem um pouco chata, como essa linguagem é fascinante e tem gerado abertura de horizontes no mercado de trabalho para todos nós.
Requirement Life Cycle Management knowledge area describes the tasks that business analysts perform in order to manage and maintain requirements and design information from inception to retirement.
H2O World - H2O Deep Learning with Arno CandelSri Ambati
H2O World 2015
Tutorial scripts for R, Python are here:
https://github.com/h2oai/h2o-world-2015-training/tree/master/tutorials/deeplearning
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...Edureka!
( **Natural Language Processing Using Python: - https://www.edureka.co/python-natural... ** )
This PPT will provide you with detailed and comprehensive knowledge of the two important aspects of Natural Language Processing ie. Stemming and Lemmatization. It will also provide you with the differences between the two with Demo on each. Following are the topics covered in this PPT:
Introduction to Big Data
What is Text Mining?
What is NLP?
Introduction to Stemming
Introduction to Lemmatization
Applications of Stemming & Lemmatization
Difference between stemming & Lemmatization
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
I will try to say – what is QA, how could we get the answer to questions on natural language and how successful have we been in that domain.
I have gained all of my knowledge from three proposed papers and what I read around them.
Aprenda através dessa sequência de slides que serão como capítulos de um livro, como funciona essa linguagem sensacional que é o HTML, atualmente sendo a linguagem mais utilizada para estruturação de páginas web no mundo inteiro. Conheça de uma maneira interessante e nem um pouco chata, como essa linguagem é fascinante e tem gerado abertura de horizontes no mercado de trabalho para todos nós.
Requirement Life Cycle Management knowledge area describes the tasks that business analysts perform in order to manage and maintain requirements and design information from inception to retirement.
H2O World - H2O Deep Learning with Arno CandelSri Ambati
H2O World 2015
Tutorial scripts for R, Python are here:
https://github.com/h2oai/h2o-world-2015-training/tree/master/tutorials/deeplearning
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
H2O World - Sparkling Water - Michal MalohlavaSri Ambati
H2O World 2015 - Michal Malohlava
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Intro to Machine Learning with H2O and AWSSri Ambati
Navdeep Gill @ Galvanize Seattle- May 2016
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Suggestions:
1) For best quality, download the PDF before viewing.
2) Open at least two windows: One for the Youtube video, one for the screencast (link below), and optionally one for the slides themselves.
3) The Youtube video is shown on the first page of the slide deck, for slides, just skip to page 2.
Screencast: http://youtu.be/VoL7JKJmr2I
Video recording: http://youtu.be/CJRvb8zxRdE (Thanks to Al Friedrich!)
In this talk, we take Deep Learning to task with real world data puzzles to solve.
Data:
- Higgs binary classification dataset (10M rows, 29 cols)
- MNIST 10-class dataset
- Weather categorical dataset
- eBay text classification dataset (8500 cols, 500k rows, 467 classes)
- ECG heartbeat anomaly detection
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
H2O World - Welcome to H2O World with Arno CandelSri Ambati
H2O World 2015
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
H2O Machine Learning and Kalman Filters for Machine PrognosticsSri Ambati
Machine Learning and Kalman Filters for Machine Prognostics by Hank Roark, at Ford Research & Innovation Center, 1.21.16
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Intro to H2O in Python - Data Science LASri Ambati
Erin LeDell's presentation on Intro to H2O Machine Learning in Python at Data Science LA meetup on 1.19.16
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Scalable Data Science and Deep Learning with H2Oodsc
The era of Big Data has passed, and the era of sensory overload – that is, the proliferation of sensor data – is upon us. The challenge today is how to create the next generation of business and consumer applications that transform how we interact with sensors themselves. Applications need to learn from every user interaction and data point and predict what can happen next. The future depends on Machine Learning, as much as it depends on the data itself, to change the way we interact with these systems.
In this talk, we explain H2O’s scalable distributed in-memory math architecture and its design principles. The platform was built alongside (and on top of) both Hadoop and Spark clusters and includes interfaces for R, Python, Scala, Java, JavaScript and JSON, along with its interactive graphical Flow interface that make it easier for non-engineers to stitch together complete analytic workflows. We outline the implementation of distributed machine learning algorithms such as Elastic Net, Random Forest, Gradient Boosting and Deep Learning. We will present a broad range of use cases and live demos that include world-record deep learning models, anomaly detection tools and approaches for Kaggle data science competitions. We also demonstrate the applicability of H2O in enterprise environments for real-world customer production use cases. By the end of this presentation, you will know how to create your own machine learning workflows on your data using R, Python (iPython Notebooks) or the Flow GUI.
H2O World - Sparkling Water - Michal MalohlavaSri Ambati
H2O World 2015 - Michal Malohlava
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Intro to Machine Learning with H2O and AWSSri Ambati
Navdeep Gill @ Galvanize Seattle- May 2016
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Suggestions:
1) For best quality, download the PDF before viewing.
2) Open at least two windows: One for the Youtube video, one for the screencast (link below), and optionally one for the slides themselves.
3) The Youtube video is shown on the first page of the slide deck, for slides, just skip to page 2.
Screencast: http://youtu.be/VoL7JKJmr2I
Video recording: http://youtu.be/CJRvb8zxRdE (Thanks to Al Friedrich!)
In this talk, we take Deep Learning to task with real world data puzzles to solve.
Data:
- Higgs binary classification dataset (10M rows, 29 cols)
- MNIST 10-class dataset
- Weather categorical dataset
- eBay text classification dataset (8500 cols, 500k rows, 467 classes)
- ECG heartbeat anomaly detection
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
H2O World - Welcome to H2O World with Arno CandelSri Ambati
H2O World 2015
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
H2O Machine Learning and Kalman Filters for Machine PrognosticsSri Ambati
Machine Learning and Kalman Filters for Machine Prognostics by Hank Roark, at Ford Research & Innovation Center, 1.21.16
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Intro to H2O in Python - Data Science LASri Ambati
Erin LeDell's presentation on Intro to H2O Machine Learning in Python at Data Science LA meetup on 1.19.16
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Scalable Data Science and Deep Learning with H2Oodsc
The era of Big Data has passed, and the era of sensory overload – that is, the proliferation of sensor data – is upon us. The challenge today is how to create the next generation of business and consumer applications that transform how we interact with sensors themselves. Applications need to learn from every user interaction and data point and predict what can happen next. The future depends on Machine Learning, as much as it depends on the data itself, to change the way we interact with these systems.
In this talk, we explain H2O’s scalable distributed in-memory math architecture and its design principles. The platform was built alongside (and on top of) both Hadoop and Spark clusters and includes interfaces for R, Python, Scala, Java, JavaScript and JSON, along with its interactive graphical Flow interface that make it easier for non-engineers to stitch together complete analytic workflows. We outline the implementation of distributed machine learning algorithms such as Elastic Net, Random Forest, Gradient Boosting and Deep Learning. We will present a broad range of use cases and live demos that include world-record deep learning models, anomaly detection tools and approaches for Kaggle data science competitions. We also demonstrate the applicability of H2O in enterprise environments for real-world customer production use cases. By the end of this presentation, you will know how to create your own machine learning workflows on your data using R, Python (iPython Notebooks) or the Flow GUI.
How to win data science competitions with Deep LearningSri Ambati
Note: Please download the slides first, otherwise some links won't work!
How to win kaggle style data science competitions and influence decisions with R, Deep Learning and H2O's fast algorithms.
We take a few public and kaggle datasets and model to win competitions on accuracy and scoring speed.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Simplilearn
This Deep Learning interview questions and answers presentation will help you prepare for Deep Learning interviews. This presentation is ideal for both beginners as well as professionals who are appearing for Deep Learning, Machine Learning or Data Science interviews. Learn what are the most important Deep Learning interview questions and answers and know what will set you apart in the interview process.
Some of the important Deep Learning interview questions are listed below:
1. What is Deep Learning?
2. What is a Neural Network?
3. What is a Multilayer Perceptron (MLP)?
4. What is Data Normalization and why do we need it?
5. What is a Boltzmann Machine?
6. What is the role of Activation Functions in neural network?
7. What is a cost function?
8. What is Gradient Descent?
9. What do you understand by Backpropagation?
10. What is the difference between Feedforward Neural Network and Recurrent Neural Network?
11. What are some applications of Recurrent Neural Network?
12. What are Softmax and ReLU functions?
13. What are hyperparameters?
14. What will happen if learning rate is set too low or too high?
15. What is Dropout and Batch Normalization?
16. What is the difference between Batch Gradient Descent and Stochastic Gradient Descent?
17. Explain Overfitting and Underfitting and how to combat them.
18. How are weights initialized in a network?
19. What are the different layers in CNN?
20. What is Pooling in CNN and how does it work?
Simplilearn’s Deep Learning course will transform you into an expert in deep learning techniques using TensorFlow, the open-source software library designed to conduct machine learning & deep neural network research. With our deep learning course, you’ll master deep learning and TensorFlow concepts, learn to implement algorithms, build artificial neural networks and traverse layers of data abstraction to understand the power of data and prepare you for your new role as deep learning scientist.
Why Deep Learning?
It is one of the most popular software platforms used for deep learning and contains powerful tools to help you build and implement artificial neural networks.
Advancements in deep learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change.
There is booming demand for skilled deep learning engineers across a wide range of industries, making this deep learning course with TensorFlow training well-suited for professionals at the intermediate to advanced level of experience. We recommend this deep learning online course particularly for the following professionals:
1. Software engineers
2. Data scientists
3. Data analysts
4. Statisticians with an interest in deep learning
Learn more at: https//www.simplilearn.com
Presentation in Vietnam Japan AI Community in 2019-05-26.
The presentation summarizes what I've learned about Regularization in Deep Learning.
Disclaimer: The presentation is given in a community event, so it wasn't thoroughly reviewed or revised.
Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks
Deep Learning is all the rage these days, but where does the reality of what Deep Learning can do end and the media hype begin? In this talk, I will dispel common myths about Deep Learning that are not necessarily true and help you decide whether you should practically use Deep Learning in your software stack.
I’ll begin with a technical overview of common neural network architectures like CNNs, RNNs, GANs and their common use cases like computer vision, language understanding or unsupervised machine learning. Then I’ll separate the hype from reality around questions like:
• When should you prefer traditional ML systems like scikit learn or Spark.ML instead of Deep Learning?
• Do you no longer need to do careful feature extraction and standardization if using Deep Learning?
• Do you really need terabytes of data when training neural networks or can you ‘steal’ pre-trained lower layers from public models by using transfer learning?
• How do you decide which activation function (like ReLU, leaky ReLU, ELU, etc) or optimizer (like Momentum, AdaGrad, RMSProp, Adam, etc) to use in your neural network?
• Should you randomly initialize the weights in your network or use more advanced strategies like Xavier or He initialization?
• How easy is it to overfit/overtrain a neural network and what are the common techniques to ovoid overfitting (like l1/l2 regularization, dropout and early stopping)?
Similar to H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel (20)
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Sri Ambati
Sandeep Singh, Head of Applied AI Computer Vision, Beans.ai
H2O Open Source GenAI World SF 2023
In the modern era of machine learning, leveraging both open-source and closed-source solutions has become paramount for achieving cutting-edge results. This talk delves into the intricacies of seamlessly integrating open-source Large Language Model (LLM) solutions like Vicuna, Falcon, and Llama with industry giants such as ChatGPT and Google's Palm. As the demand for fine-tuned and specialized datasets grows, it is imperative to understand the synergy between these tools. Attendees will gain insights into best practices for building and enriching datasets tailored for fine-tuning tasks, ensuring that their LLM projects are both robust and efficient. Through real-world examples and hands-on demonstrations, this talk will equip attendees with the knowledge to harness the power of both open and closed-source tools in a coherent and effective manner.
Patrick Hall, Professor, AI Risk Management, The George Washington University
H2O Open Source GenAI World SF 2023
Language models are incredible engineering breakthroughs but require auditing and risk management before productization. These systems raise concerns about toxicity, transparency and reproducibility, intellectual property licensing and ownership, disinformation and misinformation, supply chains, and more. How can your organization leverage these new tools without taking on undue or unknown risks? While language models and associated risk management are in their infancy, a small number of best practices in governance and risk are starting to emerge. If you have a language model use case in mind, want to understand your risks, and do something about them, this presentation is for you!
Dr. Alexy Khrabrov, Open Source Science Community Director, IBM
H2O Open Source GenAI World SF 2023
In this talk, Dr. Alexy Khrabrov, recently elected Chair of the new Generative AI Commons at Linux Foundation for AI & Data, outlines the OSS AI landscape, challenges, and opportunities. With new models and frameworks being unveiled weekly, one thing remains constant: community building and validation of all aspects of AI is key to reliable and responsible AI we can use for business and society needs. Industrial AI is one key area where such community validation can prove invaluable.
Michelle Tanco, Head of Product, H2O.ai
H2O Open Source GenAI World SF 2023
Learn how the makers at H2O.ai are building internal tools to solve real use cases using H2O Wave and h2oGPT. We will walk through an end-to-end use case and discuss how to incorporate business rules and generated content to rapidly develop custom AI apps using only Python APIs.
Applied Gen AI for the Finance Vertical Sri Ambati
Megan Kurka, Vice President, Customer Data Scientist, H2O.ai
H2O Open Source GenAI World SF 2023
Discover the transformative power of Applied Gen AI. Learn how the H2O team builds customized applications and workflows that integrate capabilities of Gen AI and AutoML specifically designed to address and enhance financial use cases. Explore real world examples, learn best practices, and witness firsthand how our innovative solutions are reshaping the landscape of finance technology.
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Sri Ambati
Pascal Pfeiffer, Principal Data Scientist, H2O.ai
H2O Open Source GenAI World SF 2023
This talk dives into the expansive ecosystem of Large Language Models (LLMs), offering practitioners an insightful guide to various relevant applications, from natural language understanding to creative content generation. While exploring use cases across different industries, it also honestly addresses the current limitations of LLMs and anticipates future advancements.
Introducción al Aprendizaje Automatico con H2O-3 (1)Sri Ambati
En esta reunión virtual, damos una introducción a la plataforma de aprendizaje automático de código abierto número 1, H2O-3 y te mostramos cómo puedes usarla para desarrollar modelos para resolver diferentes casos de uso.
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...Sri Ambati
Numerai is an open, crowd-sourced hedge fund powered by predictions from data scientists around the world. In return, participants are rewarded with weekly payouts in crypto.
In this talk, Joe will give an overview of the Numerai tournament based on his own experience. He will then explain how he automates the time-consuming tasks such as testing different modelling strategies, scoring new datasets, submitting predictions to Numerai as well as monitoring model performance with H2O Driverless AI and R.
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...Sri Ambati
In this session, you will learn about what you should do after you’ve taken an AI transformation baseline. Over the span of this session, we will discuss the next steps in moving toward AI readiness through alignment of talent and tools to drive successful adoption and continuous use within an organization.
To find additional videos on AI courses, earn badges, join the courses at H2O.ai Learning Center: https://training.h2o.ai/products/ai-foundations-course
To find the Youtube video about this presentation: https://youtu.be/K1Cl3x3rd8g
Speaker:
Chemere Davis (H2O.ai - Senior Data Scientist Training Specialist)
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Crescat
Crescat is industry-trusted event management software, built by event professionals for event professionals. Founded in 2017, we have three key products tailored for the live event industry.
Crescat Event for concert promoters and event agencies. Crescat Venue for music venues, conference centers, wedding venues, concert halls and more. And Crescat Festival for festivals, conferences and complex events.
With a wide range of popular features such as event scheduling, shift management, volunteer and crew coordination, artist booking and much more, Crescat is designed for customisation and ease-of-use.
Over 125,000 events have been planned in Crescat and with hundreds of customers of all shapes and sizes, from boutique event agencies through to international concert promoters, Crescat is rigged for success. What's more, we highly value feedback from our users and we are constantly improving our software with updates, new features and improvements.
If you plan events, run a venue or produce festivals and you're looking for ways to make your life easier, then we have a solution for you. Try our software for free or schedule a no-obligation demo with one of our product specialists today at crescat.io
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Atelier - Innover avec l’IA Générative et les graphes de connaissancesNeo4j
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement.
Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Graspan: A Big Data System for Big Code AnalysisAftab Hussain
We built a disk-based parallel graph system, Graspan, that uses a novel edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.
We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations.
These analyses were used to augment the existing checkers; these augmented checkers found 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
- Accepted in ASPLOS ‘17, Xi’an, China.
- Featured in the tutorial, Systemized Program Analyses: A Big Data Perspective on Static Analysis Scalability, ASPLOS ‘17.
- Invited for presentation at SoCal PLS ‘16.
- Invited for poster presentation at PLDI SRC ‘16.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Enhancing Research Orchestration Capabilities at ORNL.pdf
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
1. Top
10
13
Deep
Learning
Tips
&
Tricks
Arno
Candel,
PhD
Chief
Architect,
H2O.ai
@ArnoCandel
2. Tip
#1:
Understand
Model
Complexity
Age
Income
FICO
score
Fraud
Legit
Layer
1
hidden
neurons Layer
2
hidden
neurons
• Neurons
are
connected
layer-‐by-‐layer,
feed-‐forward,
directed
acyclic
graph
• Each
connection
is
a
floating-‐point
number
(weight)
• Neuron
activation/output
is
non-‐linear
transform
of
weighted
sum
of
inputs
• Model
complexity
(training
time)
grows
with
total
number
of
weights
• Training
is
iterative:
can
stop
at
any
point
and
continue
until
convergence
• Multi-‐threading
leads
to
race
conditions,
results
are
not
reproducible
[4,3]
hidden
neurons
CPU
intense
3. Tip
#1:
Understand
Model
Complexity
Given
data:
10
billion
rows,
1111
columns
1110
predictor
columns
(1100
numerical,
10
categorical
with
400
levels
each),
1
binary
response
column
Question:
Which
model
is
more
complex
(i.e.,
is
bigger,
has
more
weights)?
Model
1:
[400,400,400,400]
hidden
neurons
Model
2:
[500]
hidden
neurons
4. Tip
#1:
Understand
Model
Complexity
Model
1:
[5100,400,400,400,400,2]
neurons
and
5100x400
+
400x400
+
400x400
+
400x400
+
400x2
=
2.52M
weights
Model
2:
[5100,500,2]
neurons
and
5100x500
+
500x2
=
2.55M
weights
Answer:
Model
2.
Why?
Categoricals
are
one-‐hot
expanded
into
4000
“dummy”
variables
(0/1),
and
hence
there
are
a
total
of
1100
+
4000
=
5100
input
neurons.
Model
size
depends
on
features
Given
data:
10
billion
rows,
1111
columns
1110
predictor
columns
(1100
numerical,
10
categorical
with
400
levels
each),
1
binary
response
column
5. Question:
How
much
memory
does
H2O
Deep
Learning
need
to
train
such
a
model
(2.5M
weights)
on
10
billion
rows
for
100
epochs?
Answer:
Model
memory
usage
is
small
and
constant
over
time:
2.5M
x
4Bytes
(float)
x
3
(adaptive
learning
rate)
=
30MB
(more
training
-‐>
hopefully
better
weights,
but
not
more
weights)
Tip
#1:
Understand
Model
Complexity
Model
size
independent
of
#rows
or
training
time
Given
data:
10
billion
rows,
1111
columns
1110
predictor
columns
(1100
numerical,
10
categorical
with
400
levels
each),
1
binary
response
column
6. Tip
#2:
Establish
a
Baseline
on
Holdout
Data
For
a
given
dataset
(with
proper
holdout
sets),
do
the
following
first:
•Train
default
GLM,
DRF,
GBM
models
and
inspect
convergence
plots,
train/test
metrics
and
variable
importances
•Train
default
DL
models,
but
set
hidden
neurons
(and
maybe
epochs):
-‐ hidden=[200,200]
“I
Feel
Lucky”
-‐ hidden=[512]
“Eagle
Eye”
-‐ hidden=[64,64,64]
“Puppy
Brain”
-‐ hidden=[32,32,32,32,32]
“Junior
Chess
Master”
• Develop
a
feel
for
the
problem
and
the
holdout
performance
of
the
different
models
Develop
a
Feel
7. Tip
#3:
Inspect
Models
in
Flow
• Inspect
the
model
in
Flow
during
training:
getModel
‘model_id’
• If
the
model
is
wrong
(wrong
architecture,
response,
parameters,
etc.),
cancel
it
• If
the
model
is
taking
too
long,
cancel
and
decrease
model
complexity
• If
the
model
is
performing
badly,
cancel
and
increase
model
complexity
Cancel
early,
cancel
often
8. Tip
#4:
Use
Early
Stopping!
(On
by
default)
Before:
trains
too
long,
but
at
least
overwrite_with_best_model=true
prevents
overfitting
(returns
the
model
with
lowest
validation
error)
Now:
specify
additional
convergence
criterion:
E.g.
stopping_rounds=5,
stopping_metric=“MSE”,
stopping_tolerance=1e-‐3,
to
stop
as
soon
as
the
moving
average
(length
5)
of
the
validation
MSE
does
not
improve
by
at
least
0.1%
for
5
consecutive
scoring
events
validation
error
training
error
overwrite_with_best_model=true
training
time
/
epochs
training
time
/
epochsUse
Flow
to
inspect
the
model
Early
stopping
saves
tons
of
time
Best
Model
Higgs
dataset
9. Tip
#5:
Control
Scoring
Overhead
By
default,
scoring
is
limited
to
10%
of
the
runtime,
but
if
you
provide
a
large
validation
dataset,
then
even
scoring
for
the
first
time
can
take
a
long
time.
Here
are
options
to
control
this
overhead:
•Provide
a
smaller
validation
dataset
for
scoring
during
training
•Sample
the
validation
dataset
with
‘score_validation_samples=N’
• If
multi-‐class
or
imbalanced,
set
‘score_validation_sampling=“Stratified”’
•Score
less
often
with
‘score_duty_cycle=0.05’
etc.
The
quality
of
the
validation
dataset
and
the
frequency
of
scoring
can
affect
early
stopping.
Especially
if
N-‐fold
cross-‐validation
is
disabled,
try
to
provide
a
validation
dataset
for
accurate
early
stopping.
Validation
data
determines
scoring
speed
and
early
stopping
decision
10. Tip
#6:
Use
N-‐fold
Cross-‐Validation
Estimate
your
model
performance
well
In
H2O,
N-‐fold
Cross-‐Validation
trains
N+1
models:
• N
models
with
each
a
different
1/N-‐th
of
the
training
data
held
out
as
validation
data
• 1
model
on
the
full
training
data
Early
stopping
and
N-‐fold
CV
are
especially
useful
together,
as
the
optimal
number
of
epochs
is
estimated
from
the
N
Cross-‐Validation
models
(even
if
a
validation
dataset
is
provided).
•Early
stopping
automatically
tunes
the
number
of
epochs
•Model
performance
is
estimated
with
training
holdout
data
•The
model
is
built
on
all
of
the
training
data
11. Tip
#7:
Use
Regularization
Overfitting
is
easy,
generalization
is
art
validation
error
training
error
training
time training
time
Rectifier,
hidden=[50,50]
RectifierWithDropout,
hidden=[50,50],
l1=1e-‐4,
l2=1e-‐4,
hidden_dropout_ratio=[0.2,0.3]
Higgs
dataset
12. Tip
#8:
Perform
HyperParameter
Search
Just
need
to
find
one
of
the
many
good
models
There
are
a
lot
of
hyper
parameters
for
H2O
Deep
Learning.
Many
are
for
expert-‐level
fine
control
and
do
not
significantly
affect
the
model
performance.
Rectifier
/
RectifierWithDropout
is
recommended
for
most
problems
(The
only
difference
is
the
default
values
for
hidden_dropout_ratios:
0
/
0.5).
Main
parameters
to
tune
(I
prefer
Random
over
Grid
search)
are:
•hidden
(try
2
to
5
layers
deep,
10
to
2000
neurons
per
layer)
•hidden_dropout_ratios
and
input_dropout_ratio
•l1/l2
•adaptive_rate
•true:
rho,
epsilon
•false:
rate,
rate_annealing,
rate_decay,
momentum_start,
momentum_stable,
momentum_ramp
13. Tip
#9:
Use
Checkpointing
Any
H2O
Deep
Learning
model
can
be
stored
to
disk
and
then
later
restarted,
with
either
the
same
or
new
training
data.
•It’s
ok
to
cancel
a
H2O
Deep
Learning
model
at
any
time
during
training,
it
can
be
restarted
•Store
your
favorite
models
to
disk
for
later
use
•Continue
training
from
previous
model
checkpoints
(architecture,
data
types,
etc.
must
match,
many
parameters
are
allowed
to
change)
•Start
multiple
new
models
from
the
same
checkpoint,
with
different
regularization,
learning
rate
parameters
to
speed
up
hyper-‐parameter
search
Checkpointing
enables
fast
exploration
14. Tip
#10:
Tune
Communication
on
Multi-‐Node
computation
K-V
K-V
HTTPD
HTTPD
nodes/JVMs
communication
w
w w
w w w w
w1
w3 w2
w4
w2+w4
w1+w3
w* = (w1+w2+w3+w4)/4
initial model: weights and biases w
updated model: w*
H2O K-V store:
distributed
in-memory non-
blocking hash map
Query & display the
model via JSON, WWW
2
2 4
31
1
1
1
4
3 2
1 2
1
i
Map:
training
each
node
trains
independently,
parallelizes
well
Reduce:
model
averaging
can
improve
accuracy,
but
takes
time
(limited
to
~5%
of
time
by
default)
Know
your
‘target_ratio_comm_to_comp’
value
Multi-‐Node
15. Tip
#11:
Know
your
Data
(Sparsity/Categoricals)
If
the
number
of
input
neurons
is
>>
1000,
then
training
can
be
inefficient
and
slow,
especially
if
there
are
many
sparse
features
(e.g.,
categorical
predictors)
that
rarely
activate
neurons.
It
might
be
more
efficient
to
train
faster
with
fewer
(dense,
rich)
predictors.
Try
the
following
remedies:
•Build
a
tiny
DL
model
and
use
h2o.deepfeatures()
to
extract
lower-‐dimensional
features
•Ignore
categorical
columns
with
high
factor
counts
•Use
random
projection
of
all
categorical
features
into
N-‐dimensional
space,
by
setting
‘max_categorical_features=N’
•Use
GLRM
to
reduce
the
dimensionality
of
the
dataset
(and
make
sure
to
apply
the
same
GLRM
model
to
the
test
set)
•If
the
dataset
is
just
simply
wide
(few
or
no
categorical
features),
then
the
chance
of
it
being
sparse
is
high.
If
so,
set
‘sparse=true’,
which
can
result
in
faster
training.
Know
your
data
16. Tip
#12:
Advanced
Math
For
regression
problems,
it
might
make
sense
to
apply
different
loss
functions
and
distributions:
• Gaussian
distribution,
squared
error
loss,
sensitive
to
outliers
• Laplace
distribution,
absolute
error
loss,
more
robust
to
outliers
• Huber
loss,
combination
of
squared
error
&
absolute
error
• Poisson
distribution
(e.g.,
number
of
claims
in
a
time
period)
• Gamma
distribution
(e.g,
size
of
insurance
claims)
• Tweedie
distribution
(compound
Poisson-‐Gamma)
Also,
H2O
Deep
Learning
supports:
• Offsets
(per-‐row)
• Observation
weights
(per-‐row) Know
your
math
17. Tip
#13:
Do
Ensembling
To
obtain
highest
accuracy,
it’s
often
more
effective
to
average
a
few
fast
and
diverse
models
than
to
build
one
large
model.
Ideally,
use
different
network
architectures,
regularization
schemes
to
keep
the
variance
high
before
ensembling.
Given
a
set
of
(diverse,
and
ideally,
good
or
great)
models,
use
• Blending
(just
average
the
models)
• Gut-‐feel
Blending
(assign
your
own
weights
that
add
up
to
1)
• SuperLearner
(optimized
stacking
with
meta-‐learner,
see
Erin’s
talk)
• Add
other
models
into
the
mix
(GBM,
DRF,
GLM,
etc.)
Ensembles
rule
the
leaderboards
18. Summary:
Checklist
for
Success
with
H2O
Deep
Learning
Know
your
Math
Use
Early
Stopping
Use
Checkpointing
Use
Regularization
Use
N-‐fold
Cross-‐Validation
Perform
HyperParameter
Search
Know
your
Data
(Sparsity/Categoricals)
Tune
Communication
on
Multi-‐Node
Establish
Baseline
on
Holdout
DataUnderstand
Model
Complexity
Control
Scoring
Overhead
Do
Ensembling
Inspect
Models
in
Flow