Crab - A Python Framework for Building Recommendation SystemsMarcel Caraciolo
Keynote introducing the Framework Crab: A Python toolkit for bulding recommendation engines. It is a open source project as an alternative for Mahout Taste for Python developers.
Presented at XII Python User Group Pernambuco, 07-05-2011 at CIN/UFPE.
Benchy: Lightweight framework for Performance Benchmarks Marcel Caraciolo
Benchy: Lightweight framework for Performance Benchmarks on Python Scripts.
Presented at XXVI Pernambuco Python User Group Meeting at Recife, Pernambuco, Brazil on 06.04.2013
Microsoft Research's Leslie Lamport at Build2014 - Thinking for ProgrammersIshit Makwana
Leslie Lamport inventor of Paxos and developer of LaTeX introduces techniques and tools that help programmers think above the code level to determine what applications and services should do and ensure that they do it. Depending on the task, the appropriate tools can range from simple prose to formal, tool-checked models written in TLA+ or PlusCal.
Crab - A Python Framework for Building Recommendation SystemsMarcel Caraciolo
Keynote introducing the Framework Crab: A Python toolkit for bulding recommendation engines. It is a open source project as an alternative for Mahout Taste for Python developers.
Presented at XII Python User Group Pernambuco, 07-05-2011 at CIN/UFPE.
Benchy: Lightweight framework for Performance Benchmarks Marcel Caraciolo
Benchy: Lightweight framework for Performance Benchmarks on Python Scripts.
Presented at XXVI Pernambuco Python User Group Meeting at Recife, Pernambuco, Brazil on 06.04.2013
Microsoft Research's Leslie Lamport at Build2014 - Thinking for ProgrammersIshit Makwana
Leslie Lamport inventor of Paxos and developer of LaTeX introduces techniques and tools that help programmers think above the code level to determine what applications and services should do and ensure that they do it. Depending on the task, the appropriate tools can range from simple prose to formal, tool-checked models written in TLA+ or PlusCal.
Palestra sobre Computação Científica com Python, Scipy e Numpy ministrada durante o XVI Encontro do Grupo de Usuários de Python de Pernambuco, Recife - Pernambuco - 03/09/2011 por Marcel Pinheiro Caraciolo
High Performance Predictive Analytics in R and HadoopDataWorks Summit
Hadoop is rapidly being adopted as a major platform for storing and managing massive amounts of data, and for computing descriptive and query types of analytics on that data. However, it has a reputation for not being a suitable environment for high performance complex iterative algorithms such as logistic regression, generalized linear models, and decision trees. At Revolution Analytics we think that reputation is unjustified, and in this talk I discuss the approach we have taken to porting our suite of High Performance Analytics algorithms to run natively and efficiently in Hadoop. Our algorithms are written in C++ and R, and are based on a platform that automatically and efficiently parallelizes a broad class of algorithms called Parallel External Memory Algorithms (PEMA’s). This platform abstracts both the inter-process communication layer and the data source layer, so that the algorithms can work in almost any environment in which messages can be passed among processes and with almost any data source. MPI and RPC are two traditional ways to send messages, but messages can also be passed using files, as in Hadoop. I describe how we use the file-based communication choreographed by MapReduce and how we efficiently access data stored in HDFS.
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Jimmy Lai
Big data analysis relies on exploiting various handy tools to gain insight from data easily. In this talk, the speaker demonstrates a data mining flow for text classification using many Python tools. The flow consists of feature extraction/selection, model training/tuning and evaluation. Various tools are used in the flow, including: Pandas for feature processing, scikit-learn for classification, IPython, Notebook for fast sketching, matplotlib for visualization.
Emotion detection from text using data mining and text miningSakthi Dasans
Emotion detection from text using data mining and text mining
Based on research paper published by Faculty of Engineering, The University of Tokushima at IEEE 2007 we build an intelligent system under the title Emotelligence on Text to recognize human emotion from textual contents.
i.e. if you give an input string , our system would possibly able to say the emotion behind that textual content.
Palestra sobre Computação Científica com Python, Scipy e Numpy ministrada durante o XVI Encontro do Grupo de Usuários de Python de Pernambuco, Recife - Pernambuco - 03/09/2011 por Marcel Pinheiro Caraciolo
High Performance Predictive Analytics in R and HadoopDataWorks Summit
Hadoop is rapidly being adopted as a major platform for storing and managing massive amounts of data, and for computing descriptive and query types of analytics on that data. However, it has a reputation for not being a suitable environment for high performance complex iterative algorithms such as logistic regression, generalized linear models, and decision trees. At Revolution Analytics we think that reputation is unjustified, and in this talk I discuss the approach we have taken to porting our suite of High Performance Analytics algorithms to run natively and efficiently in Hadoop. Our algorithms are written in C++ and R, and are based on a platform that automatically and efficiently parallelizes a broad class of algorithms called Parallel External Memory Algorithms (PEMA’s). This platform abstracts both the inter-process communication layer and the data source layer, so that the algorithms can work in almost any environment in which messages can be passed among processes and with almost any data source. MPI and RPC are two traditional ways to send messages, but messages can also be passed using files, as in Hadoop. I describe how we use the file-based communication choreographed by MapReduce and how we efficiently access data stored in HDFS.
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Jimmy Lai
Big data analysis relies on exploiting various handy tools to gain insight from data easily. In this talk, the speaker demonstrates a data mining flow for text classification using many Python tools. The flow consists of feature extraction/selection, model training/tuning and evaluation. Various tools are used in the flow, including: Pandas for feature processing, scikit-learn for classification, IPython, Notebook for fast sketching, matplotlib for visualization.
Emotion detection from text using data mining and text miningSakthi Dasans
Emotion detection from text using data mining and text mining
Based on research paper published by Faculty of Engineering, The University of Tokushima at IEEE 2007 we build an intelligent system under the title Emotelligence on Text to recognize human emotion from textual contents.
i.e. if you give an input string , our system would possibly able to say the emotion behind that textual content.
What is Python? An overview of Python for science.Nicholas Pringle
A brief introduction on the use of Python for scientists. Python is fast becoming a popular programming language for scientists. It is free, open source and constantly improving. Being an easy language to learn, it has a large a community of users. Its many favourable qualities make it the perfect language for scientific collaboration.
While academic research is more and more focusing on integration of deep learning approaches for machine translation, also called Neural Machine Translation, and shows promising and exciting results – the resulting systems still have important pragmatic limitations compared to the current generation of translation engine. We will be discussing how SYSTRAN is integrating these new techniques into production systems, the results and benefits for the end users, and our vision for the next versions.
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...MLconf
Convolutional Neural Networks at scale in Spark MLlib:
Jeremy Nixon will focus on the engineering and applications of a new algorithm built on top of MLlib. The presentation will focus on the methods the algorithm uses to automatically generate features to capture nonlinear structure in data, as well as the process by which it’s trained. Major aspects of that include compositional transformations over the data, convolution, and distributed backpropagation via SGD with adaptive gradients and an adaptive learning rate. Applications will look into how to use convolutional neural networks to model data in computer vision, natural language and signal processing. Details around optimal preprocessing, the type of structure that can be learned, and managing its ability to generalize will inform developers looking to apply nonlinear modeling tools to problems that they face.
While academic research is more and more focusing on integration of deep learning approaches for machine translation, also called Neural Machine Translation, and shows promising and exciting results – the resulting systems still have important pragmatic limitations compared to the current generation of translation engine. We will be discussing how SYSTRAN is integrating these new techniques into production systems, the results and benefits for the end users, and our vision for the next versions.
Presentation on Functional Programming in a for-profit company using OCaml, presented on December 14th 2011 at Ghent University (UGent).
Due to the Haskell background of the attendees, OCaml was introduced with this in mind.
Outlines the vision and philosophy for Wakari.io with a basic overview of popular python data analysis packages. Most of the talk is conducted in Wakari and is not visible on these slides. 90 minutes for PyData NYC, November 8th 2013.
The next Raspberry Pi programming day at CERN is on the 5 October 2013. There will be live demonstrations of projects by members of CERN and guest speakers from Google and Ibisense. There will also be a walk-in tutorial session throughout the day, to encourage programming and for more questions. The day will be in French and English and cater for all levels of ability, from beginners to experts. The agenda includes introductory material, as well as more advanced topics. More project ideas and extension boards will be available during the tutorial session. There will be programming examples using Scratch, Python and C, and demonstrations of interfacing using breadboards, general purpose expansion boards (Gertboard, PiFace), FPGAs, and the PIC microcontroller. The day will also feature the Raspberry Pi camera module and an USB controlled robotic arm. If you have your own Raspberry Pi or would like to find out more, please sign up. We would like to hear about your project ideas and educational uses of the Raspberry Pi too.
http://home.cern/students-educators/updates/2013/09/sign-now-raspberry-pi-day-cern
For fun and profit. I gave this talk on Mar 16, 2016 for Python Project Night at Northwestern University. Huge credit goes to Brian Lange - I reused a bunch of his slides.
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...Simplilearn
Deep Learning covers all the essential Deep Learning frameworks that are necessary to build AI models. In this presentation, you will learn about the development of essential frameworks such as TensorFlow, Keras, PyTorch, Theano, etc. You will also understand the programming languages used to build the frameworks, the different companies that use these frameworks, the characteristics of these Deep Learning frameworks, and type of models that were built using these frameworks. Now, let us get started with understanding the different popular Deep Learning frameworks being used in industries.
Below are the different Deep Learning frameworks we'll be discussing in this presentation:
1. TensorFlow
2. Keras
3. PyTorch
4. Theano
5. Deep Learning 4 Java
6. Caffe
7. Chainer
8. Microsoft CNTK
Why Deep Learning?
It is one of the most popular software platforms used for deep learning and contains powerful tools to help you build and implement artificial neural networks.
Advancements in deep learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change. With this Tensorflow course, you’ll build expertise in deep learning models, learn to operate TensorFlow to manage neural networks and interpret the results.
And according to payscale.com, the median salary for engineers with deep learning skills tops $120,000 per year.
You can gain in-depth knowledge of Deep Learning by taking our Deep Learning certification training course. With Simplilearn’s Deep Learning course, you will prepare for a career as a Deep Learning engineer as you master concepts and techniques including supervised and unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms. Those who complete the course will be able to:
1. Understand the concepts of TensorFlow, its main functions, operations, and the execution pipeline
2. Implement deep learning algorithms, understand neural networks and traverse the layers of data abstraction which will empower you to understand data like never before
3. Master and comprehend advanced topics such as convolutional neural networks, recurrent neural networks, training deep networks and high-level interfaces
4. Build deep learning models in TensorFlow and interpret the results
5. Understand the language and fundamental concepts of artificial neural networks
6. Troubleshoot and improve deep learning models
7. Build your own deep learning project
8. Differentiate between machine learning, deep learning, and artificial intelligence
Learn more at https://www.simplilearn.com/deep-learning-course-with-tensorflow-training
Neste tutorial apresentei usando Python Básico conceitos de como construir um sistema de recomendação por filtragem colaborativa.
Mutirão PyCursos:
Vídeo em: https://plus.google.com/u/0/events/c3hqbk20omt3r5uoq13gpk82i9g
Novas Tendências para a Educação a Distância: Como reinventar a educação ?Marcel Caraciolo
Apresentação realizada durante a Conferência Talk a Bit em Junho/2012 e realizada durante o PET 2012 por Marcel Caraciolo.
Universidade Federal de Pernambuco, 2012
Aula sobre construção de webcrawlers utilizando expressões regulares e Python
Instrutor: Marcel Caraciolo
Mais informações sobre o restante do curso em:
http://www.pycursos.com/regex
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
2. Who’s me ? Marcel Pinheiro Caraciolo
Brazilian, lover of crabs
Director of P&D - brazilian startup Orygens
M.S.C Candidate at Data Mining and Recommender Systems
Current moderator of the Local Python User Group at Pernambuco
Interested at machine learning,
recommender systems and mobile computing
Blogging about machine learning with Python since 2008
http://aimotion.blogspot.com
Young apprentice with Python programming since 2008.
2
12. Playing with the text...
The most frequent words at the conference
nltk, re
12
13. But let’s take a deeper look.
I used the clustering algorithm K-Means
Tool used for visualization Ubigraph
13
14. Distribution of the Lectures
Basic Frameworks
matplotlib, ipython
Building frameworks
performance, models, web services
Parallelism
performance, gpu, statistical
Visualization
Numpy data analysis, statistical
toolkits using Numpy
14
15. To sum up...
Mining english text is so
much easier!!!
Submit your work also!
Spread the scientific python over the
community
I expect to be back to Scipy next year!
15