Slides from my talk at Open Data Science Conference 2016.
Algorithms and models are an important (and cool) part of data science. This talk is about all the other steps that it takes to deploy a data science project that makes a product slightly smarter. Stuff that you hear from practitioners, but is not covered well enough in books.
Intro to Data Science for Non-Data ScientistsSri Ambati
Erin LeDell and Chen Huang's presentations from the Intro to Data Science for Non-Data Scientists Meetup at H2O HQ on 08.20.15
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Slides from my presentation at the Data Intelligence conference in Washington DC (6/23/2017). See this link for the abstract: http://www.data-intelligence.ai/presentations/36
Intro to Data Science for Non-Data ScientistsSri Ambati
Erin LeDell and Chen Huang's presentations from the Intro to Data Science for Non-Data Scientists Meetup at H2O HQ on 08.20.15
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Slides from my presentation at the Data Intelligence conference in Washington DC (6/23/2017). See this link for the abstract: http://www.data-intelligence.ai/presentations/36
Agile Data Science is a lean methodology that is adopted from Agile Software Development. At the core it centers around people, interactions, and building minimally viable products to ship fast and often to solicit customer feedback. In this presentation, I describe how this work was done in the past with examples. Get started today with our help by visiting http://www.alpinenow.com
Curious about Data Science? Self-taught on some aspects, but missing the big picture? Well, you’ve got to start somewhere and this session is the place to do it.
This session will cover, at a layman’s level, some of the basic concepts of Data Science. In a conversational format, we will discuss: What are the differences between Big Data and Data Science – and why aren’t they the same thing? What distinguishes descriptive, predictive, and prescriptive analytics? What purpose do predictive models serve in a practical context? What kinds of models are there and what do they tell us? What is the difference between supervised and unsupervised learning? What are some common pitfalls that turn good ideas into bad science?
During this session, attendees will learn the difference between k-nearest neighbor and k-means clustering, understand the reasons why we do normalize and don’t overfit, and grasp the meaning of No Free Lunch.
A Practical-ish Introduction to Data ScienceMark West
In this talk I will share insights and knowledge that I have gained from building up a Data Science department from scratch. This talk will be split into three sections:
1. I'll begin by defining what Data Science is, how it is related to Machine Learning and share some tips for introducing Data Science to your organisation.
2. Next up well run through some commonly used Machine Learning algorithms used by Data Scientists, along with examples for use cases where these algorithms can be applied.
3. The final third of the talk will be a demonstration of how you can quickly get started with Data Science and Machine Learning using Python and the Open Source scikit-learn Library.
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...Edureka!
This Edureka Data Science course slides will take you through the basics of Data Science - why Data Science, what is Data Science, use cases, BI vs Data Science, Data Science tools and Data Science lifecycle process. This is ideal for beginners to get started with learning data science.
You can read the blog here: https://goo.gl/OoDCxz
You can also take a complete structured training, check out the details here: https://goo.gl/AfxwBc
A Hybrid Approach to Data Science Project ManagementElaine K. Lee
A talk about how Civis Analytics, a data science consultancy and software company, does project management using a blend of approaches from academia, consulting, and software engineering.
A presentation delivered by Mohammed Barakat on the 2nd Jordanian Continuous Improvement Open Day in Amman. The presentation is about Data Science and was delivered on 3rd October 2015.
data scientist the sexiest job of the 21st centuryFrank Kienle
Invited talk, describing the exciting work at Blue Yonder (www.blue-yonder.com),
'congress smart services - new business models' in Aachen, Germany 2015
Two hour lecture I gave at the Jyväskylä Summer School. The purpose of the talk is to give a quick non-technical overview of concepts and methodologies in data science. Topics include a wide overview of both pattern mining and machine learning.
See also Part 2 of the lecture: Industrial Data Science. You can find it in my profile (click the face)
In this presentation, let's have a look at What is Data Science and it's applications. We discussed most common use cases of Data Science.
I presented this at LSPE-IN meetup happened on 10th March 2018 at Walmart Global Technology Services.
Introduction to Data Science and AnalyticsSrinath Perera
This webinar serves as an introduction to WSO2 Summer School. It will discuss how to build a pipeline for your organization and for each use case, and the technology and tooling choices that need to be made for the same.
This session will explore analytics under four themes:
Hindsight (what happened)
Oversight (what is happening)
Insight (why is it happening)
Foresight (what will happen)
Recording http://t.co/WcMFEAJHok
Is Agile Data Science just two buzzwords put together? I argue that agile is a very practical and applicable methodology, that does work well in the real world for all sorts of Analytics and Data Science workflows.
http://theinnovationenterprise.com/summits/digital-web-analytics-summit-london-2015/schedule
Agile Data Science is a lean methodology that is adopted from Agile Software Development. At the core it centers around people, interactions, and building minimally viable products to ship fast and often to solicit customer feedback. In this presentation, I describe how this work was done in the past with examples. Get started today with our help by visiting http://www.alpinenow.com
Curious about Data Science? Self-taught on some aspects, but missing the big picture? Well, you’ve got to start somewhere and this session is the place to do it.
This session will cover, at a layman’s level, some of the basic concepts of Data Science. In a conversational format, we will discuss: What are the differences between Big Data and Data Science – and why aren’t they the same thing? What distinguishes descriptive, predictive, and prescriptive analytics? What purpose do predictive models serve in a practical context? What kinds of models are there and what do they tell us? What is the difference between supervised and unsupervised learning? What are some common pitfalls that turn good ideas into bad science?
During this session, attendees will learn the difference between k-nearest neighbor and k-means clustering, understand the reasons why we do normalize and don’t overfit, and grasp the meaning of No Free Lunch.
A Practical-ish Introduction to Data ScienceMark West
In this talk I will share insights and knowledge that I have gained from building up a Data Science department from scratch. This talk will be split into three sections:
1. I'll begin by defining what Data Science is, how it is related to Machine Learning and share some tips for introducing Data Science to your organisation.
2. Next up well run through some commonly used Machine Learning algorithms used by Data Scientists, along with examples for use cases where these algorithms can be applied.
3. The final third of the talk will be a demonstration of how you can quickly get started with Data Science and Machine Learning using Python and the Open Source scikit-learn Library.
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...Edureka!
This Edureka Data Science course slides will take you through the basics of Data Science - why Data Science, what is Data Science, use cases, BI vs Data Science, Data Science tools and Data Science lifecycle process. This is ideal for beginners to get started with learning data science.
You can read the blog here: https://goo.gl/OoDCxz
You can also take a complete structured training, check out the details here: https://goo.gl/AfxwBc
A Hybrid Approach to Data Science Project ManagementElaine K. Lee
A talk about how Civis Analytics, a data science consultancy and software company, does project management using a blend of approaches from academia, consulting, and software engineering.
A presentation delivered by Mohammed Barakat on the 2nd Jordanian Continuous Improvement Open Day in Amman. The presentation is about Data Science and was delivered on 3rd October 2015.
data scientist the sexiest job of the 21st centuryFrank Kienle
Invited talk, describing the exciting work at Blue Yonder (www.blue-yonder.com),
'congress smart services - new business models' in Aachen, Germany 2015
Two hour lecture I gave at the Jyväskylä Summer School. The purpose of the talk is to give a quick non-technical overview of concepts and methodologies in data science. Topics include a wide overview of both pattern mining and machine learning.
See also Part 2 of the lecture: Industrial Data Science. You can find it in my profile (click the face)
In this presentation, let's have a look at What is Data Science and it's applications. We discussed most common use cases of Data Science.
I presented this at LSPE-IN meetup happened on 10th March 2018 at Walmart Global Technology Services.
Introduction to Data Science and AnalyticsSrinath Perera
This webinar serves as an introduction to WSO2 Summer School. It will discuss how to build a pipeline for your organization and for each use case, and the technology and tooling choices that need to be made for the same.
This session will explore analytics under four themes:
Hindsight (what happened)
Oversight (what is happening)
Insight (why is it happening)
Foresight (what will happen)
Recording http://t.co/WcMFEAJHok
Is Agile Data Science just two buzzwords put together? I argue that agile is a very practical and applicable methodology, that does work well in the real world for all sorts of Analytics and Data Science workflows.
http://theinnovationenterprise.com/summits/digital-web-analytics-summit-london-2015/schedule
SAP FORUM 2016 - CAPGEMINI COLOMBIA - DIGITAL TRANSFORMATIONJosé Antonio Lorenzo
This session was held at SAP Forum 2016 in Bogota, Capgemini Colombia CEO and SAP Solution Architect explained the challenges that companies are facing to become Digital Leaders and how the SAP portfolio can help these companies to transform their digital capabilites
Frederik's passion project at the Metis Data Science Bootcamp in NYC, Jan-April 2015. Frederik built an end-to-end loan funding predictor/classifier for micro-finance platform Kiva.org
CRISP-DM: Data Mining e Modelos PreditivosLeandro Guerra
CRISP-DM: Data Mining e Modelos Preditivos. Seguimos a metodologia do CRISP-DM para desenvolver uma árvore de decisão aplicada à um problema do website Kaggle.
How to cover the whole Translation Project Workflow with one open-source syst...Qabiria
1st ProZ.com Europe International conference - Rome 2011 - Presentation by Marco Cevoli (Qabiria).
Pros and cons of an open-source project management system specific for the language services industry: ]project-open[
Five awesome django tutorials - Open Data Scienceopendatascience
Here is a PPT of Five Awesome Django Tutorials by Jason O’Rawe a ODSC data science team contributor. By this tutorial you will get Awesome Django Tutorials.
Dan Mallinger, Data Science Practice Manager, Think Big Analytics at MLconf NYCMLconf
Despite a wide array of advanced techniques available today, too many practitioners are forced to return to their old toolkit of approaches deemed “more interpretable.” Whether because of non-legal policy or difficulty in executive presentation, these restraints result from poor analytics communication and inability to explain model risks and outcomes, not a failing of the techniques.
From sampling to feature reduction to supervised modeling, the toolbox and communications of data scientists are limited by these constraints. But, instead of simplifying models, data scientists can re-introduce often ignored statistical practices to describe the models, their risk, and the impact of changes in the customer environment.
Even in situations without restrictions, these approaches will improve how practitioners select models and communicate results. Through measurement and simulation, reviewed approaches can be used to articulate the promises, risks, and assumptions of developed models, without requiring deep statistical explanations.
Webinar - The Science of Segmentation: What Questions You Should be Asking Yo...VMware Tanzu
Enterprise companies starting the transformation into a data-driven organization often wonder where to start. Companies have traditionally collected large amounts of data from sources such as operational systems. With the rise of big data, big data technologies and the Internet of Things (IoT), additional sources – such as sensor readings and social media posts – are rapidly becoming available. In order to effectively utilize both traditional sources and new ones, companies first need to join and view the data in a holistic context. After establishing a data lake to bring all data sources together in a single analytics environment, one of the first data science projects worth exploring is segmentation, which automatically identifies patterns.
In this DSC webinar, two Pivotal data scientists will discuss:
· What segmentation is
· Traditional approaches to segmentation
· How big data technologies are enabling advances in this field
They will also share some stories from past data science engagements, outline best practices and discuss the kinds of insights that can be derived from a big data approach to segmentation using both internal and external data sources.
Panelist:
Grace Gee, Data Scientist -- Pivotal
Jarrod Vawdrey, Data Scientist -- Pivotal
Hosted by:
Tim Matteson, Co-Founder -- Data Science Central
To learn more about data at Pivotal, visit http://www.pivotal.io/big-data
To view video, visit https://www.youtube.com/watch?v=svKLdMWusGA
What is Data Science, applications of machine learning, how to start a career in this exciting field. As presented at a session for Delaware Tech Meetup.
The Sky’s the Limit – The Rise of Machine LearninInside Analysis
The Briefing Room with Analyst Dr. Robin Bloor and SkyTree
Live Webcast on June 24, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=1da2b498fc39b8b331a5bbb8dea2660f
With data growing more complex these days, many organizations are looking for ways to make sense of new information sources. The goal? Sprint ahead of the competition by exploiting fast-moving opportunities. The challenge? The data volumes, variety and velocity call for significantly greater horsepower than ever before. That’s where machine learning comes into play, and it’s already fundamentally changing the Big Data Analytics landscape.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he explains how advanced analytics technology can transform the enterprise. He’ll be briefed by Martin Hack, CEO of Skytree, who will tout his company’s machine learning solution for big data. Hack will discuss the critical challenges facing today’s data professionals, and present use cases to show how machine learning can help organizations leverage big data as a capital asset. He’ll specifically address the power of predictive analytics, which can help companies seize opportunities and prevent serious problems.
Visit InsideAnlaysis.com for more information.
The beginning of this talk is applicable to a wide audience. Only the last step is JavaScript specific.
Presentation given to UtahJS Meetup (https://utahjs.com/) on July 7, 2016. Introduction of LEAP framework and how to use AWS to create a predictive model and consume prediction with the AWS JavaScript SDK.
The beginning of this talk is applicable to a wide audience. Only the last step is JavaScript specific.
Presentation given to UtahJS Meetup (https://utahjs.com/) on July 7, 2016. Introduction of LEAP framework and how to use AWS to create a predictive model and consume prediction with the AWS JavaScript SDK.
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...Kai Wähner
"Big Data" is currently a big hype. Large amounts of historical data are stored in Hadoop or other platforms. Business Intelligence tools and statistical computing are used to draw new knowledge and to find patterns from this data, for example for promotions, cross-selling or fraud detection. The key challenge is how these findings can be integrated from historical data into new transactions in real time to make customers happy, increase revenue or prevent fraud.
"Fast Data" via stream processing is the solution to embed patterns - which were obtained from analyzing historical data - into future transactions in real-time. This session uses several real world success stories to explain the concepts behind stream processing and its relation to Hadoop and other big data platforms. The session discusses how patterns and statistical models of R, Spark MLlib and other technologies can be integrated into real-time processing using open source frameworks (such as Apache Storm, Spark or Flink) or products (such as IBM InfoSphere Streams or TIBCO StreamBase). A live demo shows the complete development lifecycle combining analytics, machine learning and stream processing.
Shared at "Data-Driven Design for User Experience" with Le Wagon Tokyo, 25 Aug
https://www.meetup.com/ja-JP/Le-Wagon-Tokyo-Coding-Station/events/280067831/
In UX design, data means the voice of users (customers) and actionable insights that are beyond just numbers. Hearing these voices through user research and usage analytics is a critical process of building a human-centric design. Based on data-driven design, UX designers, product managers, and even senior management can listen to the inner voice of users and extrapolate those to discover a user journey for clear call-to-action and unwavering customer loyalty.
At this webinar, our guest speaker Emi Kwon, UX Design Director at Metlife, will walk you through the basics of data-driven design as well as share some tips and tricks for making data-driven design your value proposition as a product manager/ UX specialist.
Agenda:
✔️ Data ecosystem — Data lake, data warehouse…what does it mean for UX?
✔️ Small data and big data — the opportunities and pitfalls
✔️ Research method basics — qualitative, quantitative or triangulated
✔️ Usage analytics and A/B testing
✔️ What about COVID-19 and remote usability testing?
TechWise with Eric Kavanagh, Dr. Robin Bloor and Dr. Kirk Borne
Live Webcast on July 23, 2014
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=59d50a520542ee7ed00a0c38e8319b54
Analytical applications are everywhere these days, and for good reason. Organizations large and small are using analytics to better understand any aspect of their business: customers, processes, behaviors, even competitors. There are several critical success factors for using analytics effectively: 1) know which kind of apps make sense for your company; 2) figure out which data sets you can use, both internal and external; 3) determine optimal roles and responsibilities for your team; 4) identify where you need help, either by hiring new employees or using consultants 5) manage your program effectively over time.
Register for this episode of TechWise to learn from two of the most experienced analysts in the business: Dr. Robin Bloor, Chief Analyst of The Bloor Group, and Dr. Kirk Borne, Data Scientist, George Mason University. Each will provide their perspective on how companies can address each of the key success factors in building, refining and using analytics to improve their business. There will then be an extensive Q&A session in which attendees can ask detailed questions of our experts and get answers in real time. Registrants will also receive a consolidated deck of slides, not just from the main presenters, but also from a variety of software vendors who provide targeted solutions.
Visit InsideAnlaysis.com for more information.
Forces and Threats in a Data Warehouse (and why metadata and architecture is ...Stefan Urbanek
This keynote looks at some very common forces and threats that are causing common suffering in a data warehouse. Shows examples why the concepts are still relevant despite having all high-end technology. Provides suggestions for starting with architecture and metadata.
The What, Why and How of Analytics TestingAnand Bagmar
Here are slides from my talk on "What, Why and How of Analytics Testing" at Selenium Conference, Berlin 2017.
This talk focusses on Analytics related to Browser & Mobile Native apps, with impact on / of IoT (Internet of Things) and Big Data.
See my blog for more details - https://essenceoftesting.blogspot.com/2017/10/analytics-forgotten-child.html
Practical Strategies for Targeting the Fortune 1000BAO Inc.
In this on-demand webinar, Jim Higgins discusses what's happening with tech adoption and purchasing in the Fortune 1000 - and how you can apply these insights to intelligently tackle the target accounts that matter to your business.
Predictive Asset Optimization - Advanced AnalyticsLeonard Lee
IBM's solution offering for predictive analytics can help companies improve the management and maintenance of their assets as well as their customer installed base.
In this webinar hosted by DeepCrawl, we take a look at how clickstream data - from the SERPs through to checkout - can be analyzed to form predictions around the optimal customer journey. We dig into how the predictions can be utilized to optimize sites and apps to more fluidly guide customers from point of entry to conversion. We also review how understanding your crawl budget and the factors that impact which of your site's pages are indexed are all critical to creating a valid model to optimize your content for increased conversions.
Webinar: Everyone cares about sample quality but not everyone values it!Matt Dusig
On December 7, 2016, Mark Menig, Chief Executive Officer of TrueSample and Lisa Wilding-Brown, Chief Research Officer of Innovate MR explored various strategies to help research professionals navigate the challenging landscape of online sample quality. The webinar addressed:
• A brief overview of quality through the years. Where have we been and where are we going?
• What are current examples of online sample fraud (i.e., bots, hijackers, foreign click shops etc.)?
• What are the challenges and costs associated with today’s online fraud? How does online fraud impact data quality, specifically B2B research?
• What technical and behavioral strategies help to protect online research?
Webinar: Everyone cares about sample quality but not everyone values it!Matt Dusig
On December 7, 2016, Mark Menig, Chief Executive Officer of TrueSample and Lisa Wilding-Brown, Chief Research Officer of Innovate MR explored various strategies to help research professionals navigate the challenging landscape of online sample quality. The webinar addressed:
• A brief overview of quality through the years. Where have we been and where are we going?
• What are current examples of online sample fraud (i.e., bots, hijackers, foreign click shops etc.)?
• What are the challenges and costs associated with today’s online fraud? How does online fraud impact data quality, specifically B2B research?
• What technical and behavioral strategies help to protect online research?
Humans are sentient. We perceive. We feel. We listen. The problem is the more you put together, the more we lose these capabilities. We get slower. The idea is, how we create a company that acts like a single organism, where we identify opportunities, and that allows us to work in a faster and exponential world world where development happens in months rather than years. Don't let digital transformation become a war of competitive attrition. You may need to invest in your future to change the game.
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
Data privacy is one of the most critical issues that businesses face. This presentation shares insights on the principles and best practices for ensuring the resilience and security of your workload.
Drawing on a real-life project from the HR industry, the various challenges will be demonstrated: data protection, self-healing, business continuity, security, and transparency of data processing. This systematized approach allowed to create a secure AWS cloud infrastructure that not only met strict compliance rules but also exceeded the client's expectations.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?XfilesPro
Worried about document security while sharing them in Salesforce? Fret no more! Here are the top-notch security standards XfilesPro upholds to ensure strong security for your Salesforce documents while sharing with internal or external people.
To learn more, read the blog: https://www.xfilespro.com/how-does-xfilespro-make-document-sharing-secure-and-seamless-in-salesforce/
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Modern design is crucial in today's digital environment, and this is especially true for SharePoint intranets. The design of these digital hubs is critical to user engagement and productivity enhancement. They are the cornerstone of internal collaboration and interaction within enterprises.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Strategies for Successful Data Migration Tools.pptxvarshanayak241
Data migration is a complex but essential task for organizations aiming to modernize their IT infrastructure and leverage new technologies. By understanding common challenges and implementing these strategies, businesses can achieve a successful migration with minimal disruption. Data Migration Tool like Ask On Data play a pivotal role in this journey, offering features that streamline the process, ensure data integrity, and maintain security. With the right approach and tools, organizations can turn the challenge of data migration into an opportunity for growth and innovation.
43. ∎ “Would you be able to resolve this ticket successfully?”
∎ “Would an expert user be able to resolve this ticket
successfully?”
∎ “Would an expert user be able to resolve this ticket
successfully without getting a negative rating?”
LABELING - HOW TO
PHRASE THE
QUESTION?
There are 2 big areas of data science - A for “analyze” and B for “build”.
A is product development informed by data. It became adopted pretty widely by now. Having analytics, running A/B tests, doing cohort and funnel analysis became part of the product management culture.
The “build” kind of data science is about building smarts into the product itself and this is the kind I want to talk about.
Implementing some of this requires machine learning and it is important for product managers to understand the level of complexity of some techniques that apply to their products. However, when machine learning is discussed, too much emphasis is put on the algorithms.
More needs to be said about how a smart product gains humans’ trust and make them feel good about using it.
An app that allows you to pay for parking. You fire it up, it shows 3 choices - start a new parking session, see you old sessions.
Choose “Start a new session”, go to next screen, there are several options here - select a parking zone. Done.
I would not give this a second thought on a desktop.
But when I use this app, I’m late, I hold the phone in one hand and trying to pay for parking while running to the ferry.
I am running and fumbling with the phone and thinking - DON’T YOU KNOW ME?!
It’s a weekday morning, I am at the parking lot next to the ferry terminal, you have seen me here before. More than once.
Just give me one button - PAY NOW. And a small link to all the other features.
Every time a user has this “DON’T YOU KNOW ME?!” moment, it is an opportunity to make a product just a little bit smarter.
Smart products convert DONT YOU KNOW ME?! into YOU GET ME!
Even when they don’t know my next step exactly, they reduce the search space.
Smarter products - new problems.
Complexity goes way beyond the algorithms.
Take Nest smart thermostat - great visual design, easy to install, it is powered by machine learning that learns your preferences. It’s a good product, but even they can’t get it quite right.
Got it when we just had our baby.
We both like it pretty cool, but my wife felt cold after birth. This is just when Nest was learning.
Once it did, for some reason it was very tough for it to adjust.
Another thing - when it turns the heater on, there is no indicator is it was a human in the house or the software. I am OK correcting Nest. But not my wife.
Making products smarter introduces probabilistic behavior.
Because probabilistic behavior feels kind of like life, you start having different expectations.
Northern California has some very hot days with cold mornings. On a day like that I would not turn the heater on in the morning. But Nest would. It just knows - get to 68 degrees. But it has no context - something that is easy and intuitive to a human is not easy to software.
Getting the relationship of the user with a smart product right is tricky.
Product managers are the best people in a company to get the tradeoffs right.
Just like a pm does not have to be developer to manage a software product, she does not have to be a mathematician or a data scientist to manage a data product.
But it is necessary to understand some core concepts.
I'll use 2 data products to demonstrate some of these necessary concepts.
Here is the second data product. This one is B2B and is working in the background.
Directly helps companies like Airbnb, Linkedin, Pinterest with on-demand customer support. When a user submits a support ticket, some of these are sent to Directly which distributes them to a network of expert users that are ready to answer them. If experts resolve a question successfully, they get paid and Directly takes a cut. Otherwise, the experts can reroute the ticket back to the customer’s call center.
When questions are created in the helpdesk how do we find ones that the expert users can (and want) to solve?
Initially, we relied on our customers to configure some categories that their users chose when they were filling out the support form.
Users are not great about categorizing their issues.
We tried keywords. Very cumbersome to manage.
We need to pick as many tickets as we can, but not to create too much noise for the experts.
Getting the relationship of the user with a smart product right is tricky.
Product managers are the best people in a company to get the tradeoffs right.
Just like a pm does not have to be developer to manage a software product, she does not have to be a mathematician or a data scientist to manage a data product.
But it is necessary to understand some core concepts.
I'll use 2 data products to demonstrate some of these necessary concepts.
Solution: let us look at at ALL your tickets as they come in and a machine learning model will choose which ones will be sent to the expert users.
Here is how it works: ….. Explain the image
The model is a classifier and it needs examples to learn what a good ticket looks. It can do so from watching how the experts respond to tickets they have seen earlier. If the experts took a ticket and resolved it successfully, it becomes a positive example. If the send the question back or resolve it, but the user reviews their answer negatively, this question becomes a negative example.
ML startups ask companies “give us all your data”
I was preparing for a touch conversation.
Getting access to more and better data…
“Is it a yes?”
Think of getting data early, before you need it
Legal.
Stripping of anything personal.
Insist on storing.
Customer success (Account managers) - interested. One of the main metrics they are responsible for is our ticket share- percentage of tickets we are handling at a customer.
ML startups ask companies “give us all your data”
I was preparing for a touch conversation.
Getting access to more and better data…
“Is it a yes?”
Think of getting data early, before you need it
Legal.
Stripping of anything personal.
Insist on storing.
The improvements that you can get from cleaning your data are great.
The plot of the movie Big Short can be summarized as “guys clean a dataset, get rich”.
In case of Jawbone meal logging, the biggest lyft in performance came from realizing that breakfasts are different from other meals. Spinach in the morning was probably a part of omelete. Spinach at lunch was most likely a salad.
Sometimes, cleaning your data requires a good understanding of the domain you are working with.
Which properties of your data you do and don’t use is to a significant degree a product management decision.
For example, different cuisines disagree on what foods are eaten best together. Do you use this knowledge somehow? Depends what you know about your users.
Monica Rogati, who used to be VP of data at Jawbone has this saying:...
Yes, you could go much more advanced algorithm, but this simple one can get you pretty far.
the biggest improvements were achieved by cleaning the data and understanding it deeply
Account managers - interested
How do we know if a model is good?
When "normal software” breaks, it breaks with high visibility. An issue with ML is that it will ALWAYS give you an answer.
How we compare models?
An obvious metric is accuracy. Basically the percentage of predictions that the algorithm, gets right. However in product is data science this is a very bad metric.
This depends on how balanced or unbalanced the classes that you are predicting are.
Example: fraud detection, rare disease testing. If 0.1% of transactions are fraudulent, you can create a “very sophisticated” predictive model. When asked “Is this transaction fraudulent?” it will always say “no”. The accuracy of this model will be about 99.9%.
Thinking through this is exactly the PM’s job. In this case you don’t need to know the math that underlies the predictive model.
How do we QA data products?
How do we know if a model is good?
When "normal software” breaks, it breaks with high visibility. An issue with ML is that it will ALWAYS give you an answer.
How we compare models?
An obvious metric is accuracy. Basically the percentage of predictions that the algorithm, gets right. However in product is data science this is a very bad metric.
This depends on how balanced or unbalanced the classes that you are predicting are.
Example: fraud detection, rare disease testing. If 0.1% of transactions are fraudulent, you can create a “very sophisticated” predictive model. When asked “Is this transaction fraudulent?” it will always say “no”. The accuracy of this model will be about 99.9%.
Thinking through this is exactly the PM’s job. In this case you don’t need to know the math that underlies the predictive model.
How do we QA data products?
How do we QA data products?
When "normal software” breaks, it breaks with high visibility. An issue with ML is that it will ALWAYS give you an answer.
Monitoring in production
Unless you are making the ultimate data product - a make money while you sleep fund runner :) - your system lives in the world and interacts with people.
Once the product is out, other people carry the message and you cannot control it.
Listen to how an account manager talks about this with a client, how a salesperson talks with a prospect.
ML/DS is uniquely susceptible to BS - how to control it?
"Why did you show me ‘french fries’?" Well, because this is the item that is logged together most frequently with burger.
"Why you decided that this transaction is fraudulent? Why did you decide that this customer support ticket is resolvable?"
the simpler the model the more interpretable it is.
When a model is not easily interpreted, but it performs well, it’s your task to manage expectations.