One of the most popular buzz words nowadays in the technology world is “Machine Learning (ML).” Most economists and business experts foresee Machine Learning changing every aspect of our lives in the next 10 years through automating and optimizing processes. This is leading many organizations to seek experts who can implement Machine Learning into their businesses.
The paper will be written for statistical programmers who want to explore Machine Learning career, add Machine Learning skills to their experiences or enter a Machine Learning fields. The paper will discuss about personal journey to become to a Machine Learning Engineer from a statistical programmer. The paper will share my personal experience on what motivated me to start Machine Learning career, how I started it, and what I have learned and done to be a Machine Learning Engineer. In addition, the paper will also discuss the future of Machine Learning in Pharmaceutical Industry, especially in Biometric department.
This presentation covers an overview of Analytics and Machine learning. It also covers the Microsoft's contribution in Machine learning space. Azure ML Studio, a SaaS based portal to create, experiment and share Machine Learning Solutions to the external world.
Building machine learning muscle in your team & transitioning to make them do machine learning at scale. We also discuss about Spark & other relevant technologies.
Keynote presentation from ECBS conference. The talk is about how to use machine learning and AI in improving software engineering. Experiences from our project in Software Center (www.software-center.se).
Machine Learning with Data Science Online Course | Learn and Build Learn and Build
You are just one step away from becoming a Data Scientist Engineer. Learn a foundational understanding of Machine Learning techniques at one place. Get Online Machine Learning Certification at Learn and Build.
This presentation covers an overview of Analytics and Machine learning. It also covers the Microsoft's contribution in Machine learning space. Azure ML Studio, a SaaS based portal to create, experiment and share Machine Learning Solutions to the external world.
Building machine learning muscle in your team & transitioning to make them do machine learning at scale. We also discuss about Spark & other relevant technologies.
Keynote presentation from ECBS conference. The talk is about how to use machine learning and AI in improving software engineering. Experiences from our project in Software Center (www.software-center.se).
Machine Learning with Data Science Online Course | Learn and Build Learn and Build
You are just one step away from becoming a Data Scientist Engineer. Learn a foundational understanding of Machine Learning techniques at one place. Get Online Machine Learning Certification at Learn and Build.
How to Use Artificial Intelligence by Microsoft Product ManagerProduct School
The talk focused on the Fundamentals of Product Management, leveraging the speaker's personal experiences in the AI field. It covered core Product Manager topics such as managing customer needs, business goals & technology feasibility, the holy trinity of the Product Manager discipline, delve into data analyses, rapid experimentation, and execution, and finally, explored the challenges of customer privacy, bias, and inclusivity in AI products.
Automated machine learning (automated ML) automates feature engineering, algorithm and hyperparameter selection to find the best model for your data. The mission: Enable automated building of machine learning with the goal of accelerating, democratizing and scaling AI. This presentation covers some recent announcements of technologies related to Automated ML, and especially for Azure. The demonstrations focus on Python with Azure ML Service and Azure Databricks.
How to use Artificial Intelligence with Python? EdurekaEdureka!
YouTube Link: https://youtu.be/7O60HOZRLng
* Machine Learning Engineer Masters Program: https://www.edureka.co/masters-program/machine-learning-engineer-training *
This Edureka PPT on "Artificial Intelligence With Python" will provide you with a comprehensive and detailed knowledge of Artificial Intelligence concepts with hands-on examples.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
Walk through of azure machine learning studio new featuresLuca Zavarella
The session is mostly a demo one that will guide you into the new Azure Machine Learning Service world, focusing on the new features like the Designer (no code ML), Automated ML and ML Interpretability.
You can find the webinar in Italian language here: https://bit.ly/2w0EsNK
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016MLconf
Building a Machine Learning Platform at Quora: Each month, over 100 million people use Quora to share and grow their knowledge. Machine learning has played a critical role in enabling us to grow to this scale, with applications ranging from understanding content quality to identifying users’ interests and expertise. By investing in a reusable, extensible machine learning platform, our small team of ML engineers has been able to productionize dozens of different models and algorithms that power many features across Quora.
In this talk, I’ll discuss the core ideas behind our ML platform, as well as some of the specific systems, tools, and abstractions that have enabled us to scale our approach to machine learning.
Artificial Intelligence with Python | EdurekaEdureka!
YouTube Link: https://youtu.be/7O60HOZRLng
* Machine Learning Engineer Masters Program: https://www.edureka.co/masters-program/machine-learning-engineer-training *
This Edureka PPT on "Artificial Intelligence With Python" will provide you with a comprehensive and detailed knowledge of Artificial Intelligence concepts with hands-on examples.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
Simplifying AI and Machine Learning with Watson StudioDataWorks Summit
Are you seeing benefits from big data, AI and machine learning? Some companies are challenged by the complexity of the tools, access to quality data and the ability to operationalize these technologies. IBM’s Watson Studio addresses the needs of developers, data scientists and business analysts – who need to create, train and deploy machine and deep learning models, analyze and visualize data – all in an easy-to-use platform. Watson Studio supports Apple’s Core ML with Watson Visual Recognition service. It provides a suite of tools for data scientists, application developers and subject matter experts to collaboratively and easily work with data and use that data to build, train and deploy models at scale. When coupled with IBM Watson Knowledge Catalog, it enables companies to create a secure catalog of AI assets including datasets, documents and models. In this session, you will learn how to use these new offerings to solve real world business problems and infuse AI into your business to drive innovation.
Speaker
Sumit Goyal, IBM, Software Engineer
201906 02 Introduction to AutoML with ML.NET 1.0Mark Tabladillo
ML.NET 1.0 release is the first major milestone of a great journey that started in May 2018 when we released ML.NET 0.1 as open source. ML.NET is an open-source and cross-platform machine learning framework for .NET developers. Using ML.NET, developers can leverage their existing tools and skillsets to develop and infuse custom AI into their applications by creating custom machine learning models for common scenarios like Sentiment Analysis, Recommendation, Image Classification and more.
“Automated ML” is a collection of new technologies from Microsoft to enhance the data science development process. Still in preview, Auto ML for ML.NET 1.0 will be demonstrated in a Deep Learning Virtual Machine running Windows Server 2016. Code examples are in C# and run in Visual Studio Community 2019.
This presentation is the second of four related to ML.NET and Automated ML. The presentation will be recorded with video posted to this YouTube Channel: http://bit.ly/2ZybKwI
Deep learning goes beyond the traditional machine learning of big data and analytics. In this session, we will review the AWS offering, Amazon Machine Learning, and the AWS GPU-intensive family of servers that run native machine learning and deep-learning algorithms. We will also cover some basic deep-learning algorithms using open source software. Session sponsored by Day1 Solutions.
The Data Science Process - Do we need it and how to apply?Ivo Andreev
Machine learning is not black magic but a discipline that involves statistics, data science, analysis and hard work. From searching patterns and data preparation through applying and optimizing algorithms to obtaining usable predictions, one would need background and appropriate tools.
But do we need it, when there is already available AI as a service solution out there? Do we need to try hard with artificial neural networks? And if we decide to do so, what tools would be a safe bet?
In this session we will go through real world examples, mention key tools from Microsoft and open source world to do data science and machine learning and most importantly - we will provide a workflow and some best practices.
Data Workflows for Machine Learning - Seattle DAMLPaco Nathan
First public meetup at Twitter Seattle, for Seattle DAML:
http://www.meetup.com/Seattle-DAML/events/159043422/
We compare/contrast several open source frameworks which have emerged for Machine Learning workflows, including KNIME, IPython Notebook and related Py libraries, Cascading, Cascalog, Scalding, Summingbird, Spark/MLbase, MBrace on .NET, etc. The analysis develops several points for "best of breed" and what features would be great to see across the board for many frameworks... leading up to a "scorecard" to help evaluate different alternatives. We also review the PMML standard for migrating predictive models, e.g., from SAS to Hadoop.
When we hear the Word Machine Learning we think of Self Driving Car and Advanced Medical Solutions. This brings the awe-inspiring of Huge and Complex Data, Advanced Statistics, Algebra and Sophisticated Solutions & we get scared to Build Solutions in Machine Learning.
Machine Learning solutions are not that Hard to develop and the same time not that easy to make them perfect. This slide decks will provide insight and demos of How a Software Engineer can start Developing Machine Learning Solutions easily and Eventually master the Knowledge of Machine Learning.
Patient’s Journey using Real World Data and its Advanced AnalyticsKevin Lee
Real World Data (RWD) is data collected outside of clinical trial study, and Real-World Evidence (RWE) could be achieved through the insight from RWD. RWD sources come from EMR, health insurance claims, genomic data, and IoT from apps and wearables. RWD anonymized patient data has revolutionized how companies view patient data since it captures longitudinal pharmacy prescription, medical claims, and diagnosis.
The paper is written for those who want to understand how RWD patient data are collected and how they could be analyzed to support pharmaceutical companies. Mainly, RWD patient data could support patient analytics, commercial analytics, and payer analytics such as source of business, switch of prescription, payment method, market analysis, promotional activities, drug launch and forecasting. The paper also discusses the technology that data scientists use for RWD such as Data Warehouse, Data Visualization, Opensource Programming, Cloud Computing, GitHub, and Machine Learning.
Introduction of AWS Cloud Computing and its future for Biometric DepartmentKevin Lee
When statistical programmers or statisticians starts in open-source programming, we usually begin with installing Python and/or R on our local computer and writing codes in a local IDE such as Jupyter notebook or RStudio, but as biometric team grow, and advanced analytics become more prevalent, collaborative solutions and environments are needed. Traditional solutions have been SAS® servers, but nowadays, there is a growing need and interest for Cloud Computing. The paper is written for those who want to know about the Cloud Computing environment (e.g., AWS) and its possible implementation for the Biometric Department.
The paper will start with the main components of Cloud computing – databases, servers, applications, data analytics, reports, visualization, dashboards etc., and its benefits - Elasticity, Control, Flexibility, Integration, Reliability, Security, Inexpensive and Easy to Start. Most popular Cloud computing platforms are AWS, Google Cloud and Microsoft Azure, and this paper will introduce AWS Cloud Computing Environment.
The paper will also introduce the core technologies of AWS Cloud Computing – computing (EC2), Storage ( EBS, EFS, S3), Database ( Redshift, RDS, DynamoDB ), Security (IAM) and Networking (VPC ), and how they could be integrated to support modern-day data analytics.
Finally, the paper will introduce the department-driven Cloud computing transition project that the whole SAS programming department has moved from SAS Window Server into AWS Cloud Computing. It will also discuss the challenges, and the lessons learn and its future in the Biometric department
How to Use Artificial Intelligence by Microsoft Product ManagerProduct School
The talk focused on the Fundamentals of Product Management, leveraging the speaker's personal experiences in the AI field. It covered core Product Manager topics such as managing customer needs, business goals & technology feasibility, the holy trinity of the Product Manager discipline, delve into data analyses, rapid experimentation, and execution, and finally, explored the challenges of customer privacy, bias, and inclusivity in AI products.
Automated machine learning (automated ML) automates feature engineering, algorithm and hyperparameter selection to find the best model for your data. The mission: Enable automated building of machine learning with the goal of accelerating, democratizing and scaling AI. This presentation covers some recent announcements of technologies related to Automated ML, and especially for Azure. The demonstrations focus on Python with Azure ML Service and Azure Databricks.
How to use Artificial Intelligence with Python? EdurekaEdureka!
YouTube Link: https://youtu.be/7O60HOZRLng
* Machine Learning Engineer Masters Program: https://www.edureka.co/masters-program/machine-learning-engineer-training *
This Edureka PPT on "Artificial Intelligence With Python" will provide you with a comprehensive and detailed knowledge of Artificial Intelligence concepts with hands-on examples.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
Walk through of azure machine learning studio new featuresLuca Zavarella
The session is mostly a demo one that will guide you into the new Azure Machine Learning Service world, focusing on the new features like the Designer (no code ML), Automated ML and ML Interpretability.
You can find the webinar in Italian language here: https://bit.ly/2w0EsNK
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016MLconf
Building a Machine Learning Platform at Quora: Each month, over 100 million people use Quora to share and grow their knowledge. Machine learning has played a critical role in enabling us to grow to this scale, with applications ranging from understanding content quality to identifying users’ interests and expertise. By investing in a reusable, extensible machine learning platform, our small team of ML engineers has been able to productionize dozens of different models and algorithms that power many features across Quora.
In this talk, I’ll discuss the core ideas behind our ML platform, as well as some of the specific systems, tools, and abstractions that have enabled us to scale our approach to machine learning.
Artificial Intelligence with Python | EdurekaEdureka!
YouTube Link: https://youtu.be/7O60HOZRLng
* Machine Learning Engineer Masters Program: https://www.edureka.co/masters-program/machine-learning-engineer-training *
This Edureka PPT on "Artificial Intelligence With Python" will provide you with a comprehensive and detailed knowledge of Artificial Intelligence concepts with hands-on examples.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
Simplifying AI and Machine Learning with Watson StudioDataWorks Summit
Are you seeing benefits from big data, AI and machine learning? Some companies are challenged by the complexity of the tools, access to quality data and the ability to operationalize these technologies. IBM’s Watson Studio addresses the needs of developers, data scientists and business analysts – who need to create, train and deploy machine and deep learning models, analyze and visualize data – all in an easy-to-use platform. Watson Studio supports Apple’s Core ML with Watson Visual Recognition service. It provides a suite of tools for data scientists, application developers and subject matter experts to collaboratively and easily work with data and use that data to build, train and deploy models at scale. When coupled with IBM Watson Knowledge Catalog, it enables companies to create a secure catalog of AI assets including datasets, documents and models. In this session, you will learn how to use these new offerings to solve real world business problems and infuse AI into your business to drive innovation.
Speaker
Sumit Goyal, IBM, Software Engineer
201906 02 Introduction to AutoML with ML.NET 1.0Mark Tabladillo
ML.NET 1.0 release is the first major milestone of a great journey that started in May 2018 when we released ML.NET 0.1 as open source. ML.NET is an open-source and cross-platform machine learning framework for .NET developers. Using ML.NET, developers can leverage their existing tools and skillsets to develop and infuse custom AI into their applications by creating custom machine learning models for common scenarios like Sentiment Analysis, Recommendation, Image Classification and more.
“Automated ML” is a collection of new technologies from Microsoft to enhance the data science development process. Still in preview, Auto ML for ML.NET 1.0 will be demonstrated in a Deep Learning Virtual Machine running Windows Server 2016. Code examples are in C# and run in Visual Studio Community 2019.
This presentation is the second of four related to ML.NET and Automated ML. The presentation will be recorded with video posted to this YouTube Channel: http://bit.ly/2ZybKwI
Deep learning goes beyond the traditional machine learning of big data and analytics. In this session, we will review the AWS offering, Amazon Machine Learning, and the AWS GPU-intensive family of servers that run native machine learning and deep-learning algorithms. We will also cover some basic deep-learning algorithms using open source software. Session sponsored by Day1 Solutions.
The Data Science Process - Do we need it and how to apply?Ivo Andreev
Machine learning is not black magic but a discipline that involves statistics, data science, analysis and hard work. From searching patterns and data preparation through applying and optimizing algorithms to obtaining usable predictions, one would need background and appropriate tools.
But do we need it, when there is already available AI as a service solution out there? Do we need to try hard with artificial neural networks? And if we decide to do so, what tools would be a safe bet?
In this session we will go through real world examples, mention key tools from Microsoft and open source world to do data science and machine learning and most importantly - we will provide a workflow and some best practices.
Data Workflows for Machine Learning - Seattle DAMLPaco Nathan
First public meetup at Twitter Seattle, for Seattle DAML:
http://www.meetup.com/Seattle-DAML/events/159043422/
We compare/contrast several open source frameworks which have emerged for Machine Learning workflows, including KNIME, IPython Notebook and related Py libraries, Cascading, Cascalog, Scalding, Summingbird, Spark/MLbase, MBrace on .NET, etc. The analysis develops several points for "best of breed" and what features would be great to see across the board for many frameworks... leading up to a "scorecard" to help evaluate different alternatives. We also review the PMML standard for migrating predictive models, e.g., from SAS to Hadoop.
When we hear the Word Machine Learning we think of Self Driving Car and Advanced Medical Solutions. This brings the awe-inspiring of Huge and Complex Data, Advanced Statistics, Algebra and Sophisticated Solutions & we get scared to Build Solutions in Machine Learning.
Machine Learning solutions are not that Hard to develop and the same time not that easy to make them perfect. This slide decks will provide insight and demos of How a Software Engineer can start Developing Machine Learning Solutions easily and Eventually master the Knowledge of Machine Learning.
Patient’s Journey using Real World Data and its Advanced AnalyticsKevin Lee
Real World Data (RWD) is data collected outside of clinical trial study, and Real-World Evidence (RWE) could be achieved through the insight from RWD. RWD sources come from EMR, health insurance claims, genomic data, and IoT from apps and wearables. RWD anonymized patient data has revolutionized how companies view patient data since it captures longitudinal pharmacy prescription, medical claims, and diagnosis.
The paper is written for those who want to understand how RWD patient data are collected and how they could be analyzed to support pharmaceutical companies. Mainly, RWD patient data could support patient analytics, commercial analytics, and payer analytics such as source of business, switch of prescription, payment method, market analysis, promotional activities, drug launch and forecasting. The paper also discusses the technology that data scientists use for RWD such as Data Warehouse, Data Visualization, Opensource Programming, Cloud Computing, GitHub, and Machine Learning.
Introduction of AWS Cloud Computing and its future for Biometric DepartmentKevin Lee
When statistical programmers or statisticians starts in open-source programming, we usually begin with installing Python and/or R on our local computer and writing codes in a local IDE such as Jupyter notebook or RStudio, but as biometric team grow, and advanced analytics become more prevalent, collaborative solutions and environments are needed. Traditional solutions have been SAS® servers, but nowadays, there is a growing need and interest for Cloud Computing. The paper is written for those who want to know about the Cloud Computing environment (e.g., AWS) and its possible implementation for the Biometric Department.
The paper will start with the main components of Cloud computing – databases, servers, applications, data analytics, reports, visualization, dashboards etc., and its benefits - Elasticity, Control, Flexibility, Integration, Reliability, Security, Inexpensive and Easy to Start. Most popular Cloud computing platforms are AWS, Google Cloud and Microsoft Azure, and this paper will introduce AWS Cloud Computing Environment.
The paper will also introduce the core technologies of AWS Cloud Computing – computing (EC2), Storage ( EBS, EFS, S3), Database ( Redshift, RDS, DynamoDB ), Security (IAM) and Networking (VPC ), and how they could be integrated to support modern-day data analytics.
Finally, the paper will introduce the department-driven Cloud computing transition project that the whole SAS programming department has moved from SAS Window Server into AWS Cloud Computing. It will also discuss the challenges, and the lessons learn and its future in the Biometric department
A fear of missing out and a fear of messing up : A Strategic Roadmap for Chat...Kevin Lee
Does your organization allow ChatGPT at work? The answer might depend on where you work. Many organizations do not allow ChatGPT at work. The truth is that for the organizations, ChatGPT is a fear of missing out and a fear of messing up. But, just like any other past new technologies such as Cloud computing and social media, the organizations eventually integrate ChatGPT or other Large Language Model (LLM). This paper is for those especially Biometrics who want to initiate ChatGPT integration at work.
This paper presents how Biometric department can lead the integration of LLM, focusing on the exemplary model ChatGPT, across an entire enterprise, even in situations where the organization restricts or prohibits ChatGPT usage at work.
The roadmap outlines key stages, starting with an introduction to LLM and ChatGPT, followed by potential risks and concerns and the benefits and diverse use cases. The roadmap will emphasize how Biometrics function leads the building of a cross-functional team to initiate ChatGPT integration and build the policy and guidelines. Then, the roadmap discusses the crucial aspect of training, emphasizing user education and engagement based on company polices. The roadmap finishes with a Proof of Concept (PoC) to validate and evaluate the ChatGPT’s applicability to organizational needs and its compliance to company policies.
This paper can serve as a valuable resource navigating the implementation journey of ChatGPT, providing insights and strategies for successful integration, even within the confines of organizational limitations on ChatGPT usage.
Prompt it, not Google it - Prompt Engineering for Data ScientistsKevin Lee
Since its release, ChatGPT has rapidly gained popularity, reaching 100 million users within 2 months. Even a new concept has emerged : “Prompt it” is now the new “Google it”. Research shows ChatGPT users complete projects 25% faster. The paper is written for Statistical Programmers and Biostatisticians who want to improve their productivity and efficiency by using ChatGPT prompts better.
The paper explores the pivotal role of prompts in enhancing the performance and versatility of ChatGPT or other Large Language Model. The paper shows how Statistical Programmers and Biostatistician utilize ChatGPT's capabilities and benefits such as the content development (e.g., emails, images), search for the information, Programming assistance in R, SAS and Python, Result Interpretation and many more.
The paper also elucidates the distinctive advantages of employing prompts over traditional search methods. It emphasizes the unique characteristics of prompt engineering in ChatGPT. Various techniques, such as zero-shot learning, few-shot learning, reflection, chain of thought, and tree of thought, are dissected to illustrate the nuanced ways in which prompts can be engineered to optimize outcomes. The comprehensive exploration also offers insights into how to prompt better by adding constraints, incorporating more contexts, setting roles, coaching with feedback, probing further, and introducing step-by-step instructions to ChatGPT. The paper discusses ChatGPT's functionality in modifying and resubmitting the prompt, copying the answer, regenerating the answer, and continuing the previous prompt.
The paper highlights how Stat programmers and Biostatisticians use and lead the transformative impact of prompts to be more productive and effective.
Leading into the Unknown? Yes, we need Change Management LeadershipKevin Lee
The paper is written for those who want to lead the new changes in biometric department. Currently, the biometric department is going through Big Changes from traditional SAS ® programming to open-source programming, cloud computing, data science or even Machine Learning, and how to manage and lead those changes becomes critical for the leaders so that changes could be achieved under budget and on schedule.
Change Management is the activities/processes that support the success of changes in the organization and is considered as a leadership competency for enabling changes within the organization. More importantly, the success rate of the changes directly correlates with change management by the leaders. Leaders with excellent change management is six times more likely to succeed than ones with poor change management.
The paper will discuss major obstacles that leaders will face such as programmer/middle management resistance, insufficient support. And it will also discuss about success factors that leaders could implement in change management such as detailed planning, dedicated resources and funds, experiences in change, participation of programmers, frequent transparent communication, and clear goals.
Finally, the paper will show the examples of how change management effectively lead the success of Open-Source Programming Migration from SAS ® for the department of more than 150 SAS programmers.
How to create SDTM DM.xpt using Python v1.1Kevin Lee
The paper is written for those who wants to use Python to create SDTM SAS transport files from Raw SAS datasets. The paper will
show the similarity and differences between SAS and Python in terms of SDTM dataset development, and actual Python codes to
create SDTM SAS transport file.
The paper will start with Python packages that could read and write SAS datasets such as xport, sas7bdat, and pyreadstat. The
paper will introduce how Python reads SAS datasets from the local drive such as demographic, exposure, randomization and
disposition raw SAS datasets. The paper will also show how Python creates variables from raw data such as SEX, USUBJID, RACE,
RFSTDTC, and RFENDTC. The paper will also show how Python merge datasets using outer and inner join. The paper will also show
how programmers use Python Dataframe for data manipulation such as renaming, dropping, and replacing variables. Finally, the
paper will show how Python could create SAS SDTM transport file in the local drive.
The paper also includes the actual python codes that read Raw SAS datasets, merge, manipulate and write SDTM DM SAS xport
file.
Enterprise-level Transition from SAS to Open-source Programming for the whole...Kevin Lee
The paper is written for those who wants to learn the enterprise-level transition from SAS to open-source programming. The paper will introduce the transition project that the whole department of 150+ SAS programmers has completely moved from SAS to Open-source programming.
The paper will start with the scopes of the project – Analytic platform switch from SAS Studio to R Pro Server, converting the existing SAS codes to R/Python codes, Window server to AWS Cloud computing environment, and the transition of SAS programmers to R/Python programmers. It will also discuss the challenges of the project such as inexperience in Open-source Programming, new analytic platform, and change management. The paper will introduce how the transition-support team, executive leadership and SAS programmers have overcome the challenges together during the project.
The paper will also discuss the difference in SAS and Open-source language and programming, and it will show some examples of the conversion of SAS codes to R/Python codes. Finally, it will close with the benefits of the Open-source programming transition and the lessons learned from the project.
Artificial Intelligence in Pharmaceutical IndustryKevin Lee
This presentation will show the introduction of AI and its possible implementation in Pharmaceutical Industry such as drug discovery, personalized medicine, molecular target prediction, site selection, patient recruitment, process automation, process optimization and more.
The Jupyter Notebook is an open-source web application that allows programmers and data scientists to create and share documents that contain live code, visualizations and narrative text. Jupyter Notebook is one of most popular tool for data visualization and machine learning, and it is the perfect tool for story telling tool for data scientist.
First, the paper will start with the introduction of Jupyter Notebook and why it is the most popular tool for data scientist to show, share and visualize the data and analysis. The paper will show how data scientist uses Python programming language in Jupyter Notebook. The paper will show how data scientists import data into Jupyter Notebook using Panda. The paper will introduce Python data visualization library, matplotlib, and show how data scientists use matplotlib to easily create scatter plot, line, histograms, Kaplan Meier curves and many more.
The paper will present how data scientist use Jupyter notebook for image recognitions with visualization and machine learning. The paper will show how data scientists can convert images into numeric array. Then, the paper will show how data scientist can use this numeric data to visualize and train machine learning model for image recognition.
Perfect partnership - machine learning and CDISC standard dataKevin Lee
The most popular buzz word nowadays in the technology world is “Machine Learning (ML).” Most economists and business experts foresee Machine Learning changing every aspect of our lives in the next 10 years through automating and optimizing processes. This is leading many organizations including drug companies to implement Machine Learning into their businesses.
The presentation will start with the introduction of basic concept of Machine Learning, the computer science technology that provides systems with the ability to learn without being explicitly programmed, and it will discuss what it means by “without being explicitly programmed”. The presentation will also introduce basic ML algorithm -SVM, Decision Tress, Regression, Artificial Neural Network (ANN), and DNN. The presentation will also discuss the impact and potential of Machine Learning in our daily lives and pharmaceutical industry.
The presentation will show how CDISC data can be a perfect match on Machine Learning implementation. In this Machine Learning/AI driven process, data is considered as the most important component. 80 to 90 % of works in Machine Learning is preparing data. Since FDA mandated CDISC standards submission as of Dec 17th, 2016, all the clinical trial data are prepared in CDISC SDTM and ADaM data format. The presentation will show how CDISC data is better choice than Real World Evidence (RWE) data for ML model. The presentation will also show how pharmaceutical industry use CDISC data to build ML model and apply ML model for Real World evidence. Finally, the presentation will show how Pharma industry can use their own in-house data and Machine Learning to build innovative, data-driven business models.
Machine Learning : why we should know and how it worksKevin Lee
The most popular buzz word nowadays in the technology world is “Machine Learning (ML).” Most economists and business experts foresee Machine Learning changing every aspect of our lives in the next 10 years through automating and optimizing processes such as: self-driving vehicles; online recommendation on Netflix and Amazon; fraud detection in banks; image and video recognition; natural language processing; question answering machines (e.g., IBM Watson); and many more. This is leading many organizations to seek experts who can implement Machine Learning into their businesses.
Statistical programmers and statisticians in the pharmaceutical industry are in very interesting positions. We have very similar backgrounds as Machine Learning experts, such as programming, statistics, and data expertise, thus embodying the essential technical skill sets needed. This similarity leads many individuals to ask us about Machine Learning. If you are the leaders of biometric groups, you get asked more often.
The paper is intended for statistical programmers and statisticians who are interested in learning and applying Machine Learning to lead innovation in the pharmaceutical industry. The paper will start with the introduction of basic concepts of Machine Learning - hypothesis and cost function and gradient descent. Then, paper will introduce Supervised ML (e.g., Support Vector Machine, Decision Trees, Logistic Regression), Unsupervised ML (e.g., clustering) and the most powerful ML algorithm, Artificial Neural Network (ANN). The paper will also introduce some of popular SAS ® ML procedures and SAS Visual Data Mining and Machine Learning. Finally, the paper will discuss the current ML implementation, its future implementation and how programmers and statisticians could lead this exciting and disruptive technology in pharmaceutical industry.
We are living in the world of “Big Data”. “Big Data” is mainly expressed with three Vs – Volume, Velocity and Variety. The presentation will discuss how Big Data impacts us and how SAS programmers can use SAS skills in Big Data environment
The presentation will introduce Big Data Storage solution – Hadoop and NoSQL. In Hadoop, the presentation will discuss two major Hadoop capabilities - Hadoop Distributed File System (HDFS) and Map/Reduce (parallel computing in Hadoop). The presentation will show how SAS can work with Hadoop using HDFS LIBNAME, FILENAME, SAS/ACCESS to Hadoop HIVE and SAS GRID Managers to Hadoop YARN. The presentation will also introduce the concepts of NoSQL database for a big data solution.
The presentation will also introduce how SAS can work with the variety of data format, especially XML and JSON. The presentation will show the use case of converting XML documents to SAS datasets using LIBNAME XMLV2 XMLMAP statement. The presentation will also introduce REST API to extract data through internets and will demonstrate how SAS PROC HTTP can move the data through REST API.
We are living in the world of “Big Data”. “Big Data” is mainly expressed with three Vs – Volume, Velocity and Variety. The presentation will discuss how Big Data impacts Pharmaceutical Industry and how drug companies can lead this new Big Data environment.
How FDA will reject non compliant electronic submissionKevin Lee
Beginning Dec 18, 2016, all clinical trial and nonclinical trial studies must use standards (e.g., CDISC) for submission data and beginning May 5, 2017, NDA, ANDA, and BLA submissions must follow eCTD format for submission documents.
In order to enforce these standards mandates, the FDA also released "Technical Rejection Criteria for Study Data" in FDA eCTD website on October 3, 2016. FDA also implemented a rejection process for submissions that do not conform to the required study data standards.
The paper will discuss how these new FDA mandates impact the electronic submission and the required preparation for CDISC and eCTD complaint submission package such as SDTM, ADaM, Define.xml, SDTM annotated eCRF, SDRG, ADRG and SAS® programs. The paper will introduce the current FDA submission process, including the current FDA rejection processes – “Technical Rejection” and “Refuse-to-File” and discuss how FDA uses “Technical Rejection” and “Refuse-to-File” to reject submission. The paper will show how FDA rejection of CDISC non-compliant data will impact sponsor’s submission process, and how sponsors should respond to FDA rejections as well as questions throughout the whole submission process. Use cases will demonstrate the key technical rejection criteria that will have the greatest impact on a successful submission process
End to end standards driven oncology study (solid tumor, Immunotherapy, Leuke...Kevin Lee
Each therapeutic area has its own unique data collection and analysis. Oncology especially, has particularly specific standards for collection and analysis of data. Oncology studies are also separated into one of three different sub types according to response criteria guidelines. The first sub type, Solid Tumor study, usually follows RECIST (Response Evaluation Criteria in Solid Tumor). The second sub type, Lymphoma study, usually follows Cheson. Lastly, Leukemia study follows study specific guidelines (IWCLL for Chronic Lymphocytic Leukemia, IWAML for Acute Myeloid Leukemia, NCCN Guidelines for Acute Lymphoblastic Leukemia and ESMO clinical practice guides for Chronic Myeloid Leukemia).
This paper will demonstrate the notable level of sophistication implemented in CDISC standards, mainly driven by the differentiation across different response criteria. The paper will specifically show what SDTM domains are used to collect the different data points in each type. For example, Solid tumor studies collect tumor results in TR and TU and response in RS. Lymphoma studies collect not only tumor results and response, but also bone marrow assessment in LB and FA, and spleen and liver enlargement in PE. Leukemia studies collect blood counts (i.e., lymphocytes, neutrophils, hemoglobin and platelet count) in LB and genetic mutation as well as what are collected in Lymphoma studies. The paper will also introduce oncology terminologies (e.g., CR, PR, SD, PD, NE) and oncology-specific ADaM data sets - Time to Event (--TTE) data set.
Finally, the paper will show how standards (e.g., response criteria guidelines and CDISC) will streamline clinical trial artefacts development in oncology studies and how end to end clinical trial artefacts development can be accomplished through this standards-driven process.
Are you ready for Dec 17, 2016 - CDISC compliant data?Kevin Lee
Are you ready for Dec 17th, 2016?
According to FDA Data Standards Catalog v4.4, all clinical trial studies starting after December 17th, 2016 with the exception of certain INDs will be required to have CDISC compliant data. Organizations who are unclear on their compliance status will have their understanding of FDA expectations elucidated in the paper. The paper will show how programmers can interpret and understand the crucial elements of the FDA Data Standards Catalog, which includes support begin date, support end date, requirement begin date and requirement end date of specific standards for both eCTD and CDISC.
First, the paper will provide the brief introduction of regulatory recommendation of electronic submission, including methods, five modules in CTD especially m5, technical deficiencies in submission and etc. The paper will also discuss what programmers need to prepare for the submission according to FDA and CDISC guidelines for CSR, Protocol, SAP, SDTM annotated eCRF, SDTM datasets, ADaM datasets, ADaM datasets SAS® programs and Define.xml.
Additionally, the paper will discuss formatting logistics that programmers should be aware of in their preparation of documents, including length, naming conventions and file formats of electronic files. For examples, SAS data sets should be submitted as SAS transport file formats and SAS programs should be submitted as text format, rather than SAS format.
Finally, based on information from FDA CSS meeting and FDA Study Data Technical Conformance guides v 3.0, the paper will discuss the latest FDA concerns and issues on electronic submission. This will include the size of SAS data sets, lack of Trial Design dataset(TS) and Define.xml, importance of Reviewer Guide and etc.
We are living in the world of abundant data, so called “big data”. The term “big data” is closely associated with unstructured data. They are called “unstructured” or NoSQL data because they do not fit neatly in a traditional row-column relational database. A NoSQL (Not only SQL or Non-relational SQL) database is the type of database that can handle unstructured data. For example, a NoSQL database can store unstructured data such as XML (Extensible Markup Language), JSON (JavaScript Object Notation) or RDF (Resource Description Framework) files.
If an enterprise is able to extract unstructured data from NoSQL databases and transfer it to the SAS environment for analysis, this will produce tremendous value, especially from a big data solutions standpoint. This paper will show how unstructured data is stored in the NoSQL databases and ways to transfer it to the SAS environment for analysis. First, the paper will introduce the NoSQL database. For example, NoSQL databases can store unstructured data such as XML, JSON or RDF files. Secondly, the paper will show how the SAS system connects to NoSQL databases using REST (Representational State Transfer) API (Application Programming Interface). For example, SAS programmers can use the PROC HTTP option to extract XML or JSON files through REST API from the NoSQL database. Finally, the paper will show how SAS programmers can convert XML and JSON files to SAS datasets for analysis. For example, SAS programmers can create XMLMap files using XMLV2 LIBNAME engine and convert the extracted XML files to SAS datasets.
Introduction of semantic technology for SAS programmersKevin Lee
There is a new technology to express and search the data that can provide more meaning and relationship –
semantic technology. The semantic technology can easily add, change and implement the meaning and relationship
to the current data. Companies such as Facebook and Google are currently using the semantic technology. For
example, Facebook Graph Search use semantic technology to enhance more meaningful search for users.
The paper will introduce the basic concepts of semantic technology and its graph data model, Resource Description
Framework (RDF). RDF can link data elements in a self-describing way with elements and property: subject,
predicate and object. The paper will introduce the application and examples of RDF elements. The paper will also
introduce three different representation of RDF: RDF/XML representation, turtle representation and N-triple
representation.
The paper will also introduce “CDISC standards RDF representation, Reference and Review Guide” published by
CDISC and PhUSE CSS. The paper will discuss RDF representation, reference and review guide and show how
CDISC standards are represented and displayed in RDF format.
The paper will also introduce Simple Protocol RDF Query Language (SPARQL) that can retrieve and manipulate data
in RDF format. The paper will show how programmers can use SPARQL to re-represent RDF format of CDISC
standards metadata into structured tabular format.
Finally, paper will discuss the benefits and futures of semantic technology. The paper will also discuss what semantic
technology means to SAS programmers and how programmers take an advantage of this new technology.
Over the past decade, CDISC Standards have been widely accepted and implemented in clinical research. The FDA’s final “Guidance for Industry on electronic submission” mandates that submission data conform to CDISC standards such as SDTM, ADaM and SEND. This presentation will discuss how life sciences organizations can use Standards metadata to manage the regulatory compliance process. It will introduce how standards metadata management not only ensures regulatory compliance, but also supports process efficiency in clinical trial artefacts (e.g., protocol, CDASH, SDMT and ADaM) development and standards governance, and enables efficient communication between organizational units.
It will also introduce metadata management system and discuss how metadata management system will create, store, govern and manage standards. It will also show how standards metadata management system interacts with ETL system and dictates standards-driven clinical artefacts development.
Data centric SDLC for automated clinical data developmentKevin Lee
Many life science organizations have been building systems to automate clinical data development (e.g., SDTM and ADaM). And such systems are considered as IT product and goes through typical system development life cycle (SDLC); requirements, analysis, design, programming, test and implementation. However, SDLC was initially developed for systems that automate the business process, not the data development. So, the question naturally arises that if life science organizations develop systems to automate data development, should the systems still be developed in process-centric SDLC? or will the current process-centric SDLC satisfy the business need? The presentation will introduce data-centric SDLC. First, the presentation will discuss how some steps of typical process-centric SDLC should be modified and adjusted in data-centric SDLC. For example, test of system requires target data quality assurance. And due to unpredictability of source data, maintenance and system update will be required after implementation. Secondly, the presentation will introduce additional steps and approaches for data-centric system development process such as data profiling and compliance.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
1. How I became Machine Learning
Engineer from Statistical Programmer
Kevin Lee
2. Disclaimer
The views and opinions presented here represent those of the
speaker and should not be considered to represent any
companies or organizations.
4. Why people expects me to know about Machine
Learning?
• Programming
• Statistics / Modeling
• Data
5. What is Machine Learning?
An application of artificial
intelligence (AI) that
provides systems the
ability to automatically
learn and improve from
experience without being
explicitly programmed.
8. How does Human Learn? - Experience
How does Machine
Learn?
Data
9. How does Machine learn with data?
ML
Models
Test Data (cats)
Real Data
cat
10. Why Machine learning is so popular?
• Can solve a lot of complex business problems – New Business
• Cost effective
• Can work 24 / 7
• Automate a lot of works
• “Pretty much anything that a normal person can do in <1 sec, we
can now automate with AI” Andrew Ng
• Accurate
• Can be more accurate than normal people.
11. So, who is Machine Learning Engineer?
• Develop Machine Learning models.
• Validate ML models.
• Deploy ML models into the
production.
• Continuously monitor and update
the ML models.
Usually working with
• Data Scientist
• Data Engineer
• Cloud Computing Architecture (AWS)
12. Sample Job Descriptions
• Graduate degree (MS or PhD) in computer science,
engineering, mathematics, or related technical/scientific field
• 5+ years of professional experience in a business environment
• 3+ years of relevant experience in building large scale machine
learning or deep learning models and/or systems
• 1+ year of experience specifically with deep learning (e.g., CNN,
RNN, LSTM)
• Experience in using Python or other programming
languages
13. Typical Skill sets for ML Engineer
• Strong Programing experience in Python, Java, or C++
• ML modeling experience in CVM, Logistic Regression, Regression,
DecisionTrees, Random Forest, K mean Clustering
• Deep Learning experience in CNN, RNN, NLP.
• ML package experience in Scikit Learn,TensorFlow, Keras, PyTorch
• Cloud Computing environment in AWS,AZURE, Databricks, IBM
Watson and Google Cloud
• Database experience in Hadoop, Data Warehouse, Data Lake, NoSQL,
Relational Database
• MLOPs experience in data pipelines, feature engineering, ML model
selection/training/validation, and finally to the deployment (e.g, Docker,
API) in the production.
• Excellent communication and presentation skills
15. PhD or Master’s degree programs in Universities / Colleges
• One of the hottest majors
in college nowadays are
Computer Science and Data
Science.
• Both on-line / in-person
courses.
• The fastest way to learn
Machine Learning skill sets.
• 2 to 6 years to get the
degree
16. ML Certificate Programs
• The most popular certificate program is Machine Learning certificate
program.
• A lot of major universities currently provide ML Certificate Programs – MIT,
Cornell, Harvard, University of Washington, Coursera, Edx and more
• 2 to 6 months long
• Affordable
17. ON-LINE COURSES /
MOOC (MASSIVE OPEN
ONLINE COURSES)
• On-line degrees, certificates
• Very popular ways to learn ML
• As good as any college courses
• Very affordable 17
19. GitHub. Many ML models and implementations are posted in
GitHub in https://github.com/. GitHub also provides self-
studying materials and codes. So, many beginners could
download sample codes and data, and they could practice and
run ML models in their own environments.
19
GitHub – Code Repository
• Self-studying
materials and
codes.
• Many ongoing
Machine
Learning Projects
20. 20
• ML Practices and Competition Environment.
• Playground for ML Engineers
• Collaborative Projects Environment
If one becomes Kaggle Grandmaster or master, you will be recognized in ML
community.
23. MACHINE LEARNING
CONCEPTS /ALGORITHMS
• Machine Learning at Stanford University
• Neural Network & Deep Learning at DeepLearning.ai
• Improving Deep Neural Network : Hyperparameter tuning,
Regularization & Optimization at DeepLearning.ai
• Convolutional Neural Network at DeepLearning.ai
• Sequence Models at DeepLearning.ai
• Structuring Machine Learning Projects at DeepLearning.ai
• AI for Everyone at DeepLearning.ai
• AI for Medical Diagnosis at DeepLearning.ai
• AI for Medical Prognosis at DeepLearning.ai
24. PYTHON PROGRAMMING
Books
➢ Python Crash Courses
➢ Python for Data Analysis
➢ Feature Engineering for Machine Learning
➢ Hands-on ML with Sci-Kit Learn & TensorFlow
➢ Python Machine Learning
➢ Apache Spark Deep Learning Cookbook
About 30 GitHub repositories
28. Machine Learning Engineer / Professional Market
• The global machine learning market was
valued at $1.58B in 2017 and is expected
to reach $20.83B in 2024, growing more
than 40% annually.
• The current average salary of ML Engineer
is about 150K, and the highest paying
companies offering more than $200K.
• The demand has been increased a lot in
recent years.
29. ML Implementation in Pharmaceutical Industry
• Drug discovery
• Drug candidate selection
• Supply Chain optimization
• Medical image recognition
• Medical diagnosis
• Optimum site selection or
recruitment
• Data anomality detection
• Personized medicine
• Medical coding
• Sales and Marketing Optimization
• Pharmacovigilance
• Drug Development
30. Should I transition to Machine
Learning Engineer?
Should we learn or know about
Machine Learning?
What kind of impact can we have if
we add ML knowledges in
Biometric Department?
30