in this slide i have tried to explain what an data engineer does and what is the difference between a data engineer and a data analytics and data scientist
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)Amazon Web Services
For discovery-phase research, life sciences companies have to support infrastructure that processes millions to billions of transactions. The advent of a data lake to accomplish such a task is showing itself to be a stable and productive data platform pattern to meet the goal. We discuss how to build a data lake on AWS, using services and techniques such as AWS CloudFormation, Amazon EC2, Amazon S3, IAM, and AWS Lambda. We also review a reference architecture from Amgen that uses a data lake to aid in their Life Science Research.
The presentation describes types of data pipeline architectures. It contains information about AWS services needed to create data pipelines based on Amazon Web Services. Also, users can find different diagrams of implemented pipelines on AWS.
Amazon QuickSight is a fast BI service that makes it easy for you to build visualizations, perform ad-hoc analysis, and quickly get business insights from your data. QuickSight is built to harness the power and scalability of the cloud, so you can easily run analysis on large datasets, and support hundreds of thousands of users. In this session, we’ll demonstrate how you can easily get started with Amazon QuickSight, uploading files, connecting to S3 and Redshift and creating analyses from visualizations that are optimized based on the underlying data. Once we’ve built our analysis and dashboard, we’ll show you easy it is to share it with colleagues and stakeholders in just a few seconds. And with SPICE – QuickSight’s in-memory calculation engine – you can go from data to insights, faster than ever.
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...Amazon Web Services
Amazon QuickSight is a fast BI service that makes it easy for you to build visualizations, perform ad-hoc analysis, and quickly get business insights from your data. QuickSight is built to harness the power and scalability of the cloud, so you can easily run analysis on large datasets, and support hundreds of thousands of users. In this session, we’ll demonstrate how you can easily get started with Amazon QuickSight, uploading files, connecting to S3 and Redshift and creating analyses from visualizations that are optimized based on the underlying data. Once we’ve built our analysis and dashboard, we’ll show you easy it is to share it with colleagues and stakeholders in just a few seconds. And with SPICE – QuckSight’s in-memory calculation engine – you can go from data to insights, faster than ever.
Amazon QuickSight is a fast, cloud-powered business analytics service that makes it easy to build visualizations, perform ad-hoc analysis, and quickly get business insights from your data. Using our cloud-based service you can easily connect to your data, perform advanced analysis, and create stunning visualizations and rich dashboards that can be accessed from any browser or mobile device.
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)Amazon Web Services
For discovery-phase research, life sciences companies have to support infrastructure that processes millions to billions of transactions. The advent of a data lake to accomplish such a task is showing itself to be a stable and productive data platform pattern to meet the goal. We discuss how to build a data lake on AWS, using services and techniques such as AWS CloudFormation, Amazon EC2, Amazon S3, IAM, and AWS Lambda. We also review a reference architecture from Amgen that uses a data lake to aid in their Life Science Research.
The presentation describes types of data pipeline architectures. It contains information about AWS services needed to create data pipelines based on Amazon Web Services. Also, users can find different diagrams of implemented pipelines on AWS.
Amazon QuickSight is a fast BI service that makes it easy for you to build visualizations, perform ad-hoc analysis, and quickly get business insights from your data. QuickSight is built to harness the power and scalability of the cloud, so you can easily run analysis on large datasets, and support hundreds of thousands of users. In this session, we’ll demonstrate how you can easily get started with Amazon QuickSight, uploading files, connecting to S3 and Redshift and creating analyses from visualizations that are optimized based on the underlying data. Once we’ve built our analysis and dashboard, we’ll show you easy it is to share it with colleagues and stakeholders in just a few seconds. And with SPICE – QuickSight’s in-memory calculation engine – you can go from data to insights, faster than ever.
AWS re:Invent 2016: Visualizing Big Data Insights with Amazon QuickSight (BDM...Amazon Web Services
Amazon QuickSight is a fast BI service that makes it easy for you to build visualizations, perform ad-hoc analysis, and quickly get business insights from your data. QuickSight is built to harness the power and scalability of the cloud, so you can easily run analysis on large datasets, and support hundreds of thousands of users. In this session, we’ll demonstrate how you can easily get started with Amazon QuickSight, uploading files, connecting to S3 and Redshift and creating analyses from visualizations that are optimized based on the underlying data. Once we’ve built our analysis and dashboard, we’ll show you easy it is to share it with colleagues and stakeholders in just a few seconds. And with SPICE – QuckSight’s in-memory calculation engine – you can go from data to insights, faster than ever.
Amazon QuickSight is a fast, cloud-powered business analytics service that makes it easy to build visualizations, perform ad-hoc analysis, and quickly get business insights from your data. Using our cloud-based service you can easily connect to your data, perform advanced analysis, and create stunning visualizations and rich dashboards that can be accessed from any browser or mobile device.
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...Amazon Web Services
Find out how Citrix built a solution using Matillion ETL for Amazon Redshift from AWS Marketplace to load all data into an Amazon Redshift cluster, allowing them to do their analytics on the entire environment at a single time. We’ll discuss the transition made to consolidate multiple disparate databases in order to run analytic workloads, get a holistic view of all their data sources, and prevent inconsistent data from being captured.
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...Amazon Web Services
Find out how Citrix built a solution using Matillion ETL for Amazon Redshift from AWS Marketplace to load all data into an Amazon Redshift cluster, allowing them to do their analytics on the entire environment at a single time. We’ll discuss the transition made to consolidate multiple disparate databases in order to run analytic workloads, get a holistic view of all their data sources, and prevent inconsistent data from being captured.
Orit Alul (Sr. Solutions Architect) @ AWS:
As data is growing at an exponential rate, we are interested not only in being able to analyze the past or present but also in predicting the future!
In this session, Orit will talk about the power of data combined with machine learning.
Building a highly scalable and flexible data architecture in the cloud to collect, process, and analyze data, in order to get timely insights and react quickly to new information.
In addition, Orit will present best practices, performance and optimization tips for building a Data Lake in the cloud.
If you could not be one of the 60,000+ in attendance at Amazon AWS re:Invent, the yearly Amazon Cloud Conference, get the 411 on what major announcements that were made in Las Vegas. This presentation covers new AWS services & products, exciting announcements, and updated features.
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and TableauDATAVERSITY
Got lots of data? So does Amaysim, a leading Australian telecom provider, with its billions of rows of data. The organization successfully empowers its small team of data analysts with self-service data analytics platforms so they can easily access the data they need, perform advanced analytics, and visualize findings for all stakeholders. Register for this session and learn how Amaysim uses the Alteryx-Redshift-Tableau BI stack to easily and quickly:
Extract data from their data warehouse and blend and enrich it with other sources
Give data analytical context by running statistical, predictive, and deep geo-spatial analytics
Create visualizations from analytics and then update Tableau Workbooks directly from Alteryx, or publish the results in Amazon Redshift, for easy direct access for their stakeholders from Tableau
Hear from Adrian Loong, Alteryx Analytics Certified Expert (ACE), and product marketers from AWS and Alteryx on how organizations can use Alteryx, Amazon Redshift and Tableau to enable data analysts to spin up new self-service analytics instances to enable fast investigation for critical business decisions.
Managing Large Amounts of Data with SalesforceSense Corp
Critical "design skew" problems and solutions - Engaging Big Objects, MuleSoft, Snowflake and Tableau at the right time
Salesforce’s ability to handle large workloads and participate in high-consumption, mobile-application-powering technologies continues to evolve. Pub/sub-models and the investment in adjacent properties like Snowflake, Kafka, and MuleSoft, has broadened the development scope of Salesforce. Solutions now range from internal and in-platform applications to fueling world-scale mobile applications and integrations. Unfortunately, guidance on the extended capabilities is not well understood or documented. Knowing when to move your solution to a higher-order is an important Architect skill.
In this webinar, Paul McCollum, UXMC and Technical Architect at Sense Corp, will present an overview of data and architecture considerations. You’ll learn to identify reasons and guidelines for updating your solutions to larger-scale, modern reference infrastructures, and when to introduce products like Big Objects, Kafka, MuleSoft, and Snowflake.
Database Freedom is an AWS initiative that accelerates enterprise migrations from commercial database engines to AWS native database services or managed open-source systems. We review the basics of the Amazon purpose-built database strategy and cover our Workload Qualification Framework, which helps you determine a good database migration candidate and predict the level of effort. In the hands-on lab, you use AWS Schema Conversion Tool and AWS Database Migration Service to migrate your databases to Amazon Aurora PostgreSQL. Bring a laptop with Firefox or Chrome and a working AWS account. We provide an AWS CloudFormation template to configure the lab environment.
Microsoft Fabric is the next version of Azure Data Factory, Azure Data Explorer, Azure Synapse Analytics, and Power BI. It brings all of these capabilities together into a single unified analytics platform that goes from the data lake to the business user in a SaaS-like environment. Therefore, the vision of Fabric is to be a one-stop shop for all the analytical needs for every enterprise and one platform for everyone from a citizen developer to a data engineer. Fabric will cover the complete spectrum of services including data movement, data lake, data engineering, data integration and data science, observational analytics, and business intelligence. With Fabric, there is no need to stitch together different services from multiple vendors. Instead, the customer enjoys end-to-end, highly integrated, single offering that is easy to understand, onboard, create and operate.
This is a hugely important new product from Microsoft and I will simplify your understanding of it via a presentation and demo.
Agenda:
What is Microsoft Fabric?
Workspaces and capacities
OneLake
Lakehouse
Data Warehouse
ADF
Power BI / DirectLake
Resources
Amazon Web Services proporciona una amplia gama de servicios que le ayudarán a crear e implementar aplicaciones de análisis de big data de forma rápida y sencilla. AWS ofrece un acceso rápido a recursos de TI económicos y flexibles, algo que permitirá escalar prácticamente cualquier aplicación de big data con rapidez, incluidos almacenamiento de datos, análisis de clics, detección de elementos fraudulentos, motores de recomendación, proceso ETL impulsado por eventos, informática sin servidor y procesamiento del Internet de las cosas. Con AWS no necesita hacer grandes inversiones iniciales de tiempo o dinero para crear y mantener la infraestructura. En su lugar, puede aprovisionar exactamente el tipo y el tamaño adecuado de los recursos que necesita para impulsar sus aplicaciones de análisis de big data. Puede obtener acceso a tantos recursos como necesite, prácticamente al instante, y pagar únicamente por los utilice.
Think of big data as all data, no matter what the volume, velocity, or variety. The simple truth is a traditional on-prem data warehouse will not handle big data. So what is Microsoft’s strategy for building a big data solution? And why is it best to have this solution in the cloud? That is what this presentation will cover. Be prepared to discover all the various Microsoft technologies and products from collecting data, transforming it, storing it, to visualizing it. My goal is to help you not only understand each product but understand how they all fit together, so you can be the hero who builds your companies big data solution.
2 years ago if someone had claimed they could stand up a petabyte scale data warehouse in under an hour and then have a non-technical business user querying it live 30 minutes later without knowing any SQL or coding language, they would have been laughed out of the room. These days, that’s called taking advantage of disruptive technology. Amazon Web Services and Tableau Software have shifted the entire paradigm by which organizations not only store and access their data, but ultimately how they innovate with it. The fast, scalable, and inexpensive services that AWS provides for housing data combined with Tableau’s unbelievably flexible and user friendly visual analytic solution means that within hours an organization can securely put the power of their massive data assets into the hands of their domain experts without expensive overhead or lengthy ramp-up time. Attend this webinar to learn how Amazon Web Services and Tableau Software are leveraged together everyday to: • Empower visual ad-hoc data discovery against big data • Revolutionize corporate reporting and dashboards • Promote data driven decision making at every level The presentation will include: • A live demonstration of AWS and Tableau working together • A real customer case study focused on fraud detection and online video metrics • Live Q&A and an opportunity to trial both solutions
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...Amazon Web Services
Find out how Citrix built a solution using Matillion ETL for Amazon Redshift from AWS Marketplace to load all data into an Amazon Redshift cluster, allowing them to do their analytics on the entire environment at a single time. We’ll discuss the transition made to consolidate multiple disparate databases in order to run analytic workloads, get a holistic view of all their data sources, and prevent inconsistent data from being captured.
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...Amazon Web Services
Find out how Citrix built a solution using Matillion ETL for Amazon Redshift from AWS Marketplace to load all data into an Amazon Redshift cluster, allowing them to do their analytics on the entire environment at a single time. We’ll discuss the transition made to consolidate multiple disparate databases in order to run analytic workloads, get a holistic view of all their data sources, and prevent inconsistent data from being captured.
Orit Alul (Sr. Solutions Architect) @ AWS:
As data is growing at an exponential rate, we are interested not only in being able to analyze the past or present but also in predicting the future!
In this session, Orit will talk about the power of data combined with machine learning.
Building a highly scalable and flexible data architecture in the cloud to collect, process, and analyze data, in order to get timely insights and react quickly to new information.
In addition, Orit will present best practices, performance and optimization tips for building a Data Lake in the cloud.
If you could not be one of the 60,000+ in attendance at Amazon AWS re:Invent, the yearly Amazon Cloud Conference, get the 411 on what major announcements that were made in Las Vegas. This presentation covers new AWS services & products, exciting announcements, and updated features.
Analyzing Billions of Data Rows with Alteryx, Amazon Redshift, and TableauDATAVERSITY
Got lots of data? So does Amaysim, a leading Australian telecom provider, with its billions of rows of data. The organization successfully empowers its small team of data analysts with self-service data analytics platforms so they can easily access the data they need, perform advanced analytics, and visualize findings for all stakeholders. Register for this session and learn how Amaysim uses the Alteryx-Redshift-Tableau BI stack to easily and quickly:
Extract data from their data warehouse and blend and enrich it with other sources
Give data analytical context by running statistical, predictive, and deep geo-spatial analytics
Create visualizations from analytics and then update Tableau Workbooks directly from Alteryx, or publish the results in Amazon Redshift, for easy direct access for their stakeholders from Tableau
Hear from Adrian Loong, Alteryx Analytics Certified Expert (ACE), and product marketers from AWS and Alteryx on how organizations can use Alteryx, Amazon Redshift and Tableau to enable data analysts to spin up new self-service analytics instances to enable fast investigation for critical business decisions.
Managing Large Amounts of Data with SalesforceSense Corp
Critical "design skew" problems and solutions - Engaging Big Objects, MuleSoft, Snowflake and Tableau at the right time
Salesforce’s ability to handle large workloads and participate in high-consumption, mobile-application-powering technologies continues to evolve. Pub/sub-models and the investment in adjacent properties like Snowflake, Kafka, and MuleSoft, has broadened the development scope of Salesforce. Solutions now range from internal and in-platform applications to fueling world-scale mobile applications and integrations. Unfortunately, guidance on the extended capabilities is not well understood or documented. Knowing when to move your solution to a higher-order is an important Architect skill.
In this webinar, Paul McCollum, UXMC and Technical Architect at Sense Corp, will present an overview of data and architecture considerations. You’ll learn to identify reasons and guidelines for updating your solutions to larger-scale, modern reference infrastructures, and when to introduce products like Big Objects, Kafka, MuleSoft, and Snowflake.
Database Freedom is an AWS initiative that accelerates enterprise migrations from commercial database engines to AWS native database services or managed open-source systems. We review the basics of the Amazon purpose-built database strategy and cover our Workload Qualification Framework, which helps you determine a good database migration candidate and predict the level of effort. In the hands-on lab, you use AWS Schema Conversion Tool and AWS Database Migration Service to migrate your databases to Amazon Aurora PostgreSQL. Bring a laptop with Firefox or Chrome and a working AWS account. We provide an AWS CloudFormation template to configure the lab environment.
Microsoft Fabric is the next version of Azure Data Factory, Azure Data Explorer, Azure Synapse Analytics, and Power BI. It brings all of these capabilities together into a single unified analytics platform that goes from the data lake to the business user in a SaaS-like environment. Therefore, the vision of Fabric is to be a one-stop shop for all the analytical needs for every enterprise and one platform for everyone from a citizen developer to a data engineer. Fabric will cover the complete spectrum of services including data movement, data lake, data engineering, data integration and data science, observational analytics, and business intelligence. With Fabric, there is no need to stitch together different services from multiple vendors. Instead, the customer enjoys end-to-end, highly integrated, single offering that is easy to understand, onboard, create and operate.
This is a hugely important new product from Microsoft and I will simplify your understanding of it via a presentation and demo.
Agenda:
What is Microsoft Fabric?
Workspaces and capacities
OneLake
Lakehouse
Data Warehouse
ADF
Power BI / DirectLake
Resources
Amazon Web Services proporciona una amplia gama de servicios que le ayudarán a crear e implementar aplicaciones de análisis de big data de forma rápida y sencilla. AWS ofrece un acceso rápido a recursos de TI económicos y flexibles, algo que permitirá escalar prácticamente cualquier aplicación de big data con rapidez, incluidos almacenamiento de datos, análisis de clics, detección de elementos fraudulentos, motores de recomendación, proceso ETL impulsado por eventos, informática sin servidor y procesamiento del Internet de las cosas. Con AWS no necesita hacer grandes inversiones iniciales de tiempo o dinero para crear y mantener la infraestructura. En su lugar, puede aprovisionar exactamente el tipo y el tamaño adecuado de los recursos que necesita para impulsar sus aplicaciones de análisis de big data. Puede obtener acceso a tantos recursos como necesite, prácticamente al instante, y pagar únicamente por los utilice.
Think of big data as all data, no matter what the volume, velocity, or variety. The simple truth is a traditional on-prem data warehouse will not handle big data. So what is Microsoft’s strategy for building a big data solution? And why is it best to have this solution in the cloud? That is what this presentation will cover. Be prepared to discover all the various Microsoft technologies and products from collecting data, transforming it, storing it, to visualizing it. My goal is to help you not only understand each product but understand how they all fit together, so you can be the hero who builds your companies big data solution.
2 years ago if someone had claimed they could stand up a petabyte scale data warehouse in under an hour and then have a non-technical business user querying it live 30 minutes later without knowing any SQL or coding language, they would have been laughed out of the room. These days, that’s called taking advantage of disruptive technology. Amazon Web Services and Tableau Software have shifted the entire paradigm by which organizations not only store and access their data, but ultimately how they innovate with it. The fast, scalable, and inexpensive services that AWS provides for housing data combined with Tableau’s unbelievably flexible and user friendly visual analytic solution means that within hours an organization can securely put the power of their massive data assets into the hands of their domain experts without expensive overhead or lengthy ramp-up time. Attend this webinar to learn how Amazon Web Services and Tableau Software are leveraged together everyday to: • Empower visual ad-hoc data discovery against big data • Revolutionize corporate reporting and dashboards • Promote data driven decision making at every level The presentation will include: • A live demonstration of AWS and Tableau working together • A real customer case study focused on fraud detection and online video metrics • Live Q&A and an opportunity to trial both solutions
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
2. Introduction
Who is a data Engineer
A data engineer is an IT worker whose primary job is to prepare data for analytical or
operational uses. These software engineers are typically responsible for building data
pipelines to bring together information from di
ff
erent source systems.
Data engineering is one of the most popular and in-demand jobs among the big data
domain across the world.
3. But what do they do?
• Data Engineers build monitor and re
fi
ne complex data models to help
organizations improve their business outcomes by harnessing data power.
• In other words they work in a variety of settings to build systems that
collect, manage, and convert raw data into usable information for data
scientists and business analysts to interpret.
4. And what are they trying to achieve?
• Their ultimate goal is to make data accessible so that organizations can use it to evaluate and optimize their performance.
But now lets see what a day in a life of an data engineer is like:
5. Ok, but What skill would you need to be a Data Engineer?
• Coding: Pro
fi
ciency in coding languages is essential to this role, so consider
taking courses to learn and practice your skills. Common programming
languages include SQL, NoSQL, Python, Java, R, and Scala.
• Relational and non-relational databases: Databases rank among the most
common solutions for data storage. You should be familiar with both
relational and non-relational databases, and how they work.
• ETL (extract, transform, and load) systems: ETL is the process by which
you’ll move data from databases and other sources into a single repository,
like a data warehouse. Common ETL tools include Xplenty, Stitch, Alooma,
and Talend.
• Data security: While some companies might have dedicated data security
teams, many data engineers are still tasked with securely managing and
storing data to protect it from loss or theft.
6. • Data storage: Not all types of data should be stored the same way, especially
when it comes to big data. As you design data solutions for a company, you’ll
want to know when to use a data lake versus a data warehouse, for example.
• Automation and scripting: Automation is a necessary part of working with big
data simply because organizations are able to collect so much information. You
should be able to write scripts to automate repetitive tasks.
• Machine learning: While machine learning is more the concern of data
scientists, it can be helpful to have a grasp of the basic concepts to better
understand the needs of data scientists on your team.
• Big data tools: Data engineers don’t just work with regular data. They’re often
tasked with managing big data. Tools and technologies are evolving and vary by
company, but some popular ones include Hadoop, MongoDB, Kafka and Spark.
• Cloud computing: You’ll need to understand cloud storage and cloud computing
as companies increasingly trade physical servers for cloud services. Beginners
may consider a course in Amazon Web Services (AWS) or Google Cloud.
7. Who is AWS? (company)
Amazon Web Services (AWS) is a cloud computing platform o
ff
ered by Amazon.com
that provides a suite of services for building and running applications and websites.
These services include computing, storage, database, analytics, machine learning,
security, and many other functionalities, all of which can be accessed over the internet.
AWS was launched in 2002 and has since become one of the leading cloud computing
platforms in the world. It provides a wide range of services to businesses,
organizations, and individuals, enabling them to build and run their applications and
websites on top of the AWS infrastructure.
AWS services are available on a pay-as-you-go basis, allowing customers to only pay
for the resources they use. This makes it a
fl
exible and cost-e
ff
ective solution for
businesses, as they can scale their resources up or down as needed without having to
make signi
fi
cant upfront investments in hardware and infrastructure.
8. Why AWS is getting bold now days?
There are several reasons why AWS has become popular in recent years:
1.Scalability: AWS allows businesses to scale their resources up or down as needed, which makes
it a
fl
exible solution for companies that experience
fl
uctuating workloads.
2.Cost-e
ff
ectiveness: AWS charges customers on a pay-as-you-go basis, so they only pay for the
resources they use. This makes it a cost-e
ff
ective solution for businesses, as they don't have to
make signi
fi
cant upfront investments in hardware and infrastructure.
3.Wide range of services: AWS o
ff
ers a wide range of services, including computing, storage,
database, analytics, machine learning, security, and many others. This allows businesses to
build and run a variety of di
ff
erent applications and websites on top of the AWS platform.
4.Reliability: AWS has a strong track record of uptime and reliability, which is important for
businesses that rely on the platform to run their applications and websites.
5.Global presence: AWS has a global infrastructure with regions and availability zones located
around the world. This allows businesses to run their applications and websites in the region that
is closest to their customers, which can improve performance and reduce latency.
9. Is AWS a market place?
Yes, AWS is a marketplace that allows businesses and individuals to buy and sell a
wide range of cloud computing services. AWS provides a platform for vendors to o
ff
er
their services, and customers can browse and purchase these services through the
AWS website.
AWS o
ff
ers a variety of services, including computing, storage, database, analytics,
machine learning, security, and many others. Customers can use these services to
build and run their applications and websites on top of the AWS infrastructure.
AWS also o
ff
ers a number of tools and resources for vendors to use in developing and
o
ff
ering their services on the AWS marketplace. This includes the AWS Partner
Network, which is a global community of consulting and technology partners that can
help businesses build and sell their services on AWS.
10. We are going to talk about Spark in a few minutes but lets talk about AWS Glue first
• What is AWS Glue?
• AWS Glue is a serverless data integration service that makes it easy for
analytics users to discover, prepare, move, and integrate data from multiple
sources. You can use it for analytics, machine learning, and application
development. It also includes additional productivity and data ops tooling
for authoring, running jobs, and implementing business work
fl
ows.
• With AWS Glue, you can discover and connect to more than 70 diverse data
sources and manage your data in a centralized data catalog. You can
visually create, run and monitor extract, transform, and load (ETL) pipelines
to load data into your data lakes. Also, you can immediately search and
query cataloged data using Amazon Athena, Amazon EMR, and Amazon
Redshift Spectrum.
11. But how does It work?
Here's how AWS Glue works:
1.Data extraction: AWS Glue can extract data from a variety of sources,
including Amazon S3, Amazon RDS, Amazon Redshift, and other data stores.
2.Data transformation: AWS Glue can transform the extracted data using a
variety of transformations, such as
fi
ltering, sorting, and aggregating data.
3.Data loading: AWS Glue can load the transformed data into a variety of data
stores, including Amazon S3, Amazon RDS, Amazon Redshift, and other data
stores.
AWS Glue also includes a number of features to help users build and maintain
their ETL jobs, including a visual development environment, a library of pre-built
connectors and transformations, and the ability to schedule ETL jobs.
12. Data professionals talk about how they define data engineering and how it
differs from data analytics and data science.
13. Let’s talk about the technologies now:
• These are the Top 20 Most
Commonly Used Data
Engineering Tools in the Year
2022
• Of course we cannot talk about
all of them at the moment but
we will try our best explain some
of them.
14. 1. Amazon Redshift
Amazon Redshift is a fully-managed data warehouse service offered by Amazon
Web Services (AWS). It is designed to handle large amounts of data and allows
users to analyze data using SQL and business intelligence (BI) tools.
Amazon Redshift is based on a columnar data storage model, which allows it to
ef
fi
ciently store and retrieve data for fast querying and analysis. It also includes a
variety of features to optimize performance, such as data compression and the
ability to parallelize queries across multiple nodes.
Customers can use Amazon Redshift to store and analyze data for a wide range
of applications, including business intelligence, data warehousing, analytics, and
more. It is a cost-effective solution, as customers only pay for the resources they
use and can scale their resources up or down as needed.
17. 3. Tableau
• Tableau is an excellent data visualization and business intelligence tool used for reporting and
analyzing vast volumes of data. It is an American company that started in 2003—in June 2019,
Salesforce acquired Tableau. It helps users create di
ff
erent charts, graphs, maps, dashboards,
and stories for visualizing and analyzing data, to help in making business decisions.
• It has some features like:
• Tableau supports powerful data discovery and exploration that enables users to answer
important questions in seconds
• users without relevant experience can start immediately with creating visualizations using
Tableau
• It can connect to several data sources that other BI tools do not support. Tableau enables
users to create reports by joining and blending di
ff
erent datasets
• Tableau Server supports a centralized location to manage all published data sources within an
organization
18. Usage of Tableau in Walmart
And how to connect your own data warehouse to Tableau
• This is how to get data from
Walmart marketplace(and from
other sources) into Tableau by
locating it into your data
warehouse that is connect to
Tableau.
• Load your Walmart Marketplace
data into your central data
warehouse to analyze it with
Tableau.
24. Two Main Abstractions of Apache Spark
• Resilient Distributed Datasets
(RDD):
is a fundamental data structure of
Spark and it is the primary data
abstraction in Apache Spark and the
Spark Core. RDDs are fault-tolerant,
immutable distributed collections of
objects, which means once you
create an RDD you cannot change
it. Each dataset in RDD is divided
into
logical partitions, which can be
computed on di
ff
erent nodes of the
cluster.