This document discusses deploying SharePoint 2013 on Microsoft Azure infrastructure as a service (IaaS). It covers key Azure concepts like virtual networks, availability, disks, and virtual machines. Virtual networks allow grouping of virtual machines and enabling Active Directory. High availability is achieved through location, regions, affinity groups, and availability sets. Disk storage and performance considerations for databases and content are provided. Sample virtual machine configurations show optimal disk layout and sizing for SharePoint and SQL Server.
This document discusses Nomad, a distributed, highly available, datacenter-aware cluster scheduler developed by HashiCorp. Nomad schedules work (tasks) across available resources (hosts) to optimize utilization. It allows defining jobs through a declarative job specification language and handles scheduling work to available resources. Nomad aims to provide flexibility for different workloads through pluggable drivers, schedulers and fingerprinting while also being operationally simple to use with a single binary, no dependencies, and high availability.
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...Fwdays
Troubleshooting performance issues can be a bit tricky, especially when you’re given a broad statement that the database is slow.
Learn to direct your attention to the correct moving pieces and fix what needs your attention.
Learn how all this is done at Percona, what we monitor and track, and the tools we use.
This document discusses Nomad and Consul, two products from HashiCorp that help with deploying and discovering services at scale. Nomad is a cluster scheduler that allows specifying jobs to deploy applications across datacenters. It provides advantages like higher resource utilization, decoupling work from resources, and better quality of service through features like bin packing and priorities. Consul is a service discovery and configuration tool that supports querying across datacenters and regions. It uses Raft consensus and gossip protocols to maintain high availability and scalability.
Azure Data Factory (ADF) CICD experiences. Demo on how to set up CI and CD using ADF provisioned through Terraform and with deployment and git in Azure DevOps. Presented at Microsoft Data Platform User Group Oslo, Norway January 2020
Amazon Web Services Customer Case Study, Fashion for HomeAmazon Web Services
This document discusses how a designer furniture company uses AWS services like S3, CloudFront, EC2, and RDS to run their online shop. It provides examples of how S3 can be used to store reports, backups, images, and other files. CloudFront is used as a CDN to distribute graphics, CSS, and JS files globally. General recommendations include using long-term pricing plans, monitoring costs with CloudWatch, using multiple regions for high availability, and employing security best practices.
This document discusses deploying SharePoint 2013 on Microsoft Azure infrastructure as a service (IaaS). It covers key Azure concepts like virtual networks, availability, disks, and virtual machines. Virtual networks allow grouping of virtual machines and enabling Active Directory. High availability is achieved through location, regions, affinity groups, and availability sets. Disk storage and performance considerations for databases and content are provided. Sample virtual machine configurations show optimal disk layout and sizing for SharePoint and SQL Server.
This document discusses Nomad, a distributed, highly available, datacenter-aware cluster scheduler developed by HashiCorp. Nomad schedules work (tasks) across available resources (hosts) to optimize utilization. It allows defining jobs through a declarative job specification language and handles scheduling work to available resources. Nomad aims to provide flexibility for different workloads through pluggable drivers, schedulers and fingerprinting while also being operationally simple to use with a single binary, no dependencies, and high availability.
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...Fwdays
Troubleshooting performance issues can be a bit tricky, especially when you’re given a broad statement that the database is slow.
Learn to direct your attention to the correct moving pieces and fix what needs your attention.
Learn how all this is done at Percona, what we monitor and track, and the tools we use.
This document discusses Nomad and Consul, two products from HashiCorp that help with deploying and discovering services at scale. Nomad is a cluster scheduler that allows specifying jobs to deploy applications across datacenters. It provides advantages like higher resource utilization, decoupling work from resources, and better quality of service through features like bin packing and priorities. Consul is a service discovery and configuration tool that supports querying across datacenters and regions. It uses Raft consensus and gossip protocols to maintain high availability and scalability.
Azure Data Factory (ADF) CICD experiences. Demo on how to set up CI and CD using ADF provisioned through Terraform and with deployment and git in Azure DevOps. Presented at Microsoft Data Platform User Group Oslo, Norway January 2020
Amazon Web Services Customer Case Study, Fashion for HomeAmazon Web Services
This document discusses how a designer furniture company uses AWS services like S3, CloudFront, EC2, and RDS to run their online shop. It provides examples of how S3 can be used to store reports, backups, images, and other files. CloudFront is used as a CDN to distribute graphics, CSS, and JS files globally. General recommendations include using long-term pricing plans, monitoring costs with CloudWatch, using multiple regions for high availability, and employing security best practices.
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB
When using MongoDB and AWS, you want to design your infrastructure to avoid storage bottlenecks and make the best use of your available storage resources. AWS offers a myriad of storage options, including ephemeral disks, EBS, Provisioned IOPS, and ephemeral SSD's, each offering different performance and persistence characteristics. In this session, we’ll evaluate each of these options in the context of your MongoDB deployment, assessing the benefits and drawbacks of each.
Data Orchestration Summit
www.alluxio.io/data-orchestration-summit-2019
November 7, 2019
Presto on Alluxio Hands-On Lab
Speakers:
Bin Fan, Alluxio
Zac Blanco, Alluxio
Kamil Bajda-Pawlikowski, Starburst, Presto Company
Martin Traverso, Presto Software Foundation
For more Alluxio events: https://www.alluxio.io/events/
San Francisco HashiCorp User Group at GitHubJon Benson
This document discusses Nomad and Consul, two products from HashiCorp that help with deploying and discovering services at scale. Nomad is a cluster scheduler that allows specifying jobs to deploy applications across datacenters. It provides advantages like higher resource utilization, decoupling work from resources, and better quality of service. Consul is a service discovery and configuration tool that supports service registration, health checking, and queries at scale across datacenters. The presentation covers the architectures and advantages of both Nomad and Consul for operating large clusters in a multi-region environment.
Heap Dump Analysis - AEM: Real World IssuesKanika Gera
This document discusses analyzing Java heap dumps to diagnose out of memory errors. It begins with an overview of Java heap concepts like how memory is allocated and garbage collection. Next, it defines what a heap dump is and how to generate one. It then explains how to analyze heap dumps using tools like MAT to identify the largest objects consuming memory, visualize object reference graphs and dominator trees, and investigate threads. The goal is to find memory leaks and reduce memory usage to prevent out of memory errors from occurring.
Big Data, Big Projects, Big Mistakes: How to Jumpstart and Deliver with SuccessAltoros
This document contains information about Altoros Systems and their Chief Technology Officer Andrei Yurkevich. It discusses Altoros' services including Hadoop performance engineering and cloud automation. It also includes details about their global employee size, customers, and partners. Later sections evaluate different cloud platform options and database technologies for building a data analytics prototype within budget and functional requirements.
This document provides an overview of HashiCorp Nomad, including its key concepts, architecture, scheduling process, job specification, runtime environment, task drivers, and HTTP API. Nomad is an open source project that supports Docker containers, operates simply with one binary across datacenters, and is built for scale and hybrid cloud deployments. It uses a client-server model with Raft consensus and gossip protocols to manage membership across regions. Scheduling is inspired by Google papers and involves evaluating state changes to generate allocation plans that place tasks based on feasibility and ranking nodes.
Building Fast SQL Analytics on Anything with Presto, AlluxioAlluxio, Inc.
Alluxio Bay Area Meetup @ Galvanize | SF
Aug 20, 2019
Interactive Analytics in the Cloud with Presto and Alluxio
Speaker:
Bin Fan, Founding Engineer, Alluxio
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScyllaDB
In this talk AWS’ Ken Krupa, Head of Specialized Solutions Architecture, will describe the architecture and capabilities of two new AWS EC2 instance types perfect for data-intensive storage and IO-heavy workloads like ScyllaDB: the Intel-based I4i and the Graviton2-based I4g series.
The Intel Xeon Ice Lake-based I4i series provides unparalleled raw horsepower for your most demanding workloads. Meanwhile, the Graviton2-powered I4g instances provide lower cost per storage on a power-efficient platform to deploy your cloud-native applications.
Ken will also describe the AWS Nitro SSD, a new form of high-speed NVMe storage with a Flash Translation Layer built with Nitro controllers, which powers both of these instance families.
ScyllaDB VP of Product Tzach Livyatan will then share benchmarking results showing how ScyllaDB behaves under load on these two instance types, providing maximum system utility and efficiency.
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
The document discusses various techniques for optimizing performance and scaling WordPress sites. It covers caching at the disk, memory, page, and object levels. It also discusses scaling strategies like using multiple web and database servers, database sharding, file syncing, and caching technologies like Memcached. Specific caching plugins like Batcache and W3TC are mentioned. Coding best practices like using transients and the WordPress APIs are recommended to optimize performance.
Володимир Цап "Constraint driven infrastructure - scale or tune?"Fwdays
Volodymyr Tsap discusses how to save money on infrastructure through constraint driven design. He provides examples of hardware configurations on AWS, bare metal servers, and PaaS platforms to demonstrate how costs can be optimized. Tsap also outlines ways to reduce software costs through choices in operating system, virtualization, databases, and orchestration. Infrastructure support costs depend on the complexity of the environment, with basic setups costing $500-800 per month while more advanced architectures are $4,000-6,000 per month. The overall message is that money saved through optimization can be invested in people.
Building Cloud Native Analytical Pipelines on AWS Alluxio, Inc.
Alluxio Bay Area Meetup @ Galvanize | SF
Aug 20, 2019
Interactive Analytics in the Cloud with Presto and Alluxio
Speaker:
Irene Cai, Software Engineer, Google
Primary Storage in CloudStack by Mike Tutkowskibuildacloud
Primary storage in CloudStack stores running virtual machine disk volumes on hosts and is used for production applications, databases, and dev/test systems. It requires high-performance storage that can handle high change content and bursty I/O workloads. To configure primary storage, administrators first set up storage space on a SAN, create a hypervisor-level storage repository, and then define a primary storage in CloudStack that is associated with compute offerings for user VMs.
This document discusses caching services available on Windows Azure, including content delivery networks (CDNs) and caching. It describes how CDNs deliver content closer to end users, and caching stores frequently accessed data closer to Azure applications. Caching on Azure can be done through dedicated roles, co-location with applications, or shared caching services. The document outlines characteristics of CDNs like dedicated endpoints and worldwide datacenters. It also provides examples of caching configuration and workflows in Visual Studio and code samples for putting and getting items from the cache.
ASP.NET MVC 5 is a framework for building scalable and standards-based web applications using established design patterns and the power of ASP.NET and .NET. It allows applications to run on IIS or self-host on Windows, Linux, and Mac OS X using the .NET runtime and libraries delivered via NuGet. Applications are built with MSBuild/Roslyn and hosted by Kestrel, IIS, or HTTP.SYS, with libraries from NuGet rather than the GAC.
This document discusses server and code architectures that can scale easily as an application grows. It presents different server setup structures (linear, diamond, fan-out, multi-fan) and strategies for scaling web/API servers using Node.js. It also covers data storage options and how to scale storage. The key is to design architectures that can grow horizontally by expanding to other servers rather than only vertically by increasing the resources of a single server.
[Pgday.Seoul 2018] PostgreSQL 성능을 위해 개발된 라이브러리 OS 소개 apposhaPgDay.Seoul
This document introduces AppOS, an operating system specialized for database performance. It discusses how AppOS improves on Linux by being more optimized for database workloads through techniques like specialized caching, I/O scheduling based on database priorities, and atomic writes. It also explains how AppOS is portable, high performing, and extensible to support different databases through its modular design. Future plans include improving cache management, parallel query optimization, and cooperative CPU scheduling.
This document discusses moving MongoDB to the cloud. It provides an overview of MongoDB hosting options including on-premises data centers, cloud providers, and hosted databases. It outlines some key reasons to move to the cloud, such as cost-effectiveness, reduced need for staffing, and improved availability. It also covers important considerations for strategy planning including instance types, high availability strategy, security, and migration/rollback strategies. Finally, it discusses two common strategies for migrating - adding a cloud server to an existing replica set with no downtime, or taking backups and restoring to the cloud which requires downtime.
Rigorous and Multi-tenant HBase Performance MeasurementDataWorks Summit
The document discusses techniques for rigorously measuring HBase performance in both standalone and multi-tenant environments. It begins with an overview of HBase and the Yahoo! Cloud Serving Benchmark (YCSB) for evaluating databases. It then discusses best practices for cluster setup, data loading, and benchmarking techniques like warming the cache, setting target throughput, and using appropriate workloads. Finally, it covers challenges in measuring HBase performance when used alongside other frameworks like MapReduce and Solr in a multi-tenant setting.
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB
When using MongoDB and AWS, you want to design your infrastructure to avoid storage bottlenecks and make the best use of your available storage resources. AWS offers a myriad of storage options, including ephemeral disks, EBS, Provisioned IOPS, and ephemeral SSD's, each offering different performance and persistence characteristics. In this session, we’ll evaluate each of these options in the context of your MongoDB deployment, assessing the benefits and drawbacks of each.
Data Orchestration Summit
www.alluxio.io/data-orchestration-summit-2019
November 7, 2019
Presto on Alluxio Hands-On Lab
Speakers:
Bin Fan, Alluxio
Zac Blanco, Alluxio
Kamil Bajda-Pawlikowski, Starburst, Presto Company
Martin Traverso, Presto Software Foundation
For more Alluxio events: https://www.alluxio.io/events/
San Francisco HashiCorp User Group at GitHubJon Benson
This document discusses Nomad and Consul, two products from HashiCorp that help with deploying and discovering services at scale. Nomad is a cluster scheduler that allows specifying jobs to deploy applications across datacenters. It provides advantages like higher resource utilization, decoupling work from resources, and better quality of service. Consul is a service discovery and configuration tool that supports service registration, health checking, and queries at scale across datacenters. The presentation covers the architectures and advantages of both Nomad and Consul for operating large clusters in a multi-region environment.
Heap Dump Analysis - AEM: Real World IssuesKanika Gera
This document discusses analyzing Java heap dumps to diagnose out of memory errors. It begins with an overview of Java heap concepts like how memory is allocated and garbage collection. Next, it defines what a heap dump is and how to generate one. It then explains how to analyze heap dumps using tools like MAT to identify the largest objects consuming memory, visualize object reference graphs and dominator trees, and investigate threads. The goal is to find memory leaks and reduce memory usage to prevent out of memory errors from occurring.
Big Data, Big Projects, Big Mistakes: How to Jumpstart and Deliver with SuccessAltoros
This document contains information about Altoros Systems and their Chief Technology Officer Andrei Yurkevich. It discusses Altoros' services including Hadoop performance engineering and cloud automation. It also includes details about their global employee size, customers, and partners. Later sections evaluate different cloud platform options and database technologies for building a data analytics prototype within budget and functional requirements.
This document provides an overview of HashiCorp Nomad, including its key concepts, architecture, scheduling process, job specification, runtime environment, task drivers, and HTTP API. Nomad is an open source project that supports Docker containers, operates simply with one binary across datacenters, and is built for scale and hybrid cloud deployments. It uses a client-server model with Raft consensus and gossip protocols to manage membership across regions. Scheduling is inspired by Google papers and involves evaluating state changes to generate allocation plans that place tasks based on feasibility and ranking nodes.
Building Fast SQL Analytics on Anything with Presto, AlluxioAlluxio, Inc.
Alluxio Bay Area Meetup @ Galvanize | SF
Aug 20, 2019
Interactive Analytics in the Cloud with Presto and Alluxio
Speaker:
Bin Fan, Founding Engineer, Alluxio
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScyllaDB
In this talk AWS’ Ken Krupa, Head of Specialized Solutions Architecture, will describe the architecture and capabilities of two new AWS EC2 instance types perfect for data-intensive storage and IO-heavy workloads like ScyllaDB: the Intel-based I4i and the Graviton2-based I4g series.
The Intel Xeon Ice Lake-based I4i series provides unparalleled raw horsepower for your most demanding workloads. Meanwhile, the Graviton2-powered I4g instances provide lower cost per storage on a power-efficient platform to deploy your cloud-native applications.
Ken will also describe the AWS Nitro SSD, a new form of high-speed NVMe storage with a Flash Translation Layer built with Nitro controllers, which powers both of these instance families.
ScyllaDB VP of Product Tzach Livyatan will then share benchmarking results showing how ScyllaDB behaves under load on these two instance types, providing maximum system utility and efficiency.
To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.
The document discusses various techniques for optimizing performance and scaling WordPress sites. It covers caching at the disk, memory, page, and object levels. It also discusses scaling strategies like using multiple web and database servers, database sharding, file syncing, and caching technologies like Memcached. Specific caching plugins like Batcache and W3TC are mentioned. Coding best practices like using transients and the WordPress APIs are recommended to optimize performance.
Володимир Цап "Constraint driven infrastructure - scale or tune?"Fwdays
Volodymyr Tsap discusses how to save money on infrastructure through constraint driven design. He provides examples of hardware configurations on AWS, bare metal servers, and PaaS platforms to demonstrate how costs can be optimized. Tsap also outlines ways to reduce software costs through choices in operating system, virtualization, databases, and orchestration. Infrastructure support costs depend on the complexity of the environment, with basic setups costing $500-800 per month while more advanced architectures are $4,000-6,000 per month. The overall message is that money saved through optimization can be invested in people.
Building Cloud Native Analytical Pipelines on AWS Alluxio, Inc.
Alluxio Bay Area Meetup @ Galvanize | SF
Aug 20, 2019
Interactive Analytics in the Cloud with Presto and Alluxio
Speaker:
Irene Cai, Software Engineer, Google
Primary Storage in CloudStack by Mike Tutkowskibuildacloud
Primary storage in CloudStack stores running virtual machine disk volumes on hosts and is used for production applications, databases, and dev/test systems. It requires high-performance storage that can handle high change content and bursty I/O workloads. To configure primary storage, administrators first set up storage space on a SAN, create a hypervisor-level storage repository, and then define a primary storage in CloudStack that is associated with compute offerings for user VMs.
This document discusses caching services available on Windows Azure, including content delivery networks (CDNs) and caching. It describes how CDNs deliver content closer to end users, and caching stores frequently accessed data closer to Azure applications. Caching on Azure can be done through dedicated roles, co-location with applications, or shared caching services. The document outlines characteristics of CDNs like dedicated endpoints and worldwide datacenters. It also provides examples of caching configuration and workflows in Visual Studio and code samples for putting and getting items from the cache.
ASP.NET MVC 5 is a framework for building scalable and standards-based web applications using established design patterns and the power of ASP.NET and .NET. It allows applications to run on IIS or self-host on Windows, Linux, and Mac OS X using the .NET runtime and libraries delivered via NuGet. Applications are built with MSBuild/Roslyn and hosted by Kestrel, IIS, or HTTP.SYS, with libraries from NuGet rather than the GAC.
This document discusses server and code architectures that can scale easily as an application grows. It presents different server setup structures (linear, diamond, fan-out, multi-fan) and strategies for scaling web/API servers using Node.js. It also covers data storage options and how to scale storage. The key is to design architectures that can grow horizontally by expanding to other servers rather than only vertically by increasing the resources of a single server.
[Pgday.Seoul 2018] PostgreSQL 성능을 위해 개발된 라이브러리 OS 소개 apposhaPgDay.Seoul
This document introduces AppOS, an operating system specialized for database performance. It discusses how AppOS improves on Linux by being more optimized for database workloads through techniques like specialized caching, I/O scheduling based on database priorities, and atomic writes. It also explains how AppOS is portable, high performing, and extensible to support different databases through its modular design. Future plans include improving cache management, parallel query optimization, and cooperative CPU scheduling.
This document discusses moving MongoDB to the cloud. It provides an overview of MongoDB hosting options including on-premises data centers, cloud providers, and hosted databases. It outlines some key reasons to move to the cloud, such as cost-effectiveness, reduced need for staffing, and improved availability. It also covers important considerations for strategy planning including instance types, high availability strategy, security, and migration/rollback strategies. Finally, it discusses two common strategies for migrating - adding a cloud server to an existing replica set with no downtime, or taking backups and restoring to the cloud which requires downtime.
Rigorous and Multi-tenant HBase Performance MeasurementDataWorks Summit
The document discusses techniques for rigorously measuring HBase performance in both standalone and multi-tenant environments. It begins with an overview of HBase and the Yahoo! Cloud Serving Benchmark (YCSB) for evaluating databases. It then discusses best practices for cluster setup, data loading, and benchmarking techniques like warming the cache, setting target throughput, and using appropriate workloads. Finally, it covers challenges in measuring HBase performance when used alongside other frameworks like MapReduce and Solr in a multi-tenant setting.
Rigorous and Multi-tenant HBase PerformanceCloudera, Inc.
The document discusses techniques for rigorously measuring Apache HBase performance in both standalone and multi-tenant environments. It introduces the Yahoo! Cloud Serving Benchmark (YCSB) and best practices for cluster setup, workload generation, data loading, and measurement. These include pre-splitting tables, warming caches, setting target throughput, and using appropriate workload distributions. The document also covers challenges in achieving good multi-tenant performance across HBase, MapReduce and Apache Solr.
The document summarizes Oracle's Big Data Appliance and solutions. It discusses the Big Data Appliance hardware which includes 18 servers with 48GB memory, 12 Intel cores, and 24TB storage per node. The software includes Oracle Linux, Apache Hadoop, Oracle NoSQL Database, Oracle Data Integrator, and Oracle Loader for Hadoop. Oracle Loader for Hadoop can be used to load data from Hadoop into Oracle Database in online or offline mode. The Big Data Appliance provides an optimized platform for storing and analyzing large amounts of data and is integrated with Oracle Exadata.
Custom Coded Projects - When picking up a project you have many choices to make. Do you go for a premium theme and already builded plugins or will you write parts yourself. I will discuss what impacts custom building a project can have. I will focus on time, cost and speed to help you out with your decision making with future projects.
This document discusses using virtualization and containers to improve database deployments in development environments. It notes that traditional database deployments are slow, taking 85% of project time for creation and refreshes. Virtualization allows for more frequent releases by speeding up refresh times. The document discusses how virtualization engines can track database changes and provision new virtual databases in seconds from a source database. This allows developers and testers to self-service provision databases without involving DBAs. It also discusses how virtualization and containers can optimize database deployments in cloud environments by reducing storage usage and data transfers.
Should I move my database to the cloud?James Serra
So you have been running on-prem SQL Server for a while now. Maybe you have taken the step to move it from bare metal to a VM, and have seen some nice benefits. Ready to see a TON more benefits? If you said “YES!”, then this is the session for you as I will go over the many benefits gained by moving your on-prem SQL Server to an Azure VM (IaaS). Then I will really blow your mind by showing you even more benefits by moving to Azure SQL Database (PaaS/DBaaS). And for those of you with a large data warehouse, I also got you covered with Azure SQL Data Warehouse. Along the way I will talk about the many hybrid approaches so you can take a gradual approve to moving to the cloud. If you are interested in cost savings, additional features, ease of use, quick scaling, improved reliability and ending the days of upgrading hardware, this is the session for you!
PHD Virtual: Optimizing Backups for Any StorageMark McHenry
Learn about the differences between virtual full, and traditional full and incremental backup modes, and which mode works best depending on the type of storage.
A powerful feature in Postgres called Foreign Data Wrappers lets end users integrate data from MongoDB, Hadoop and other solutions with their Postgres database and leverage it as single, seamless database using SQL.
Use of these features has skyrocketed since EDB released to the open source community new FDWs for MongoDB, Hadoop and MySQL that support both read and write capabilities. Now greatly enhanced, FDWs enable integrating data across disparate deployments to support new workloads, expanded development goals and harvesting greater value from data.
Learn more about Foreign Data Wrappers (FDWs) and Postgres with Sameer Kumar, Database Consultant from Ashnik.
Target Audience: This presentation is intended for IT Professionals seeking to do more with Postgres in his every day projects and build new applications.
The Perils and Triumphs of using Cassandra at a .NET/Microsoft ShopJeff Smoley
NativeX recently transitioned a large portion of their backend infrastructure from Microsoft SQL Server to Apache Cassandra. Check out our story about how we were successful at getting our .NET web apps to reliably connect to Cassandra. Learn about FluentCassandra, Snowflake, Hector, and IKVM. It's a story of struggle and perseverance, where everyone lives happily ever after.
SQL 2014 hybrid platform - Azure and on premise Shy Engelberg
The document provides an overview of integration features between SQL Server 2014 and Windows Azure. It discusses capabilities like deploying a SQL database to an Azure virtual machine, storing database data files in Azure storage, backing up SQL databases to Azure storage, and using Azure virtual machines for disaster recovery of SQL Server databases through availability group replicas. The document contains disclaimers that it provides overviews rather than technical details and that some demos may fail due to bugs in the preview release. It also includes contact information for the presenter.
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
This document discusses best practices for implementing Ceph-powered storage as a service. It covers planning a Ceph implementation based on business and technical requirements. Various use cases for Ceph are described, including OpenStack, cloud storage, web-scale applications, high performance block storage, archive/cold storage, databases and Hadoop. Architectural considerations for redundancy, servers, networking are also discussed. The document concludes with a case study of a university implementing a Ceph-based storage cloud to address storage needs for cancer and genomic research data.
This deck presents the best practices of using Apache Hive with good performance. It covers getting data into Hive, using ORC file format, getting good layout into partitions and files based on query patterns, execution using Tez and YARN queues, memory configuration, and debugging common query performance issues. It also describes Hive Bucketing and reading Hive Explain query plans.
Evan Pollan talks about Bazaarvoice's Hadoop infrastructure for clickstream analytics, as well as an approach to large-scale cardinality analysis using Map/Reduce and HBase.
Hadoop in the cloud – The what, why and how from the expertsDataWorks Summit
The document discusses Hadoop in the cloud and its benefits. It summarizes that Hadoop in the cloud provides distributed storage, automated failover, hyper-scaling, distributed computing, and extensibility. It also discusses deploying Hadoop clusters in Azure HDInsight and options for customizing clusters and integrating them.
How DataCore software radically decreased 9-11 Emergency Communications of Southern Oregon call center response times with 20-40x improvements. DataCore's 10/26/17 Advantech Solution Day presentation by Sushant Rao.
Event details: http://www.advantech-eautomation.com/eMarketingPrograms/Server_SolutionDay/
Delivering Apache Hadoop for the Modern Data Architecture Hortonworks
Join Hortonworks and Cisco as we discuss trends and drivers for a modern data architecture. Our experts will walk you through some key design considerations when deploying a Hadoop cluster in production. We'll also share practical best practices around Cisco-based big data architectures and Hortonworks Data Platform to get you started on building your modern data architecture.
Workshop on Advanced Design Patterns for Amazon DynamoDB - DAT405 - re:Invent...Amazon Web Services
Join us for the first-ever Amazon DynamoDB practical hands-on workshop. This session is designed for developers, engineers, and database administrators who are involved in designing and maintaining DynamoDB applications. We begin with a walkthrough of proven NoSQL design patterns for at-scale applications. Next, we use step-by-step instructions to apply lessons learned to design DynamoDB tables and indexes that are optimized for performance and cost. Expect to leave this session with the knowledge to build and monitor DynamoDB applications that can grow to any size and scale. Attendees should have a basic understanding of DynamoDB. To attend this workshop, bring your laptop.
This document provides an introduction to Cloudant, which is a fully managed NoSQL database as a service (DBaaS) that provides a scalable and flexible data layer for web and mobile applications. The presentation discusses NoSQL databases and why they are useful, describes Cloudant's features such as document storage, querying, indexing and its global data presence. It also provides examples of how companies like FitnessKeeper and Fidelity Investments use Cloudant to solve data scaling and management challenges. The document concludes by outlining next steps for signing up and exploring Cloudant.
Similar to Kenshoo - Use Hadoop, One Week, No Coding (20)
How Data-Driven Approaches are Changing Your Data Management Strategies
Introducing data-driven strategies into your business model alters the way your organization manages and provides information to your customers, partners and employees. Gone are the days of “waterfall” implementation strategies from relational data to applications within a data center. Now, data-driven business models require agile implementation of applications based on information from all across an organization–on-premises, cloud, and mobile–and includes information from outside corporate walls from partners, third-party vendors, and customers. Data management strategies need to be ready to meet these challenges or your new and disruptive business models will fail at the most critical time: when your customers want to access it.
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
This document discusses machine learning model comparison and evaluation. It describes how the rendezvous architecture in MapR makes evaluation easier by collecting metrics on model performance and allowing direct comparison of models. It also discusses challenges like reject inferencing and the need to balance exploration of new models with exploitation of existing models. The document provides recommendations for change detection and analyzing latency distributions to better evaluate models over time.
Self-Service Data Science for Leveraging ML & AI on All of Your DataMapR Technologies
MapR has launched the MapR Data Science Refinery which leverages a scalable data science notebook with native platform access, superior out-of-the-box security, and access to global event streaming and a multi-model NoSQL database.
Enabling Real-Time Business with Change Data CaptureMapR Technologies
Machine learning (ML) and artificial intelligence (AI) enable intelligent processes that can autonomously make decisions in real-time. The real challenge for effective ML and AI is getting all relevant data to a converged data platform in real-time, where it can be processed using modern technologies and integrated into any downstream systems.
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
The document discusses machine learning and autonomous driving applications. It begins with a simple machine learning example of classifying images of chickens posted on Twitter. It then discusses how autonomous vehicles use machine learning by gathering large amounts of sensor data to train models for tasks like object recognition. The document also summarizes challenges for applying machine learning at an enterprise scale and how the MapR data platform can address these challenges by providing a unified environment for storing, accessing, and processing large amounts of diverse data.
ML Workshop 1: A New Architecture for Machine Learning LogisticsMapR Technologies
Having heard the high-level rationale for the rendezvous architecture in the introduction to this series, we will now dig in deeper to talk about how and why the pieces fit together. In terms of components, we will cover why streams work, why they need to be persistent, performant and pervasive in a microservices design and how they provide isolation between components. From there, we will talk about some of the details of the implementation of a rendezvous architecture including discussion of when the architecture is applicable, key components of message content and how failures and upgrades are handled. We will touch on the monitoring requirements for a rendezvous system but will save the analysis of the recorded data for later. Listen to the webinar on demand: https://mapr.com/resources/webinars/machine-learning-workshop-1/
Machine Learning Success: The Key to Easier Model ManagementMapR Technologies
Join Ellen Friedman, co-author (with Ted Dunning) of a new short O’Reilly book Machine Learning Logistics: Model Management in the Real World, to look at what you can do to have effective model management, including the role of stream-first architecture, containers, a microservices approach and a DataOps style of work. Ellen will provide a basic explanation of a new architecture that not only leverages stream transport but also makes use of canary models and decoy models for accurate model evaluation and for efficient and rapid deployment of new models in production.
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
Data warehouses have been the standard tool for analyzing data created by business operations. In recent years, increasing data volumes, new types of data formats, and emerging analytics technologies such as machine learning have given rise to modern data lakes. Connecting application databases, data warehouses, and data lakes using real-time data pipelines can significantly improve the time to action for business decisions. More: http://info.mapr.com/WB_MapR-StreamSets-Data-Warehouse-Modernization_Global_DG_17.08.16_RegistrationPage.html
Live Tutorial – Streaming Real-Time Events Using Apache APIsMapR Technologies
For this talk we will explore the power of streaming real time events in the context of the IoT and smart cities.
http://info.mapr.com/WB_Streaming-Real-Time-Events_Global_DG_17.08.02_RegistrationPage.html
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
Deploying storage with a forklift is so 1990s, right? Today’s applications and infrastructure demand systems and services that scale. Customers require performance and capacity to fit the use case and workloads, not the other way around. Architects need multi-temperature, multi-location, highly available, and compliance friendly platforms that grow with the generational shift in data growth and utility.
Churn prediction is big business. It minimizes customer defection by predicting which customers are likely to cancel a service. Though originally used within the telecommunications industry, it has become common practice for banks, ISPs, insurance firms, and other verticals. More: http://info.mapr.com/WB_PredictingChurn_Global_DG_17.06.15_RegistrationPage.html
The prediction process is data-driven and often uses advanced machine learning techniques. In this webinar, we'll look at customer data, do some preliminary analysis, and generate churn prediction models – all with Spark machine learning (ML) and a Zeppelin notebook.
Spark’s ML library goal is to make machine learning scalable and easy. Zeppelin with Spark provides a web-based notebook that enables interactive machine learning and visualization.
In this tutorial, we'll do the following:
Review classification and decision trees
Use Spark DataFrames with Spark ML pipelines
Predict customer churn with Apache Spark ML decision trees
Use Zeppelin to run Spark commands and visualize the results
An Introduction to the MapR Converged Data PlatformMapR Technologies
Listen to the webinar on-demand: http://info.mapr.com/WB_Partner_CDP_Intro_EMEA_DG_17.05.31_RegistrationPage.html
In this 90-minute webinar, we discuss:
- The MapR Converged Data Platform and its components
- Use cases for the Converged Data Platform
- MapR Converged Partner Program
- How to get started with MapR
- Becoming a partner
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
IT budgets are shrinking, and the move to next-generation technologies is upon us. The cloud is an option for nearly every company, but just because it is an option doesn’t mean it is always the right solution for every problem.
Most cloud providers would prefer that every customer be tightly coupled with their proprietary services and APIs to create lock-in with that cloud provider. The savvy customer will leverage the cloud as infrastructure and stay loosely bound to a cloud provider. This creates an opportunity for the customer to execute a multicloud strategy or even a hybrid on-premises and cloud solution.
Jim Scott explores different use cases that may be best run in the cloud versus on-premises, points out opportunities to optimize cost and operational benefits, and explains how to get the data moved between locations. Along the way, Jim discusses security, backups, event streaming, databases, replication, and snapshots across a variety of use cases that run most businesses today.
Is your organization at the analytics crossroads? Have you made strides collecting and sharing massive amounts of data from electronic health records, insurance claims, and health information exchanges but found these efforts made little impact on efficiency, patient outcomes, or costs?
Changes in how business is done combined with multiple technology drivers make geo-distributed data increasingly important for enterprises. These changes are causing serious disruption across a wide range of industries, including healthcare, manufacturing, automotive, telecommunications, and entertainment. Technical challenges arise with these disruptions, but the good news is there are now innovative solutions to address these problems. http://info.mapr.com/WB_Geo-distributed-Big-Data-and-Analytics_Global_DG_17.05.16_RegistrationPage.html
This document is the agenda for a MapR product update webinar that will take place in Spring 2017. It introduces MapR's new Persistent Application Client Container (PACC) which allows applications to easily persist data in Docker containers. It also discusses MapR Edge for IoT which extends MapR's converged data platform to the edge. The webinar will cover Hive, Spark, and Drill updates in the new MapR Ecosystem Pack 3.0. Speakers from MapR will provide details on these products and there will be a question and answer session.
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
SAP® HANA and SAP® IQ are popular platforms for various analytical and transactional use cases. If you’re an SAP customer, you’ve experienced the benefits of deploying these solutions. However, as data volumes grow, you’re likely asking yourself: How do I scale storage to support these applications? How can I have one platform for various applications and use cases?
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
SAP HANA is an increasingly popular platform for various analytical and transactional use cases with its in-memory architecture. If you’re an SAP customer you’ve experienced the benefits.
However, the underlying storage for SAP HANA is painfully expensive. This slows down your ability to grow your SAP HANA footprint and serve up more applications.
You’re not the only one still loading your data into data warehouses and building marts or cubes out of it. But today’s data requires a much more accessible environment that delivers real-time results. Prepare for this transformation because your data platform and storage choices are about to undergo a re-platforming that happens once in 30 years.
With the MapR Converged Data Platform (CDP) and Cisco Unified Compute System (UCS), you can optimize today’s infrastructure and grow to take advantage of what’s next. Uncover the range of possibilities from re-platforming by intimately understanding your options for density, performance, functionality and more.
Drill can query JSON data stored in various data sources like HDFS, HBase, and Hive. It allows running SQL queries over JSON data without requiring a fixed schema. The document describes how Drill enables ad-hoc querying of JSON-formatted Yelp business review data using SQL, providing insights faster than traditional approaches.
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on integration of Salesforce with Bonterra Impact Management.
Interested in deploying an integration with Salesforce for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Dive into the realm of operating systems (OS) with Pravash Chandra Das, a seasoned Digital Forensic Analyst, as your guide. 🚀 This comprehensive presentation illuminates the core concepts, types, and evolution of OS, essential for understanding modern computing landscapes.
Beginning with the foundational definition, Das clarifies the pivotal role of OS as system software orchestrating hardware resources, software applications, and user interactions. Through succinct descriptions, he delineates the diverse types of OS, from single-user, single-task environments like early MS-DOS iterations, to multi-user, multi-tasking systems exemplified by modern Linux distributions.
Crucial components like the kernel and shell are dissected, highlighting their indispensable functions in resource management and user interface interaction. Das elucidates how the kernel acts as the central nervous system, orchestrating process scheduling, memory allocation, and device management. Meanwhile, the shell serves as the gateway for user commands, bridging the gap between human input and machine execution. 💻
The narrative then shifts to a captivating exploration of prominent desktop OSs, Windows, macOS, and Linux. Windows, with its globally ubiquitous presence and user-friendly interface, emerges as a cornerstone in personal computing history. macOS, lauded for its sleek design and seamless integration with Apple's ecosystem, stands as a beacon of stability and creativity. Linux, an open-source marvel, offers unparalleled flexibility and security, revolutionizing the computing landscape. 🖥️
Moving to the realm of mobile devices, Das unravels the dominance of Android and iOS. Android's open-source ethos fosters a vibrant ecosystem of customization and innovation, while iOS boasts a seamless user experience and robust security infrastructure. Meanwhile, discontinued platforms like Symbian and Palm OS evoke nostalgia for their pioneering roles in the smartphone revolution.
The journey concludes with a reflection on the ever-evolving landscape of OS, underscored by the emergence of real-time operating systems (RTOS) and the persistent quest for innovation and efficiency. As technology continues to shape our world, understanding the foundations and evolution of operating systems remains paramount. Join Pravash Chandra Das on this illuminating journey through the heart of computing. 🌟
Digital Marketing Trends in 2024 | Guide for Staying AheadWask
https://www.wask.co/ebooks/digital-marketing-trends-in-2024
Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.