The document describes how to configure and deploy a MongoDB sharded cluster with 6 virtual machines in 30 minutes. It provides step-by-step instructions on installing MongoDB, setting up the config servers, adding shards, and enabling sharding for databases and collections. Key aspects include designating MongoDB instances as config servers, starting mongos processes connected to the config servers, adding shards by hostname and port, and enabling sharding on specific databases and collections with shard keys.
Debezium is a Kafka Connect plugin that performs Change Data Capture from your database into Kafka. This talk demonstrates how this can be leveraged to move your data from one database platform such as MySQL to PostgreSQL. A working example is available on GitHub (github.com/gh-mlfowler/debezium-demo).
Debezium is a Kafka Connect plugin that performs Change Data Capture from your database into Kafka. This talk demonstrates how this can be leveraged to move your data from one database platform such as MySQL to PostgreSQL. A working example is available on GitHub (github.com/gh-mlfowler/debezium-demo).
A comprehensive walkthrough of how to manage infrastructure-as-code using Terraform. This presentation includes an introduction to Terraform, a discussion of how to manage Terraform state, how to use Terraform modules, an overview of best practices (e.g. isolation, versioning, loops, if-statements), and a list of gotchas to look out for.
For a written and more in-depth version of this presentation, check out the "Comprehensive Guide to Terraform" blog post series: https://blog.gruntwork.io/a-comprehensive-guide-to-terraform-b3d32832baca
Unique course notes for the Certified Kubernetes Administrator (CKA) for each section of the exam. Designed to be engaging and used as a reference in the future for kubernetes concepts.
Presentación empleada en el primer MeetUp AWS del grupo de usuarios de Valencia.
Infraestructura como código empleando Terraform. Se muestra las principales características de esta tecnología que nos permite ser más ágiles y rápidos desplegando nuestras plataformas en AWS.
Presentation that I gave as a guest lecture for a summer intensive development course at nod coworking in Dallas, TX. The presentation targets beginning web developers with little, to no experience in databases, SQL, or PostgreSQL. I cover the creation of a database, creating records, reading/querying records, updating records, destroying records, joining tables, and a brief introduction to transactions.
카카오 광고 플랫폼 MSA 적용 사례 및 API Gateway와 인증 구현에 대한 소개if kakao
황민호(robin.hwang) / kakao corp. DSP개발파트
---
최근 Spring Cloud와 Netflix OSS로 MSA를 구성하는 시스템 기반의 서비스들이 많아지는 추세입니다.
카카오에서도 작년에 오픈한 광고 플랫폼 모먼트에 Spring Cloud 기반의 MSA환경을 구성하여, API Gateway도 적용하였는데 1년 반 정도 운영한 경험을 공유할 예정입니다. 더불어 MSA 환경에서는 API Gateway를 통해 인증을 어떻게 처리하는지 알아보고 OAuth2 기반의 JWT Token을 이용한 인증에 대한 이야기도 함께 나눌 예정입니다.
[2019] 바르게, 빠르게! Reactive를 품은 Spring KafkaNHN FORWARD
※다운로드하시면 더 선명한 자료를 보실 수 있습니다.
Spring Kafka 2.3에 추가된 Reactive API를 소개합니다.
모니터링시스템에서 감지한 이상 현상을 담당자들에게 통지하는 실제 사례를 중심으로 설명합니다.
Reactive 방식으로 메시지를 발행하고 소비하는 방법을 소개하고, 읽어 들인 이벤트 메시지에 적용해야 할 여러 복잡한 요구 사항을 Rx의 연산자들을 통해 간결하게 구현하는 예제를 공유합니다.
Publisher와 Subscriber 간의 동작 구조를 통해 여러 시스템 그리고 저장소와 연계할 때 주의할 점을 되짚어보고, 특히 Kafka를 이용해서 생길 수 있는 문제와 이를 해결할 방법을 제안합니다.
목차
1. Kafka 메시지를 비동기로 처리하는 방법
2. ReactiveX에서 제공하는 연산자를 활용하는 사례
3. Project Reactor의 내부 구조(Publisher-Subscriber 간 처리 흐름)
대상
- Reactive Programming에 관심 있는 분
- Kafka 등 스트리밍 플랫폼의 메시지 처리량을 높이고 싶은 분
■관련 동영상: https://youtu.be/HzQfJNusnO8
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안SANG WON PARK
Apache Kafak의 빅데이터 아키텍처에서 역할이 점차 커지고, 중요한 비중을 차지하게 되면서, 성능에 대한 고민도 늘어나고 있다.
다양한 프로젝트를 진행하면서 Apache Kafka를 모니터링 하기 위해 필요한 Metrics들을 이해하고, 이를 최적화 하기 위한 Configruation 설정을 정리해 보았다.
[Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안]
Apache Kafka 성능 모니터링에 필요한 metrics에 대해 이해하고, 4가지 관점(처리량, 지연, Durability, 가용성)에서 성능을 최적화 하는 방안을 정리함. Kafka를 구성하는 3개 모듈(Producer, Broker, Consumer)별로 성능 최적화를 위한 …
[Apache Kafka 모니터링을 위한 Metrics 이해]
Apache Kafka의 상태를 모니터링 하기 위해서는 4개(System(OS), Producer, Broker, Consumer)에서 발생하는 metrics들을 살펴봐야 한다.
이번 글에서는 JVM에서 제공하는 JMX metrics를 중심으로 producer/broker/consumer의 지표를 정리하였다.
모든 지표를 정리하진 않았고, 내 관점에서 유의미한 지표들을 중심으로 이해한 내용임
[Apache Kafka 성능 Configuration 최적화]
성능목표를 4개로 구분(Throughtput, Latency, Durability, Avalibility)하고, 각 목표에 따라 어떤 Kafka configuration의 조정을 어떻게 해야하는지 정리하였다.
튜닝한 파라미터를 적용한 후, 성능테스트를 수행하면서 추출된 Metrics를 모니터링하여 현재 업무에 최적화 되도록 최적화를 수행하는 것이 필요하다.
Developer’s guide to contributing code to Kafka with Mickael Maison and Tom B...HostedbyConfluent
Contributing code to an open source project can sometimes feel really difficult. The process is often different for each project and requires you to develop, build and test your change and then it needs to be accepted by the project. For Kafka, certain types of changes also require you to go through the Kafka Improvement Proposal (KIP) process.
In this talk, we will cover in detail the process to contribute code to Apache Kafka, from setting up a development environment, to building the code, running tests and opening a PR. We will also look at the KIP process, describe what each section of the document is for, the importance of finding consensus, and what happens when it gets voted. We will share, from a committers point of view, what we look for when reviewing a KIP and give some tips to help you get through the process successfully.
At the end of this talk, you will be able to get started contributing code to Kafka and understand how to get from idea, to KIP, to released feature.
A comprehensive walkthrough of how to manage infrastructure-as-code using Terraform. This presentation includes an introduction to Terraform, a discussion of how to manage Terraform state, how to use Terraform modules, an overview of best practices (e.g. isolation, versioning, loops, if-statements), and a list of gotchas to look out for.
For a written and more in-depth version of this presentation, check out the "Comprehensive Guide to Terraform" blog post series: https://blog.gruntwork.io/a-comprehensive-guide-to-terraform-b3d32832baca
Unique course notes for the Certified Kubernetes Administrator (CKA) for each section of the exam. Designed to be engaging and used as a reference in the future for kubernetes concepts.
Presentación empleada en el primer MeetUp AWS del grupo de usuarios de Valencia.
Infraestructura como código empleando Terraform. Se muestra las principales características de esta tecnología que nos permite ser más ágiles y rápidos desplegando nuestras plataformas en AWS.
Presentation that I gave as a guest lecture for a summer intensive development course at nod coworking in Dallas, TX. The presentation targets beginning web developers with little, to no experience in databases, SQL, or PostgreSQL. I cover the creation of a database, creating records, reading/querying records, updating records, destroying records, joining tables, and a brief introduction to transactions.
카카오 광고 플랫폼 MSA 적용 사례 및 API Gateway와 인증 구현에 대한 소개if kakao
황민호(robin.hwang) / kakao corp. DSP개발파트
---
최근 Spring Cloud와 Netflix OSS로 MSA를 구성하는 시스템 기반의 서비스들이 많아지는 추세입니다.
카카오에서도 작년에 오픈한 광고 플랫폼 모먼트에 Spring Cloud 기반의 MSA환경을 구성하여, API Gateway도 적용하였는데 1년 반 정도 운영한 경험을 공유할 예정입니다. 더불어 MSA 환경에서는 API Gateway를 통해 인증을 어떻게 처리하는지 알아보고 OAuth2 기반의 JWT Token을 이용한 인증에 대한 이야기도 함께 나눌 예정입니다.
[2019] 바르게, 빠르게! Reactive를 품은 Spring KafkaNHN FORWARD
※다운로드하시면 더 선명한 자료를 보실 수 있습니다.
Spring Kafka 2.3에 추가된 Reactive API를 소개합니다.
모니터링시스템에서 감지한 이상 현상을 담당자들에게 통지하는 실제 사례를 중심으로 설명합니다.
Reactive 방식으로 메시지를 발행하고 소비하는 방법을 소개하고, 읽어 들인 이벤트 메시지에 적용해야 할 여러 복잡한 요구 사항을 Rx의 연산자들을 통해 간결하게 구현하는 예제를 공유합니다.
Publisher와 Subscriber 간의 동작 구조를 통해 여러 시스템 그리고 저장소와 연계할 때 주의할 점을 되짚어보고, 특히 Kafka를 이용해서 생길 수 있는 문제와 이를 해결할 방법을 제안합니다.
목차
1. Kafka 메시지를 비동기로 처리하는 방법
2. ReactiveX에서 제공하는 연산자를 활용하는 사례
3. Project Reactor의 내부 구조(Publisher-Subscriber 간 처리 흐름)
대상
- Reactive Programming에 관심 있는 분
- Kafka 등 스트리밍 플랫폼의 메시지 처리량을 높이고 싶은 분
■관련 동영상: https://youtu.be/HzQfJNusnO8
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안SANG WON PARK
Apache Kafak의 빅데이터 아키텍처에서 역할이 점차 커지고, 중요한 비중을 차지하게 되면서, 성능에 대한 고민도 늘어나고 있다.
다양한 프로젝트를 진행하면서 Apache Kafka를 모니터링 하기 위해 필요한 Metrics들을 이해하고, 이를 최적화 하기 위한 Configruation 설정을 정리해 보았다.
[Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안]
Apache Kafka 성능 모니터링에 필요한 metrics에 대해 이해하고, 4가지 관점(처리량, 지연, Durability, 가용성)에서 성능을 최적화 하는 방안을 정리함. Kafka를 구성하는 3개 모듈(Producer, Broker, Consumer)별로 성능 최적화를 위한 …
[Apache Kafka 모니터링을 위한 Metrics 이해]
Apache Kafka의 상태를 모니터링 하기 위해서는 4개(System(OS), Producer, Broker, Consumer)에서 발생하는 metrics들을 살펴봐야 한다.
이번 글에서는 JVM에서 제공하는 JMX metrics를 중심으로 producer/broker/consumer의 지표를 정리하였다.
모든 지표를 정리하진 않았고, 내 관점에서 유의미한 지표들을 중심으로 이해한 내용임
[Apache Kafka 성능 Configuration 최적화]
성능목표를 4개로 구분(Throughtput, Latency, Durability, Avalibility)하고, 각 목표에 따라 어떤 Kafka configuration의 조정을 어떻게 해야하는지 정리하였다.
튜닝한 파라미터를 적용한 후, 성능테스트를 수행하면서 추출된 Metrics를 모니터링하여 현재 업무에 최적화 되도록 최적화를 수행하는 것이 필요하다.
Developer’s guide to contributing code to Kafka with Mickael Maison and Tom B...HostedbyConfluent
Contributing code to an open source project can sometimes feel really difficult. The process is often different for each project and requires you to develop, build and test your change and then it needs to be accepted by the project. For Kafka, certain types of changes also require you to go through the Kafka Improvement Proposal (KIP) process.
In this talk, we will cover in detail the process to contribute code to Apache Kafka, from setting up a development environment, to building the code, running tests and opening a PR. We will also look at the KIP process, describe what each section of the document is for, the importance of finding consensus, and what happens when it gets voted. We will share, from a committers point of view, what we look for when reviewing a KIP and give some tips to help you get through the process successfully.
At the end of this talk, you will be able to get started contributing code to Kafka and understand how to get from idea, to KIP, to released feature.
This tutorial will guide you through the many considerations when deploying a sharded cluster. We will cover the services that make up a sharded cluster, configuration recommendations for these services, shard key selection, use cases, and how data is managed within a sharded cluster. Maintaining a sharded cluster also has its challenges. We will review these challenges and how you can prevent them with proper design or ways to resolve them if they exist today. There will be lab sessions at the end of some chapters so please have your laptops with you.
This tutorial will guide you through the many considerations when deploying a sharded cluster. We will cover the services that make up a sharded cluster, configuration recommendations for these services, shard key selection, use cases, and how data is managed within a sharded cluster. Maintaining a sharded cluster also has its challenges. We will review these challenges and how you can prevent them with proper design or ways to resolve them if they exist today. Additional topics like tag aware sharding (Zones), disaster recovery, and data streaming will also be covered.
Configuring MongoDB HA Replica Set on AWS EC2ShepHertz
It has always been a tedious task to choose the right configuration for MongoDB on AWS EC2
It is always challenging and takes a lots of time to make your system Production Ready.
Here is a quick guide on how to setup MongoDB on AWS EC2.
In this white-paper we will explain how one can install and configure cloud storage software and use it for backup purposes. Document can be used by an individual or a company who wishes to become a cloud backup provider and utilize its own hardware for cloud object-based storage. We will use Cloudian object-storage software along with CloudBerry Lab products.
This presentation was written by Wagner Bianchi for the presentation on the Oracle Consulting Team/Professional Services meeting that took place in San Francisco/CA.
Building Apache Cassandra clusters for massive scaleAlex Thompson
Covering theory and operational aspects of bring up Apache Cassandra clusters - this presentation can be used as a field reference. Presented by Alex Thompson at the Sydney Cassandra Meetup.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
2. The presentation is based on VMs created using VMWare’s ESXi/Vsphere and RHEL/Oracle
Linux.
This presentation is based on setting up mongoDB sharded cluster
with 6 VMs in the MongoDB cluster (RedHat Linux).
Each VM consists of 2 vCPUs and 4 GB of RAM
Each VM is created with 80 GB of disk space. No special mounts/ file systems are used.
Linux version used: 6.5
5. Installing mongdb
The following steps guide you through installing mongodb software on Linux.
set yum repository and download packages using yum package installer.
[root@bigdata1 ~]# cd /etc/yum.repos.d/
[root@bigdata1 yum.repos.d]# cat mongodb.repo
[mongodb]
name=MongoDB Repository
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/
gpgcheck=0
enabled=1
Before you run “yum install”, be sure to check internet is working.
[root@bigdata6 yum.repos.d]# yum install -y mongodb-org-2.6.1 mongodb-org-server-2.6.1
mongodb-org-shell-2.6.1 mongodb-org-mongos-2.6.1 mongodb-org-tools-2.6.1
6. Setting up mongodb
Make sure to create /data/configdb and /data/db directories on each servers
The above directories should be owned by mongod user and mongod group.
Change ownership to mongod
As a root user run “chown mongod:mongod /data/configdb” and “chown mongod:mongod
/data/db”
Without above directories mongod process will not start
When you start mongo daemon process it will create “/data/configdb/mongod.lock” file
You can also start mongod process with service option as root. For example “service mongod
start”
You can also configure mongo daemon to start at system boot.
7. MongoDB Sharding Architecture
Sharding is a method for storing data across multiple machines. MongoDB uses sharding
to support deployments with very large data sets and high throughput operations.
Database systems with large data sets and high throughput applications can challenge
the capacity of a single server. High query rates can exhaust the CPU capacity of the
server. Larger data sets exceed the storage capacity of a single machine. Finally,
working set sizes larger than the system’s RAM stress the I/O capacity of disk drives.
Sharding addresses the challenge of scaling to support high throughput and large data
sets:
Sharding reduces the number of operations each shard handles. Each shard processes fewer
operations as the cluster grows. As a result, a cluster can increase capacity and throughput
horizontally.
For example, to insert data, the application only needs to access the shard responsible for
that record.
Sharding reduces the amount of data that each server needs to store. Each shard stores less
data as the cluster grows.
For example, if a database has a 1 terabyte data set, and there are 4 shards, then each shard
might hold only 256GB of data. If there are 40 shards, then each shard might hold only 25GB
of data.
9. Deploying a sharded cluster
Start the Config Server Database Instances.
The config server processes are mongod instances that store the cluster’s metadata.
Designate a mongod as a config server using the --configsvr option.
e.g: mongod --configsvr --dbpath /data/configdb --port 27019 --bind_ip 192.168.0.131 –v
Each config server stores a complete copy of the cluster’s metadata.
Production deployments require three config server instances, each of them running on different servers to ensure
high availability and data safety.
All members of a sharded cluster must be able to connect to all other members of a sharded cluster.
Make sure to setup proper network connectivity, firewall rules.
Start the three config server instances. For example
Node1: mongod --configsvr --dbpath /data/configdb --port 27019 --bind_ip 192.168.0.131 -v
Node2: mongod --configsvr --dbpath /data/configdb --port 27019 --bind_ip 192.168.0.132 –v
Node3: mongod --configsvr --dbpath /data/configdb --port 27019 --bind_ip 192.168.0.133 -v
Start the mongos Instances:
The mongos instances are lightweight and do not require data directories.
You can run a mongos instance on a system that runs other cluster components, such as on an application server or a
server running a mongod process.
By default, a mongos instance runs on port 27017. Specify the hostnames of the three config servers, either in the
configuration file or as command line parameters.
For example: mongos --configdb bigdata1:27019,bigdata2:27019,bigdata3:27019
10. Add Shards to the Cluster
Enable Sharding for a Database
Enabling sharding for a database does not redistribute data but make it possible to shard the
collections in that database.
From a mongo shell, connect to the mongos instance. Issue a command using the following
[hdfs@bigdata3 ~]$ mongo --host bigdata2 --port 27017
MongoDB shell version: 2.6.1
connecting to: bigdata2:27017/test
rs0:PRIMARY>
sh.addShard("rs0/bigdata1:27017")
sh.addShard("rs0/bigdata2:27017")
sh.addShard("rs0/bigdata3:27017")
sh.addShard("rs0/bigdata4:27017")
sh.addShard("rs0/bigdata5:27017")
sh.addShard("rs0/bigdata6:27017")
11. Enable Sharding for a Collection
Before you can shard a collection, you must enable sharding for the collection’s database.
Enabling sharding for a database does not redistribute data but make it possible to shard the
collections in that database.
Once you enable sharding for a database, MongoDB assigns a primary shard for that database
where MongoDB stores all data before sharding begins.
Issue the sh.enableSharding() method.
rs0:PRIMARY> show dbs
sh.enableSharding(" customer”) OR
db.runCommand( { enableSharding: “customer” } )
Enable sharding on a per-collection basis.
sh.shardCollection("records.people", { "zipcode": 1, "name": 1 } )
sh.shardCollection("people.addresses", { "state": 1, "_id": 1 } )
sh.shardCollection("assets.chairs", { "type": 1, "_id": 1 } )
sh.shardCollection("events.alerts", { "_id": "hashed" } )