With this presentation you should be able to create a kerberos secured architecture for a framework of an interactive data analysis and machine learning by using a Jupyter/JupyterHub powered by IPython Clusters that enables the machine learning processing clustering local and/or remote nodes, all of this with a non-root user and as a service.
How to go the extra mile on monitoringTiago Simões
This document provides instructions for monitoring additional metrics from clusters and applications using Grafana, Prometheus, JMX, and PushGateway. It includes steps to export JMX metrics from Kafka and NiFi, setup and use PushGateway to collect and expose custom metrics, and create Grafana dashboards to visualize the metrics.
How to create a multi tenancy for an interactive data analysis with jupyter h...Tiago Simões
This document provides instructions for setting up an interactive data analysis framework using a Cloudera Spark cluster with Kerberos authentication, a JupyterHub machine, and LDAP authentication. The key steps are:
1. Install Anaconda, Jupyter, and dependencies on the JupyterHub machine.
2. Configure JupyterHub to use LDAP for authentication via plugins like ldapcreateusers and sudospawner.
3. Set up a PySpark kernel that uses Kerberos authentication to allow users to run Spark jobs on the cluster via proxy impersonation.
4. Optional: Configure JupyterLab as the default interface and enable R, Hive, and Impala kernels.
This document provides instructions for installing a single-node Hadoop cluster on Ubuntu. It outlines downloading and configuring Java, installing Hadoop, configuring SSH access to localhost, editing Hadoop configuration files, and formatting the HDFS filesystem via the namenode. Key steps include adding a dedicated Hadoop user, generating SSH keys, setting properties in core-site.xml, hdfs-site.xml and mapred-site.xml, and running 'hadoop namenode -format' to initialize the filesystem.
How to create a secured cloudera clusterTiago Simões
This presentation, it’s for everyone that is curious with Big Data and does have the know how to start learning...
With this, you will be able to create quickly a Kerberos secured Cloudera Cluster.
To know more, Register for Online Hadoop Training at WizIQ.
Click here : http://www.wiziq.com/course/21308-hadoop-big-data-training
A complete guide to Hadoop Installation that will help you when ever you face problems while installing Hadoop !!
How to configure a hive high availability connection with zeppelinTiago Simões
With this presentation, you not only should be able to configure a Hive Interpreter on Zeppelin but also with a High Availability, Load balancing and Concurrency architecture.
It will be created a JDBC connection with kerberos authentication that will communicate with your Zookeeper on the cluster.
Raphaël Pinson's talk on "Configuration surgery with Augeas" at PuppetCamp Geneva '12. Video at http://youtu.be/H0MJaIv4bgk
Learn more: www.puppetlabs.com
How to go the extra mile on monitoringTiago Simões
This document provides instructions for monitoring additional metrics from clusters and applications using Grafana, Prometheus, JMX, and PushGateway. It includes steps to export JMX metrics from Kafka and NiFi, setup and use PushGateway to collect and expose custom metrics, and create Grafana dashboards to visualize the metrics.
How to create a multi tenancy for an interactive data analysis with jupyter h...Tiago Simões
This document provides instructions for setting up an interactive data analysis framework using a Cloudera Spark cluster with Kerberos authentication, a JupyterHub machine, and LDAP authentication. The key steps are:
1. Install Anaconda, Jupyter, and dependencies on the JupyterHub machine.
2. Configure JupyterHub to use LDAP for authentication via plugins like ldapcreateusers and sudospawner.
3. Set up a PySpark kernel that uses Kerberos authentication to allow users to run Spark jobs on the cluster via proxy impersonation.
4. Optional: Configure JupyterLab as the default interface and enable R, Hive, and Impala kernels.
This document provides instructions for installing a single-node Hadoop cluster on Ubuntu. It outlines downloading and configuring Java, installing Hadoop, configuring SSH access to localhost, editing Hadoop configuration files, and formatting the HDFS filesystem via the namenode. Key steps include adding a dedicated Hadoop user, generating SSH keys, setting properties in core-site.xml, hdfs-site.xml and mapred-site.xml, and running 'hadoop namenode -format' to initialize the filesystem.
How to create a secured cloudera clusterTiago Simões
This presentation, it’s for everyone that is curious with Big Data and does have the know how to start learning...
With this, you will be able to create quickly a Kerberos secured Cloudera Cluster.
To know more, Register for Online Hadoop Training at WizIQ.
Click here : http://www.wiziq.com/course/21308-hadoop-big-data-training
A complete guide to Hadoop Installation that will help you when ever you face problems while installing Hadoop !!
How to configure a hive high availability connection with zeppelinTiago Simões
With this presentation, you not only should be able to configure a Hive Interpreter on Zeppelin but also with a High Availability, Load balancing and Concurrency architecture.
It will be created a JDBC connection with kerberos authentication that will communicate with your Zookeeper on the cluster.
Raphaël Pinson's talk on "Configuration surgery with Augeas" at PuppetCamp Geneva '12. Video at http://youtu.be/H0MJaIv4bgk
Learn more: www.puppetlabs.com
The document discusses Hadoop and HDFS. It provides an overview of HDFS architecture and how it is designed to be highly fault tolerant and provide high throughput access to large datasets. It also discusses setting up single node and multi-node Hadoop clusters on Ubuntu Linux, including configuration, formatting, starting and stopping the clusters, and running MapReduce jobs.
1. The document describes how to set up a Hadoop cluster on Amazon EC2, including creating a VPC, launching EC2 instances for a master node and slave nodes, and configuring the instances to install and run Hadoop services.
2. Key steps include creating a VPC, security group and EC2 instances for the master and slaves, installing Java and Hadoop on the master, cloning the master image for the slaves, and configuring files to set the master and slave nodes and start Hadoop services.
3. The setup is tested by verifying Hadoop processes are running on all nodes and accessing the HDFS WebUI.
The document discusses configuring and running a Hadoop cluster on Amazon EC2 instances using the Cloudera distribution. It provides steps for launching EC2 instances, editing configuration files, starting Hadoop services, and verifying the HDFS and MapReduce functionality. It also demonstrates how to start and stop an HBase cluster on the same EC2 nodes.
This document provides instructions for setting up Hadoop in single node mode on Ubuntu. It describes adding a Hadoop user, installing Java and SSH, downloading and extracting Hadoop, configuring environment variables and Hadoop configuration files, and formatting the NameNode.
Install and Configure Ubuntu for Hadoop Installation for beginners Shilpa Hemaraj
Covered each and every step to configure Ubuntu. Used vmware workstation 10.
Note: I am beginner so I might have used technical word wrong. But it is working perfectly fine.
This document summarizes an OSCON 2010 presentation by Joshua Timberman and Aaron Peterson of Opscode about Chef, an open-source automation platform for configuring and managing servers. The presentation covers Chef 101, getting started with Chef, and cooking with Chef. It discusses key concepts like Chef clients, the Chef server, nodes, roles, recipes, resources, attributes, and data bags. The goal is to provide an introduction to Chef and how it can be used to automate infrastructure.
This document provides an introduction to using Ansible in a top-down approach. It discusses using Ansible to provision infrastructure including load balancers, application servers, and databases. It covers using ad-hoc commands and playbooks to configure systems. Playbooks can target groups of hosts, apply roles to automate common tasks, and allow variables to customize configurations. Selective execution allows running only certain parts of a playbook. Overall the document demonstrates how Ansible can be used to deploy and manage infrastructure and applications in a centralized, automated way.
This document proposes using RPM packages to deploy Java applications to Red Hat Linux systems in a more automated and standardized way. Currently, deployment is a manual multi-step process that is slow, error-prone, and requires detailed application knowledge. The proposal suggests using Maven and Jenkins to build Java applications into RPM packages. These packages can then be installed, upgraded, and rolled back easily using common Linux tools like YUM. This approach simplifies deployment, improves speed, enables easy auditing of versions, and allows for faster rollbacks compared to the current process.
Ansible is an IT automation tool that can provision and configure servers. It works by defining playbooks that contain tasks to be run on target servers. Playbooks use YAML format and modules to automate configuration changes. Vagrant and Ansible can be integrated so that Ansible playbooks are run as part of the Vagrant provisioning process to automate server setup. The document provides an introduction and examples of using Ansible playbooks with Vagrant virtual machines to install and configure the Apache HTTP server.
Vagrant, Ansible, and OpenStack on your laptopLorin Hochstein
The document discusses using Ansible and Vagrant together to easily test and deploy OpenStack. Ansible allows writing idempotent infrastructure scripts, while Vagrant allows testing them by booting reproducible virtual machines. The document provides an example of using Ansible plays to install NTP and using Vagrant to define VMs for an OpenStack controller and compute node.
The document provides configuration details for setting up a Capistrano deployment with multistage environments and recipes for common tasks like installing gems, configuring databases, and integrating with Thinking Sphinx. It includes base configuration definitions, recipes for setting up Thinking Sphinx indexes and configuration files, and instructions for packaging the Capistrano configurations as a gem.
More info at http://blog.carlossanchez.eu/tag/devops
Video en español: http://youtu.be/E_OE4l3t5BA
The DevOps movement aims to improve communication between developers and operations teams to solve critical issues such as fear of change and risky deployments. But the same way that Agile development would likely fail without continuous integration tools, the DevOps principles need tools to make them real, and provide the automation required to actually be implemented. Most of the so called DevOps tools focus on the operations side, and there should be more than that, the automation must cover the full process, Dev to QA to Ops and be as automated and agile as possible. Tools in each part of the workflow have evolved in their own silos, and with the support of their own target teams. But a true DevOps mentality requires a seamless process from the start of development to the end in production deployments and maintenance, and for a process to be successful there must be tools that take the burden out of humans.
Apache Maven has arguably been the most successful tool for development, project standardization and automation introduced in the last years. On the operations side we have open source tools like Puppet or Chef that are becoming increasingly popular to automate infrastructure maintenance and server provisioning.
In this presentation we will introduce an end-to-end development-to-production process that will take advantage of Maven and Puppet, each of them at their strong points, and open source tools to automate the handover between them, automating continuous build and deployment, continuous delivery, from source code to any number of application servers managed with Puppet, running either in physical hardware or the cloud, handling new continuous integration builds and releases automatically through several stages and environments such as development, QA, and production.
How to Develop Puppet Modules: From Source to the Forge With Zero ClicksCarlos Sanchez
Puppet Modules are a great way to reuse code, share your development with other people and take advantage of the hundreds of modules already available in the community. But how to create, test and publish them as easily as possible? now that infrastructure is defined as code, we need to use development best practices to build, test, deploy and use Puppet modules themselves. Three steps for a fully automated process
* Continuous Integration of Puppet Modules
* Automatic release and upload to the Puppet Forge
* Deploy to Puppet master
Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu 康志強 大人
This document provides instructions for installing Hadoop 2.2.0 on a 3 node cluster of Ubuntu virtual machines. It describes setting up hostnames and SSH access between nodes, installing Java and Hadoop, and configuring Hadoop for a multi-node setup with one node as the name node and secondary name node, and the other two nodes as data nodes and node managers. Finally it explains starting up the HDFS and YARN services and verifying the cluster setup.
기존에 저희 회사에서 사용하던 모니터링은 Zabbix 였습니다.
컨테이너 모니터링 부분으로 옮겨가면서 변화가 필요하였고, 이에 대해서 프로메테우스를 활용한 모니터링 방법을 자연스럽게 고민하게 되었습니다.
이에 이영주님께서 테크세션을 진행하였고, 이에 발표자료를 올립니다.
5개의 부분으로 구성되어 있으며, 세팅 방법에 대한 내용까지 포함합니다.
01. Prometheus?
02. Usage
03. Alertmanager
04. Cluster
05. Performance
Preparation study for Docker Event
Mulodo Open Study Group (MOSG) @Ho chi minh, Vietnam
http://www.meetup.com/Open-Study-Group-Saigon/events/229781420/
This document discusses Docker and provides an introduction and overview. It introduces Docker concepts like Dockerfiles, commands, linking containers, volumes, port mapping and registries. It also discusses tools that can be used with Docker like Fig, Baseimage, Boot2Docker and Flynn. The document provides examples of Dockerfiles, commands and how to build, run, link and manage containers.
The document discusses installing and configuring various Linux applications including Apache, PHP, MySQL, and Postgres. It covers basic Ubuntu installation, system configuration, installing packages, configuring Apache, PHP, and MySQL. Specific instructions are provided for installing Apache, configuring virtual hosts and SSL, installing PHP, and installing and configuring MySQL and phpMyAdmin.
Puppet is a configuration management tool that allows systems to be provisioned in a consistent, automated way. It uses manifests and resources to describe a system's configuration. Resources include packages, services, files and users. Modules contain reusable sets of resources. Templates allow variables to be used when generating configuration files. Puppet can be used with Vagrant for development and provisioning, and in production via a Puppet master to distribute configuration to clients.
The document discusses Hadoop and HDFS. It provides an overview of HDFS architecture and how it is designed to be highly fault tolerant and provide high throughput access to large datasets. It also discusses setting up single node and multi-node Hadoop clusters on Ubuntu Linux, including configuration, formatting, starting and stopping the clusters, and running MapReduce jobs.
1. The document describes how to set up a Hadoop cluster on Amazon EC2, including creating a VPC, launching EC2 instances for a master node and slave nodes, and configuring the instances to install and run Hadoop services.
2. Key steps include creating a VPC, security group and EC2 instances for the master and slaves, installing Java and Hadoop on the master, cloning the master image for the slaves, and configuring files to set the master and slave nodes and start Hadoop services.
3. The setup is tested by verifying Hadoop processes are running on all nodes and accessing the HDFS WebUI.
The document discusses configuring and running a Hadoop cluster on Amazon EC2 instances using the Cloudera distribution. It provides steps for launching EC2 instances, editing configuration files, starting Hadoop services, and verifying the HDFS and MapReduce functionality. It also demonstrates how to start and stop an HBase cluster on the same EC2 nodes.
This document provides instructions for setting up Hadoop in single node mode on Ubuntu. It describes adding a Hadoop user, installing Java and SSH, downloading and extracting Hadoop, configuring environment variables and Hadoop configuration files, and formatting the NameNode.
Install and Configure Ubuntu for Hadoop Installation for beginners Shilpa Hemaraj
Covered each and every step to configure Ubuntu. Used vmware workstation 10.
Note: I am beginner so I might have used technical word wrong. But it is working perfectly fine.
This document summarizes an OSCON 2010 presentation by Joshua Timberman and Aaron Peterson of Opscode about Chef, an open-source automation platform for configuring and managing servers. The presentation covers Chef 101, getting started with Chef, and cooking with Chef. It discusses key concepts like Chef clients, the Chef server, nodes, roles, recipes, resources, attributes, and data bags. The goal is to provide an introduction to Chef and how it can be used to automate infrastructure.
This document provides an introduction to using Ansible in a top-down approach. It discusses using Ansible to provision infrastructure including load balancers, application servers, and databases. It covers using ad-hoc commands and playbooks to configure systems. Playbooks can target groups of hosts, apply roles to automate common tasks, and allow variables to customize configurations. Selective execution allows running only certain parts of a playbook. Overall the document demonstrates how Ansible can be used to deploy and manage infrastructure and applications in a centralized, automated way.
This document proposes using RPM packages to deploy Java applications to Red Hat Linux systems in a more automated and standardized way. Currently, deployment is a manual multi-step process that is slow, error-prone, and requires detailed application knowledge. The proposal suggests using Maven and Jenkins to build Java applications into RPM packages. These packages can then be installed, upgraded, and rolled back easily using common Linux tools like YUM. This approach simplifies deployment, improves speed, enables easy auditing of versions, and allows for faster rollbacks compared to the current process.
Ansible is an IT automation tool that can provision and configure servers. It works by defining playbooks that contain tasks to be run on target servers. Playbooks use YAML format and modules to automate configuration changes. Vagrant and Ansible can be integrated so that Ansible playbooks are run as part of the Vagrant provisioning process to automate server setup. The document provides an introduction and examples of using Ansible playbooks with Vagrant virtual machines to install and configure the Apache HTTP server.
Vagrant, Ansible, and OpenStack on your laptopLorin Hochstein
The document discusses using Ansible and Vagrant together to easily test and deploy OpenStack. Ansible allows writing idempotent infrastructure scripts, while Vagrant allows testing them by booting reproducible virtual machines. The document provides an example of using Ansible plays to install NTP and using Vagrant to define VMs for an OpenStack controller and compute node.
The document provides configuration details for setting up a Capistrano deployment with multistage environments and recipes for common tasks like installing gems, configuring databases, and integrating with Thinking Sphinx. It includes base configuration definitions, recipes for setting up Thinking Sphinx indexes and configuration files, and instructions for packaging the Capistrano configurations as a gem.
More info at http://blog.carlossanchez.eu/tag/devops
Video en español: http://youtu.be/E_OE4l3t5BA
The DevOps movement aims to improve communication between developers and operations teams to solve critical issues such as fear of change and risky deployments. But the same way that Agile development would likely fail without continuous integration tools, the DevOps principles need tools to make them real, and provide the automation required to actually be implemented. Most of the so called DevOps tools focus on the operations side, and there should be more than that, the automation must cover the full process, Dev to QA to Ops and be as automated and agile as possible. Tools in each part of the workflow have evolved in their own silos, and with the support of their own target teams. But a true DevOps mentality requires a seamless process from the start of development to the end in production deployments and maintenance, and for a process to be successful there must be tools that take the burden out of humans.
Apache Maven has arguably been the most successful tool for development, project standardization and automation introduced in the last years. On the operations side we have open source tools like Puppet or Chef that are becoming increasingly popular to automate infrastructure maintenance and server provisioning.
In this presentation we will introduce an end-to-end development-to-production process that will take advantage of Maven and Puppet, each of them at their strong points, and open source tools to automate the handover between them, automating continuous build and deployment, continuous delivery, from source code to any number of application servers managed with Puppet, running either in physical hardware or the cloud, handling new continuous integration builds and releases automatically through several stages and environments such as development, QA, and production.
How to Develop Puppet Modules: From Source to the Forge With Zero ClicksCarlos Sanchez
Puppet Modules are a great way to reuse code, share your development with other people and take advantage of the hundreds of modules already available in the community. But how to create, test and publish them as easily as possible? now that infrastructure is defined as code, we need to use development best practices to build, test, deploy and use Puppet modules themselves. Three steps for a fully automated process
* Continuous Integration of Puppet Modules
* Automatic release and upload to the Puppet Forge
* Deploy to Puppet master
Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu 康志強 大人
This document provides instructions for installing Hadoop 2.2.0 on a 3 node cluster of Ubuntu virtual machines. It describes setting up hostnames and SSH access between nodes, installing Java and Hadoop, and configuring Hadoop for a multi-node setup with one node as the name node and secondary name node, and the other two nodes as data nodes and node managers. Finally it explains starting up the HDFS and YARN services and verifying the cluster setup.
기존에 저희 회사에서 사용하던 모니터링은 Zabbix 였습니다.
컨테이너 모니터링 부분으로 옮겨가면서 변화가 필요하였고, 이에 대해서 프로메테우스를 활용한 모니터링 방법을 자연스럽게 고민하게 되었습니다.
이에 이영주님께서 테크세션을 진행하였고, 이에 발표자료를 올립니다.
5개의 부분으로 구성되어 있으며, 세팅 방법에 대한 내용까지 포함합니다.
01. Prometheus?
02. Usage
03. Alertmanager
04. Cluster
05. Performance
Preparation study for Docker Event
Mulodo Open Study Group (MOSG) @Ho chi minh, Vietnam
http://www.meetup.com/Open-Study-Group-Saigon/events/229781420/
This document discusses Docker and provides an introduction and overview. It introduces Docker concepts like Dockerfiles, commands, linking containers, volumes, port mapping and registries. It also discusses tools that can be used with Docker like Fig, Baseimage, Boot2Docker and Flynn. The document provides examples of Dockerfiles, commands and how to build, run, link and manage containers.
The document discusses installing and configuring various Linux applications including Apache, PHP, MySQL, and Postgres. It covers basic Ubuntu installation, system configuration, installing packages, configuring Apache, PHP, and MySQL. Specific instructions are provided for installing Apache, configuring virtual hosts and SSL, installing PHP, and installing and configuring MySQL and phpMyAdmin.
Puppet is a configuration management tool that allows systems to be provisioned in a consistent, automated way. It uses manifests and resources to describe a system's configuration. Resources include packages, services, files and users. Modules contain reusable sets of resources. Templates allow variables to be used when generating configuration files. Puppet can be used with Vagrant for development and provisioning, and in production via a Puppet master to distribute configuration to clients.
This document provides an overview of Puppet concepts including modules, classes, resources, nodes, catalogs, and roles. It explains that Puppet is configuration management software that uses declarative language and resources to define and enforce the desired state of systems. Puppet Masters compile catalogs that Puppet Agents use to configure and maintain nodes according to assigned classes and dependencies between resources. Modules help organize and reuse configuration code.
Virtualization and automation of library software/machines + PuppetOmar Reygaert
The document discusses virtualization, automation, and Puppet. It begins with an introduction to virtualization and hands-on labs. It then covers automation through kickstart files and preseeding to automate operating system installation. Hands-on labs are also provided for automation. Finally, it discusses Puppet for configuration management, including node definitions, modules, and resources to manipulate files, packages, users and more. Hands-on labs are presented for implementing SFX configuration with Puppet.
The document provides instructions for setting up Kubernetes on two VMs (master and worker nodes) using VirtualBox. It describes the minimum requirements for the VMs and outlines the steps to configure networking and install Kubernetes, container runtime (containerd), and CNI (Flannel). The steps covered include setting up NAT and host-only networking in VirtualBox, configuring the hosts file, installing Kubernetes packages (kubeadm, kubelet, kubectl), initializing the master node with kubeadm, joining the worker node to the cluster, and deploying a sample pod.
Puppet is a configuration automation platform that simplifies system administration tasks. It uses a client/server model where agent nodes pull configuration profiles from the Puppet master. Modules on the master describe the desired system configuration. Puppet translates modules into code and configures agent servers as needed. Puppet can manage infrastructure across multiple servers.
The document discusses how immutable infrastructure can be achieved through Puppet by treating systems configuration as code. Puppet allows defining systems in code and enforcing that state through automatic idempotent runs, compensating for inherent system mutability. This brings predictability to infrastructure and allows higher level operations by establishing a foundation of reliable, known states.
Build Your Own CaaS (Container as a Service)HungWei Chiu
In this slide, I introduce the kubernetes and show an example what is CaaS and what it can provides.
Besides, I also introduce how to setup a continuous integration and continuous deployment for the CaaS platform.
The document provides instructions for installing and configuring OpenERP on Ubuntu, including downloading and installing Ubuntu, installing required packages like PostgreSQL, configuring the PostgreSQL database, downloading and installing the OpenERP server and client, configuring the OpenERP files, and starting the OpenERP server and client.
Krux operates a large infrastructure serving thousands of user requests per second. They use Puppet and tools like Cloudkick, Foreman, Boto, and Vagrant to manage their infrastructure in an automated and scalable way. Their Puppet configuration is split into modules, environments, and datacenters. They launch AWS nodes programmatically and configure them with Puppet. Cloudkick is used for monitoring and parallel SSH. Boto allows full Python API access to AWS. Vagrant allows consistently provisioning development machines locally. Automation and external configuration enable their small operations team to manage a large, dynamic infrastructure.
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage ServiceCloudian
This document will help a new user deploy a 3-node Cloudian storage cluster in your data center for use with the Cloudian HyperStore Hybrid Cloud Service from AWS Marketplace.
Sally and Leo use infrastructure as code practices like Cucumber, ServerSpec, Vagrant, and Ansible to automate the provisioning and configuration of a web server. They write behavior tests in Cucumber and infrastructure tests in ServerSpec. Vagrant is used to provision a virtual machine, and Ansible configures the server. By tying the tests to the provisioning code, they can now repeatedly build servers that are known to meet requirements.
Beyond Golden Containers: Complementing Docker with Puppetlutter
Often, Docker or more generally containers and immutable infrastructure are viewed as a replacement for configuration management. This talk explains why that is not the case, and that they are in fact complementary.
Containers move the challenges that configuration management solves to different places in the application lifecycle. The talk explains where Puppet fits into this changed lifecycle, and what tools Puppet provides there.
Slides for a talk I gave at the Linux Foundation Colaboration Summit 2015
The document provides instructions for setting up a Kubernetes cluster with one master node and one worker node on VirtualBox. It outlines the system requirements for the nodes, describes how to configure the networking and hostnames, install Docker and Kubernetes, initialize the master node with kubeadm init, join the worker node with kubeadm join, and deploy a test pod. It also includes commands to check the cluster status and remove existing Docker installations.
Puppi is a Puppet modules that drives Puppet's knowledge of the Systems to a command line tool that you can use to check services availability, gather info on the system and deploy application with a single command.
1. The document provides instructions for installing ODOO v8.0 on an Ubuntu 14.04 LTS system, including creating a system user, installing PostgreSQL and dependencies, cloning the ODOO code from GitHub, configuring the database and ODOO settings, and setting up a boot script to start ODOO on startup.
2. Steps include creating a PostgreSQL user, editing the PostgreSQL configuration files to allow remote connections, installing dependencies like Python modules, cloning the ODOO code, editing the ODOO configuration file, and creating an init script to start ODOO as a service.
3. The instructions conclude by noting that automatic startup and shutdown can be enabled, and that an installation
AMS Node Meetup December presentation Phusion Passengericemobile
Phusion Passenger is an app server for Node.js, Ruby and Python. It simplifies deployment and administration, increases your server's efficiency and helps identifying and solving problems.
In this talk Hongli Lai demonstrates how Passenger simplifies things by integrating with Nginx and by replacing Forever, PM2, Cluster and all sorts of other tools. Hongli also shares what other benefits Passenger has to offer, and what you can expect from future developments.
Cloud init and cloud provisioning [openstack summit vancouver]Joshua Harlow
Evil Superuser's HOWTO: Launching instances to do your bidding.
You click 'run' on the OpenStack dashboard, or launch a new instance via the api. Some provisioning magic happens and soon you've got a server created especially for you. Did you ever wonder what magic happens to a standard image on boot? Have you wanted to launch instances and have them into your infrastructure with no manual interaction? Cloud-init is software that runs in most linux instances. It can take your input and do your bidding. Learn what things cloud-init magically does for you and how you can make it do more. Also, take advantage of the after-talk to pester cloud-init developers on what is missing or throw rotten fruits in their direction.
Similar to How to create a secured multi tenancy for clustered ML with JupyterHub (20)
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Tatiana Kojar
Skybuffer AI, built on the robust SAP Business Technology Platform (SAP BTP), is the latest and most advanced version of our AI development, reaffirming our commitment to delivering top-tier AI solutions. Skybuffer AI harnesses all the innovative capabilities of the SAP BTP in the AI domain, from Conversational AI to cutting-edge Generative AI and Retrieval-Augmented Generation (RAG). It also helps SAP customers safeguard their investments into SAP Conversational AI and ensure a seamless, one-click transition to SAP Business AI.
With Skybuffer AI, various AI models can be integrated into a single communication channel such as Microsoft Teams. This integration empowers business users with insights drawn from SAP backend systems, enterprise documents, and the expansive knowledge of Generative AI. And the best part of it is that it is all managed through our intuitive no-code Action Server interface, requiring no extensive coding knowledge and making the advanced AI accessible to more users.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Azure API Management to expose backend services securely
How to create a secured multi tenancy for clustered ML with JupyterHub
1. How-to create a secured multi
tenancy for Clustered ML with
JupyterHub
Non-root + JupyterHub + Kerberos +
IPython Cluster as a service
2. Introduction
With this presentation you should be able to create a kerberos secured architecture for a
framework of an interactive data analysis and machine learning by using a
Jupyter/JupyterHub powered by IPython Clusters that enables the processing clustering
local and/or remote nodes.
3. Architecture
This architecture enables the following:
● Transparent data-science development
● User Authentication
● Authentication via Kerberos + SSH
● Upgrades on Cluster won’t affect the developments.
● Controlled access to the data and resources by Kerberos Tickets.
● Several coding API’s (Scala, R, Python, PySpark, etc…).
● Parallel Processing
● JupyterHub as service and non-root user
5. Pre-Assumptions
1. Jupyter Machine hostname: cm1.localdomain
2. Controller Node hostname: cm1.localdomain Engine Node hostname: cm2.localdomain
3. Conda Python version: 3.8.5
4. Jupyter Machine Authentication Pre-Installed: Kerberos
a. Kerberos Realm DOMAIN.COM
5. JupyterHub Machine Authentication Not-Installed: Kerberos
6. Permissions user with root or sudo
7. MIT Kerberos installed on your windows machine
6. Miniconda
Add Anaconda User/Dir
adduser anaconda;
passwd anaconda;
mkdir /opt/anaconda;
Download and installation
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -P /tmp;
chmod +x /tmp/Miniconda3-latest-Linux-x86_64.sh;
/tmp/Miniconda3-latest-Linux-x86_64.sh -b -u -p /opt/anaconda;
Note 1: Change with your values in the highlighted field.
Note 2: JupyterHub requires Python 3.X, therefore it will be installed Anaconda 3
Add Permissions MiniConda
chown -R anaconda:anaconda /opt/anaconda;
chmod -R go-w /opt/anaconda && chmod -R go+rX /opt/anaconda;
mkdir -p /apps/anaconda/pkgs;
chown -R anaconda:anaconda /apps/anaconda/pkgs && chmod -R oug+rwx /apps;
7. Anaconda
Set Conda Bash Configurations
nano .bashrc;
export CONDA_PKGS_DIRS="/apps/anaconda/pkgs","/opt/anaconda/pkgs","/home/$USER/.conda/pkgs"
export CONDA_ENVS_DIRS="/apps/anaconda/$USER/envs"
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/opt/anaconda/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/opt/anaconda/etc/profile.d/conda.sh" ]; then
. "/opt/anaconda/etc/profile.d/conda.sh"
else
export PATH="/opt/anaconda/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
conda config --set auto_update_conda False && conda config --add channels conda-forge;
conda config --set pip_interop_enabled True;
Note: Change with your values in the highlighted field.
8. Jupyter or JupyterHub?
JupyterHub it’s a multi-purpose notebook that:
● Manages authentication.
● Spawns single-user notebook on-demand.
● Gives each user a complete notebook
server.
How to choose?
9. JupyterHub
JupyterHub needs to be executed with root privileges or at least some root privileges (ie for example to access to the
pam passwords). Therefore we will need to configure a special user (with no password) that it will be used by the
sudospawner!
For this example we will set: user: jupyter | group: jupyterhub to execute the JupyterHub Server as a service. Any new
user that should access to Jupyter and Spawn Notebooks … must be added to the JupyterHub group.
Create User/Group to operate as Service
sudo useradd jupyter && sudo groupadd jupyterhub && sudo usermod jupyter -G jupyterhub;
Add jupyter to root group & Give Read Permissions (PAM)
sudo usermod -a -G root jupyter; sudo chmod g+r /etc/shadow;
Log as Jupyter user
su - jupyter;
Note 1: it’s only necessary to change the highlighted
10. JupyterHub
Set Conda Bash Configurations
Use the configurations on the Page 7.
Create Environment for JupyterHub
conda create -n jupyterhub_env;
Activate Environment for JupyterHub
conda activate jupyterhub_env;
Install JupyterHub Packages
conda install jupyterhub jupyterlab notebook configurable-http-proxy;
Install sudospawner Package
conda install -c conda-forge sudospawner;
Check sudospawner location
which sudospawner;
Note 1: it’s only necessary to change the highlighted
Create JupyterHub Directories
sudo mkdir /etc/jupyterhub;
sudo chown jupyter:jupyterhub /etc/jupyterhub;
Generate JupyterHub Config file
cd /etc/jupyterhub && jupyterhub --generate-config;
11. JupyterHub
Create/Edit sudoers config
sudo nano /etc/sudoers.d/jupytersudoers;
Runas_Alias JUPYTER_USERS = jupyter
Cmnd_Alias JUPYTER_CMD =
/apps/anaconda/jupyter/envs/jupyterhub_env/bin/sudospawner
%jupyterhub ALL=(jupyter) /usr/bin/sudo
jupyter ALL=(%jupyterhub) NOPASSWD:JUPYTER_CMD
Start JupyterHub Server With Config File
jupyterhub -f /etc/jupyterhub/jupyterhub_config.py;
Note: it’s only necessary to change the highlighted, ex: for your ip.
Create/Edit sudoers config
sudo nano /etc/sudoers.d/jupytersudoers;
import os
import pwd
import subprocess
def create_dir_hook(spawner):
if not os.path.exists(os.path.join('/home/', spawner.user.name)):
subprocess.call(["sudo", "/sbin/mkhomedir_helper",
spawner.user.name])
c.Spawner.pre_spawn_hook = create_dir_hook
c.JupyterHub.bind_url = 'http://10.111.22.333:8000'
c.JupyterHub.hub_bind_url = 'http://10.111.22.333:8081'
c.JupyterHub.hub_ip = '10.111.22.333’
c.JupyterHub.spawner_class = 'sudospawner.SudoSpawner'
c.SudoSpawner.sudospawner_path =
'/apps/anaconda/jupyter/envs/jupyterhub_env/bin/sudospawner'
c.Authenticator.admin_users = {'jupyter'}
12. JupyterHub
Create systemd JupyterHub Directory
sudo mkdir -p /home/jupyter/.config/systemd;
Create systemd JupyterHub service Configuration
sudo nano /home/jupyter/.config/systemd/jupyterhub.service;
[Unit]
Description=Jupyterhub Server
After=syslog.target network-online.target
[Service]
Type=simple
User=jupyter
ExecStart=/etc/jupyterhub/runJupyterhub.sh
WorkingDirectory=/etc/jupyterhub
Restart=on-failure
RestartSec=1min
TimeoutSec=5min
[Install]
WantedBy=multi-user.target
Note: it’s only necessary to change the highlighted
Create JupyterHub Script for Systemd
nano /etc/jupyterhub/runJupyterhub.sh;
#!/bin/bash
export
CONDA_PKGS_DIRS="/apps/anaconda/pkgs","/opt/anaconda/pkgs","/home/$USER/.
conda/pkgs"
export CONDA_ENVS_DIRS="/apps/anaconda/$USER/envs"
__conda_setup="$('/opt/anaconda/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/opt/anaconda/etc/profile.d/conda.sh" ]; then
. "/opt/anaconda/etc/profile.d/conda.sh"
else
export PATH="/opt/anaconda/bin:$PATH"
fi
fi
unset __conda_setup
conda activate /apps/anaconda/jupyter/envs/jupyterhub_env
/apps/anaconda/jupyter/envs/jupyterhub_env/bin/jupyterhub -f
/etc/jupyterhub/jupyterhub_config.py 2>&1 | tee /var/log/jupyter/jupyterhub.log
13. JupyterHub
Create systemd JupyterHub service symbolic link
sudo ln -s /home/jupyter/.config/systemd/jupyterhub.service /etc/systemd/system/jupyterhub.service;
Enable/Start systemd JupyterHub service
sudo systemctl enable jupyterhub.service;
sudo systemctl start jupyterhub && systemctl status jupyterhub;
Note: it’s only necessary to change the highlighted
14. IPython Clusters
With this functionality it will enable on the current architecture, the ability to distribute your python processing between
local and/or remote cpu and therefore use the power of parallel processing.
Install ipyparallel
conda install ipyparallel;
Note: This package must be installed on the controller machine and on all remote engine nodes!
Apply to All Users
jupyter nbextension install --sys-prefix --py ipyparallel;
jupyter nbextension enable --sys-prefix --py ipyparallel;
jupyter serverextension enable --sys-prefix --py ipyparallel;
15. IPython Clusters
Create ssh profile on user
ipython profile create --parallel --profile=ssh;
Note: this is on the scope of the user that will run/spawn the notebook ex: tpsimoes
Configure ssh profile on user
nano /home/tpsimoes/.ipython/profile_ssh/ipcluster_config.py;
c.IPClusterStart.controller_launcher_class = 'Local'
c.IPClusterEngines.engine_launcher_class = 'SSH'
c.SSHEngineSetLauncher.engines = { 'cm1.localdomain' : 2, 'cm2.localdomain' : 5 }
nano /home/tpsimoes/.ipython/profile_ssh/ipcontroller_config.py;
c.IPControllerApp.location = 'cm1.localdomain'
c.HubFactory.client_ip = '10.111.22.333'
c.HubFactory.engine_ip = '10.111.22.333'
c.HubFactory.ip = '*'
Note: it’s only necessary to change the highlighted
16. So that IPython Cluster Controller (SSH profile) can communicate with all the engines (local and remote) we will need to
configure the SSH on Local machine and also on the remote nodes.
KeyLess Configuration
ssh-keygen;
Copy the SSH Public Key (id_rsa.pub) to the root account on your target hosts.
ssh-copy-id -i ~/.ssh/id_rsa.pub -p 22 tpsimoes@cm2.localdomain;
Add the SSH Public Key to the authorized_keys file on your target hosts.
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys && chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys;
Add User to SSH
ssh tpsimoes@localhost;
ssh tpsimoes@cm1.localdomain;
ssh tpsimoes@cm2.localdomain;
Try connecting User via SSH
ssh -p '22' 'tpsimoes@cm2.localdomain';
Note: it’s only necessary to change the highlighted
IPython Clusters
17. IPython Clusters
When starting a Cluster via JupyterHub UI would should see on your logs the communication between machines…
JupyterHub Logs
[I 2021-02-22 14:28:43.979 SingleUserNotebookApp launcher:591] ensuring remote cm1.localdomain:.ipython/profile_ssh/security/ exists
Connection to cm1.localdomain closed.
[I 2021-02-22 14:28:44.776 SingleUserNotebookApp launcher:595] sending /home/tpsimoes/.ipython/profile_ssh/security/ipcontroller-client.json to
cm1.localdomain:.ipython/profile_ssh/security/ipcontroller-client.json
[I 2021-02-22 14:28:45.573 SingleUserNotebookApp launcher:591] ensuring remote cm1.localdomain:.ipython/profile_ssh/security/ exists
Connection to cm1.localdomain closed.
[I 2021-02-22 14:28:46.405 SingleUserNotebookApp launcher:595] sending /home/tpsimoes/.ipython/profile_ssh/security/ipcontroller-engine.json to
cm1.localdomain:.ipython/profile_ssh/security/ipcontroller-engine.json
[I 2021-02-22 14:28:47.308 SingleUserNotebookApp launcher:591] ensuring remote cm2.localdomain:.ipython/profile_ssh/security/ exists
Connection to cm2.localdomain closed.
[I 2021-02-22 14:28:48.087 SingleUserNotebookApp launcher:595] sending /home/tpsimoes/.ipython/profile_ssh/security/ipcontroller-client.json to
cm2.localdomain:.ipython/profile_ssh/security/ipcontroller-client.json
[I 2021-02-22 14:28:48.875 SingleUserNotebookApp launcher:591] ensuring remote cm2.localdomain:.ipython/profile_ssh/security/ exists
Connection to cm2.localdomain closed.
[I 2021-02-22 14:28:49.652 SingleUserNotebookApp launcher:595] sending /home/tpsimoes/.ipython/profile_ssh/security/ipcontroller-engine.json to
cm2.localdomain:.ipython/profile_ssh/security/ipcontroller-engine.json