High Performance Python on Apache SparkWes McKinney
This document contains the slides from a presentation given by Wes McKinney on high performance Python on Apache Spark. The presentation discusses why Python is an important and productive language, defines what is meant by "high performance Python", and explores techniques for building fast Python software such as embracing limitations of the Python interpreter and using native data structures and compiled extensions where needed. Specific examples are provided around control flow, reading CSV files, and the importance of efficient in-memory data structures.
공유 스토리지를 이용한 H/A Cluster 뿐만 아니라
Replication을 이용한 Shared Nothing H/A Cluster 제공
내장된 Application 인지형의 고가용성 기능 제공
DB에 대하여 이중으로 Check 하는 Depth 모니터링 기능
30개의 주요한 Applications 지원
Surge 2014: From Clouds to Roots: root cause performance analysis at Netflix. Brendan Gregg.
At Netflix, high scale and fast deployment rule. The possibilities for failure are endless, and the environment excels at handling this, regularly tested and exercised by the simian army. But, when this environment automatically works around systemic issues that aren’t root-caused, they can grow over time. This talk describes the challenge of not just handling failures of scale on the Netflix cloud, but also new approaches and tools for quickly diagnosing their root cause in an ever changing environment.
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...Databricks
This presentation discusses how to use Redis and Spark Structured Streaming to process streaming data at scale. The solution breaks down into three functional blocks - data ingest using Redis Streams, data processing using Spark Structured Streaming, and data querying using Spark SQL. Redis Streams are used to ingest streaming click data, Spark Structured Streaming processes the data in micro-batches, and Spark SQL queries the processed data stored as Redis hashes. This combination provides a scalable solution to continuously collect, process, and query data streams in real-time.
Apache Tez - A New Chapter in Hadoop Data ProcessingDataWorks Summit
Apache Tez is a framework for accelerating Hadoop query processing. It is based on expressing a computation as a dataflow graph and executing it in a highly customizable way. Tez is built on top of YARN and provides benefits like better performance, predictability, and utilization of cluster resources compared to traditional MapReduce. It allows applications to focus on business logic rather than Hadoop internals.
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...Christian Tzolov
When working with BigData & IoT systems we often feel the need for a Common Query Language. The system specific languages usually require longer adoption time and are harder to integrate within the existing stacks.
To fill this gap some NoSql vendors are building SQL access to their systems. Building SQL engine from scratch is a daunting job and frameworks like Apache Calcite can help you with the heavy lifting. Calcite allow you to integrate SQL parser, cost-based optimizer, and JDBC with your NoSql system.
We will walk through the process of building a SQL access layer for Apache Geode (In-Memory Data Grid). I will share my experience, pitfalls and technical consideration like balancing between the SQL/RDBMS semantics and the design choices and limitations of the data system.
Hopefully this will enable you to add SQL capabilities to your prefered NoSQL data system.
High Performance Python on Apache SparkWes McKinney
This document contains the slides from a presentation given by Wes McKinney on high performance Python on Apache Spark. The presentation discusses why Python is an important and productive language, defines what is meant by "high performance Python", and explores techniques for building fast Python software such as embracing limitations of the Python interpreter and using native data structures and compiled extensions where needed. Specific examples are provided around control flow, reading CSV files, and the importance of efficient in-memory data structures.
공유 스토리지를 이용한 H/A Cluster 뿐만 아니라
Replication을 이용한 Shared Nothing H/A Cluster 제공
내장된 Application 인지형의 고가용성 기능 제공
DB에 대하여 이중으로 Check 하는 Depth 모니터링 기능
30개의 주요한 Applications 지원
Surge 2014: From Clouds to Roots: root cause performance analysis at Netflix. Brendan Gregg.
At Netflix, high scale and fast deployment rule. The possibilities for failure are endless, and the environment excels at handling this, regularly tested and exercised by the simian army. But, when this environment automatically works around systemic issues that aren’t root-caused, they can grow over time. This talk describes the challenge of not just handling failures of scale on the Netflix cloud, but also new approaches and tools for quickly diagnosing their root cause in an ever changing environment.
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...Databricks
This presentation discusses how to use Redis and Spark Structured Streaming to process streaming data at scale. The solution breaks down into three functional blocks - data ingest using Redis Streams, data processing using Spark Structured Streaming, and data querying using Spark SQL. Redis Streams are used to ingest streaming click data, Spark Structured Streaming processes the data in micro-batches, and Spark SQL queries the processed data stored as Redis hashes. This combination provides a scalable solution to continuously collect, process, and query data streams in real-time.
Apache Tez - A New Chapter in Hadoop Data ProcessingDataWorks Summit
Apache Tez is a framework for accelerating Hadoop query processing. It is based on expressing a computation as a dataflow graph and executing it in a highly customizable way. Tez is built on top of YARN and provides benefits like better performance, predictability, and utilization of cluster resources compared to traditional MapReduce. It allows applications to focus on business logic rather than Hadoop internals.
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...Christian Tzolov
When working with BigData & IoT systems we often feel the need for a Common Query Language. The system specific languages usually require longer adoption time and are harder to integrate within the existing stacks.
To fill this gap some NoSql vendors are building SQL access to their systems. Building SQL engine from scratch is a daunting job and frameworks like Apache Calcite can help you with the heavy lifting. Calcite allow you to integrate SQL parser, cost-based optimizer, and JDBC with your NoSql system.
We will walk through the process of building a SQL access layer for Apache Geode (In-Memory Data Grid). I will share my experience, pitfalls and technical consideration like balancing between the SQL/RDBMS semantics and the design choices and limitations of the data system.
Hopefully this will enable you to add SQL capabilities to your prefered NoSQL data system.
Cost-based Query Optimization in Apache Phoenix using Apache CalciteJulian Hyde
This document summarizes a presentation on using Apache Calcite for cost-based query optimization in Apache Phoenix. Key points include:
- Phoenix is adding Calcite's query planning capabilities to improve performance and SQL compliance over its existing query optimizer.
- Calcite models queries as relational algebra expressions and uses rules, statistics, and a cost model to choose the most efficient execution plan.
- Examples show how Calcite rules like filter pushdown and exploiting sortedness can generate better plans than Phoenix's existing optimizer.
- Materialized views and interoperability with other Calcite data sources like Apache Drill are areas for future improvement beyond the initial Phoenix+Calcite integration.
CICS is the power of mainframe. It has all the capabilities to handle online transactions. The ppt covers highly useful CICS concepts to refresh your CICS knowledge quickly.
Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman ...HostedbyConfluent
Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman Kolesnev | Current 2022
Kafka Streams applications can process fast-moving, unbounded streams of data. This gives us the capability to process and react to events from many sources in near real time as they converge in Kafka. However, if the events in these data streams have a spatial component and their spatial relationships with each other determine how they should be processed or reacted to, this raises some fundamental challenges. Determining that, for example, a person is within an area or that routes are intersecting requires access to geospatial operations which are not readily available in Kafka Streams.
In this talk, we will first set the scene with a geospatial 101. Then, using a simplified taxi hailing use case, we will look at two approaches for processing spatial data with Kafka Streams. The first approach is a naive approach which uses Kafka Streams DSL, geohashing and the Java Spatial4j library. The second approach is a prototype which replaces the RocksDB statestore with Apache Lucene (an embedded storage engine with powerful indexing, search and geospatial capabilities), and implements a stateful spatial join with the Transformer API.
This talk will give you an appreciation of geospatial use cases and how Kafka Streams could enable them. You will see the role the state store plays in stateful processing and the implications for geospatial processing. It will also show you what is involved in integrating a custom state store with Kafka Streams. Overall, this talk will give you an understanding of how you might go about building custom processing capabilities on top of Kafka Streams for your own use cases.
This document provides an overview and deep dive into Robinhood's RDS Data Lake architecture for ingesting data from their RDS databases into an S3 data lake. It discusses their prior daily snapshotting approach, and how they implemented a faster change data capture pipeline using Debezium to capture database changes and ingest them incrementally into a Hudi data lake. It also covers lessons learned around change data capture setup and configuration, initial table bootstrapping, data serialization formats, and scaling the ingestion process. Future work areas discussed include orchestrating thousands of pipelines and improving downstream query performance.
This document discusses ongoing work to improve HDFS multi-tenancy support and resource management. It describes how HDFS currently supports resource sharing, isolation, and management. Improvements include fair call queueing for NameNode RPCs, throttling techniques to avoid queue overload, and a proposed resource coupon system for reserving NameNode and DataNode resources. The goal is to provide better quality of service and allow prioritization of important jobs over batch workloads.
The document summarizes Apache Phoenix and its past, present, and future as a SQL interface for HBase. It describes Phoenix's architecture and key features like secondary indexes, joins, aggregations, and transactions. Recent releases added functional indexes, the Phoenix Query Server, and initial transaction support. Future plans include improvements to local indexes, integration with Calcite and Hive, and adding JSON and other SQL features. The document aims to provide an overview of Phoenix's capabilities and roadmap for building a full-featured SQL layer over HBase.
Avro Tutorial - Records with Schema for Kafka and HadoopJean-Paul Azar
Covers how to use Avro to save records to disk. This can be used later to use Avro with Kafka Schema Registry. This provides background on Avro which gets used with Hadoop and Kafka.
Prestogres is a PostgreSQL protocol gateway for Presto that allows Presto to be queried using standard BI tools through ODBC/JDBC. It works by rewriting queries at the pgpool-II middleware layer and executing the rewritten queries on Presto using PL/Python functions. This allows Presto to integrate with the existing BI tool ecosystem while avoiding the complexity of implementing the full PostgreSQL protocol. Key aspects of the Prestogres implementation include faking PostgreSQL system catalogs, handling multi-statement queries and errors, and security definition. Future work items include better supporting SQL syntax like casts and temporary tables.
Presentation done at Kamailio World 2014, Berlin, Germanu - several methods to do asynchronous SIP routing via Kamailio configuration file. Blend modules such as tm, tmx, mqueue, rtimer, async, evapi, etc. to suspend routing of current SIP request and resume that once processing of additional tasks has finished.
One of the many challenges of a distributed architecture is preserving the consistency of data across different systems. During this one-hour presentation, we are going to explore a number of strategies for maintaining consistency, going from the most basic options up to an automated recovery mechanism using compensations and reservations - what’s commonly referred to as a “saga” pattern. Our journey will be based on a hypothetical food delivery application on which we will analyze various decisions and their tradeoffs. The discussion will stay at an abstract, architectural level for the most part, with only a few code examples.
In the agenda:
- Idempotency and Retries
- 2 Phase Commit
- Eventual Consistency
- Compensations
- Reservations
- The Saga Pattern
Building Resilient and Scalable Data Pipelines by Decoupling Compute and StorageDatabricks
This document discusses Pure Storage's log analytics data pipeline. It began with over 1,000 VMs and 100 file brokers processing 1.5-2 million events per second and 0.5-1 petabytes of data daily. Through decoupling compute and storage, the pipeline now uses 2,500+ VMs and 350+ file brokers to process more data and 120,000+ tests daily while maintaining a 5 second SLA. Decoupling allows each stage to scale independently and improves flexibility, efficiency, reliability and scalability to handle growth.
The document discusses best practices for streaming applications. It covers common streaming use cases like ingestion, transformations, and counting. It also discusses advanced streaming use cases that involve machine learning. The document provides an overview of streaming architectures and compares different streaming engines like Spark Streaming, Flink, Storm, and Kafka Streams. It discusses when to use different storage systems and message brokers like Kafka for ingestion pipelines. The goal is to understand common streaming use cases and their architectures.
Cost-based Query Optimization in Apache Phoenix using Apache CalciteJulian Hyde
This document summarizes a presentation on using Apache Calcite for cost-based query optimization in Apache Phoenix. Key points include:
- Phoenix is adding Calcite's query planning capabilities to improve performance and SQL compliance over its existing query optimizer.
- Calcite models queries as relational algebra expressions and uses rules, statistics, and a cost model to choose the most efficient execution plan.
- Examples show how Calcite rules like filter pushdown and exploiting sortedness can generate better plans than Phoenix's existing optimizer.
- Materialized views and interoperability with other Calcite data sources like Apache Drill are areas for future improvement beyond the initial Phoenix+Calcite integration.
CICS is the power of mainframe. It has all the capabilities to handle online transactions. The ppt covers highly useful CICS concepts to refresh your CICS knowledge quickly.
Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman ...HostedbyConfluent
Real-Time Processing of Spatial Data Using Kafka Streams, Ian Feeney & Roman Kolesnev | Current 2022
Kafka Streams applications can process fast-moving, unbounded streams of data. This gives us the capability to process and react to events from many sources in near real time as they converge in Kafka. However, if the events in these data streams have a spatial component and their spatial relationships with each other determine how they should be processed or reacted to, this raises some fundamental challenges. Determining that, for example, a person is within an area or that routes are intersecting requires access to geospatial operations which are not readily available in Kafka Streams.
In this talk, we will first set the scene with a geospatial 101. Then, using a simplified taxi hailing use case, we will look at two approaches for processing spatial data with Kafka Streams. The first approach is a naive approach which uses Kafka Streams DSL, geohashing and the Java Spatial4j library. The second approach is a prototype which replaces the RocksDB statestore with Apache Lucene (an embedded storage engine with powerful indexing, search and geospatial capabilities), and implements a stateful spatial join with the Transformer API.
This talk will give you an appreciation of geospatial use cases and how Kafka Streams could enable them. You will see the role the state store plays in stateful processing and the implications for geospatial processing. It will also show you what is involved in integrating a custom state store with Kafka Streams. Overall, this talk will give you an understanding of how you might go about building custom processing capabilities on top of Kafka Streams for your own use cases.
This document provides an overview and deep dive into Robinhood's RDS Data Lake architecture for ingesting data from their RDS databases into an S3 data lake. It discusses their prior daily snapshotting approach, and how they implemented a faster change data capture pipeline using Debezium to capture database changes and ingest them incrementally into a Hudi data lake. It also covers lessons learned around change data capture setup and configuration, initial table bootstrapping, data serialization formats, and scaling the ingestion process. Future work areas discussed include orchestrating thousands of pipelines and improving downstream query performance.
This document discusses ongoing work to improve HDFS multi-tenancy support and resource management. It describes how HDFS currently supports resource sharing, isolation, and management. Improvements include fair call queueing for NameNode RPCs, throttling techniques to avoid queue overload, and a proposed resource coupon system for reserving NameNode and DataNode resources. The goal is to provide better quality of service and allow prioritization of important jobs over batch workloads.
The document summarizes Apache Phoenix and its past, present, and future as a SQL interface for HBase. It describes Phoenix's architecture and key features like secondary indexes, joins, aggregations, and transactions. Recent releases added functional indexes, the Phoenix Query Server, and initial transaction support. Future plans include improvements to local indexes, integration with Calcite and Hive, and adding JSON and other SQL features. The document aims to provide an overview of Phoenix's capabilities and roadmap for building a full-featured SQL layer over HBase.
Avro Tutorial - Records with Schema for Kafka and HadoopJean-Paul Azar
Covers how to use Avro to save records to disk. This can be used later to use Avro with Kafka Schema Registry. This provides background on Avro which gets used with Hadoop and Kafka.
Prestogres is a PostgreSQL protocol gateway for Presto that allows Presto to be queried using standard BI tools through ODBC/JDBC. It works by rewriting queries at the pgpool-II middleware layer and executing the rewritten queries on Presto using PL/Python functions. This allows Presto to integrate with the existing BI tool ecosystem while avoiding the complexity of implementing the full PostgreSQL protocol. Key aspects of the Prestogres implementation include faking PostgreSQL system catalogs, handling multi-statement queries and errors, and security definition. Future work items include better supporting SQL syntax like casts and temporary tables.
Presentation done at Kamailio World 2014, Berlin, Germanu - several methods to do asynchronous SIP routing via Kamailio configuration file. Blend modules such as tm, tmx, mqueue, rtimer, async, evapi, etc. to suspend routing of current SIP request and resume that once processing of additional tasks has finished.
One of the many challenges of a distributed architecture is preserving the consistency of data across different systems. During this one-hour presentation, we are going to explore a number of strategies for maintaining consistency, going from the most basic options up to an automated recovery mechanism using compensations and reservations - what’s commonly referred to as a “saga” pattern. Our journey will be based on a hypothetical food delivery application on which we will analyze various decisions and their tradeoffs. The discussion will stay at an abstract, architectural level for the most part, with only a few code examples.
In the agenda:
- Idempotency and Retries
- 2 Phase Commit
- Eventual Consistency
- Compensations
- Reservations
- The Saga Pattern
Building Resilient and Scalable Data Pipelines by Decoupling Compute and StorageDatabricks
This document discusses Pure Storage's log analytics data pipeline. It began with over 1,000 VMs and 100 file brokers processing 1.5-2 million events per second and 0.5-1 petabytes of data daily. Through decoupling compute and storage, the pipeline now uses 2,500+ VMs and 350+ file brokers to process more data and 120,000+ tests daily while maintaining a 5 second SLA. Decoupling allows each stage to scale independently and improves flexibility, efficiency, reliability and scalability to handle growth.
The document discusses best practices for streaming applications. It covers common streaming use cases like ingestion, transformations, and counting. It also discusses advanced streaming use cases that involve machine learning. The document provides an overview of streaming architectures and compares different streaming engines like Spark Streaming, Flink, Storm, and Kafka Streams. It discusses when to use different storage systems and message brokers like Kafka for ingestion pipelines. The goal is to understand common streaming use cases and their architectures.
Livin' with Docker - dallo sviluppo alla produzionegiacomos
Presentiamo un caso di studio di un progetto web nato e cresciuto con Docker al centro della scena. Vedremo le soluzioni scelte durante tutto il percorso, partendo da docker-compose in locale, per arrivare a CoreOS e systemd in produzione, passando per la fase di continuous integration/build e il deploy.
Talk DockerOps 13-02-2016, Ferrara
Introduzione a Docker e alla Dockerizzazione.
Filosofia e progettazione Docker
Comandi principali per gesitere immagini, container, volumi e reti
Dockerfile e docker-compose
Valerio Radice @ Nextre (Maggio 2017)
TAG: docker , Dockerfile , docker-compose , italian , nextre
[Laravel Day 2022] Deploy di Laravel su AWS Lambda (from Zero to Hero).pdfFrancesco Liuzzi
Grazie a Serverless e Bref.sh è possibile mettere (facilmente) online una web-app in Laravel su AWS. Vedremo come è possibile dar vita ad un’architettura complessa (con web server, object storage, code, database, cache e CDN) ad alta disponibiltà e scalabile, mantenendo un costo basso ed utilizzando esclusivamente servizi fully-managed.
Francesco Liuzzi
Dell'hosting web su cloud se ne parla molto, e non senza motivo: oltre a essere più economico dei server tradizionali, questo metodo offre la sicurezza della continuità e della potenza di server multipli.
Per aiutarvi a sfruttare al meglio questi vantaggi, il System Engineer Danilo Abbasciano dedica la guida pubblicata oggi nel Centro Risorse a un'applicazione pratica dell'argomento, utilizzando due strumenti open source: l'installazione di Joomla con OpenShift. Mentre Joomla ha bisogno ormai di poche presentazioni, OpenShift è un prodotto nuovo, rilasciato da Red Hat nel 2011. Si tratta di una piattaforma open source per la gestione di applicazioni cloud, disponibile anche nella versione gratuita Express.
Se siete interessati alla flessibilità e alla portabilità di questa soluzione, per costruire il vostro sito Joomla sul cloud Red Hat cominciate da qui: Danilo Abbasciano descrive tutti i passi del processo, dall'installazione del client OpenShift alla corretta configurazione di Joomla.
Installazione Qt/Qt Quick per target AndroidPaolo Sereno
Questo breve tutorial rappresenta una mini guida per iniziare a programmare con Qt e Qt Quick su target Android. In particolare esso vuole essere un “memo” da usare durante i meetup e workshop sull’argomento organizzati dalla web community Qt-Italia.org.
Quali strumenti utilizzare per migliorare il workflow di uno sviluppatore? Oggi strumenti come git, docker, gitlab e kubernetes ci aiutano a gestire meglio il nostro tempo permettendoci di focalizzarci di piu' sul codice che sulla customizzazione dell'ambiente.
Distribuire una libreria Java per usarla come dipendenza gradlePaolo Montalto
L'utilizzo di dipendenze software è una tecnica entrata già da tempo nella pratica quotidiana di ciascun buon programmatore. I suoi vantaggi sono indubbi ma non tutti sanno come funzionano le dipendenze e come sia possibile rendere disponibile pubblicamente la propria libreria.
In questo talk cerco di spiegare per quale motivo è importante utilizzare dipendenze software, come funzionano, perché può essere utile pubblicare le proprie librerie e come è possibile farlo, mostrando un caso reale basato su Gradle.
Vagrant e Docker a confronto;scegliere ed iniziareDaniele Mondello
Pitch presentato al Linux Day 2015 a Palermo su Vagrant e docker per confrontarli, scegliere ed iniziare. Partendo dal concetto di virtualizzazione, proseguendo con l'analisi delle due soluzioni fino a dare cenni sull'installazione ed un primo utilizzo.
Slide e riferimenti al codice del Meetup del 28/02/2019 del Vue JS Milano dove parlo di creazione di componenti, passaggio di parametri, computed properties
Similar to Con Aruba, a lezione di cloud #lezione 30 - parte 1: 'GitLab e Cloud Server Smart - installazione manuale' (20)
Create and use a Dockerized Aruba Cloud server - CloudConf 2017Aruba S.p.A.
Docker can be used to provision and manage virtual servers hosted on the Aruba Cloud platform. The docker-machine driver for Aruba Cloud allows users to create, start, stop, and remove Docker-enabled virtual servers using docker-machine commands. Virtual servers on Aruba Cloud include Smart and Pro options and can be created from templates like Ubuntu or CentOS in different sizes.
Aruba eCommerce - Corso online ' Come preparare le promozioni nel tuo eCommerce'Aruba S.p.A.
Aruba eCommerce - Corso online ' Come preparare le promozioni nel tuo eCommerce, promozioni e saldi nel negozio online, dai primi passi alle strategie di marketing'
2. #VenerdìDelCloud 2
Nell’ultima lezione,
abbiamo avuto modo di vedere come sia possibile installare
un nostro GitHub personalizzato,
utilizzando GitLab e un Cloud Server Smart.
Gli utenti più esperti, però,
potrebbero voler mettere maggiormente le mani
sui differenti comandi di configurazionedel cloud server GitLab,
magari ricorrendo all’uso di PostgreSQL al posto di MySQL
o assecondando altre personalizzazioni.
Impariamo a configurare ed avviare
un ambiente GitLab su un cloud server
con una completa descrizione delle fasi di deploy
3. #VenerdìDelCloud 3
A tal proposito,
abbiamo pensato di ripercorrere l’installazionedi GitLab
su uno dei nostri Cloud Server Smart
senza l’utilizzo di “scorciatoie”,
ma ricorrendo a un setup di tipo manuale, adatto ai
puristi Linux e a tutti gli utenti espertiche vogliono apprendere
come installareGitLabin modalità fai-da-te.
Iniziamo questa avventura
avviando un Cloud Server Smart…
4. #VenerdìDelCloud 4
Colleghiamoci al pannello di controllo ed
inseriamo i dati di accesso,
confermando con un clic su Accedi.
Clicchiamo sul pannello Gestisci e su Crea
Nuovo Cloud Server
Scegliamo l’opzione Cloud Server
Smart e clicchiamo su Prosegui.
5. #VenerdìDelCloud 5
Digitiamo il nome dell’istanza (ad esempio GitLabManualSetup) e selezioniamo
il sistema operativo cliccando su Seleziona Template e scegliendo Ubuntu 12.04 LTS 32
bit dalla lista che compare. Confermiamo cliccando su Scegli.
Scegliamo la password di accesso e
la Taglia, optando, ad esempio,
per una configurazione Large.
Clicchiamo su Crea Cloud Server.
Quando sarà pronto all’uso,
clicchiamo sul pannello Gestisci
e sul pulsante laterale Gestisci,
scegliendo la voce Accedi dal menu a scomparsa.
Assicuriamoci di poter eseguire Java e clicchiamo su Lancia la connessione, digitando la
password di accesso al cloud server e confermando con Connetti.
6. #VenerdìDelCloud 6
Una volta raggiunta la finestra-terminale, assicuriamoci di essere loggati come root e
iniziamo la procedura di setup di GitLab aggiornando la distribuzione:
apt-get update -y
apt-get upgrade -y
installando VIM come editor di default:
sudo apt-get install -y vim
sudo update-alternatives -set editor /usr/bin/vim.basic
e installando tutti i pacchetti necessari:
sudo apt-get install -y build-essential zlib1g-dev libyaml-dev libssl-dev libgdbm-dev libreadline-
dev libncurses5-dev libffi-dev curl openssh-server redis-server checkinstall libxml2-dev libxslt-dev
libcurl4-openssl-dev libicu-dev logrotate
Assicuriamoci di avere la giusta versione di GitLab installata, effettuandone
l’installazione, verificandone la versione:
sudo apt-get install -y git-core
git –version
7. #VenerdìDelCloud 7
…e procedendo all’eventuale rimozione e compilazione di una versione di Git più
recente, solo nel caso in cui la versione dovesse essere inferiore alla 1.7.10:
sudo apt-get remove git-core
sudo apt-get install -y libcurl4-openssl-dev libexpat1-dev gettext libz-dev libssl-dev build-
essential
cd /tmp
curl –progress https://git-core.googlecode.com/files/git-1.8.5.2.tar.gz | tar xz
cd git-1.8.5.2/
make prefix=/usr/local all
sudo make prefix=/usr/local install
Installiamo anche Postfix come mail server
per la gestione delle email:
sudo apt-get install -y postfix
Durante l’installazione, compare una
schermata: qui scegliamo Internet Site, confermiamo premendo Enter sulla tastiera,
indichiamo il nome completo del cloud server e confermiamo sempre con Enter.
8. #VenerdìDelCloud 8
Procediamo con l’installazione e la compilazione di Ruby:
mkdir /tmp/ruby && cd /tmp/ruby
curl –progress ftp://ftp.ruby-lang.org/pub/ruby/2.0/ruby-2.0.0-p353.tar.gz | tar xz
cd ruby-2.0.0-p353
./configure –disable-install-rdoc
make
sudo make install
e del Builder Gem:
sudo gem install bundler –no-ri –no-rdoc
9. #VenerdìDelCloud 9
A questo punto, dobbiamo occuparci di creare e definire l’utente git di GitLab
e procedere poi con l’installazione della GitLab Shell,
un applicativo di accesso SSH e
gestione repository sviluppato appositamente per GitLab:
sudo adduser –disabled-login –gecos ‘GitLab’ git
cd /home/git
sudo -u git -H git clone https://github.com/gitlabhq/gitlab-shell.git
cd gitlab-shell
sudo -u git -H git checkout v1.7.4
sudo -u git -H cp config.yml.example config.yml
10. #VenerdìDelCloud 10
Dobbiamo ora modificare il file config.yml, sostituendo il valore specificato alla
voce gitlab_url con l’indirizzo IP del nostro cloud server:
sudo -u git -H editor config.yml
Quando abbiamo finito (ricordiamoci di salvare e
uscire dall’editor con la sequenza tasti ESC e ZZ),
continuiamo con il setup:
sudo -u git -H ./bin/install
12. #VenerdìDelCloud 12
…nella prossima parte della lezione
continuiamo co le procedure per installare GitLab
e come configurarlo su un cloud server…
Continua a seguirci
Contenuti a cura di HostingTalk