Amazon-style shopping cart analysis using MapReduce on a Hadoop clusterAsociatia ProLinux
This document discusses using MapReduce on a Hadoop cluster to analyze shopping cart data similar to what Amazon analyzes. It begins with an agenda that includes deploying Hadoop and using MapReduce for machine learning. It then discusses the origins of Hadoop from the Nutch project and key facts about Hadoop architecture. Part 1 explains how to configure and deploy a Hadoop cluster. Part 2 demonstrates hands-on use of MapReduce to analyze sample data, providing example Mapper and Reducer Python scripts. It concludes with other real-world uses of MapReduce.
4G networks were developed to address the limitations of 3G networks in supporting high-capacity multimedia services and applications with lower costs. 4G uses an all-IP based Evolved Packet Core network and advanced radio technologies like OFDM, multiple antennas, and link adaptation to provide seamless high-speed wireless broadband connectivity and end-to-end quality of service. Key improvements include an evolved packet core, all-IP networking, seamless handovers between networks, and the end of circuit-switched voice calls.
If you’re already a SQL user then working with Hadoop may be a little easier than you think, thanks to Apache Hive. It provides a mechanism to project structure onto the data in Hadoop and to query that data using a SQL-like language called HiveQL (HQL).
This cheat sheet covers:
-- Query
-- Metadata
-- SQL Compatibility
-- Command Line
-- Hive Shell
Amazon-style shopping cart analysis using MapReduce on a Hadoop clusterAsociatia ProLinux
This document discusses using MapReduce on a Hadoop cluster to analyze shopping cart data similar to what Amazon analyzes. It begins with an agenda that includes deploying Hadoop and using MapReduce for machine learning. It then discusses the origins of Hadoop from the Nutch project and key facts about Hadoop architecture. Part 1 explains how to configure and deploy a Hadoop cluster. Part 2 demonstrates hands-on use of MapReduce to analyze sample data, providing example Mapper and Reducer Python scripts. It concludes with other real-world uses of MapReduce.
4G networks were developed to address the limitations of 3G networks in supporting high-capacity multimedia services and applications with lower costs. 4G uses an all-IP based Evolved Packet Core network and advanced radio technologies like OFDM, multiple antennas, and link adaptation to provide seamless high-speed wireless broadband connectivity and end-to-end quality of service. Key improvements include an evolved packet core, all-IP networking, seamless handovers between networks, and the end of circuit-switched voice calls.
If you’re already a SQL user then working with Hadoop may be a little easier than you think, thanks to Apache Hive. It provides a mechanism to project structure onto the data in Hadoop and to query that data using a SQL-like language called HiveQL (HQL).
This cheat sheet covers:
-- Query
-- Metadata
-- SQL Compatibility
-- Command Line
-- Hive Shell
This document provides an overview of Bacula, an open source network backup solution. It describes the main Bacula components including the Bacula Director, Console, File, Storage, Catalog, and Monitor services. It also discusses how Bacula allows for centralized backup/restore, scheduling, job prioritization, security features, restoration capabilities, storage device support, operating system support, and graphical user interfaces. The document concludes by announcing $4.5 million in funding for Bacula Systems to deliver enterprise-grade open source backup and restoration technologies to large data centers.
This document discusses CUBRID, an open source relational database management system (RDBMS). It provides information on the following:
- CUBRID was launched globally in 2008 and is optimized for web services and applications with high traffic. New features are developed in Romania.
- CUBRID uses a 3-tier architecture for high performance and scalability with various interfaces like ODBC, JDBC, PHP and tools for easy management.
- CUBRID provides high availability with automatic node failure detection and failover between master/slave brokers and database servers.
This document discusses three ways to configure Linux bonding in the kernel: 1) Using modprobe and alias commands, 2) Using sysfs directly by writing to bonding_masters and mode files, and 3) Adding slaves to an interface using sysfs without ifenslave. It notes the benefits of the sysfs method are handling multiple interfaces, reconfiguring without reboot, and avoiding problems with modprobe.
UDPCast is software that uses multicast over UDP to copy the contents of a computer's hard drive (seed host) to other computers on a network. It allows administrators to easily install and maintain the same software and operating system configuration on multiple computers, such as in a school computer lab. The imaging process involves preparing the hosts, installing software on the seed host, and using sender and receiver processes to transfer the contents over the network from the seed host to the other hosts using multicast UDP packets.
Org-Mode is an Emacs mode for note taking, task management, and plain text organization. It allows you to create documents with hierarchical structure, tags, timestamps, and status keywords. The presentation demonstrated Org-Mode and provided instructions for installing and configuring it, with additional resources for learning more.
Darktable is a photography workflow application that allows for non-destructive import, management, and editing of photos. It uses 32-bit floats instead of 8 or 16-bit integers for channels, and the LAB color space. Darktable has different modules for organizing photos, editing photos with basic and advanced effects, and managing edits and history. It provides plugins for adjustments like sharpening, cropping, split toning, and lens distortion correction.
This document provides an overview of Bacula, an open source network backup solution. It describes the main Bacula components including the Bacula Director, Console, File, Storage, Catalog, and Monitor services. It also discusses how Bacula allows for centralized backup/restore, scheduling, job prioritization, security features, restoration capabilities, storage device support, operating system support, and graphical user interfaces. The document concludes by announcing $4.5 million in funding for Bacula Systems to deliver enterprise-grade open source backup and restoration technologies to large data centers.
This document discusses CUBRID, an open source relational database management system (RDBMS). It provides information on the following:
- CUBRID was launched globally in 2008 and is optimized for web services and applications with high traffic. New features are developed in Romania.
- CUBRID uses a 3-tier architecture for high performance and scalability with various interfaces like ODBC, JDBC, PHP and tools for easy management.
- CUBRID provides high availability with automatic node failure detection and failover between master/slave brokers and database servers.
This document discusses three ways to configure Linux bonding in the kernel: 1) Using modprobe and alias commands, 2) Using sysfs directly by writing to bonding_masters and mode files, and 3) Adding slaves to an interface using sysfs without ifenslave. It notes the benefits of the sysfs method are handling multiple interfaces, reconfiguring without reboot, and avoiding problems with modprobe.
UDPCast is software that uses multicast over UDP to copy the contents of a computer's hard drive (seed host) to other computers on a network. It allows administrators to easily install and maintain the same software and operating system configuration on multiple computers, such as in a school computer lab. The imaging process involves preparing the hosts, installing software on the seed host, and using sender and receiver processes to transfer the contents over the network from the seed host to the other hosts using multicast UDP packets.
Org-Mode is an Emacs mode for note taking, task management, and plain text organization. It allows you to create documents with hierarchical structure, tags, timestamps, and status keywords. The presentation demonstrated Org-Mode and provided instructions for installing and configuring it, with additional resources for learning more.
Darktable is a photography workflow application that allows for non-destructive import, management, and editing of photos. It uses 32-bit floats instead of 8 or 16-bit integers for channels, and the LAB color space. Darktable has different modules for organizing photos, editing photos with basic and advanced effects, and managing edits and history. It provides plugins for adjustments like sharpening, cropping, split toning, and lens distortion correction.
1. rss2email
Lightning talk
R˘zvan Deaconescu
a
razvan@rosedu.org
ˆ alnirile lunare RLUG – Septembrie 2011
Intˆ
8 septembrie 2011
R˘zvan Deaconescu
a ˆ alnirile lunare RLUG – Septembrie 2011
Intˆ
rss2email
2. Contextul
folosesc feed-uri pentru notific˘ri (wiki-uri, Redmine)
a
Liferea are probleme cu HTTPS (probabil plus SNI)
dat e-mail pe lista de devel de la Liferea, nu am primit r˘spuns
a
pluginul de feed de la Evolution nu merge
prefer s˘ folosesc o aplicatie (clientul de e-mail) pentru mai
a ,
multe “task-uri”: e-mail, calendaring, feed notification
R˘zvan Deaconescu
a ˆ alnirile lunare RLUG – Septembrie 2011
Intˆ
rss2email
3. Solutia: rss2email
,
converteste notific˘rile RSS/Atom ˆ email-uri
, a ın
usor de instalat
,
configurare rapid˘
a
ˆ configurezi pe un server si de acolo ˆ, i trimite mesaje
ıl , ıt
nu e nevoie de configurare pe client si, apoi, reconfigurare, pe
,
alt client
R˘zvan Deaconescu
a ˆ alnirile lunare RLUG – Septembrie 2011
Intˆ
rss2email
4. Instalare si utilizare
,
instalare pe un server stabil
apt-get install rss2email
r2e (f˘r˘ argumente afiseaz˘ ecran de ajutor)
aa , a
r2e new razvan@rosedu.org
r2e add ’http://lif.rosedu.org/wiki/feed.php?
linkto=diff&content=diff&mode=recent’
Atentie la ampersand!
,
r2e run --no-send (verificare feed-uri)
r2e list
r2e delete 7
intrare ˆ crontab
ın
0 * * * * /usr/bin/r2e run > /dev/null 2>&1
R˘zvan Deaconescu
a ˆ alnirile lunare RLUG – Septembrie 2011
Intˆ
rss2email
5. Configurare
cp /usr/share/doc/rss2email/examples/config.py
∼/.rss2email/config.py
DEFAULT FROM =
"rss2email-noreply@swarm.cs.pub.ro"
SMTP SERVER = "localhost:25"
R˘zvan Deaconescu
a ˆ alnirile lunare RLUG – Septembrie 2011
Intˆ
rss2email
6. “Personalizare”
DATE HEADER = 1
FORCE FROM = 0
BONUS HEADER = ’nX-feed:
rss2email-noreply@swarm.cs.pub.ro’
plus filtru ˆ MDA sau clientul de e-mail
ın
R˘zvan Deaconescu
a ˆ alnirile lunare RLUG – Septembrie 2011
Intˆ
rss2email