University of Belgrade 
Faculty of Organizational Sciences 
Laboratory for E-business 
HADOOP INFRASTRUCTURE FOR 
EDUCATION 
Darko Marjanović, darko@elab.rs 
Miloš Milovanović, milovanovicm@elab.rs 
Božidar Radenković, boza@elab.rs
Laboratory for E-business 
• Exists within the Faculty of Organizational Sciences, University of 
Belgrade 
• Organizes e-learning courses since 2001. by using Moodle LMS and 
blended learning concept 
• More than 1000 students take our courses each year 
• Research areas: 
 E-business 
 Internet and mobile technologies 
 Big Data 
 Cloud Computing 
 E-education 
 Adaptive e-services 
 Internet of things 
 Social media
Overview 
• Introduction 
• Hadoop model for education 
• Implementation 
• Cluster organizaton 
• Conclusion
Introduction 
• Education institutions need to have access to relevant information in 
order to offer high-quality education to students. 
• Main problem – Information arrive to organizations 
• from variety of sources 
• with rapidly increasing speed 
• in variety of types. 
• Hadoop as a possible solution to this matter
Hadoop 
• Apache Hadoop is an open-source software framework for storage 
and large-scale processing of data-sets on clusters of commodity 
hardware. 
• All the modules in Hadoop are designed with a fundamental 
assumption that hardware failures are common and thus should be 
automatically handled in software by the framework.
Big Data 
• Big data is a blanket term for any collection of data sets so large 
and complex that it becomes difficult to process using on-hand 
database management tools or traditional data processing 
applications.
Hadoop model for education 
Guidelines used for deploying Hadoop model: 
• Efficient data import 
• Reliable manipulation 
• Flexible output
Model for managing Big Data in 
educational institutions
Implementation 
• Three node cluster 
• Integration with Moodle LMS 
• Distributed storage 
• In its performance, Hadoop cluster consumes a significant amount 
of resources, and controlling them is inevitable.
Implemented Hadoop e-learning 
infrastructure
Cluster organization 
• Central role that is responsible for Hadoop’s performance is 
represented by Master node. 
• In order to optimize Hard Disk Drive Memory, the implementation 
described here contains Data Node installed on the Master Node 
• Imposed mechanism for preventing data losing between nodes is to 
constantly monitor network infrastructure. 
• Data replication as a mechanism for preserving data within cluster
Hadoop cluster organization in 
Laboratory for E-business
Conclusion 
• A scalable platform that brings Big Data based on Hadoop to e-learning 
environment is presented. 
• Main contribution of described paper is providing environment for 
manipulating data generated from variety of sources in education 
activities. 
• Primary objective is improvement of e-learning process. 
• Future research is directed to: 
-optimizing integration with e-learning services 
-integration with cloud platform
University of Belgrade 
Faculty of Organizational Sciences 
Laboratory for E-business 
HADOOP INFRASTRUCTURE FOR 
EDUCATION 
Darko Marjanović, darko@elab.rs 
Miloš Milovanović, milovanovicm@elab.rs 
Božidar Radenković, boza@elab.rs

Hadoop infrastructure for education

  • 1.
    University of Belgrade Faculty of Organizational Sciences Laboratory for E-business HADOOP INFRASTRUCTURE FOR EDUCATION Darko Marjanović, darko@elab.rs Miloš Milovanović, milovanovicm@elab.rs Božidar Radenković, boza@elab.rs
  • 2.
    Laboratory for E-business • Exists within the Faculty of Organizational Sciences, University of Belgrade • Organizes e-learning courses since 2001. by using Moodle LMS and blended learning concept • More than 1000 students take our courses each year • Research areas:  E-business  Internet and mobile technologies  Big Data  Cloud Computing  E-education  Adaptive e-services  Internet of things  Social media
  • 3.
    Overview • Introduction • Hadoop model for education • Implementation • Cluster organizaton • Conclusion
  • 4.
    Introduction • Educationinstitutions need to have access to relevant information in order to offer high-quality education to students. • Main problem – Information arrive to organizations • from variety of sources • with rapidly increasing speed • in variety of types. • Hadoop as a possible solution to this matter
  • 5.
    Hadoop • ApacheHadoop is an open-source software framework for storage and large-scale processing of data-sets on clusters of commodity hardware. • All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common and thus should be automatically handled in software by the framework.
  • 6.
    Big Data •Big data is a blanket term for any collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.
  • 7.
    Hadoop model foreducation Guidelines used for deploying Hadoop model: • Efficient data import • Reliable manipulation • Flexible output
  • 8.
    Model for managingBig Data in educational institutions
  • 9.
    Implementation • Threenode cluster • Integration with Moodle LMS • Distributed storage • In its performance, Hadoop cluster consumes a significant amount of resources, and controlling them is inevitable.
  • 10.
  • 11.
    Cluster organization •Central role that is responsible for Hadoop’s performance is represented by Master node. • In order to optimize Hard Disk Drive Memory, the implementation described here contains Data Node installed on the Master Node • Imposed mechanism for preventing data losing between nodes is to constantly monitor network infrastructure. • Data replication as a mechanism for preserving data within cluster
  • 12.
    Hadoop cluster organizationin Laboratory for E-business
  • 13.
    Conclusion • Ascalable platform that brings Big Data based on Hadoop to e-learning environment is presented. • Main contribution of described paper is providing environment for manipulating data generated from variety of sources in education activities. • Primary objective is improvement of e-learning process. • Future research is directed to: -optimizing integration with e-learning services -integration with cloud platform
  • 14.
    University of Belgrade Faculty of Organizational Sciences Laboratory for E-business HADOOP INFRASTRUCTURE FOR EDUCATION Darko Marjanović, darko@elab.rs Miloš Milovanović, milovanovicm@elab.rs Božidar Radenković, boza@elab.rs