What we do● Run an Indian Languages Search Engine● Research ○ Information Extraction ○ Information Retrieval ○ Information Access ○ Virtualization and Cloud● Users of ○ OpenStack ○ Hadoop ○ and lot of other FOSS
Before OpenStack source: http://www.codeproject.com/KB/threads/hxgrid/image4.jpg
Problems● Provisioning ○ Adhoc ○ Time consuming ○ Unmanaged● User Management ○ No resource accounting ○ Access Control ○ Usage Restriction● Storage ○ Data reliability ○ Duplication
More Problems...● Cluster ○ Terrible Resource Utilization ○ New deployment => Too much time ○ Data Redundancy ○ Non-optimal deployments● Academic ○ No cloud platform for experimentation ○ Large Scale sandboxed resource provisioning for students.
Storage● Hadoop compatible distributed storage● Glance image store● Desktop backup utility using CloudFuse● Data reliability● No more Data Fragmentation
OpenStack in Academia● Research ○ Inter cloud migration ○ Inter cloud scheduling ○ Performance Evaluation● Resource provisioning for course assignments and projects. ○ 3 courses ○ 350+ students ○ 20+ projects
HadoopStack● Big Data processing on Demand● Entire ecosystem for Big Data - Hadoop Family, Spark, Mahout, R● Multi-Cloud - OpenStack and AWS.