This document provides a summary of Mopuru Babu's experience and skills. He has over 9 years of experience in software development using Java technologies and 2 years of experience in Hadoop development. He has expert knowledge of technologies like Hadoop, Hive, Pig, Spark, and databases like HBase and SQL. He has worked on projects for clients in various industries involving designing, developing, and deploying distributed applications that process and analyze large datasets.
Mopuru Babu has over 9 years of experience in software development using Java technologies and 3 years experience in Hadoop development. He has extensive experience designing, developing, and deploying multi-tier and enterprise-level distributed applications. He has expertise in technologies like Hadoop, Hive, Pig, Spark, and frameworks like Spring and Struts. He has worked on both small and large projects for clients in various industries.
Trafodion is a transactional SQL engine that runs on Hadoop and HBase, providing ANSI SQL access via ODBC/JDBC drivers. It maintains compatibility with Hadoop APIs while adding relational schema support, distributed transactions, secondary indexes and automatic parallelism. Trafodion uses HBase for storage but adds features like ACID compliance across rows and tables and more optimized performance for transactional workloads. By running SQL on Hadoop, Trafodion allows users to leverage existing SQL skills while gaining scalability and flexibility of big data platforms.
This document discusses leveraging major market opportunities with Microsoft Azure. It notes that worldwide cloud software revenue is expected to grow significantly between 2010-2017. By 2017, nearly $1 of every $5 spent on applications will be consumed via the cloud. It also notes that hybrid cloud deployments will be common for large enterprises by the end of 2017. The document then outlines several major enterprise workloads that can be moved to Azure, including test/development, SharePoint, SQL/business intelligence, application migration, SAP, and identity/Office 365. It provides examples of how partners can help customers with these types of migrations.
This document contains the resume of Vipin KP, who has over 5 years of experience as a Big Data Hadoop Developer. He has extensive experience developing Hadoop applications for clients such as EMC, Apple, Dun & Bradstreet, Neilsen, Commonwealth Bank of Australia, and Nokia Siemens Network. He has expertise in technologies such as Hadoop, Hive, Pig, Sqoop, Oozie, and Spark and has developed ETL processes, data pipelines, and analytics solutions on Hadoop clusters. He holds a Master's degree in Computer Science and is Cloudera certified in Hadoop development.
M.V. Rama Kumar has 3 years of experience in application development using Java and big data technologies like Hadoop. He has 1.6 years of experience using Hadoop components such as HDFS, MapReduce, Pig, Hive, Sqoop, HBase and Oozie. He has extensive experience setting up Hadoop clusters and processing large, structured and unstructured data.
Prashanth Shankar Kumar has over 8 years of experience in data analytics, Hadoop, Teradata, and mainframes. He currently works as a Hadoop Developer/Tech Lead at Bank of America where he develops Hive queries, Impala queries, MapReduce programs, and Oozie workflows. Previously he worked as a Hadoop Developer at State Farm Insurance where he installed and managed Hadoop clusters and developed solutions using Hive, Pig, Sqoop, and HBase. He has expertise in Teradata, SQL, Java, Linux, and agile methodologies.
Webinar: Selecting the Right SQL-on-Hadoop SolutionMapR Technologies
In the crowded SQL-on-Hadoop market, choosing the right solution for your business can be difficult. In this webinar, learn firsthand from Rick van der Lans, independent analyst and managing director of R20/Consultancy, how to sort through this market complexity and what tough questions to ask when evaluating perspective SQL-on-Hadoop solutions.
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudDataWorks Summit
This document discusses how organizations can leverage data and analytics to power their business models. It provides examples of Fortune 100 companies that are using Attunity products to build data lakes and ingest data from SAP and other sources into Hadoop, Apache Kafka, and the cloud in order to perform real-time analytics. The document outlines the benefits of Attunity's data replication tools for extracting, transforming, and loading SAP and other enterprise data into data lakes and data warehouses.
Mopuru Babu has over 9 years of experience in software development using Java technologies and 3 years experience in Hadoop development. He has extensive experience designing, developing, and deploying multi-tier and enterprise-level distributed applications. He has expertise in technologies like Hadoop, Hive, Pig, Spark, and frameworks like Spring and Struts. He has worked on both small and large projects for clients in various industries.
Trafodion is a transactional SQL engine that runs on Hadoop and HBase, providing ANSI SQL access via ODBC/JDBC drivers. It maintains compatibility with Hadoop APIs while adding relational schema support, distributed transactions, secondary indexes and automatic parallelism. Trafodion uses HBase for storage but adds features like ACID compliance across rows and tables and more optimized performance for transactional workloads. By running SQL on Hadoop, Trafodion allows users to leverage existing SQL skills while gaining scalability and flexibility of big data platforms.
This document discusses leveraging major market opportunities with Microsoft Azure. It notes that worldwide cloud software revenue is expected to grow significantly between 2010-2017. By 2017, nearly $1 of every $5 spent on applications will be consumed via the cloud. It also notes that hybrid cloud deployments will be common for large enterprises by the end of 2017. The document then outlines several major enterprise workloads that can be moved to Azure, including test/development, SharePoint, SQL/business intelligence, application migration, SAP, and identity/Office 365. It provides examples of how partners can help customers with these types of migrations.
This document contains the resume of Vipin KP, who has over 5 years of experience as a Big Data Hadoop Developer. He has extensive experience developing Hadoop applications for clients such as EMC, Apple, Dun & Bradstreet, Neilsen, Commonwealth Bank of Australia, and Nokia Siemens Network. He has expertise in technologies such as Hadoop, Hive, Pig, Sqoop, Oozie, and Spark and has developed ETL processes, data pipelines, and analytics solutions on Hadoop clusters. He holds a Master's degree in Computer Science and is Cloudera certified in Hadoop development.
M.V. Rama Kumar has 3 years of experience in application development using Java and big data technologies like Hadoop. He has 1.6 years of experience using Hadoop components such as HDFS, MapReduce, Pig, Hive, Sqoop, HBase and Oozie. He has extensive experience setting up Hadoop clusters and processing large, structured and unstructured data.
Prashanth Shankar Kumar has over 8 years of experience in data analytics, Hadoop, Teradata, and mainframes. He currently works as a Hadoop Developer/Tech Lead at Bank of America where he develops Hive queries, Impala queries, MapReduce programs, and Oozie workflows. Previously he worked as a Hadoop Developer at State Farm Insurance where he installed and managed Hadoop clusters and developed solutions using Hive, Pig, Sqoop, and HBase. He has expertise in Teradata, SQL, Java, Linux, and agile methodologies.
Webinar: Selecting the Right SQL-on-Hadoop SolutionMapR Technologies
In the crowded SQL-on-Hadoop market, choosing the right solution for your business can be difficult. In this webinar, learn firsthand from Rick van der Lans, independent analyst and managing director of R20/Consultancy, how to sort through this market complexity and what tough questions to ask when evaluating perspective SQL-on-Hadoop solutions.
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudDataWorks Summit
This document discusses how organizations can leverage data and analytics to power their business models. It provides examples of Fortune 100 companies that are using Attunity products to build data lakes and ingest data from SAP and other sources into Hadoop, Apache Kafka, and the cloud in order to perform real-time analytics. The document outlines the benefits of Attunity's data replication tools for extracting, transforming, and loading SAP and other enterprise data into data lakes and data warehouses.
This document discusses designing a new big data platform to replace an existing complex and outdated one. It analyzes challenges with the current platform, including inability to keep up with business needs. The proposed new platform called Dredge would use abstraction layers to integrate big data tools in a loosely coupled and scalable way. This would simplify development and maintenance while supporting business goals. Key aspects of Dredge include declarative configuration, logical workflows, and plug-and-play integration of tools like HDFS, Hive, HBase, Kafka and Spark in a reusable and event-driven manner. The new platform aims to improve scalability, reduce costs and better support analytics needs over time.
HBase and Drill: How loosley typed SQL is ideal for NoSQLDataWorks Summit
The document discusses how complex data structures can be modeled in a database using an extended relational model. It begins with an agenda that includes discussing loose typing, examples of what can be done, and looking at a real database with 10-20x fewer tables. It then contrasts the traditional relational model with HBase and discusses how structuring allows complex objects in fields and references between objects. Examples are given of modeling time-series data and music metadata in fewer tables using these techniques. Apache Drill is presented as a way to perform SQL queries over these complex data structures.
DoneDeal AWS Data Analytics Platform build using AWS products: EMR, Data Pipeline, S3, Kinesis, Redshift and Tableau. Custom built ETL was written using PySpark.
Business intelligence analyzes data to provide actionable information for decision making. Big data is a $50 billion market by 2017, referring to technologies that capture, store, manage and analyze large variable data collections. Hadoop is an open source framework for distributed storage and processing of large data sets on commodity hardware, enabling businesses to gain insight from massive amounts of structured and unstructured data. It involves components like HDFS for data storage, MapReduce for processing, and others for accessing, storing, integrating, and managing data.
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarCloudera, Inc.
This document discusses how NoSQL databases are well-suited for interactive web applications with large audiences due to their ability to scale out horizontally, while Hadoop is well-suited for analyzing large volumes of data. It provides examples of how NoSQL and Hadoop can work together, with NoSQL serving as a low-latency data store and Hadoop performing batch analysis on the large volumes of data generated by web applications and their users. The document argues that NoSQL and Hadoop address different but complementary challenges and are highly synergistic when used together.
• Capable of processing large sets of structured, semi-structured and unstructured data and supporting system architecture
• Implemented Proof of concepts on Hadoop stack and different big data analytic tools, migration from different databases to Hadoop.
• Developed multiple Map Reduce jobs in java for data cleaning and pre-processing according to the business requirements, Importing and exporting data into HDFS and Hive using Sqoop.
Having Experience in writing HIVE queries & Pig scripts.
The document provides a summary of Pallavi's professional experience and skills. She has over 8 years of experience working with big data, databases, and web applications. Some of her key skills and experiences include developing ETL processes using tools like Apache Spark, Hive, Pig, Sqoop and Flume; loading and analyzing data in Hadoop clusters; creating dashboards and reports in Tableau; and developing applications using technologies like SQL Server, SSIS, SSRS, Java, and .NET. She has worked on projects involving healthcare, performance metrics, and business intelligence.
HBase provides many features for multi-tenancy and isolation. However, the operation of these features require integration into the broader operations of a cluster. This talk will cover some methods we use at Bloomberg for multi-tenancy and discuss some HBase-Oozie integration. Particularly of interest is our work on an Oozie action for secure snapshot export -- this extends the HBase security model via Oozie allowing self-service (non-hbase user) snapshot export on secure clusters.
Key topics:
* Bloomberg's Oozie HBase export snapshot action
* Oozie coordinated time based major compactions
* How we use LDAP with HBase (and why to take care with HADOOP-12291)
* Some of our multi-tenancy setups around monitoring for SLAs
* Suggesting HBase stays the course of being "just" a datastore -- and all projects following the Unix philosophy (this has made things like our Oozie integration much easier!)
Deepesh Rehi has over 5 years of experience in big data technologies like Hadoop, Spark, Hive, and Neo4j. He has worked on several projects involving fraud detection, data integration, and analytics. Currently, he is working as a Programmer Analyst at AAA Insurance where he is building a fraud detection system using graph databases and Spark. Previously, he developed applications for data ingestion, ETL, and analytics on projects for clients like Capital One and Cognizant. Deepesh has strong skills in Scala, Java, SQL, and Unix scripting.
This document contains Anil Kumar's resume. It summarizes his contact information, professional experience working with Hadoop and related technologies like MapReduce, Pig, and Hive. It also lists his technical skills and qualifications, including being a MapR certified Hadoop Professional. His work experience includes developing MapReduce algorithms, installing and configuring MapR Hadoop clusters, and working on projects for clients like Pfizer and American Express involving data analytics using Hadoop, Spark, and Hive.
Video: http://www.youtube.com/watch?v=BT8WvQMMaV0
Hadoop is the technology of choice for processing large data sets. At salesforce.com, we service internal and product big data use cases using a combination of Hadoop, Java MapReduce, Pig, Force.com, and machine learning algorithms. In this webinar, we will discuss an internal use case and a product use case:
Product Metrics: Internally, we measure feature usage using a combination of Hadoop, Pig, and the Force.com platform (Custom Objects and Analytics).
Community-Based Recommendations: In Chatter, our most successful people and file recommendations are built on a collaborative filtering algorithm that is implemented on Hadoop using Java MapReduce.
Tez is the next generation Hadoop Query Processing framework written on top of YARN. Computation topologies in higher level languages like Pig/Hive can be naturally expressed in the new graph dataflow model exposed by Tez. Multi-stage queries can be expressed as a single Tez job resulting in lower latency for short queries and improved throughput for large scale queries. MapReduce has been the workhorse for Hadoop but its monolithic structure had made innovation slower. YARN separates resource management from application logic and thus enables the creation of Tez, a more flexible and generic new framework for data processing for the benefit of the entire Hadoop query ecosystem.
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARNDataWorks Summit
DeathStar is a system that runs HBase on YARN to provide easy, dynamic multi-tenant HBase clusters via YARN. It allows different applications to run HBase in separate application-specific clusters on a shared HDFS and YARN infrastructure. This provides strict isolation between applications and enables dynamic scaling of clusters as needed. Some key benefits are improved cluster utilization, easier capacity planning and configuration, and the ability to start new clusters on demand without lengthy provisioning times.
This presentation discusses the follow topics
What is Hadoop?
Need for Hadoop
History of Hadoop
Hadoop Overview
Advantages and Disadvantages of Hadoop
Hadoop Distributed File System
Comparing: RDBMS vs. Hadoop
Advantages and Disadvantages of HDFS
Hadoop frameworks
Modules of Hadoop frameworks
Features of 'Hadoop‘
Hadoop Analytics Tools
The document discusses Seagate's plans to integrate hard disk drives (HDDs) with flash storage, systems, services, and consumer devices to deliver unique hybrid solutions for customers. It notes Seagate's annual revenue, employees, manufacturing plants, and design centers. It also discusses Seagate exploring the use of big data analytics and Hadoop across various potential use cases and outlines Seagate's high-level plans for Hadoop implementation.
The document provides an overview of Apache Drill's features including its ability to perform schema-less queries on a variety of data formats directly on HDFS. It demonstrates how Drill allows for interactive analysis of data in Hadoop through SQL queries without needing to first extract, transform and load the data into a data warehouse. The document includes an example query on a Yelp dataset that returns the top 10 cities and states by business count grouped and ordered by the results.
The document provides an agenda and slides for a presentation on architectural considerations for data warehousing with Hadoop. The presentation discusses typical data warehouse architectures and challenges, how Hadoop can complement existing architectures, and provides an example use case of implementing a data warehouse with Hadoop using the Movielens dataset. Key aspects covered include ingestion of data from various sources using tools like Flume and Sqoop, data modeling and storage formats in Hadoop, processing the data using tools like Hive and Spark, and exporting results to a data warehouse.
Analyzing the World's Largest Security Data Lake!DataWorks Summit
The document discusses Symantec's CloudFire Analytics platform for analyzing security data at scale. It describes how CloudFire provides Hadoop ecosystem tools on OpenStack virtual machines across 50+ data centers to support security product analytics. Key points covered include analytics services and data, administration and monitoring using tools like Ambari and OpsView, and plans for self-service analytics using dynamic clusters provisioned through CloudBreak integration.
Este documento es un ensayo fotográfico realizado por Jhon Mauricio Santacruz como parte de sus estudios en la Universidad Nacional Abierta Y a Distancia sobre la inclusión social en Colombia. El ensayo contiene fotografías y una conclusión en la que Santacruz observa problemas como la indiferencia, discriminación, exclusión social y abandono que enfrentan ciertas poblaciones en la región, como personas con discapacidades, dependencia a drogas o pobreza extrema.
The document proposes a new design strategy called the "Abductive Irrational Approach". This approach does not follow predefined methodologies and instead allows each step to inform the next in a non-linear way, without knowing the final solution in advance. This is intended to reduce bias and allow for more novel, out-of-the-box solutions compared to traditional linear design approaches. The document outlines this new approach and compares it to existing design perspectives and methodologies.
German Perez-Casanova is a Spanish citizen currently working as the Global Supply Management Director for Delphi International Operations Luxembourg. He has over 35 years of experience in automotive operations and finance management. Perez-Casanova holds an MBA from Universidad Complutense de Madrid and has held various director level positions within Delphi and General Motors focused on cost management, operations improvement, and finance. He is fluent in Spanish, English, and French.
This document discusses designing a new big data platform to replace an existing complex and outdated one. It analyzes challenges with the current platform, including inability to keep up with business needs. The proposed new platform called Dredge would use abstraction layers to integrate big data tools in a loosely coupled and scalable way. This would simplify development and maintenance while supporting business goals. Key aspects of Dredge include declarative configuration, logical workflows, and plug-and-play integration of tools like HDFS, Hive, HBase, Kafka and Spark in a reusable and event-driven manner. The new platform aims to improve scalability, reduce costs and better support analytics needs over time.
HBase and Drill: How loosley typed SQL is ideal for NoSQLDataWorks Summit
The document discusses how complex data structures can be modeled in a database using an extended relational model. It begins with an agenda that includes discussing loose typing, examples of what can be done, and looking at a real database with 10-20x fewer tables. It then contrasts the traditional relational model with HBase and discusses how structuring allows complex objects in fields and references between objects. Examples are given of modeling time-series data and music metadata in fewer tables using these techniques. Apache Drill is presented as a way to perform SQL queries over these complex data structures.
DoneDeal AWS Data Analytics Platform build using AWS products: EMR, Data Pipeline, S3, Kinesis, Redshift and Tableau. Custom built ETL was written using PySpark.
Business intelligence analyzes data to provide actionable information for decision making. Big data is a $50 billion market by 2017, referring to technologies that capture, store, manage and analyze large variable data collections. Hadoop is an open source framework for distributed storage and processing of large data sets on commodity hardware, enabling businesses to gain insight from massive amounts of structured and unstructured data. It involves components like HDFS for data storage, MapReduce for processing, and others for accessing, storing, integrating, and managing data.
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarCloudera, Inc.
This document discusses how NoSQL databases are well-suited for interactive web applications with large audiences due to their ability to scale out horizontally, while Hadoop is well-suited for analyzing large volumes of data. It provides examples of how NoSQL and Hadoop can work together, with NoSQL serving as a low-latency data store and Hadoop performing batch analysis on the large volumes of data generated by web applications and their users. The document argues that NoSQL and Hadoop address different but complementary challenges and are highly synergistic when used together.
• Capable of processing large sets of structured, semi-structured and unstructured data and supporting system architecture
• Implemented Proof of concepts on Hadoop stack and different big data analytic tools, migration from different databases to Hadoop.
• Developed multiple Map Reduce jobs in java for data cleaning and pre-processing according to the business requirements, Importing and exporting data into HDFS and Hive using Sqoop.
Having Experience in writing HIVE queries & Pig scripts.
The document provides a summary of Pallavi's professional experience and skills. She has over 8 years of experience working with big data, databases, and web applications. Some of her key skills and experiences include developing ETL processes using tools like Apache Spark, Hive, Pig, Sqoop and Flume; loading and analyzing data in Hadoop clusters; creating dashboards and reports in Tableau; and developing applications using technologies like SQL Server, SSIS, SSRS, Java, and .NET. She has worked on projects involving healthcare, performance metrics, and business intelligence.
HBase provides many features for multi-tenancy and isolation. However, the operation of these features require integration into the broader operations of a cluster. This talk will cover some methods we use at Bloomberg for multi-tenancy and discuss some HBase-Oozie integration. Particularly of interest is our work on an Oozie action for secure snapshot export -- this extends the HBase security model via Oozie allowing self-service (non-hbase user) snapshot export on secure clusters.
Key topics:
* Bloomberg's Oozie HBase export snapshot action
* Oozie coordinated time based major compactions
* How we use LDAP with HBase (and why to take care with HADOOP-12291)
* Some of our multi-tenancy setups around monitoring for SLAs
* Suggesting HBase stays the course of being "just" a datastore -- and all projects following the Unix philosophy (this has made things like our Oozie integration much easier!)
Deepesh Rehi has over 5 years of experience in big data technologies like Hadoop, Spark, Hive, and Neo4j. He has worked on several projects involving fraud detection, data integration, and analytics. Currently, he is working as a Programmer Analyst at AAA Insurance where he is building a fraud detection system using graph databases and Spark. Previously, he developed applications for data ingestion, ETL, and analytics on projects for clients like Capital One and Cognizant. Deepesh has strong skills in Scala, Java, SQL, and Unix scripting.
This document contains Anil Kumar's resume. It summarizes his contact information, professional experience working with Hadoop and related technologies like MapReduce, Pig, and Hive. It also lists his technical skills and qualifications, including being a MapR certified Hadoop Professional. His work experience includes developing MapReduce algorithms, installing and configuring MapR Hadoop clusters, and working on projects for clients like Pfizer and American Express involving data analytics using Hadoop, Spark, and Hive.
Video: http://www.youtube.com/watch?v=BT8WvQMMaV0
Hadoop is the technology of choice for processing large data sets. At salesforce.com, we service internal and product big data use cases using a combination of Hadoop, Java MapReduce, Pig, Force.com, and machine learning algorithms. In this webinar, we will discuss an internal use case and a product use case:
Product Metrics: Internally, we measure feature usage using a combination of Hadoop, Pig, and the Force.com platform (Custom Objects and Analytics).
Community-Based Recommendations: In Chatter, our most successful people and file recommendations are built on a collaborative filtering algorithm that is implemented on Hadoop using Java MapReduce.
Tez is the next generation Hadoop Query Processing framework written on top of YARN. Computation topologies in higher level languages like Pig/Hive can be naturally expressed in the new graph dataflow model exposed by Tez. Multi-stage queries can be expressed as a single Tez job resulting in lower latency for short queries and improved throughput for large scale queries. MapReduce has been the workhorse for Hadoop but its monolithic structure had made innovation slower. YARN separates resource management from application logic and thus enables the creation of Tez, a more flexible and generic new framework for data processing for the benefit of the entire Hadoop query ecosystem.
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARNDataWorks Summit
DeathStar is a system that runs HBase on YARN to provide easy, dynamic multi-tenant HBase clusters via YARN. It allows different applications to run HBase in separate application-specific clusters on a shared HDFS and YARN infrastructure. This provides strict isolation between applications and enables dynamic scaling of clusters as needed. Some key benefits are improved cluster utilization, easier capacity planning and configuration, and the ability to start new clusters on demand without lengthy provisioning times.
This presentation discusses the follow topics
What is Hadoop?
Need for Hadoop
History of Hadoop
Hadoop Overview
Advantages and Disadvantages of Hadoop
Hadoop Distributed File System
Comparing: RDBMS vs. Hadoop
Advantages and Disadvantages of HDFS
Hadoop frameworks
Modules of Hadoop frameworks
Features of 'Hadoop‘
Hadoop Analytics Tools
The document discusses Seagate's plans to integrate hard disk drives (HDDs) with flash storage, systems, services, and consumer devices to deliver unique hybrid solutions for customers. It notes Seagate's annual revenue, employees, manufacturing plants, and design centers. It also discusses Seagate exploring the use of big data analytics and Hadoop across various potential use cases and outlines Seagate's high-level plans for Hadoop implementation.
The document provides an overview of Apache Drill's features including its ability to perform schema-less queries on a variety of data formats directly on HDFS. It demonstrates how Drill allows for interactive analysis of data in Hadoop through SQL queries without needing to first extract, transform and load the data into a data warehouse. The document includes an example query on a Yelp dataset that returns the top 10 cities and states by business count grouped and ordered by the results.
The document provides an agenda and slides for a presentation on architectural considerations for data warehousing with Hadoop. The presentation discusses typical data warehouse architectures and challenges, how Hadoop can complement existing architectures, and provides an example use case of implementing a data warehouse with Hadoop using the Movielens dataset. Key aspects covered include ingestion of data from various sources using tools like Flume and Sqoop, data modeling and storage formats in Hadoop, processing the data using tools like Hive and Spark, and exporting results to a data warehouse.
Analyzing the World's Largest Security Data Lake!DataWorks Summit
The document discusses Symantec's CloudFire Analytics platform for analyzing security data at scale. It describes how CloudFire provides Hadoop ecosystem tools on OpenStack virtual machines across 50+ data centers to support security product analytics. Key points covered include analytics services and data, administration and monitoring using tools like Ambari and OpsView, and plans for self-service analytics using dynamic clusters provisioned through CloudBreak integration.
Este documento es un ensayo fotográfico realizado por Jhon Mauricio Santacruz como parte de sus estudios en la Universidad Nacional Abierta Y a Distancia sobre la inclusión social en Colombia. El ensayo contiene fotografías y una conclusión en la que Santacruz observa problemas como la indiferencia, discriminación, exclusión social y abandono que enfrentan ciertas poblaciones en la región, como personas con discapacidades, dependencia a drogas o pobreza extrema.
The document proposes a new design strategy called the "Abductive Irrational Approach". This approach does not follow predefined methodologies and instead allows each step to inform the next in a non-linear way, without knowing the final solution in advance. This is intended to reduce bias and allow for more novel, out-of-the-box solutions compared to traditional linear design approaches. The document outlines this new approach and compares it to existing design perspectives and methodologies.
German Perez-Casanova is a Spanish citizen currently working as the Global Supply Management Director for Delphi International Operations Luxembourg. He has over 35 years of experience in automotive operations and finance management. Perez-Casanova holds an MBA from Universidad Complutense de Madrid and has held various director level positions within Delphi and General Motors focused on cost management, operations improvement, and finance. He is fluent in Spanish, English, and French.
This document provides a summary of Mopuru Babu's experience and skills. He has over 9 years of experience in software development using Java technologies and 2 years of experience in Hadoop development. He has expert knowledge of technologies like Hadoop, Hive, Pig, Spark, and databases like HBase and SQL. He has worked on projects in data analytics, ETL, and building applications on big data platforms. He is proficient in Java, Scala, SQL, Pig Latin, HiveQL and has strong skills in distributed systems, data modeling, and Agile methodologies.
1) Rsmart is a software company founded in 2006 with headquarters in New Mumbai, India and offices globally that provides aviation, banking, and financial services solutions.
2) It has over 50 employees and 100 solution implementations across geographies, serving over 90 customers worldwide.
3) Rsmart focuses on cloud-enabled and mobile solutions and prides itself on its domain expertise, low total cost of ownership for customers, and track record of successful project execution.
Video surveillance systems monitor behavior through electronic equipment like CCTV cameras. They were originally used by businesses to prevent theft but are now commonly used in homes. During installation, cameras are placed outside in areas like entrances and hidden areas near windows. Cameras transmit video signals through coaxial cables to monitors. There are different types of cameras like CCTV, dome, and infrared cameras. Video surveillance provides security benefits like real-time monitoring, remote access, and high quality recordings that can be used as evidence. It increases security for public spaces, businesses, and homes.
This document proposes a location-based authentication system for enhancing e-banking security on smartphones. It reviews how location can be used as an additional authentication factor beyond traditional methods like passwords. The system would authenticate users by checking their GPS location on a timely basis in addition to their credentials. Using location tracking and self-destructing encryption keys that expire makes the system more secure by preventing unauthorized access to banking data and funds from outside approved locations. The goal is to offer banks more secure ways to provide mobile banking services via smartphone applications.
The document discusses learning management systems (LMS), including their history, types, functions, advantages, and disadvantages. It provides a timeline of important LMS milestones from 1924 to present. LMS are software applications used to deliver, track, and manage educational content online. They allow students to access course materials through a web browser from any device with internet access. Key functions of LMS include course administration, tracking progress, and reporting. Advantages are low-cost access to learning from anywhere at any time, while disadvantages include increased workload for instructors and potential poor use of technology. Instructors play an important role in guiding discussions, providing feedback, and communicating to support online learning through an LMS.
Este documento describe la anatomía de las arterias del abdomen y órganos digestivos. Explica que la irrigación de los órganos proviene de ramas de la aorta abdominal, incluyendo el tronco celíaco, la arteria hepática común y la arteria esplénica. También describe la irrigación y funciones del hígado, bazo y páncreas.
En esta presentación se hace una síntesis del perfil biográfico de Jean Piaget en cuanto a su historia de vida de formación académica y desempeño profesional, principales obras y premios y distinciones otorgados.
This document profiles Sara, a 21-35 year old savvy young professional who works as an Assistant Marketing Manager. She was promoted twice in one year and aims to become the CEO of her own marketing firm within two years. As she oversees a large team and works long stressful hours, Starbucks can help her stay focused and energized with caffeinated beverages and a professional meeting space with free Wi-Fi. Their social media messaging will target young urban professionals like Sara.
A review of Best Practices in Vascular Access and Infusion Therapy presentati...bsulejma09
This document discusses vascular access and intravenous therapy. It begins with the presenter introducing themselves and their background and disclosures. The objectives of the presentation are then outlined, which include gaining knowledge about vein anatomy, infusion principles, vascular access devices, vessel health, building a successful infusion program, new research, and technology. Details are then provided about various topics related to vascular access including vein anatomy, principles of infusion, hemodilution, choosing the right device, complications, and decreasing complications through developing an infusion program. Financial considerations of sending patients out for line placement versus keeping them in the facility are reviewed. The roles and responsibilities of bedside nurses are discussed in relation to standards of practice from organizations like the Infusion Nurses Society
This document summarizes an English lesson taught by a teacher trainee named Anjana RK about the short story "The Tattered Blanket". The lesson covered the story's themes of cultural roots, family ties, and empathy for the aged. Students read and discussed the story in groups, answering scaffolding questions from the teacher to check their understanding. They also wrote imagined conversations between the story's characters and wrote about their own grandparents. The teacher used a chart, questioning, and clarification of difficult words to facilitate student comprehension of the story and its themes.
Sourav Banerjee is a software developer with over 4 years of experience in banking, big data, and mainframe development. He has extensive skills in Java, COBOL, SQL, Hadoop, Hive, and Pig. His career includes projects involving data migration, log analysis, statement generation, and developing solutions for requirements from clients like ING Bank and Tata Consultancy Services. Sourav holds certifications in areas like financial markets, big data analytics, and data science.
This document provides a summary of M.V. Rama Kumar's professional experience and qualifications. He has over 3 years of experience in application development using Java and big data technologies like Hadoop, HDFS, MapReduce, Apache Pig, Hive and Sqoop. Some of his key responsibilities have included writing Pig scripts to optimize job execution time, creating Hive tables and queries, and using Sqoop to transfer data between HDFS and relational databases. He is currently working as a Software Engineer with Tata Consultancy Services on projects involving XML analytics using Hadoop and sentiment analysis on customer data in the banking domain.
This document contains a summary of Sunil's professional experience and skills. He has over 3 years of experience working with big data technologies like Spark, Hadoop, Hive and Presto. He has extensive experience installing, configuring and maintaining Hadoop and Spark clusters. Some of his projects involved building recommendation engines, analyzing social media and call drop data. He is proficient in Java, Hadoop ecosystems and SQL/NoSQL databases.
Jayaram_Parida- Big Data Architect and Technical Scrum MasterJayaram Parida
Jayaram Parida has over 19 years of experience in IT, including 3 years as a Big Data Technical Solution Architect. He has extensive skills in technologies like Hadoop, HDFS, HBase, Hive, MapReduce, Kafka, Storm, YARN, Pig, Python, and data analytics tools. He has experience architecting and developing big data solutions for clients in various industries. His roles have included designing Hadoop infrastructures, developing real-time analytics platforms, and creating visualizations and reports.
- Gubendran Lakshmanan has over 13 years of experience in IT with expertise in Java/J2EE, AWS cloud, big data, and distributed systems.
- He has extensive experience designing, developing, and implementing applications using technologies like Java, Spring, Hibernate, AWS, Hadoop, HDFS, HBase, and NoSQL databases.
- He has worked as a senior developer and technical lead on several projects in domains like online advertising, energy, and food automation.
Robin David is seeking a position that utilizes his 9+ years of experience in IT and 3+ years experience with Hadoop. He has extensive experience designing, implementing, and managing Hadoop clusters and data solutions. His experience includes building data lakes, ETL processes, data integration, and analytics solutions for clients across various industries. He is proficient in Hadoop ecosystem tools like Hive, Pig, Sqoop, Flume, and Spark and has expertise in Hadoop administration, performance tuning, and support.
This document provides a summary of Sudheer's professional experience and qualifications. He has over 3 years of experience in application development using Java and Hadoop. Some of his key skills and responsibilities include writing Pig scripts, setting up and managing Hadoop clusters, developing web applications using Java/J2EE, and working on projects for clients like Target and JPJ. He is proficient in technologies like Java, Hadoop, Pig, Hive, and databases.
Venkata Kumar has over 8 years of experience as a senior Python developer. He has extensive experience developing web and mobile applications using Python frameworks like Django and Flask. He has also worked on backend development with databases like MongoDB, MySQL, Oracle, and SQL Server. Some of his responsibilities have included designing and developing RESTful APIs, building automated workflows using Python, and implementing responsive user interfaces with HTML, CSS, JavaScript, and frameworks like AngularJS.
The document provides a summary of Rama Prasad Owk's professional experience as an ETL/Hadoop Developer. It outlines over 6 years of experience in areas such as: designing, developing and implementing ETL jobs using IBM Data Stage; working with various Hadoop technologies such as MapReduce, Pig, Hive, HBase, Sqoop, and Spark; troubleshooting errors; importing and exporting data between HDFS and databases; and writing Spark and Python applications for Hadoop. Specific experiences are listed from roles at clients such as United Services Automobile Association, Walmart, and General Motors where responsibilities included developing ETL jobs, Hadoop programs, and SQL queries to meet business requirements.
This document provides a summary of a candidate's skills and experience working with large data sets and Hadoop technologies. The candidate has 1 year of experience developing, implementing, and deploying Big Data solutions using Hadoop technologies like HDFS, Hive, Pig, and Sqoop on Linux and Windows environments. They have extensive experience developing UDFs, UDAFs, UDTFs in Hive and Pig scripts for ETL processes. Additionally, the candidate has knowledge of Spark, Scala, Java/J2EE development, Linux, and databases. Their most recent role involved writing Pig and Hive queries, UDFs, and shell scripts to process and report on large data sets using Hadoop, Pig, Hive, Cass
Vijay Pai has over 7.5 years of experience in retail working as a Principal Software Engineer. He has expertise in Big Data Hadoop development, Unix and shell scripting, technical support, and software development. Some of his key responsibilities include handling development and support projects, conducting system analysis, and ensuring on-time delivery of high quality solutions. He is skilled in technologies like HDFS, MapReduce, Hive, Pig and Sqoop.
Mallikarjun Kori is a software professional with over 3 years of experience in technologies like SQL, Java, Hadoop. He has expertise in big data ecosystems such as HDFS, HBase, Hive, Pig, Sqoop, Oozie, Flume, and MapReduce. He has worked on projects involving processing large datasets using Hadoop for companies in utilities and banking domains. His responsibilities included developing Pig and Hive scripts, Sqoop scripts, MapReduce applications, and reports. He holds an MCA degree from KSOU Mysore and a BCA from Gulbarga University.
• Excellent analytical and problem solving skills.
• Excellent communication skills.
• Quick Learner, Self-Motivated and team player traits.
• Ability to mentor and educate peers whenever needed for the greater good of the team as a whole.
Neelima Kambam has over 10 years of experience in IT and 3+ years experience with Hadoop technologies like HDFS, Pig, Hive, Sqoop and Flume. She has designed and developed ETL workflows and performed data analytics on large datasets for healthcare and other clients. She is proficient in Java, SQL, Cobol and XML technologies.
This document contains the resume of Himabindu Y. summarizing their professional experience and skills. They have over 3 years of experience in application development using Java and big data technologies like Hadoop. Their most recent role was as a Software Engineer at Prokarma Softtech since 2013 where they worked on projects involving Hadoop, Pig, Hive, Sqoop and machine learning. They have also worked on projects involving web services, file processing and cryptographic operations. Himabindu holds a B.Tech in Computer Science and Engineering and is proficient in technologies like Java, Hadoop, Oracle SQL and Linux.
This document contains a summary and details for Prasanna Kumar, including his contact information, work experience, education, technical skills, and professional projects. He has 4 years of experience in application and Hadoop development with expertise in the Hadoop ecosystem, Hive, Impala, Pig, Phoenix, HBase, Spark, and Scala. His work experience includes projects involving building reports from sales data stored in HDFS using Hive, Pig, and Sqoop, and developing insurance management applications using Oracle ADF.
Pankaj Resume for Hadoop,Java,J2EE - Outside WorldPankaj Kumar
Pankaj Kumar is seeking a challenging position utilizing his 7.9 years of experience in big data technologies like Hadoop, Java, and machine learning. He has deep expertise in technologies such as MapReduce, HDFS, Pig, Hive, HBase, MongoDB, and Spark. His experience includes successfully developing and delivering big data analytics solutions for healthcare, telecom, and other industries.
Monika Raghuvanshi is seeking a position as a Hadoop Administrator where she can apply her 7 years of experience in Hadoop and Unix administration. She has expertise in installing, configuring, and maintaining Hadoop clusters as well as ensuring security through Kerberos and SSL. She is proficient in Linux, networking, programming languages, and databases. Her experience includes projects with Barclays, GE Healthcare, Ontario Ministry of Transportation, and Nortel where she administered Hadoop and Unix systems.
- The document contains the resume of Deepankar Sehdev which details his 3.5 years of experience in developing big data and data warehouse applications using technologies like Hadoop, Hive, Pig, Sqoop, AWS Redshift. It lists his roles and responsibilities in 4 past projects involving migration of data from mainframe and databases to Hadoop clusters and data warehouses.
Similar to Sunshine consulting mopuru babu cv_java_j2_ee_spring_bigdata_scala_Spark (20)
Consistent toolbox talks are critical for maintaining workplace safety, as they provide regular opportunities to address specific hazards and reinforce safe practices.
These brief, focused sessions ensure that safety is a continual conversation rather than a one-time event, which helps keep safety protocols fresh in employees' minds. Studies have shown that shorter, more frequent training sessions are more effective for retention and behavior change compared to longer, infrequent sessions.
Engaging workers regularly, toolbox talks promote a culture of safety, empower employees to voice concerns, and ultimately reduce the likelihood of accidents and injuries on site.
The traditional method of conducting safety talks with paper documents and lengthy meetings is not only time-consuming but also less effective. Manual tracking of attendance and compliance is prone to errors and inconsistencies, leading to gaps in safety communication and potential non-compliance with OSHA regulations. Switching to a digital solution like Safelyio offers significant advantages.
Safelyio automates the delivery and documentation of safety talks, ensuring consistency and accessibility. The microlearning approach breaks down complex safety protocols into manageable, bite-sized pieces, making it easier for employees to absorb and retain information.
This method minimizes disruptions to work schedules, eliminates the hassle of paperwork, and ensures that all safety communications are tracked and recorded accurately. Ultimately, using a digital platform like Safelyio enhances engagement, compliance, and overall safety performance on site. https://safelyio.com/
Flutter is a popular open source, cross-platform framework developed by Google. In this webinar we'll explore Flutter and its architecture, delve into the Flutter Embedder and Flutter’s Dart language, discover how to leverage Flutter for embedded device development, learn about Automotive Grade Linux (AGL) and its consortium and understand the rationale behind AGL's choice of Flutter for next-gen IVI systems. Don’t miss this opportunity to discover whether Flutter is right for your project.
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesQuickdice ERP
Explore the seamless transition to e-invoicing with this comprehensive guide tailored for Saudi Arabian businesses. Navigate the process effortlessly with step-by-step instructions designed to streamline implementation and enhance efficiency.
E-commerce Development Services- Hornet DynamicsHornet Dynamics
For any business hoping to succeed in the digital age, having a strong online presence is crucial. We offer Ecommerce Development Services that are customized according to your business requirements and client preferences, enabling you to create a dynamic, safe, and user-friendly online store.
14 th Edition of International conference on computer visionShulagnaSarkar2
About the event
14th Edition of International conference on computer vision
Computer conferences organized by ScienceFather group. ScienceFather takes the privilege to invite speakers participants students delegates and exhibitors from across the globe to its International Conference on computer conferences to be held in the Various Beautiful cites of the world. computer conferences are a discussion of common Inventions-related issues and additionally trade information share proof thoughts and insight into advanced developments in the science inventions service system. New technology may create many materials and devices with a vast range of applications such as in Science medicine electronics biomaterials energy production and consumer products.
Nomination are Open!! Don't Miss it
Visit: computer.scifat.com
Award Nomination: https://x-i.me/ishnom
Conference Submission: https://x-i.me/anicon
For Enquiry: Computer@scifat.com
Artificia Intellicence and XPath Extension FunctionsOctavian Nadolu
The purpose of this presentation is to provide an overview of how you can use AI from XSLT, XQuery, Schematron, or XML Refactoring operations, the potential benefits of using AI, and some of the challenges we face.
UI5con 2024 - Bring Your Own Design SystemPeter Muessig
How do you combine the OpenUI5/SAPUI5 programming model with a design system that makes its controls available as Web Components? Since OpenUI5/SAPUI5 1.120, the framework supports the integration of any Web Components. This makes it possible, for example, to natively embed own Web Components of your design system which are created with Stencil. The integration embeds the Web Components in a way that they can be used naturally in XMLViews, like with standard UI5 controls, and can be bound with data binding. Learn how you can also make use of the Web Components base class in OpenUI5/SAPUI5 to also integrate your Web Components and get inspired by the solution to generate a custom UI5 library providing the Web Components control wrappers for the native ones.
WWDC 2024 Keynote Review: For CocoaCoders AustinPatrick Weigel
Overview of WWDC 2024 Keynote Address.
Covers: Apple Intelligence, iOS18, macOS Sequoia, iPadOS, watchOS, visionOS, and Apple TV+.
Understandable dialogue on Apple TV+
On-device app controlling AI.
Access to ChatGPT with a guest appearance by Chief Data Thief Sam Altman!
App Locking! iPhone Mirroring! And a Calculator!!
Microservice Teams - How the cloud changes the way we workSven Peters
A lot of technical challenges and complexity come with building a cloud-native and distributed architecture. The way we develop backend software has fundamentally changed in the last ten years. Managing a microservices architecture demands a lot of us to ensure observability and operational resiliency. But did you also change the way you run your development teams?
Sven will talk about Atlassian’s journey from a monolith to a multi-tenanted architecture and how it affected the way the engineering teams work. You will learn how we shifted to service ownership, moved to more autonomous teams (and its challenges), and established platform and enablement teams.
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...XfilesPro
Wondering how X-Sign gained popularity in a quick time span? This eSign functionality of XfilesPro DocuPrime has many advancements to offer for Salesforce users. Explore them now!
1. MOPURU BABU
Mob no#+1-321-210-3823
E Mail : babu.m@sunshineconsulting.co
EXPERIENCE SUMMARY
Over 9 years of professional experience in Software development in Java Technologies with 2 years
in Hadoop development using Hadoop, Hive, HBase, MapReduce, Pig, Sqoop, Oozie, Shell scripting,
Yarn,Scala,Spark to include design, developing and deploying n-tired and enterprise level distributed
applications.
• Working as HADOOP Developer since 3 years on various Hadoop platforms like Cloudera &
Hortonworks and over SIX years of experience in Software development on Java & Spring frame
works(Spring IoC/core, Spring DAO support, Spring ORM, Spring AOP, Spring Security, Spring
MVC, Spring Cache and Spring Integration).
• Expert knowledge over J2EE Design Patterns like MVC Architecture, Front Controller, Session
Facade, Business Delegate and Data Access Object for building J2EE Applications.
• Designed & developed several multi-tier Web based, Client-Server and Multithread applications
using Object Oriented Analysis and Design concepts and Service Oriented Architecture (SOA)
mostly in cross platform environment.
• Excellent working knowledge of popular frameworks like Struts, Hibernate, and Spring MVC.
• Developed core modules in large cross-platform applications using JAVA, J2EE, Spring, Struts,
Hibernate, JAX-WS (SOAP)/JAX-RS(REST) Web Services JMS.
• Good understanding of HDFS Designs, Daemons, federation and HDFS high availability (HA).
• Experience in developing Hive and Pig scripts.
• Extensive knowledge of creating different Hive tables with different file compressions formats.
• Hands on experience in installing, configuring and administrating Hadoop cluster components
like MapReduce, HDFS, HBase, Hive, Sqoop, Spark, Pig, Zookeeper, Oozie and Flume using
Apache Code base.
• Experience in Managing scalable Hadoop clusters including Cluster designing, provisioning,
custom configurations, monitoring and maintaining using Hadoop distributions: Cloudera CDH.
• Good experience using Apache SPARK.
• Worked on a prototype Apache Spark Streaming project, and converted our existing Java Storm
Topology.
• Experience managing Cloudera distribution of Hadoop(Cloudera Manager).
• Excellent understanding of NoSQL databases like HBase.
• Extensive working knowledge in setting up and running Clusters, monitoring, Data analytics,
Sentiment analysis, Predictive analysis, Data presentation with big data world.
• Hands on experience working on structured, unstructured data with various file formats such as
xml files, Json files, sequence files using Map Reduce programs.
• Extensive experience with wiring SQL queries using HiveQL to perform analytics on structured
data.
• Expertise in Data load management, importing & exporting data using SQOOP & FLUME.
• Performed different PIG operations, joining operations and transformations on data to join,
clean, aggregate and analyze data.
• Excellent interpersonal and communication skills, creative, research-minded, technically
competent and result-oriented with problem solving and leadership skills.
• Expert in Database, RDBMS concepts and using MS SQL Server and Oracle 10g.
• Expertise in working with web development technologies such as HTML, CSS, and JavaScript.
• Expertise in working with different methodologies like Waterfall and Agile.
• Proficient experience in using the databases such as MySQL, MS SQL Server, DB2.
1
2. SOFTWARE SKILLS
Elements Particulars
Primary Skills Analysis, Design, Development, Implementation, Testing & Packaging.
Languages Java
Big Data Skills Hadoop, Map Reduce, Hive, Pig, Sqoop, Oozie, Scala, Spark
RDBMS MS SQL server 2000/2005, DB2, Oracle 8.x/9i/10g
No SQL HBase, MarkLogic
Internet Technology JSP, HTML, XML & CSS
Scripting Language Java Script, JSON, Angular JS
Application Server Web Sphere 6.0, Web Logic 8.1, Jboss5.0
Web Server Tomcat 5.0
Frameworks Struts, Spring, Hibernate, Log4j
CM Tools IBM Clear Case, IBM Rational Team Concert (Think client &Web version),
WinCVS, SVN, GIT
Defect Tracking Tools IBM TSRM, IBM Rational Team Concert, IBM Rational Quality Manager
Build Tools Apache Ant - 1.6.5
Testing Tools JUnit
IDE & GUI Eclipse 3.3, IBM RAD, IBM RSA & Net Beans
Operating System Windows 7, Windows 95/98/ME/NT/XP, Unix & Linux
UML Modeling Tools Star UML
Web Technologies Servlets, JSP
INDUSTRY EXPERIENCE
• UK Public Sector
• Finance Sector
• Retail Sector
Achievements at Workplace
• Received star performer in IBM.
• Was appreciated for committed and reliable work.
• Received value awards, Deep Skill Adder awards in IBM.
• Received lot of appreciation in showing the team work, adapting to new technologies & many
client Appreciation.
Interpersonal Competencies
• Ability to interact successfully with multiple teams across the global organization, including
services and support for all regions.
• Strong mathematical, analytical background with innovative thoughts and creative action.
• Sound Tech-skills, getting-across business flow, Quick-grasp self-learning professional
• Value-added attitudes.
• Zeal to learn New Technologies.
2
3. Project #1 : SPST (Service Pac Product Selector Tool)
Client : IBM, Raleigh, NC
Environment : Hadoop, Hive, Pig, SQOOP, Map Reduce, Java(jdk1.7), LINUX, MySQL,
NoSQL, Cloudera, Spring
Duration : Apr’15 - Till date
Description : This is a web based on-line tool designed to help Offering and Sales
Managers and global Geo’s create, maintain and distribute Service Pac Product Offerings data
worldwide. Along with release enhancements and maintenance activities business analytics also
implemented.
Delivered multiple Big Data use cases to support custom sales offerings using Hadoop. Delivered
different KPIs and Metrics like.
Per country wide
- which offer description has least and more times occurred
- which offer number has least and most occurrence
- which offer type has least and more occurrences
- which partpin has least and more occurrences
- which offerpin has least and most occurrences
- what is the highest and what is the lowest cost. Are they repeated? if so how many times and for
what offer types they repeated
- List down the top 10 offer descriptions
- List down the top 10 offer types
- Which offer description has highest price and which one is lowest price
- What is the difference between the highest price and lowest price
- How many offers are released in each year
- for the customer which Part pin you will suggest - At least suggest 3 categories
- For the customer which Offer pin you will not suggest - you should not suggest 3 categories
- for how many records its under loss
- For the combination top offer description and the top offer code how many Part pins are repeated
- Country code has how many Offer numbers
- For each year prepare a tab separated file as below
year maxopencost maxclosecost offernumber offercode country
Zone wide
- All the above mentioned should be repeated for the entire zone
- zone has how many country codes
Total
- All the above should be repeated for the entire zone
- Total how many country codes are there and each zone has how many country codes. Come out
with a CSV formatted file output
Ex: zone1,AM,300
zone1,AT,365
Different Zones
- from each zone which country has more machine types
- from each zone which country has highest price and give the complete details.
- In the zone what is the highest price and give the complete details
− what is the latest announce date in a country, zone.
3
4. As a Senior Team Member, was responsible for
• Installation & configuration of a Hadoop cluster along with Hive.
• Developed Map Reduce application using Hadoop map reduce programming, a framework
for processing.
• Large data sets in parallel across the Hadoop cluster for pre-processing.
• Developed the code for Importing and exporting data into HDFS and Hive using Sqoop.
• Responsible for writing Hive Queries for analysing data in Hive warehouse using Hive Query
Language(HQL).
• Involved in defining job flows using Oozie for scheduling jobs to manage apache Hadoop jobs
by directed.
• Developing Hive User Defined Functions in java, compiling them into jars and adding them to
the HDFS and executing them with Hive Queries.
• Experienced in managing and reviewing Hadoop log files.
• Responsible to manage data coming from different sources.
• Assisted in monitoring the Hadoop cluster
• Dealing with high volume of data in the cluster.
• Tested and reported defects in an Agile Methodology perspective.
• Consolidate all defects, report it to PM/Leads for prompt fixes by development teams and
drive it to closure.
• Installed Hadoop ecosystems (Hive, Pig, Sqoop, HBase, Oozie) on top of Hadoop cluster
• Importing data from SQL to HDFS & Hive for analytical purpose.
• Involved in developing the Controller, Service and DAO layers of Spring Framework for
developing dashboard for SPST project.
• Attending Business requirement Meetings, UAT Support, Onsite- Offshore Co-ordination,
Work Assignment.
Project #2 : Inventory Management
Client : Wal-Mart, Bentonville, AR
Environment : CDH4,Eclipse,HDFS,Hive, Map Reduce, Spark, Spark-SQL, Oozie, Sqoop, Pig
Duration : 22 Months (Jun’13 – Mar’15)
Description : The application tasks include data extraction from various input sources
like DB2, xml into Cassandra database. The incoming data is specifically related to the Wal-Mart
stores details like address, alignments, divisions, departments and etc. these data filtered according
the business logic and stored in the respective column families. Written some of the services to
retrieve the data from the Cassandra.
As a Senior Team Member, was responsible for
• Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing
of data.
• Import the data from different sources like HDFS/HBase into Spark RDD Developed a data
pipeline using Kafka and Storm to store data into HDFS. Performed real time analysis on the
incoming data.
• Automated the process for extraction of data from warehouses and weblogs by developing
work-flows and coordinator jobs in OOZIE.
• Developed Scala scripts, UDFs using both Data frames/SQL and RDD/Map Reduce in Spark
for Data Aggregation, queries and writing data back to OLTP system directly or through
Sqoop.
4
5. • Exploring with the Spark improving the performance and optimization of the existing
algorithms in Hadoop using Spark Context, Spark-SQL ,Spark YARN.
• Performance optimization dealing with large datasets using Partitions, Spark in Memory
capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other heavy
lifting during ingestion process itself.
• Used Spark API over Horton Works Hadoop YARN to perform analytics on data in Hive.
• Performed transformations like event joins, filter bot traffic and some pre-aggregations
using Pig.
• Developed Map Reduce jobs to convert data files into Parquet file format.
• Developed business specific Custom UDF's in Hive, Pig.
• Configured Oozie workflow to run multiple Hive and Pig jobs which run independently with
time and data availability.
• Optimized Map Reduce code, pig scripts and performance tuning and analysis.
• Involvement in design, development and testing phases of Software Development Life Cycle.
• Performed Hadoop installation, updates, patches and version upgrades when required.
Project #3 : DWP CIS (Customer Information System)
Client : Public Sector, UK Government, Preston, UK
Environment : Core Java, J2ee, JSP, Struts, Web services, Altova XML Spy, WAS 6.0, RAD,
Apache ANT, Oracle SQL Developer, RTC 3.0
RDBMS : Oracle
Duration : 19 Months (Nov’ 11 – May’ 13)
Description : CIS is Customer Information system which holds all the personal details of
the UK Citizens. Along with the personal details that they are entitled for and registered are also
stored in CIS.CIS is primary for the personal and slave for the benefits and award details stored in the
database.
CIS was formed in 2004 by merging two systems (DCI and PDCS). DCI is Department Central Index
and PDCS is Personal Data Computer Store. These two were two different systems altogether and
they never used to communicate to each other.
Because of that whenever the change is happening in one system, the same was not getting
reflected in the other system. As part of Release 1, CIS database was built by having the information
to flow from DCI and PDCS to CIS.PDCS was COBOL based and all systems which were talking to PDCS
started to talk to CIS now through batch and online. Dialogues were replaced by SEF screens and
functions. This will mean that the mainframe dialogues will be replaced by browser based screens.
The CIS online Functions which will provide as replacement for DCI can be decomposed into number
of discrete function types like Primary access, Secondary access, Data link Requests and Inter Service
Access.
As a Senior Team Member, was responsible for
• Implementing GUI as per requirement.
• Involved to create the Functional Design and Technical Design.
• Involved in coding Business logic, Persist data with struts and JDBC, Unit Testing and
Integration Testing.
• Involved Re-Usable Components which can be used in all modules.
• Involved in supporting UAT activities and Production issues, fixing bugs and Defect Tracking.
• Involved in creating XSD.
• Involved in debugging, troubleshooting and defect fixing
• Involved in updating the XSDs based on the business requirements.
• Involved in testing of the Web services and integrating with external vendors and internal
clients.
5
6. • Assist developers in the technical design phase, construction and unit testing phase.
• Analyzing and understanding the architectural requirements.
• Proposing a new design solution which exceeds the client expectations.
• Involved in conducting peer code Reviews and Functional documents.
Project #4 : E Referrals
Client : Public Sector, UK Government, Preston, UK
Environment : Java, HTML, JSP, J2ee, JDBC, EDS Tool
RDBMS : My SQL 5.5
Servers : Tomcat
Duration : 12 Months (Nov’ 10 to Oct’ 11)
Description : The eReferrals over payments application provides a solution that allows
the user to complete online debt referrals. The service pre-populates, where possible, debt referral
screens with client data held in CIS. This is retrieved using a keyed NINO for the case.
The application at its most basic will:
Create an eReferral to Debt Manager via user input and data scrape from CIS.
Route the referral through the approval process
Collate daily approved referrals into a Debt Referral Batch file.
Dispatch file to Debt Manager
A copy of the file held on eReferrals for 7 th day period only.
Incoming data will consist of client data via CIS, in XML format. Outgoing data destined for Debt
Manager will consist of XML document made up Debt referrals and various reports in CSV format.
As a Senior Team Member, was responsible for
• Involved in implementing GUI as per requirement.
• Involved in LLD, Functional Design and Technical Design.
• Involved create the web pages using Java server pages.
• Involved Controller logic in Servlets.
• Implemented client side validations using Java Script.
• Implemented JDBC components.
• Preparing Unit Test Cases.
Project #5 : E forms
Client : Public Sector, UK Government, Preston, UK
Environment : Core Java, HTML, Java script, JSP, Servlets, JDBC, RTC
RDBMS : My SQL 5.5
Servers : Tomcat
Duration : 9 Months (Feb’ 10 to Oct’ 10)
Description : E forms is a service available to all staff via the DWP intranet that allows
DWP staff to submit various personal related forms online. It was initially developed to allow staff
access to their pay information, and to submit expense and overtime claims online. However, the
majority of this week has now been taken over by the strategic resource management system. E
forms continue to provides services for few pay/expenses forms.
As a Senior Team Member, was responsible for
• Involved in implementing GUI as per requirement.
6
7. • Involved in LLD, Functional Design and Technical Design.
• Involved create the web pages using Java server pages.
• Involved Controller logic in Servlets.
• Implemented client side validations using Java Script.
• Implemented JDBC components.
• Preparing Unit Test Cases
Project #6 : ACG Italy
Client : ACG VISION4, Italy
Environment : Java, JSON, Struts, Hibernate, Web Sphere, RSA, Clear Case, DB2, JUnit,
RQM
RDBMS : DB2
Duration : 18 Months (Aug ’08 – Jan’ 10)
Description : ACG Vision4 is an ERP product this covers all processes of the company,
organized to process, designed to be used easily by all the different types of users, flexible in its
ability to integrate and interact with other systems inside and outside the company and whose
functions and data are accessible with the most popular systems for office automation and data-
independent, so intuitive and simple.
It’s having some modules like Finance, Controlling, Supply Chain Management and SVM.ACG has
developed Vision4 taking advantage of IBM's best technology and the most popular open source
technology on the market. The database and platform infrastructure is built on IBM DB2 and IBM
WebSphere, the activity of query, reporting and analysis is made with Cognos. For the part of
architectural construction, adhering to the principles of SOA, there was an undertaken using proven
industry standard such Hybernia, Struts, and Dojo Web 2.0, development was carried out entirely in
Java, everything installed on multiple platforms: Linux, OS400 and Windows.
As a Team Member, was responsible for
• Responsible for leading the project in various phases including design, development and Unit
testing of the application modules and management of a team.
• Worked as a Functional Group Leader for implementing the functional use cases.
• Develop the Re-Usable Components which can be used in all modules.
• Develop Proof of Concepts for ACG framework and provide technical solutions.
• Developed UI by using JSON.
• Implementing the Action Classes using Struts.
• Written Hibernate components and conducting peer code views.
• Requirement analysis from the business users and from onshore counter partner.
• Involve in enhancements, debugging, troubleshooting and defect fixing.
• Involve in Unit Testing, White Box Testing and Integration Testing of the application.
• Customizing and exploring new things on ACG framework
• Involved in assisting juniors.
Project #7 : eCIS (electronic Customer Information System)
Client : Service Master & MerryMaids, United States
Environment : Java, XML, Spring, Hibernate, JMS, Web Services and EXTJS
RDBMS : MSSQL Server 2005, Oracle
Servers : Tomcat 6 & JBoss 5.4
Duration : 11 Months (Sep '07 – July ’08)
Description : The eCIS –Electronic Customer Information System provides the service to
the customers and maintain the list of services for different branches and Franchises. eCIS is web
7
8. based and distributed web application. Here we are maintaining customer and Employee
information for different services. The end user is main active participate in the system.
The following are the list of modules for customer:
Customer / Employee / Service / Account Receivables / Utilities / Maintenance
As a Team Member, was responsible for
• Involved in Analysis, Effort estimation, Design & Development in the part of Employee Audit,
Email Reminders, Audit Rules and Workflow features enhancements.
• Involved in Analysis, Design and Development of Customization Frame Work enhancements
such as , Allocation Verification on Save, etc.
• Involved in debugging, troubleshooting and defect fixing.
• Involving in setting up and configuring the development and deployment of the application.
• Design & developed Customized Ext JS components for Application.
• Implementation of Spring Security for application, EXT Delegate / facade Data object controller,
Domain locking, Exception handling using springs.
EDUCATION
• Bachelor of Computer Science (Bsc) from S.V University.
• Master of Computer Application (MCA) from S.V University.
8