Optimal Execution Of MapReduce Jobs In Cloud
Anshul Aggarwal, Software Engineer, Cisco Systems
Session Length: 1 Hour
Tue March 10 21:30 PST
Wed March 11 0:30 EST
Wed March 11 4:30:00 UTC
Wed March 11 10:00 IST
Wed March 11 15:30 Sydney
Voices 2015 www.globaltechwomen.com
We use MapReduce programming paradigm because it lends itself well to most data-intensive analytics jobs run on cloud these days, given its ability to scale-out and leverage several machines to parallel process data. Research has demonstrates that existing approaches to provisioning other applications in the cloud are not immediately relevant to MapReduce -based applications. Provisioning a MapReduce job entails requesting optimum number of resource sets (RS) and configuring MapReduce parameters such that each resource set is maximally utilized.
Each application has a different bottleneck resource (CPU :Disk :Network), and different bottleneck resource utilization, and thus needs to pick a different combination of these parameters based on the job profile such that the bottleneck resource is maximally utilized.
The problem at hand is thus defining a resource provisioning framework for MapReduce jobs running in a cloud keeping in mind performance goals such as Optimal resource utilization with Minimum incurred cost, Lower execution time, Energy Awareness, Automatic handling of node failure and Highly scalable solution.
And introdution to MR and Hadoop and an view on the opportunities to use MR with databases i.e., SQL-MapReduce by Teradata and In-database MR by Oracle.
The presentation was used during a class of Datenbanken Implementierungstechniken in 2013.
As MapReduce clusters have become popular these days, their scheduling is one of the important factor which is to be considered. In order to achieve good performance a MapReduce scheduler must avoid unnecessary data transmission. Hence different scheduling algorithms for MapReduce are necessary to provide good performance. This
slide provides an overview of many different scheduling algorithms for MapReduce.
This was the first session about Hadoop and MapReduce. It introduces what Hadoop is and its main components. It also covers the how to program your first MapReduce task and how to run it on pseudo distributed Hadoop installation.
This session was given in Arabic and i may provide a video for the session soon.
And introdution to MR and Hadoop and an view on the opportunities to use MR with databases i.e., SQL-MapReduce by Teradata and In-database MR by Oracle.
The presentation was used during a class of Datenbanken Implementierungstechniken in 2013.
As MapReduce clusters have become popular these days, their scheduling is one of the important factor which is to be considered. In order to achieve good performance a MapReduce scheduler must avoid unnecessary data transmission. Hence different scheduling algorithms for MapReduce are necessary to provide good performance. This
slide provides an overview of many different scheduling algorithms for MapReduce.
This was the first session about Hadoop and MapReduce. It introduces what Hadoop is and its main components. It also covers the how to program your first MapReduce task and how to run it on pseudo distributed Hadoop installation.
This session was given in Arabic and i may provide a video for the session soon.
A tutorial presentation based on hadoop.apache.org documentation.
I gave this presentation at Amirkabir University of Technology as Teaching Assistant of Cloud Computing course of Dr. Amir H. Payberah in spring semester 2015.
Mapreduce examples starting from the basic WordCount to a more complex K-means algorithm. The code contained in these slides is available at https://github.com/andreaiacono/MapReduce
Silicon Valley Cloud Computing Meetup
Mountain View, 2010-07-19
Examples of Hadoop Streaming, based on Python scripts running on the AWS Elastic MapReduce service, which show text mining on the "Enron Email Dataset" from Infochimps.com plus data visualization using R and Gephi
Source at: http://github.com/ceteri/ceteri-mapred
A tutorial presentation based on hadoop.apache.org documentation.
I gave this presentation at Amirkabir University of Technology as Teaching Assistant of Cloud Computing course of Dr. Amir H. Payberah in spring semester 2015.
Mapreduce examples starting from the basic WordCount to a more complex K-means algorithm. The code contained in these slides is available at https://github.com/andreaiacono/MapReduce
Silicon Valley Cloud Computing Meetup
Mountain View, 2010-07-19
Examples of Hadoop Streaming, based on Python scripts running on the AWS Elastic MapReduce service, which show text mining on the "Enron Email Dataset" from Infochimps.com plus data visualization using R and Gephi
Source at: http://github.com/ceteri/ceteri-mapred
BFIT for healthcare is a Document Management System design for the healthcare industry. BFIT will provide your hospital or clinic a comprehensive database repository system to keep your patient database, their medical records, x-ray negative, letter of reference and etc.. important note in one system. The system also provide a notepad and image editor to doctors to records their observation and solution just like recording to the patient's card.
It is a system used to manage, track and
store documents and reduce paper.
It enable organizations to manage tasks
effectively and streamline processing of
their documents across all departments.
A Practical Guide to Capturing, Organizing, and Securing Your DocumentsScott Abel
Presented by Jeff Potts at Documentation and Training Life Sciences, June 23-26, 2008 in Indianapolis.
Every organization struggles with how to store, tag, and search for their documents. In a hospital corporation, the need is particularly critical. Hospital staff need to be able to quickly find the latest policies and procedures. Auditors need to be able to track who made what changes and when. Lawyers want to know which protocols were in place on a particular date. In this session you’ll learn a practical approach to putting a document management system in place that can help address these needs and reduce your exposure to legal, regulatory, and even human health risks.
Based on lessons learned during a real-world project, the session shows that getting your documents under control doesn’t have to be a multi-year, multi-million dollar effort. The slide deck outlines how a hospital corporation in New England used a “start small and grow” approach to piloting and rolling out a document management solution across the corporation.
Introduction to MapReduce Data Transformationsswooledge
MapReduce is a framework for scalable parallel data processing popularized by Google. Although initially used for simple large-scale text processing, map/reduce has recently been expanded to serve some application tasks normally performed by traditional relational databases.
You Will Learn
* The basics of Map/Reduce programming in Java
* The application domains where the framework is most appropriate
* How to build analytic database systems that handle large datasets and multiple data sources robustly
* Evaluate data warehousing vendors in a realistic and unbiased way
* Emerging trends to combine Map/Reduce with standard SQL for improved power and efficiency
Geared To
* Programmers
* Developers
* Database Administrators
* Data warehouse managers
* CIOs
* CTOs
The Chief Data Officer Agenda: Metrics for Information and Data ManagementDATAVERSITY
Welcome to The Chief Data Officer Agenda, a DATAVERSITY monthly webinar focused on the emerging priorities of the Chief Data Officer (CDO). What issues are CDOs facing now, and what should be on their Agenda. The webinar series is moderated by DATAVERSITY CEO and Founder, Tony Shaw, who will be joined each month by guest experts to discuss the requirements and demands on the burgeoning CDO role.
This month in the series:
The value proposition of enterprise information management is founded on Information being treated as an Asset. Information management professionals concur, but CxOs will say "So what?" In most organizations, they are both right! The conflict starts with one group thinking metaphorically, and the other literally. CDOs know that “Information asset” needs to be more than a metaphor…it has to be actionable. When you’re in charge of the application and value of data, how do you measure that? How do you measure progress? What types of metrics are there and which ones actually work? There is a lot more to measuring the value of information than common ROI.
This presentation will give you some starting points for real information asset management and information economics. You’ll learn some of the techniques being used successfully today, and considerations for quantifying the value and progress of information management. There is a means of reconciliation between the metaphors and reality, and this talk will outline a vision for the future, but with practical steps to help you get there.
In this session, we'll discuss architectural, design and tuning best practices for building rock solid and scalable Alfresco Solutions. We'll cover the typical use cases for highly scalable Alfresco solutions, like massive injection and high concurrency, also introducing 3.3 and 3.4 Transfer / Replication services for building complex high availability enterprise architectures.
Slide deck from an Alfresco Webinar which can be viewed at http://blogs.alfresco.com/wp/webcasts/2009/05/alfresco-webcast-a-developers-guide-1-capabilities-architecture-optaros/
This presentation discusses what Alfresco is an options for working with Alfresco from a developer perspective.
Alfresco 5.2 Introduces New Public REST APIs
For an update, please see: https://www.slideshare.net/jvonka/exciting-new-alfresco-apis
https://www.meetup.com/Alfresco-Meetups/events/236987848/
An overview of the new and enhanced APIs will be discussed and some of the key endpoints demonstrated via Postman so that by the time you leave you should have enough knowledge to create a simple client or integration.
These APIs will also be the foundation for new clients developed for the Alfresco Digital Business Platform.
We'll have a sneak peek at what's coming next and leave plenty of time for questions, feedback and open discussion.
Hadoop is commonly used for processing large swaths of data in batch. While many of the necessary building blocks for data processing exist within the Hadoop ecosystem – HDFS, MapReduce, HBase, Hive, Pig, Oozie, and so on – it can be a challenge to assemble and operationalize them as a production ETL platform. This presentation covers one approach to data ingest, organization, format selection, process orchestration, and external system integration, based on collective experience acquired across many production Hadoop deployments.
This presentation will give you Information about :
1.Configuring HDFS
2.Interacting With HDFS
3.HDFS Permissions and Security
4.Additional HDFS Tasks
HDFS Overview and Architecture
5.HDFS Installation
6.Hadoop File System Shell
7.File System Java API
Hadoop Training, Enhance your Big data subject knowledge with Online Training without wasting your time. Register for Free LIVE DEMO Class.
For more info: http://www.hadooponlinetutor.com
Contact Us:
8121660044
732-419-2619
http://www.hadooponlinetutor.com
This talk was for GDG Fresno meeting. The demo used Google Compute Engine and Google Cloud Storage. The actual talk was different than the slides. There were a lot of good questions from the audience, and diverted to side topics many times.
Apache Tez : Accelerating Hadoop Query ProcessingBikas Saha
Apache Tez is the new data processing framework in the Hadoop ecosystem. It runs on top of YARN - the new compute platform for Hadoop 2. Learn how Tez is built from the ground up to tackle a broad spectrum of data processing scenarios in Hadoop/BigData - ranging from interactive query processing to complex batch processing. With a high degree of automation built-in, and support for extensive customization, Tez aims to work out of the box for good performance and efficiency. Apache Hive and Pig are already adopting Tez as their platform of choice for query execution.
YARN Ready: Integrating to YARN with Tez Hortonworks
YARN Ready webinar series helps developers integrate their applications to YARN. Tez is one vehicle to do that. We take a deep dive including code review to help you get started.
Jumpstart your career with the world’s most in-demand technology: Hadoop. Hadooptrainingacademy provides best Hadoop online training with quality videos, comprehensive
online live training and detailed study material. Join today!
For more info, visit: http://www.hadooptrainingacademy.com/
Contact Us:
8121660088
732-419-2619
http://www.hadooptrainingacademy.com/
Everything you wanted to know about Apache Tez:
-- Distributed execution framework targeted towards data-processing applications.
-- Based on expressing a computation as a dataflow graph.
-- Highly customizable to meet a broad spectrum of use cases.
-- Built on top of YARN – the resource management framework for Hadoop.
-- Open source Apache incubator project and Apache licensed.
Hadoop is a Java software framework that supports data-intensive distributed applications and is developed under open source license. It enables applications to work with thousands of nodes and petabytes of data.
Similar to Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015 (20)
Breaking the Code of Interview Implicit Bias to Value Different Gender Competencies
Bonita Banducci, Banducci Consulting
Live at Santa Clara University - Room #330C located on the 3rd floor of the Learning Commons
Session Length: 1 hour
Implicit Bias Workshops and exercises are being shared widely on the internet. Some of the solutions are:
"Determine precisely what skills and attributes you are hiring for."
"Ask exactly the same questions to each candidate."
But what about the implicit bias in determining what skills you are valuing--beyond traditional management and leadership competencies?
How can interviewers recognize the often invisible, unarticulated, undervalued and often misinterpreted competencies of more "relational and collectivist" people--often women and men and women from different cultures?
Bonita Banducci teaches Gender and Engineering class in Santa Clara University's School of Engineering Graduate Program. In video and cartoon representation as well as in person, her students apply Gender Competence®--understanding and skills to work with gender (and cultural) differences as competencies--to job interviews both as the interviewer and the interviewee, as men and women. They show how to "mine the gold" of difference for the best candidate AND to get the job as the best candidate while establishing the value of relational competencies in the workplace and marketplace.
Breaking the Code of Interview Implicit Bias to Value Different Gender Compet...Deanna Kosaraju
Breaking the Code of Interview Implicit Bias to Value Different Gender Competencies
Bonita Banducci, Banducci Consulting
Live at Santa Clara University - Room #330C located on the 3rd floor of the Learning Commons
Voices 2015 - www.globaltechwomen.com
Session Length: 1 hour
Implicit Bias Workshops and exercises are being shared widely on the internet. Some of the solutions are:
"Determine precisely what skills and attributes you are hiring for."
"Ask exactly the same questions to each candidate."
But what about the implicit bias in determining what skills you are valuing--beyond traditional management and leadership competencies?
How can interviewers recognize the often invisible, unarticulated, undervalued and often misinterpreted competencies of more "relational and collectivist" people--often women and men and women from different cultures?
Bonita Banducci teaches Gender and Engineering class in Santa Clara University's School of Engineering Graduate Program. In video and cartoon representation as well as in person, her students apply Gender Competence®--understanding and skills to work with gender (and cultural) differences as competencies--to job interviews both as the interviewer and the interviewee, as men and women. They show how to "mine the gold" of difference for the best candidate AND to get the job as the best candidate while establishing the value of relational competencies in the workplace and marketplace.
Change IT!
S. Revi Sterling, University of Colorado Boulder
Voices 2015 - www.globaltechwomen.com
Session Length: 1 Hour
Dr. Revi Sterling founded and directs the only Information and Communication Technology for Development graduate program in the United States. This talk would demonstrate how IT (ICT as the rest of the world calls it) has given a quantum boost to international development efforts, and will give examples of what works and what doesn’t when technologists turn humanitarians. This talk will open avenues for technologists of all types and levels to truly make impact with their ideas, while promoting collaboration rather than competition. Sterling will point audiences to helpful resources while catalyzing their creativity.
How Can We Make Interacting With Technology and Science Exciting and Fun Expe...Deanna Kosaraju
How Can We Make Interacting With Technology and Science Exciting and Fun Experiences?
Marjan BoorBoor, Master of Technological Socio-Economical Planning (Major in Intelligent Renewable Energy System Planning)
Voices 2015 - www.globaltechwomen.com
Thu March 12 7:00 PST
Thu March 12 10:00 EST
Thu March 12 14:00 UTC
Thu March 12 19:30 IST
Fri March 13 1:00 Sydney
Session Length: 30 minutes + 30 minutes networking time
How my curiosity took me to science and engineering. How finding my own way of learning made me fall in love with science and technology and how sharing this with others has given them a memorable experience making them curious and innovative in the field. My mission to reach as many people as possible.
About Marjan:
I have loved math since I was in elementary school, I did a lot of self-educating in math where I find joy and excitement in math. In middle school because of my math grades I was invited to participate in a preparation math course for International Math and computer Olympiad.
I have a master degree in Technological Socio Economical planning, majoring in renewable energy system planning and development. My background education is in Mathematic, Physics, Computer science, Artificial intelligence and Robotics, Marketing, Leadership, Project management and innovation and monitoring, Innovative Psychology of learning.
I made several intelligent Robots and participated in several artificial intelligent Robocup competitions and participated in Iran, Germany, Netherland and Atlanta international Robocup competition and won several awards.
I am the developer of energyplanner.dk which is an intelligence tool for tracing changes in an energy system due to gradual energy transition from fossil fuel to renewable energies that is going to be used by energy planner in Denmark.
I have innovative way of learning for science and technology that help you not only become good at it but also enjoy it. I have teach math for years, and my experience have been that factors such as gender, age, and race, background are not determining factors for being good at math and enjoy learning that.
Measure Impact, Not Activity
Karen Catlin, Karen Catlin Consulting
Voices 2015 - www.globlatechwomen.com
LIVE - At Santa Clara University, Room #330C located on the 3rd floor of the Learning Commons
Session Length: 45 minutes
Most women’s affinity groups measure their activity. How many events they’ve run, how many people attended each event, and the like. What’s often missing is a vision of what will be different because of the group’s impact and how to measure it. Frankly, I'm not surprised. That's exactly what I did when I ran the women's group at Adobe Systems. I founded that group in 2008, and while I'm proud of our accomplishments, I didn't measure our impact. I honestly don't think I knew how to. At the time, calculating the return on investment seemed like a holy grail.
I've since realized how important it is to measure the impact of an affinity group and how to go about doing so. In this talk, I will provide practical tips for identifying the vision, gaining support, defining the strategy for achieving it, and measuring the impact.
Women’s INpowerment: The First-ever Global Survey to Hear Voice, Value and Vi...Deanna Kosaraju
Women’s INpowerment: The First-ever Global Survey to Hear Voice, Value and Vision of Today’s Young Women
Jin In, 4Girls GLocal Leadership (4GGL)
Voices 2015 - www.globaltechwomen.com
Wed March 11 11:00 PST
Wed March 11 14:00 EST
Wed March 11 18:00 UTC
Wed March 11 23:30 IST
Thu March 12 5:00 Sydney
Session Length: 1 Hour
Today, it is almost indisputable that EMPOWERMENT of girls and women is required for sustainable development. Whether poverty, violence and terrorism, or population explosion, as stated by Kofi Annan, 7th Secretary General of the United Nations, “There is no tool for development more effective than the empowerment of women.”
And yet, what do we really know about empowerment of women, particularly at the global level?
One global monitor is the Millennium Development Goals. Specifically, Goal 3 is to promote gender equality and empower women. However, when examined in details, its one target is to eliminate gender disparity in education, and its four indicators are:
1. Ratio of girls to boys in primary, secondary and tertiary education;
2. Ratio of literate women to men, 15-24 years old;
3. Share of women in wage employment in the non-agricultural sector; and
4. Proportion of seats held by women in national parliament
Another is the World Economic Forum’s Global Gender Gap Report. Here it specifically states that it ranks countries according to gender equality, NOT women’s empowerment.
Therefore, what may be THE GAP in gender gap is empowerment. And to begin our understanding and knowledge, 4GIRLS GLOCAL LEADERSHIP (4GGL) created and launched the first-ever global survey to hear voices, values, and vision of today’s young women around the world.
Our goal for this Conference is to share our initial findings, as well as begin a critical conversation about empowerment – the most powerful tool for change.
What to read more about Jin In? Check out her website. You can also read her story, a fascinating read!
The Language of Leadership
Caroline Simard, Research Director at Clayman Institute for Gender Research at Stanford University
Voices 2015 www.globaltechwomen.com
Session Length: 1 Hour
Language can influence our perceptions of men and women, and the potential each has to lead. In this session, we discuss the language of leadership and the role bias can play in shaping different leadership outcomes for men and women. The session will offer strategies for individuals and managers to examine and sharpen their own voices and their advocacy of others, with an aim to advance women’s leadership and create effective organizations where all employees thrive.
Mentors and Role Models - Best Practices in Many Cultures - Voices 2015Deanna Kosaraju
Mentors and Role Models - Best Practices in Many Cultures
Katy Dickinson, Founder, Mentoring Standard
Voices 2015 www.globaltechwomen.com
Wed March 11 8:30 PST
Wed March 11 11:30 EST
Wed March 11 15:30 UTC
Wed March 11 21:00 IST
Thu March 12 2:30 Sydney
Session Length: 1 Hour
Mentoring is a professional methodology with remarkably good payback. This talk will present how mentors, mentees, and their home organizations can make the most of this best practice, including how to start up and measure a mentoring program. Examples will come from successful corporate, governmental, and school-based mentoring programs in Brazil, China, India, the USA, Europe, Africa, and the Middle East. Program success in one Engineering company was measured at over 1,000% return on investment (ROI) with more than twice the normal promotions, 93% satisfaction, 88% mentors working remotely from mentees in 30 global sites, and 70% executive mentors. Many of the stories will come from the U.S. State Department's TechWomen mentoring program for STEM professional women. Since 2011, 250 mentors from 89 Silicon Valley companies have hosted TechWomen Emerging Leaders from the Middle East and Africa who then return to their home countries to be mentors and role models for girls and young women. Illustrations for the talk will come from sources including the "Notable Women in Computing Card Deck" Kickstarter project and the "TechWomen Emerging Leaders from the Middle East and Africa" project.
About Katy: Katy Dickinson designs and manages successful mentoring programs in the Americas, Africa, the Middle East, Europe, and Asia. She has held senior executive roles at Everwise, People to People, MentorCloud, Huawei, and Sun Microsystems. At Sun, she created and managed the global Engineering mentoring programs for ten years, after creating and managing the Sun Labs archiving system, the Software development life cycle process, and other large corporate infrastructure.
Katy Dickinson was the Process Architect for the first class of the U.S. State Department’s TechWomen mentoring program for the Middle East and Africa. She is an Accredited Mentor, University of the South - School of Theology, Education for Ministry program. Member of the Anita Borg Institute Advisory Board. Lecturer for the University of California at Berkeley Engineering class on entrepreneurship. Katy Dickinson was graduated from the University of California at Berkeley with high honors and distinction. She is an author, speaker, and popular blogger.
Featured Session: Voices Live Chicago Conference
Location: Aon
200 East Randolph
Chicago, IL USA
12-2pm CST
Panel: Cracking the Glass Ceiling: Growing Female Technology Professionals
Will be streamed on Spreecast and WebEx from 12-2pm CST on Friday, March 13th
Moderators:
Margaret Resce Milkint, Managing Partner, The Jacobson Group; WING Co-Founder; ITF Board Member
David Mendelsohn, Managing Partner, DLA Piper; WING Co-Founder
Panelists:
Danelle Kent, Consultant¸ SWC Technology Partners
Danelle is a Certified Project Management Professional (NU) with 4+ years of combined experience in detail oriented technical writing and quality assurance analysis. She currently supports full software lifecycle by facilitating different functional roles including quality assurance analyst, business analyst, and technical writer.
Arti Arora, Aon
Deanne Hettich, Vice President Practice Leadership, Aon Hewitt
Cynthia Clarke, CIO, Mesirow Financial
Jeff Hughes, Vice President Information Technology, CNA
Marisa Cabrera, IT Rotational Program Participant, CNA
Abstract: Despite the strides made recently for women in business, female tech professionals continue to be outpaced by their male counterparts. According to Silicon Valley Bank’s Innovation Economy Outlook survey, less than 50 percent of technology companies have women in the C-suite or serving on the board of directors. Only 19 percent of CIO positions for Fortune 250 companies are held by women.
In fact, the gender disparity among technology professionals seems to be increasing in spite of recent gains throughout the workplace. Fewer women are joining the tech workforce and the numbers of female students studying technology is in decline—today only 18 percent of computer science majors are women, compared to 37 percent in the mid-1980s. Add in a continued wage imbalance and a high turnover rate for female tech professionals mid-career and it is clear that there is work to be done. How can we encourage more women to join the technology field and insurance technology in particular? What can be done to break down the barriers to success as a female technology professional?
Heart Rate Variability and the Digital Health Revolution - Voices 2015Deanna Kosaraju
Heart Rate Variability and the Digital Health Revolution
Ronda Collier, SweetWater Health
Voices 2015 www.globaltechwomen.com
Thu March 12 13:00 PST
Thu March 12 16:00 EST
Thu March 12 20:00 UTC
Fri March 13 1:30 IST
Fri March 13 7:00 Sydney
Session Length: 1 Hour
According to leading medical institutions such as Stanford, Harvard and the Mayo Clinic, stress is responsible for more than 90% all disease. Most people are stressed to some degree and many are not aware of it. This may be due to the fact that the brain is a giant filter and pattern matcher. When a condition becomes familiar the brain filters it out, even if it is not good for you.
HRV is the latest buzz word in the digital health revolution. It provides an objective measure of the stress or “fight or flight” response in your nervous system, and insight into situations, thoughts and behaviors that cause stress. In addition, it is a measure of systemic health and is an early indicator for heart disease and hypertension.
Legacy HRV measurement systems were expensive, bulky and did not provide meaningful feedback for the general population. Thanks to mobile platforms such as smart phones and tablets, HRV is available to the consumer, providing an accurate and easy way to measure and manage the weak link in the chain, also known as stress.
Women and CS, Lessons Learned From Turkey - Voices 2015Deanna Kosaraju
Women and CS, Lessons Learned From Turkey
Voices 2015 www.globaltechwomen.com
Mon March 9 23:00 PST
Tue March 10 2:00 EST
Tue March 10 6:00 UTC
Tue March 10 11:30 IST
Tue March 10 16:00 Sydney
Dr. Umit Yalcinalp, Architect
Dr. Gokcen Cilingir, Senior Software Engineer
Dr. Gulustan Dogan, Associate Professor
Session Length: 1 Hour
While interest in computing is steadily declining in US among women, more women are attracted to computing and seeking technical CS degrees in Turkey. In this talk, members of Turkish Women in Computing will discuss the results of ongoing research based on survey conducted on this discrepancy among its members and their connections who are women with computing degrees. We will briefly present the hypotheses previously presented at Global Voices conference in 2014 and how our findings compare with these hypotheses previously presented. We will also talk about our data collection methodology and some interesting surprises we encountered about demographics that affected the results.
Communications Platform Provides "Your School at your Fingertips" for Busy Pa...Deanna Kosaraju
Communications Platform Provides "Your School at your Fingertips" for Busy Parents
Nirupama Mallavarupu, MobileArq
Voices 2015 www.globaltechwomen.com
Mon March 9 13:00 PST
Mon March 9 16:00 EST
Mon March 9 20:00 UTC
Tue March 10 1:30 IST
Tue March 10 7:00 Sydney
Session Length: 1 Hour
MobileArq is a mobile school directory for parents created by a Mother with expertise in Information Technology and a first-hand knowledge of the needs of busy parents with school age children. The mobile directory is the cloud version of the standard print directory, which contains the contact information of all parents and families in the school. The MobileArq app has a logical, easy to use interface through which parents can access school directory information on their mobile devices as well as desktop computers, whenever and wherever they need it. Unlike printed directories, new data can be added and corrections made in real-time so information is always up-to-date, and families can chose to opt out or edit personal contact information easily throughout the school year. MobileArq provides the user the added convenience of calling/emailing/texting and message groups directly from the App.
MobileArq's provides PTAs all-in-one portal to manage their communication, fundraising and memberships. MobileArq is the first to also provide APIs for third parties to write applications for parents, teachers and students.
ASEAN Women in Tech
Voices 2015 www.globaltechwomen.com
Tue March 10 20:00 PST
Tue March 10 23:00 EST
Wed March 11 3:00 UTC
Wed March 11 8:30 IST
Wed March 11 14:00 Sydney
Lucie Newcomb CEO The NewComm Global Group Inc. (USA), Moderator
Nuraizah Shamsul Baharin, Managing Director of Madcat World Sdn. Bhd. (Malaysia),
Jennifer Kenny, CEO BizTh!nk (Indonesia)
Au Soriano, CEO Pinoy Travel (Philippines)
Haslina Taib, CEO BAG Networks Sdn Bhd (Brunei)
Session length: 1 Hour
In celebration of International Women's Day, from Brunei to Malaysia, Indonesia to the Philippines, ASEAN to California and around the world, we will share our experiences and recommendations; celebrate our unique contexts as well as common ground.
Empowering Women Technology Startup Founders to Succeed - Voices 2015Deanna Kosaraju
Empowering Women Technology Startup Founders to Succeed
Ari Horie, CEO & Founder, Women's Startup Lab
Voices 2015 - www.globaltechwomen.com
Mon March 9 10:00 PST
Mon March 9 13:00 EST
Mon March 9 17:00 UTC
Tue March 10 22:30 IST
Tue March 10 4:00 Sydney
Session Length: 1 Hour
As Founder and CEO of Women's Startup lab, Ari Horie will share her knowledge on the entrepreneurial journey, obstacles facing female technology founders, and her tips and tricks for women looking to overcome these barriers. In addition, Ari will be able to discuss her philosophy, “The Hito Rule,” which calls on women to not only lean in, but on and up their communities to gather the skills, network, and support needed to advance their companies and how she has successfully implemented this philosophy during her own entrepreneurial journey.
Innovation a Destination and a Journey - Voices 2015Deanna Kosaraju
Innovation – A Destination and a Journey
Valeria Mihalache, Xilinx
Voices 2015 - www.globaltechwomen.com
Mon March 9 9:00 PST
Mon March 9 12:00 EST
Mon March 9 16:00 UTC
Tue March 10 21:30 IST
Tue March 10 3:00 Sydney
Session Length: 1 Hour
We live in a time when change happens at lightning speed, all around us. The only way for companies to be competitive is to innovate, not only with respect to their product offering, but also with respect to their marketing and to their work processes. Moreover, all organizations need innovation in order to stay ahead in the game. Non-profit organizations need to bring innovation in their processes to attract more donors; schools need to innovate with respect to the teaching methods, as well as with respect to the programs they offer, for instance to take advantage of the latest technologies available, or to engage the students more in the learning process.
In this presentation we will focus on innovation and on its two good companions, creativity and invention. We will talk about the differences between the three, while also showing the intrinsic connection between them. Giving real-life examples, we will talk about different types on innovation. We recognize that innovativeness is an elusive trait, that one cannot predict how innovative somebody will be in his or her position. Nevertheless, there are certain things that organizations can do to allow for and to stimulate the innovativeness of their employees/members. We will talk about processes that organizations can use in order to inspire and foster innovation.
Agility and Cloud Computing
Ambs Kesavan, Xilinx
Voices 2015 www.globaltechwomen.com
Session Length: 45 minutes
The objective of this talk is to share technology trends in cloud computing industry and the opportunities they provide to innovate at scale. The presentation highlights the productivity and economic benefits from adopting this disruptive technology to create a sustained competitive advantage for businesses of all sizes ranging from SMB segment to high end enterprises.
The Confidence Gap: Igniting Brilliance through Feminine Leadership - Voices...Deanna Kosaraju
The Confidence Gap: Igniting Brilliance through Feminine Leadership
Lisa Marie Jenkins, Cisco Systems, Lisa Marie Jenkins Consulting
Voices Conference www.globaltechwomen.com
Mon March 9 15:00 PST
Mon March 9 18:00 EST
Mon March 9 22:00 UTC
Tue March 10 3:30 IST
Tue March 10 9:00 Sydney
Session Length: 1 Hour
Evidence shows that women are less self-assured than men, however studies prove confidence is just as important as competence when it comes to advancing a successful career. The confidence gap between men and women is caused by factors that range from upbringing, social conditioning, and biology. Today the 8 most desired characteristics for the modern leader are considered to be feminine, so when women become more courageous, they will naturally become more effective leaders and everyone benefits.
In this session, you will:
• Begin to learn what it takes to overcome self-doubt and become unstoppable in the face of fear
• Develop the courage and clarity to take action and take risks
• Understand how confidence combined with core strengths creates powerful and authentic women
Business Intelligence Engineering - Voices 2015Deanna Kosaraju
Business Intelligence Engineering for Big Data
Ramya Bommareddy
Voices 2015 - www.globaltechwomen.com
March 9th 2015
Session Length: 1 hour
Beyond building and developing information technology applications and teams, we will share our experiences of thriving in a globally distributed organization. As a dynamic duo of Engineering Manager and Engineering lead, we will showcase how we are delivering innovation in Business Intelligence and Data Engineering in an Enterprise Data warehouse setting. We cater to the data and information needs of the business community whose priorities are moving targets. The myth that Agile methodology only works for co-located teams has been busted. Promoting a culture of outside-in-thinking through customer centricity, with focus on quality is the key to success.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
3. What is MapReduce?
• Simple data-parallel programming model designed for
scalability and fault-tolerance
• Pioneered by Google
• Processes 20 petabytes of data per day
• Popularized by open-source Hadoop project
• Used at Yahoo!, Facebook, Amazon, …
5. Outline
• Cloud And MapReduce
• MapReduce architecture
• Example applications
• Getting started with Hadoop
• Tuning MapReduce
6. Cloud Computing
• The emergence of cloud computing
has made a tremendous impact on
the Information Technology (IT) industry
• Cloud computing moved away from personal computers and
the individual enterprise application server to services
provided by the cloud of computers
• The resources like CPU and storage are provided as general
utilities to the users on-demand based through internet
• Cloud computing is in initial stages, with many issues still to
be addressed.
10. MapReduce History
• Historically, data processing was completely done using
database technologies. Most of the data had a well-defined
structure and was often stored in relational databases
• Data soon reached terabytes and then petabytes
• Google developed a new programming model called
MapReduce to handle large-scale data analysis,and later they
introduced the model through their seminal paper
MapReduce: Simplified Data Processing on Large Clusters.
13. What is MapReduce used for?
• At Google:
• Index construction for Google Search
• Article clustering for Google News
• Statistical machine translation
• At Yahoo!:
• “Web map” powering Yahoo! Search
• Spam detection for Yahoo! Mail
• At Facebook:
• Data mining
• Ad optimization
• Spam detection
14. MapReduce Framework
• computing paradigm for processing data that resides on hundreds of
computers
• popularized recently by Google, Hadoop, and many others
• more of a framework
• makes problem solving easier and harder
• inter-cluster network utilization
• performance of a job that will be distributed
• published by Google without any actual source code
16. Outline
• Cloud And MapReduce
• MapReduce Basics
• Example applications
• Getting started with Hadoop
• Tuning MapReduce
17. Word Count -"Hello World" of
MapReduce world.
• The word count job accepts an input directory, a mapper
function, and a reducer function as inputs.
• We use the mapper function to process the data in parallel,
and we use the reducer function to collect results of the
mapper and produce the final results.
• Mapper sends its results to reducer using a key-value based
model.
• $bin/hadoop -cp hadoop-microbook.jar
microbook.wordcount. WordCount amazon-meta.txt
wordcount-output1
19. Example : Word Count
19Map
Tasks
Reduce
Tasks
• Job: Count the occurrences of each word in a data set
20. Outline
• Cloud And MapReduce
• MapReduce Basics
• Example applications
• Mapreduce Architecture
• Getting started with Hadoop
• Tuning MapReduce
21. How Mapreduce Works
At the highest level, there are four independent entities:
• The client, which submits the MapReduce job.
• The jobtracker, which coordinates the job run. The jobtracker
is a Java application whose main class is JobTracker.
• The tasktrackers, which run the tasks that the job has been
split into.
• The distributed filesystem (normally HDFS), which is used
for sharing job files between the other entities.
23. Developing a MapReduce Application
• The Configuration API
Configuration conf = new Configuration();
conf.addResource("configuration-1.xml");
conf.addResource("configuration-2.xml");
• GenericOptionsParser, Tool, and ToolRunner
• Writing a Unit Test
• Testing the Driver
• Launching a Job
% hadoop jar hadoop-examples.jar v3.MaxTemperatureDriver -
conf conf/hadoop-cluster.xml Input/ncdc/all max-temp
• Retrieving the Results
24. This is where the Magic Happens
public class MaxTemperatureDriver extends Configured implements Tool {
@Override
Job job = new Job(getConf(), "Max temperature");
job.setJarByClass(getClass());
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(MaxTemperatureMapper.class);
job.setCombinerClass(MaxTemperatureReducer.class);
job.setReducerClass(MaxTemperatureReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
return job.waitForCompletion(true) ? 0 : 1;
}
public static void main(String[] args) throws Exception {
int exitCode = ToolRunner.run(new MaxTemperatureDriver(), args);
System.exit(exitCode);
}
}
30. Why Hadoop is able to compete?
30
Scalability (petabytes of data,
thousands of machines)
Database
vs.
Flexibility in accepting all data
formats (no schema)
Commodity inexpensive hardware
Efficient and simple fault-tolerant
mechanism
Performance (tons of indexing,
tuning, data organization tech.)
Features:
- Provenance tracking
- Annotation management
- ….
31. What is Hadoop
• Hadoop is a software framework for distributed processing of large
datasets across large clusters of computers
• Large datasets Terabytes or petabytes of data
• Large clusters hundreds or thousands of nodes
• Hadoop is open-source implementation for Google MapReduce
• HDFS is a filesystem designed for storing very large files with
streaming data access patterns, running on clusters of commodity
hardware
31
32. What is Hadoop (Cont’d)
• Hadoop framework consists on two main layers
• Distributed file system (HDFS)
• Execution engine (MapReduce)
• Hadoop is designed as a master-slave shared-nothing architecture
32
33. Design Principles of Hadoop
• Automatic parallelization & distribution
• computation across thousands of nodes and Hidden from the end-user
• Fault tolerance and automatic recovery
• Nodes/tasks will fail and will recover automatically
• Clean and simple programming abstraction
• Users only provide two functions “map” and “reduce”
• Need to process big data
• Commodity hardware
• Large number of low-end cheap machines working in parallel to solve a
computing problem
33
34. Hardware Specs
• Memory
• RAM
• Total tasks
• No Raid required
• No Blade server
• Dedicated Switch
• Dedicated 1GB line
35. Who Uses MapReduce/Hadoop
• Google: Inventors of MapReduce computing paradigm
• Yahoo: Developing Hadoop open-source of MapReduce
• IBM, Microsoft, Oracle
• Facebook, Amazon, AOL, NetFlex
• Many others + universities and research labs
• Many enterprises are turning to Hadoop
• Especially applications generating big data
• Web applications, social networks, scientific applications
35
36. Hadoop:How it Works
• Hadoop implements Google’s MapReduce, using HDFS
• MapReduce divides applications into many small blocks of work.
• HDFS creates multiple replicas of data blocks for reliability, placing them
on compute nodes around the cluster.
• MapReduce can then process the data where it is located.
• Hadoop ‘s target is to run on clusters of the order of 10,000-nodes.
36
SathyaSaiUniversity,Prashanti
Nilayam
38. Hadoop: Assumptions
It is written with large clusters of computers in mind and is built
around the following assumptions:
• Hardware will fail.
• Processing will be run in batches.
• Applications that run on HDFS have large data sets.
• It should provide high aggregate data bandwidth
• Applications need a write-once-read-many access model.
• Moving Computation is Cheaper than Moving Data.
• Portability is important.
40. Hadoop Distributed File System (HDFS)
40
Centralized namenode
- Maintains metadata info about files
Many datanode (1000s)
- Store the actual data
- Files are divided into blocks
- Each block is replicated N times
(Default = 3)
File F 1 2 3 4 5
Blocks (64 MB)
41. Main Properties of HDFS
• Large: A HDFS instance may consist of thousands of server
machines, each storing part of the file system’s data
• Replication: Each data block is replicated many times
(default is 3)
• Failure: Failure is the norm rather than exception
• Fault Tolerance: Detection of faults and quick, automatic
recovery from them is a core architectural goal of HDFS
• Namenode is consistently checking Datanodes
41
42. Outline
• Cloud And MapReduce
• MapReduce architecture
• Example applications
• Getting started with Hadoop
• Tuning MapReduce
44. Mapping workers to
Processors
• The input data (on HDFS) is stored on the local disks of the machines
in the cluster. HDFS divides each file into 64 MB blocks, and stores
several copies of each block (typically 3 copies) on different
machines.
• The MapReduce master takes the location information of the input
files into account and attempts to schedule a map task on a machine
that contains a replica of the corresponding input data. Failing that, it
attempts to schedule a map task near a replica of that task's input
data. When running large MapReduce operations on a significant
fraction of the workers in a cluster, most input data is read locally and
consumes no network bandwidth.
44
SathyaSaiUniversity,Prashanti
Nilayam
45. Task Granularity
• The map phase has M pieces and the reduce phase has R pieces.
• M and R should be much larger than the number of worker
machines.
• Having each worker perform many different tasks improves dynamic
load balancing, and also speeds up recovery when a worker fails.
• Larger the M and R, more the decisions the master must make
• R is often constrained by users because the output of each reduce task
ends up in a separate output file.
• Typically, (at Google), M = 200,000 and R = 5,000, using 2,000
worker machines.
45
SathyaSaiUniversity,Prashanti
Nilayam
46. Speculative Execution – One
approach
• Tasks may be slow for various reasons, including hardware
degradation or software mis-configuration, but the causes
may be hard to detect since the tasks still complete
• successfully, albeit after a longer time than expected. Hadoop
doesn’t try to diagnose and fix slow-running tasks;
• instead, it tries to detect when a task is running slower than
expected and launches another, equivalent, task as a backup.
47. Problem Statement
The problem at hand is defining a resource provisioning
framework for MapReduce jobs running in a cloud keeping in
mind performance goals such as
Resource utilization with
-optimal number of map and reduce slots
-improvements in execution time
-Highly scalable solution
48. References
[1] E. Bortnikov, A. Frank, E. Hillel, and S. Rao, “Predicting execution bottlenecks in map-
reduce clusters” In Proc. of the 4th USENIX conference on Hot Topics in Cloud computing,
2012.
[2] R. Buyya, S. K. Garg, and R. N. Calheiros, “SLA-Oriented Resource Provisioning for Cloud
Computing: Challenges, Architecture, and Solutions” In International Conference on Cloud and
Service Computing, 2011.
[3] S. Chaisiri, Bu-Sung Lee, and D. Niyato, “Optimization of Resource Provisioning Cost in
Cloud Computing” in Transactions On Service Computing, Vol. 5, No. 2, IEEE, April-June 2012
[4] L Cherkasova and R.H. Campbell, “Resource Provisioning Framework for MapReduce Jobs
with Performance Goals”, in Middleware 2011, LNCS 7049, pp. 165–186, 2011
[5] J. Dean, and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters”,
Communications of the ACM, Jan 2008
[6] Y. Hu, J. Wong, G. Iszlai, and M. Litoiu, “Resource Provisioning for Cloud Computing” In
Proc. of the 2009 Conference of the Center for Advanced Studies on Collaborative Research,
2009.
[7] K. Kambatla, A. Pathak, and H. Pucha, “Towards optimizing hadoop provisioning in the
cloud in Proc. of the First Workshop on Hot Topics in Cloud Computing, 2009
[8] Kuyoro S. O., Ibikunle F. and Awodele O., “Cloud Computing Security Issues and
Challenges” in International Journal of Computer Networks (IJCN), Vol. 3, Issue 5, 2011
Editor's Notes
When you run the MapReduce job, Hadoop first reads the input files from the input directory
line by line. Then Hadoop invokes the mapper once for each line passing the line as the
argument. Subsequently, each mapper parses the line, and extracts words included in the
line it received as the input. After processing, the mapper sends the word count to the reducer
by emitting the word and word count as name value pairs.
Writing a program in MapReduce has a certain flow to it. You start by writing your
map and reduce functions, ideally with unit tests to make sure they do what you expect.
Then you write a driver program to run a job, which can run from your IDE using a
small subset of the data to check that it is working. If it fails, then you can use your
IDE’s debugger to find the source of the problem. With this information, you can
expand your unit tests to cover this case and improve your mapper or reducer as appropriate
to handle such input correctly.
When the program runs as expected against the small dataset, you are ready to unleash
it on a cluster. Running against the full dataset is likely to expose some more issues,
which you can fix as before, by expanding your tests and mapper or reducer to handle
the new cases. Debugging failing programs in the cluster is a challenge, so we look at
some common techniques to make it easier.
We solve problems involving large datasets using many computers where we can parallel
process the dataset using those computers. However, writing a program that processes a
dataset in a distributed setup is a heavy undertaking. The challenges of such a program are
shown as follows:
Although it is possible to write such a program, it is a waste to write such programs again
and again. MapReduce-based frameworks like Hadoop lets users write only the