Wayne Schroeder is an expert in distributed data management and iRODS with over 38 years of experience in software engineering, data management, and scientific computing. He designed and implemented major components of iRODS and provided support to the international iRODS user community for over 12 years. He has extensive experience managing projects and leading teams, and currently owns a consulting business providing data management consulting services and products.
These are the slides for Robert H. McDonald for the Future Trends Panel Presentation at the the Inter-institutional Approaches to Supporting Scholarly Communication Symposium held on August 16, 2012 at the Georgia Institute of Technology.
Evolving NASA’s Data and Information Systems for Earth Scienceinside-BigData.com
In this deck from the HPC User Forum, Rahul Ramachandran from NASA presents: Evolving NASA’s Data and Information Systems for Earth Science.
"NASA’s Earth Science Division (ESD) missions help us to understand our planet’s interconnected systems, from a global scale down to minute processes. Working in concert with a satellite network of international partners, ESD can measure precipitation around the world, and it can employ its own constellation of small satellites to look into the eye of a hurricane. ESD technology can track dust storms across continents and mosquito habitats across cities. ESD delivers the technology, expertise and global observations that help us to map the myriad connections between our planet’s vital processes and the effects of ongoing natural and human-caused changes."
Watch the video: https://wp.me/p3RLHQ-k8y
Learn more: https://science.nasa.gov/earth-science
and
http://hpcuserforum.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
OntoSoft: A Distributed Semantic Registry for Scientific Softwaredgarijo
Credit to Yolanda Gil.
OntoSoft is a distributed semantic registry for scientific software. This paper describes three major novel contributions of OntoSoft: 1) a software metadata registry designed for scientists, 2) a distributed approach to software registries that targets communities of interest, and 3) metadata crowdsourcing through access control. Software metadata is organized using the OntoSoft ontology along six dimensions that matter to scientists: identify software, understand and assess software, execute software, get support for the software, do research with the software, and update the software. OntoSoft is a distributed registry where each site is owned and maintained by a community of interest, with a distributed semantic query capability that allows users to search across all sites. The registry has metadata crowdsourcing capabilities, supported through access control so that software authors can allow others to expand on specific metadata properties.
These are the slides for Robert H. McDonald for the Future Trends Panel Presentation at the the Inter-institutional Approaches to Supporting Scholarly Communication Symposium held on August 16, 2012 at the Georgia Institute of Technology.
Evolving NASA’s Data and Information Systems for Earth Scienceinside-BigData.com
In this deck from the HPC User Forum, Rahul Ramachandran from NASA presents: Evolving NASA’s Data and Information Systems for Earth Science.
"NASA’s Earth Science Division (ESD) missions help us to understand our planet’s interconnected systems, from a global scale down to minute processes. Working in concert with a satellite network of international partners, ESD can measure precipitation around the world, and it can employ its own constellation of small satellites to look into the eye of a hurricane. ESD technology can track dust storms across continents and mosquito habitats across cities. ESD delivers the technology, expertise and global observations that help us to map the myriad connections between our planet’s vital processes and the effects of ongoing natural and human-caused changes."
Watch the video: https://wp.me/p3RLHQ-k8y
Learn more: https://science.nasa.gov/earth-science
and
http://hpcuserforum.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
OntoSoft: A Distributed Semantic Registry for Scientific Softwaredgarijo
Credit to Yolanda Gil.
OntoSoft is a distributed semantic registry for scientific software. This paper describes three major novel contributions of OntoSoft: 1) a software metadata registry designed for scientists, 2) a distributed approach to software registries that targets communities of interest, and 3) metadata crowdsourcing through access control. Software metadata is organized using the OntoSoft ontology along six dimensions that matter to scientists: identify software, understand and assess software, execute software, get support for the software, do research with the software, and update the software. OntoSoft is a distributed registry where each site is owned and maintained by a community of interest, with a distributed semantic query capability that allows users to search across all sites. The registry has metadata crowdsourcing capabilities, supported through access control so that software authors can allow others to expand on specific metadata properties.
SDGs, Taita Taveta Perspective by Roberto MwashashuRoberto Mwashashu
A presentation on the Perspective of Sustainable Development Goals (SDGs) implementation in the County Government of Taita Taveta. This was presented at Taita Taveta Wildlife Sanctuary.
Enjoy your reading and share comments. Especially if you think there is need to mainstream County and National plans to the Sustainable Development Goals, Agenda 2063, Vision 2030 and other essential plans.
Providing actionable healthcare analytics at scale: A perspective from stroke...Nuffield Trust
Benjamin Bray, Research Director and the Sentinel Stroke National Audit Programme, presents at the Monitoring quality of care conference about stroke care analytics.
Lisa Annaly, Head of Provider Analytics at the Care Quality Commission, discusses lessons learned from the CQC as they have worked to monitor care quality over time.
Herbalance - Здоровый гормональный фон в любом возрастеgoldenmouse
Здоровье, красота, молодость - все это результат здорового гормонального фона. HerBalance - натуральная добавка от компании NHT Global - простой путь иметь Ваш идеальный гормональный фон в любом возрасте!
Material para repaso del tema respectivo para los y las estudiantes de 9º año del profesor Gustavo Bolaños Ramírez, del Liceo de Atenas, Alajuela, Costa Rica.
DOCUMENT SELECTION USING MAPREDUCE Yenumula B Reddy and Desmond HillClaraZara1
Big data is used for structured, unstructured and semi-structured large volume of data which is difficult to manage and costly to store. Using explanatory analysis techniques to understand such raw data, carefully balance the benefits in terms of storage and retrieval techniques is an essential part of the Big Data. The research discusses the MapReduce issues, framework for MapReduce programming model and implementation. The paper includes the analysis of Big Data using MapReduce techniques and identifying a required document from a stream of documents. Identifying a required document is part of the security in a stream of documents in the cyber world. The document may be significant in business, medical, social, or terrorism.
Big data is used for structured, unstructured and semi-structured large volume of data which is difficult to
manage and costly to store. Using explanatory analysis techniques to understand such raw data, carefully
balance the benefits in terms of storage and retrieval techniques is an essential part of the Big Data. The
research discusses the Map Reduce issues, framework for Map Reduce programming model and
implementation. The paper includes the analysis of Big Data using Map Reduce techniques and identifying
a required document from a stream of documents. Identifying a required document is part of the security in
a stream of documents in the cyber world. The document may be significant in business, medical, social, or
terrorism.
SDGs, Taita Taveta Perspective by Roberto MwashashuRoberto Mwashashu
A presentation on the Perspective of Sustainable Development Goals (SDGs) implementation in the County Government of Taita Taveta. This was presented at Taita Taveta Wildlife Sanctuary.
Enjoy your reading and share comments. Especially if you think there is need to mainstream County and National plans to the Sustainable Development Goals, Agenda 2063, Vision 2030 and other essential plans.
Providing actionable healthcare analytics at scale: A perspective from stroke...Nuffield Trust
Benjamin Bray, Research Director and the Sentinel Stroke National Audit Programme, presents at the Monitoring quality of care conference about stroke care analytics.
Lisa Annaly, Head of Provider Analytics at the Care Quality Commission, discusses lessons learned from the CQC as they have worked to monitor care quality over time.
Herbalance - Здоровый гормональный фон в любом возрастеgoldenmouse
Здоровье, красота, молодость - все это результат здорового гормонального фона. HerBalance - натуральная добавка от компании NHT Global - простой путь иметь Ваш идеальный гормональный фон в любом возрасте!
Material para repaso del tema respectivo para los y las estudiantes de 9º año del profesor Gustavo Bolaños Ramírez, del Liceo de Atenas, Alajuela, Costa Rica.
DOCUMENT SELECTION USING MAPREDUCE Yenumula B Reddy and Desmond HillClaraZara1
Big data is used for structured, unstructured and semi-structured large volume of data which is difficult to manage and costly to store. Using explanatory analysis techniques to understand such raw data, carefully balance the benefits in terms of storage and retrieval techniques is an essential part of the Big Data. The research discusses the MapReduce issues, framework for MapReduce programming model and implementation. The paper includes the analysis of Big Data using MapReduce techniques and identifying a required document from a stream of documents. Identifying a required document is part of the security in a stream of documents in the cyber world. The document may be significant in business, medical, social, or terrorism.
Big data is used for structured, unstructured and semi-structured large volume of data which is difficult to
manage and costly to store. Using explanatory analysis techniques to understand such raw data, carefully
balance the benefits in terms of storage and retrieval techniques is an essential part of the Big Data. The
research discusses the Map Reduce issues, framework for Map Reduce programming model and
implementation. The paper includes the analysis of Big Data using Map Reduce techniques and identifying
a required document from a stream of documents. Identifying a required document is part of the security in
a stream of documents in the cyber world. The document may be significant in business, medical, social, or
terrorism.
An Approach for RDF-based Semantic Access to NoSQL Repositories, presented as partial requiremnt for the discipline "Metodologia da Pesquisa em Ciência da Computação" at UFSC/2015
Leveraging Open Source Technologies to Enable Scientific Archiving and Discovery; Steve Hughes, NASA; Data Publication Repositories
The 2nd Research Data Access and Preservation (RDAP) Summit
An ASIS&T Summit
March 31-April 1, 2011 Denver, CO
In cooperation with the Coalition for Networked Information
http://asist.org/Conferences/RDAP11/index.html
How do data analysts work with big data and distributed computing frameworks.pdfSoumodeep Nanee Kundu
The era of big data has ushered in a new paradigm for data analysis, presenting unique challenges and opportunities. This article delves into the world of big data analytics and explores how data analysts work with distributed computing frameworks to handle large and complex datasets. We'll discuss the concept of big data, the challenges it poses, and the evolution of distributed computing frameworks. Furthermore, we'll dive into the role of data analysts, their skills and tools, and the practical applications of big data analytics. By the end of this article, readers will have a comprehensive understanding of how data analysts leverage distributed computing frameworks to extract valuable insights from vast datasets.
Big data is a term which refers to those data sets or combinations of data sets whose volume,
complexity, and rate of growth make them difficult to be captured, managed, processed or analysed by
traditional tools and technologies. Big data is relatively a new concept that includes huge quantities of data,
social media analytics and real time data. Over the past era, there have been a lot of efforts and studies are
carried out in growing proficient tools for performing various tasks in big data. Due to the large and
complex collection of datasets it is difficult to process on traditional data processing applications. This
concern turns to be further mandatory for producing various tools in big data. In this survey, a various
collection of big data tools are illustrated and also analysed with the salient features.
Data Science Tools and Technologies: A Comprehensive Overviewsaniakhan8105
"Data Science Tools and Technologies: A Comprehensive Overview" explores the essential tools and platforms that data scientists use to analyze, visualize, and interpret complex data. From programming languages like Python and R to advanced frameworks like TensorFlow and Hadoop, this guide covers everything needed for effective data science practice.
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology
An perspective into the raise of NoSQL systems and an comparison between RDBMS and NoSQL technologies.
The basic idea of the presentation originated while trying to understand the different alternatives available for managing data while building a fast, highly scalable, available, and reliable enterprise application.
1. Wayne Schroeder
E-mail: w.schroede@gmail.com
Phone: (858) 484-3427
Profile
Expertise in iRODS/SRB, distributed data management/digital repositories, computer security, DBMS technology, software engineering,
optimization, authentication systems, technical communication (written and verbal), and remote distributed technical support. Designed,
implemented, and supported major components of the iRODS and SRB systems, managed iRODS development and support, and was
Principal Investigator on a series of awards and sub-contracts.
Experience
Over 38 years of experience in software engineering, high-performance computing, data management, scientific applications, computer
security, networking, and systems support and administration.
Over 10 years of experience as a team lead and manager, including recent role as Principal Investigator and proposal developer for multiple
awards for the DICE-UCSD team, 2010-2014.
A leading expert in iRODS, the integrated Rule-Oriented Data System.
Over 12 years of experience in providing support to the international iRODS and SRB user community primarily via email and
documentation (wiki.irods.org), some local replication/debug of issues, and occasional remote debugging sessions.
Extensive experience with a wide variety of programming languages including C, C++, Java, Perl, Python, an extended Fortran, and
various assembly languages.
Gave many presenations and wrote or co-authored many papers over the past 25+ years.
Experience working as part of, and co-leading, a physically distributed team: DICE-UCSD and DICE at UNC, 2008-2014.
Appointments
2014 Business Owner. Launched Integrated Data Management Solutions
LLC providing consulting services and developing products.
2002 Research Programmer/Analyst (PA IV), Data Grid Technologies Group
at the San Diego Supercomputer Center which then became Data
Intensive Cyber Environments team (DICE at UNC and DICE-UCSD at
UCSD), Institute for Neural Computing, University of California
San Diego
2000 Senior Software Engineer, Entropia, San Diego, CA
1997 Research Programmer/Analyst, Data Intensive Computing/Security
Systems, San Diego Supercomputer Center, General Atomics/UC San Diego
1994 Systems Software Special Projects, San Diego Supercomputer
Center, General Atomics/UC San Diego
1988 Principal Scientist, Manager, Production Systems/Central Systems
Software, San Diego Supercomputer Center, General Atomics/UC San Diego
1985 Staff Systems Programmer, San Diego Supercomputer Center,
General Atomics/UC San Diego
1981 Systems Programmer, National Magnetic Fusion Energy Computer Center,
Lawrence Livermore National Laboratory
1980 Applications Project leader, Applications Programming Division II,
Lawrence Livermore National Laboratory
1979 Applications systems programmer, Applications Programming Division II,
Lawrence Livermore National Laboratory
1976 Programmer, Sperry*Univac, Saint Paul, Minnesota
Education
University of Nebraska at Omaha B.S. Computer Science (magna cum laude) 1976. Minor: Psychology.
Career Highlights
Designed and implemented major components of the Integrated Rule Orientated Data management System (iRODS), a network, database,
and rule based data-management/data-grid system used in many large-scale research projects world-wide. Under Dr. Reagan Moore and in
collaboration with Michael Wan (lead architect), Dr. Arcot Rajasekar, and others, quickly and efficiently developed the iRODS system to
become robust and useful enough to attract follow-on funding from NARA and NSF to help sustain the team and continue iRODS
development for many years. Specialist in iRODS Database subsystem (queries, SQL, ODBC, iRODS-RDBMS API), iRODS
installation/testing/QA, computer security/authentication, software engineering in C/Java/scripting languages, Unix/Linux, user support, and
documentation. DICE iRODS Product Manager and release coordinator 2006-2014. See the iRODS web site (wiki) http://wiki.irods.org
and http://wiki.irods.org/index.php/Science_and_Engineerng_Domains .
As manager and senior research engineer at SDSC (1986-2000), developed a series of innovative HPC, network, and security systems and
presented those results at many international conferences in Europe, Asia, Australia and the U.S., including ten presentations at eight Cray
User Group meetings.
2. User Group meetings.
Successfully developed critical networking software for the initial phase for the San Diego Supercomputer Center (1985). This was SDSC's
largest technical challenge in creating this large NSF-funded national High Performance Computing (HPC) center.
Developed and ported key software components at Entropia, a mid-sized start-up (with about 30 staff members at the peak), over a two year
period; was retained through all three rounds of layoffs (two affecting engineers) and only resigned when it was clear that Entropia would
fail. We created an innovative and well performing software system but the company failed due to some inaccurate initial business
assumptions.
As a software engineer at the National Magnetic Fusion Energy Computer Center (NMFECC) at the Lawrence Livermore National
Labatory (LLNL) (1981-1885) maintained networking software and developed one of the first email systems (1983/1984), running on the
Cray Supercomputers at NMFECC. Extended this into an early bulletin board system.
As software engineer and Project Leader for an important large-scale HPC scientific modelling application used in nuclear weapons design
at LLNL, led the porting of the code from the CDC 7600 to the Cray-1 (1980).
Successfully developed and maintained software for the simulation system used in ship-board command and control testing and training at
Sperry-Univac (1976-1979).
Developed IEEE/Cray data conversion routines (convieee) that were selected from multiple contenders for use in Gigabit testbeds at the
Third Gigabit Testbed Workshop (1992); these routines were rated as very high performance, well structured, and well documented.
A Sampling of Presentations
iRODS User Group meeting in Berlin, Germany, February 28 and March 1 2013, three presentations: iRODS Version 3.2 Features and
Bug Fixes (Wayne Schroeder, UCSD), New PAM/LDAP Authentication (Wayne Schroeder, UCSD), and Emerging businesses (Reagan
Moore, UNC-CH, Wayne Schroeder, Archive Analytics). See https://wiki.irods.org/index.php/iRODS_User_Group_Meeting_2013 .
Schroeder, Wayne, "iRODS, the Integrated Rule Oriented Data-management System", UCSD BigData Inaugural Workshop, November
25, 2013.
One Mind Data Sharing Platform Review & Discussion, "iRODS, Integrated Rule-Oriented Data-management System", January 10, 2012,
UCSD.
CineGrid International Workshop 2011, "iRODS Update", Wayne Schroeder, December 5, 2011, UCSD.
Invited Talk, PGCon2008 (PostgreSQL), "iRODS - A Large-Scale Rule-Orientated Data Management System", Wayne Schroeder, May
23, 2008, Ottawa, Canada
Brief talk on SRB/iRODS, first annual NMI Build & Test Facility workshop, April 29, 2008, Madison, WI
APAC07 iRODS Tutorial, Reagan Moore, Wayne Schroeder, (co-authors Arcot Rajasekar, and Mike Wan), Perth, Australia, October 11,
2007
ISGC 2006, "An Intelligent Rule-Oriented Data Management System", Wayne Schroeder, May 3, 2006, Taipei, Taiwan
"SDSC SRB Core Technology", Michael Wan, Wayne Schroeder, Aug 23, 2005
"Overview of the SDSC Storage Resource Broker", Wayne Schroeder, Spring 2004 HEPix meeting, and as part of a tutorial and a public
lecture at the eScience Institute, Edinburgh, UK, May, 2004.
A Sampling of Publications
"NITRD iRODS Demonstration", Reagan W. Moore, Richard Marciano, Arcot Rajasekar, Antoine de Torcy, Chien-Yi Hou, Leesa
Brieger, Jon Crabtree, Jewel Ward, Mason Chua, UNC Chapel Hill; Wayne Schroeder, Michael Wan, Sheau-Yen Chen, UCSD,
sponsored by NARA at NSF, 2009.
Rajasekar, A., M. Wan, R. Moore, W. Schroeder, "A Prototype Rule-based Distributed Data Management System", HPDC workshop on
"Next Generation Distributed Data Management", May 2006, Paris, France.
Schroeder, Wayne, "The SDSC Encryption/Authentication (SEA) System", Concurrency: Practice and Experience - Aspects of Seamless
Computing, John Wiley & Sons Ltd., Volume 11 Number 15, December 25, 1999.
Moore, R., C. Baru, A. Rajasekar, B. Ludascher, R. Marciano, M. Wan, W. Schroeder, and A. Gupta, "Collection-Based Persistent Digital
Archives - Part 1", D-Lib Magazine, March 2000, http://www.dlib.org/ (Part 2 appeared in April 2000).
Rajasekar, A., M. Wan, Reagan Moore, W. Schroeder, G. Kremenek, A. Jagatheesan, C. Cowart, B. Zhu, S.-Y. Chen, R. Olschanowsky,
"Storage Resource Broker - Managing Distributed Data in a Grid,"Computer Society of India Journal, special issue on SAN, 2003.
Data Grid Federation, Rajasekar, A., M. Wan, R. Moore, W. Schroeder, PDPTA, Las Vegas NV, June 2004 - Special Session on New
3. Data Grid Federation, Rajasekar, A., M. Wan, R. Moore, W. Schroeder, PDPTA, Las Vegas NV, June 2004 - Special Session on New
Trends in Distributed Data Access
Data Grid Management Systems Moore, R.W., Jagatheesan, A., Rajasekar, A., Wan, M. and Schroeder, W., Proceedings of the 21st
IEEE/NASA Conference on Mass Storage Systems and Technologies (MSST) , April 13-16, 2004, College Park, Maryland, USA.
"Analysis of HPSS Performance Based on Per-File Transfer Logs", W. Schroeder, R. Marciano, J. Lopez, M. Gleicher, G. Kremenek, C.
Baru, R. Moore, Procs. Seventh NASA Goddard Conference on Mass Storage Systems & Technologies, and Sixteenth IEEE Mass Storage
Systems Symposium, March 15-18, 1999, San Diego, CA.
Schroeder, Wayne, "Kerberos/DCE, the Secure Shell, and Practical Internet Security," Proceedings, Thirty-eighth Semiannual Cray User
Group Meeting, Charlotte, North Carolina (October 1996).
Baru, C.K., Moore,R.W., Rajasekar, A.,Schroeder, W., Wan, M., "A Data Handling Architecture for a Prototype Federal Application,"
Proceedings of the IEEE Conference on Mass Storage Systems, College Park, MD., March 23-27, 1998.
Moore, Reagan, Joseph Lopez, Charles Lofton, Wayne Schroeder, George Kremenek, Michael K. Gleicher, "Configuring and Tuning
Archival Storage Systems," Sixteenth IEEE Mass Storage Systems Symposium held jointly with the Seventh NASA Goddard Conference
on Mass Storage Systems & Technologies, March 9-11, 1999.
Additional publications with SRB/iRODS team, 1997-2000, 2002-2014 See https://wiki.irods.org/index.php/Publications.
Additional Cray User Group publications/presentations (about ten total), 1987-1996, in Europe, Asia and the U.S. A sampling of these
include: "SDSC UNICOS Queued File Transport and Distributed Computing Tools", "SDSC Enhancements to NSL UniTree" and
"Kerberos/DCE, the Secure Shell, and Practical Internet Security".
Additional information
Also see https://wiki.irods.org/index.php/Wayne_Schroeder .
For an older resume with more detail on the early career: http://users.sdsc.edu/~schroede/resume2000.txt .
A brief profile is available at: https://www.linkedin.com/pub/wayne-schroeder/8/900/a5a
This document last updated December, 2014.