- The document discusses the Rensselaer Institute for Data Exploration and Applications (IDEA) and its work in applying data science across various domains like healthcare, business, and the sciences.
- It outlines graduate projects in IDEA that involve collaborations with other Rensselaer research centers and applying data exploration tools.
- It also discusses changes made to Rensselaer's computer science and information technology curriculum to incorporate more training in data analytics, data science challenges, and working with large, unstructured datasets. This includes new concentrations in data science and information dominance.
The Pros and Cons of Big Data in an ePatient WorldPYA, P.C.
PYA Principal Dr. Kent Bottles, who is also PYA Analytics’ Chief Medical Officer, presented “The Pros and Cons of Big Data in an ePatient World” at the ePatient Connections 2013 conference.
Tools and Methods for Big Data Analytics by Dahl WintersMelinda Thielbar
Research Triangle Analysts October presentation on Big Data by Dahl Winters (formerly of Research Triangle Institute). Dahl takes her viewers on a whirlwind tour of big data tools such as Hadoop and big data algorithms such as MapReduce, clustering, and deep learning. These slides document the many resources available on the internet, as well as guidelines of when and where to use each.
The web-conference hosted by CRISIL Global Research & Analytics on “Big Data’s Big Impact on Businesses” on January 29, 2013, saw participation from senior officials of global multinationals from 9 countries. The presentation described how data analytics is helping businesses make “evidence-based” decisions, thereby creating a positive impact. It also spoke about the opportunities opening up in the Big Data space in India and across the globe.
Hosted by:
Sanjeev Sinha, President, CRISIL Global Research & Analytics
Gaurav Dua, Director & Practice Leader (Technology, Media & Telecom), CRISIL Global Research & Analytics
This presentation is prepared by one of our renowned tutor "Suraj"
If you are interested to learn more about Big Data, Hadoop, data Science then join our free Introduction class on 14 Jan at 11 AM GMT. To register your interest email us at info@uplatz.com
Presentazione nell'ambito del workshop: OPEN DATA E CLOUD COMPUTING: OPPORTUNITÀ DI BUSINESS. Una vista internazionale - 15 Settembre 2014 Pad. 152 della Regione Puglia - 78 Fiera del Levante Bari
New Frontiers in IA: Design in the Era of Cognitive ComputingPaul King
WIAD 2016 - World IA (Information Architecture) Day in Boise, Idaho.
Paul Michael King has an MLIS from the University of Illinois at Urbana-Champaign. Today he works for Healthwise where he has strived for the past four years to lay the groundwork for a patient-centered ontology to support IA across a wide range of use cases. He started his IA career at NASA’s Jet Propulsion Laboratory (JPL) as a Semantic Engineer where he developed the IA for an engineering research portal to preserve a vast body of knowledge left by several generations of retiring engineers. This same work also laid the groundwork for NASA’s emerging enterprise ontology. After JPL he worked for the Informatics & Telematics Institute in Greece where he served as liaison to European projects to support technological innovation and develop semantic technologies within the unified market.
The Pros and Cons of Big Data in an ePatient WorldPYA, P.C.
PYA Principal Dr. Kent Bottles, who is also PYA Analytics’ Chief Medical Officer, presented “The Pros and Cons of Big Data in an ePatient World” at the ePatient Connections 2013 conference.
Tools and Methods for Big Data Analytics by Dahl WintersMelinda Thielbar
Research Triangle Analysts October presentation on Big Data by Dahl Winters (formerly of Research Triangle Institute). Dahl takes her viewers on a whirlwind tour of big data tools such as Hadoop and big data algorithms such as MapReduce, clustering, and deep learning. These slides document the many resources available on the internet, as well as guidelines of when and where to use each.
The web-conference hosted by CRISIL Global Research & Analytics on “Big Data’s Big Impact on Businesses” on January 29, 2013, saw participation from senior officials of global multinationals from 9 countries. The presentation described how data analytics is helping businesses make “evidence-based” decisions, thereby creating a positive impact. It also spoke about the opportunities opening up in the Big Data space in India and across the globe.
Hosted by:
Sanjeev Sinha, President, CRISIL Global Research & Analytics
Gaurav Dua, Director & Practice Leader (Technology, Media & Telecom), CRISIL Global Research & Analytics
This presentation is prepared by one of our renowned tutor "Suraj"
If you are interested to learn more about Big Data, Hadoop, data Science then join our free Introduction class on 14 Jan at 11 AM GMT. To register your interest email us at info@uplatz.com
Presentazione nell'ambito del workshop: OPEN DATA E CLOUD COMPUTING: OPPORTUNITÀ DI BUSINESS. Una vista internazionale - 15 Settembre 2014 Pad. 152 della Regione Puglia - 78 Fiera del Levante Bari
New Frontiers in IA: Design in the Era of Cognitive ComputingPaul King
WIAD 2016 - World IA (Information Architecture) Day in Boise, Idaho.
Paul Michael King has an MLIS from the University of Illinois at Urbana-Champaign. Today he works for Healthwise where he has strived for the past four years to lay the groundwork for a patient-centered ontology to support IA across a wide range of use cases. He started his IA career at NASA’s Jet Propulsion Laboratory (JPL) as a Semantic Engineer where he developed the IA for an engineering research portal to preserve a vast body of knowledge left by several generations of retiring engineers. This same work also laid the groundwork for NASA’s emerging enterprise ontology. After JPL he worked for the Informatics & Telematics Institute in Greece where he served as liaison to European projects to support technological innovation and develop semantic technologies within the unified market.
Point Placement Algorithms: An Experimental StudyCSCJournals
The point location problem is to determine the position of n distinct points on a line, up to translation and reflection by the fewest possible pairwise (adversarial) distance queries. In this paper we report on an experimental study of a number of deterministic point placement algorithms and an incremental randomized algorithm, with the goal of obtaining a greater insight into the practical utility of these algorithms, particularly of the randomized one.
Presentation from Master of Science thesis defense (Evaluation of Rapid Impact Compaction for Transportation Infrastructure Applications; July 15, 2011)
Presented at BJUG, 5/8/2012 by Ivan Portilla
IBM Watson is a reasoning system with a question and answer front end that processes natural language coming from both structured and unstructured data. Watson additionally incorporates analytics from which the system learns to derive answer confidence and scoring. We will discuss the Watson System and some of its key foundations that came from the Open Source Apache Software Foundation. We will share the lessons learned of using Open source technologies including UIMA, Derby, Hadoop and Tomcat in Watson. We will explain how the primary (shallow) search was built with Apache Lucene and how the team followed Agile best practices for its Software development efforts.
Computer Science & Information Systems
First attempt to offer a broad view of CS & IS field by comparing and relate its disciplines
Luis Borges Gouveia
November 2013
Understanding the New World of Cognitive ComputingDATAVERSITY
Cognitive Computing is a rapidly developing technology that has reached practical application and implementation. So what is it? Do you need it? How can it benefit your business?
In this webinar a panel of experts in Cognitive Computing will discuss the technology, the current practical applications, and where this technology is going. The discussion will start with a review of a recent survey produced by DATAVERSITY on how Cognitive Computing is currently understood by your peers. The panel will also review many components of the technology including:
Cognitive Analytics
Machine Learning
Deep Learning
Reasoning
And next generation artificial intelligence (AI)
And get involved in the discussion with your own questions to present to the panel.
2.0 Introduction to Computer Science and ProgrammingAbdelrahman Hosny
This is an introduction to Computer Science and Programming for absolute beginners. Use these slides to start introducing some non-technical major students to the field of software development and computer programming.
The shorter version of these slides was presented at Amuse UX 2015 Special Meetup (Budapest, Hungary) — http://www.meetup.com/UXbudapest/events/225944151/.
Big Data HPC Convergence and a bunch of other thingsGeoffrey Fox
This talk supports the Ph.D. in Computational & Data Enabled Science & Engineering at Jackson State University. It describes related educational activities at Indiana University, the Big Data phenomena, jobs and HPC and Big Data computations. It then describes how HPC and Big Data can be converged into a single theme.
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
This is my presentation on the Topic "Data Science - An emerging Stream of Science with its Spreading Reach & Impact". I have compiled and collected different statistics and data from different sources. This may be useful for students and those who might be interested in this field of Study.
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
Watch: https://bit.ly/2DYsUhD
Advanced data science techniques, like machine learning, have proven an extremely useful tool to derive valuable insights from existing data. Platforms like Spark, and complex libraries for R, Python and Scala put advanced techniques at the fingertips of the data scientists. However, these data scientists spent most of their time looking for the right data and massaging it into a usable format. Data virtualization offers a new alternative to address these issues in a more efficient and agile way.
Attend this webinar and learn:
- How data virtualization can accelerate data acquisition and massaging, providing the data scientist with a powerful tool to complement their practice
- How popular tools from the data science ecosystem: Spark, Python, Zeppelin, Jupyter, etc. integrate with Denodo
- How you can use the Denodo Platform with large data volumes in an efficient way
- How Prologis accelerated their use of Machine Learning with data virtualization
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION Elvis Muyanja
Today, data science is enabling companies, governments, research centres and other organisations to turn their volumes of big data into valuable and actionable insights. It is important to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. According to the McKinsey Global Institute, the U.S. alone could face a shortage of about 190,000 data scientists and 1.5 million managers and analysts who can understand and make decisions using big data by 2018. In coming years, data scientists will be vital to all sectors —from law and medicine to media and nonprofits. Has the African continent planned to train the next generation of data scientists required on the continent?
Knowing what AI Systems Don't know and Why it mattersJames Hendler
A discussion of chatGPT and some other examples with respect to accuracy and other issues - a general background talk for those interested in the subject
Exploring the Boundaries of Artificial Intelligence (or "Modern AI")James Hendler
A discussion of the strengths and limitations of some current AI systems including chatGPT and DALL-E. Originally presented at University of Leicester Feb 2023.
The original abstract, title and bio were generated by chatGPT -- the first three slides show corrections -- original talk announcement included:
"Please note: The title, abstract and Hendler’s bio above were written by “GPT3,” a modern AI system. It contains information which is both correct and incorrect. That will be the topic of this talk."
Presentation at "International knowledge graph workshop" at KDD 2020. The short overview talk shows how we have moved from Semantic Web to Linked Data to Knowledge Graphs. We argue that the same "a little semantics goes a long way" principle from the early days of the Semantic Web still is needed today -- some lessons learned and steps ahead are outlined.
Keynote talk presented at WebScience 2020 conference. Looks at roots of Web/Web Science and explores two possible futures and what web scientists and others can do about it. Even starts with a quote from Charles Dickins.
The Future of AI: Going BeyondDeep Learning, Watson, and the Semantic WebJames Hendler
These slides, based on a presentation at distinguished lecture at IBM Almaden in March, 2017 explore some of the challenges to machine learning and some recent work. It is a newer version of the slides originally presented at IJCAI 2016.
Capacity Building: Data Science in the University At Rensselaer Polytechnic ...James Hendler
In this short talk, presented at the ITU's Capacity Building Symposium, I review some of the pedagogical innovation in data science happening at Rensselaer (RPI) and some aspects of teaching data science that are crucial to larger success.
Enhancing Precision Wellness with Knowledge Graphs and Semantic Analytics: O...James Hendler
Talk presented at Bio-IT 2018 (machine learning track) - explores some approaches to overcoming challenges of using machine learning systems in healthcare applications.
Digital Archiving, The Semantic Web, and Modern AIJames Hendler
This was my keynote talk on accepted the "Spotlight Award" from the association of moving image archivists. The talk relates needs of archiving, use of semantic (web) metadata, and deep learning for archiving.
The Unreasonable Effectiveness of MetadataJames Hendler
Invited talk at VIVO 2017 conference - explores the view of the semantic web as enriched metadata, and how that kind of information can be used in new and interesting ways.
Social Machines - 2017 Update (University of Iowa)James Hendler
This is an update to the talk entitled "Social Machines: the coming collision of artificial intelligence, social networks and humanity." It was presented as an ACM Distinguished Speaker lecture at the "University of Iowa Computing Conference" 2017-02-24
Social Machines: The coming collision of Artificial Intelligence, Social Netw...James Hendler
Will your next doctor be a human being—or a machine? Will you have a choice? If you do, what should you know before making it?This book introduces the reader to the pitfalls and promises of artificial intelligence (AI) in its modern incarnation and the growing trend of systems to "reach off the Web" into the real world. The convergence of AI, social networking, and modern computing is creating an historic inflection point in the partnership between human beings and machines with potentially profound impacts on the future not only of computing but of our world and species.AI experts and researchers James Hendler—co-originator of the Semantic Web (Web 3.0)—and Alice Mulvehill—developer of AI-based operational systems for DARPA, the Air Force, and NASA—explore the social implications of AI systems in the context of a close examination of the technologies that make them possible. The authors critically evaluate the utopian claims and dystopian counterclaims of AI prognosticators. Social Machines: The Coming Collision of Artificial Intelligence, Social Networking, and Humanity is your richly illustrated field guide to the future of your machine-mediated relationships with other human beings and with increasingly intelligent machines.
Knowledge Representation in the Age of Deep Learning, Watson, and the Semanti...James Hendler
IJCAI 16 keynote on the need to bring modern AI accomplishments of recent years into connection with the more traditional goals of symbolic AI (and vice versa).
On Beyond OWL: challenges for ontologies on the WebJames Hendler
The need for ontologies in the real world is manifest and increasing. On the Web, ontologies are everywhere — but OWL isn’t. In this talk, I look at some of the things that are not in OWL, but which are needed for the use of OWL in many Web domains. This talk explores some of the needs for ontologies on the Web in data integration, emerging technologies, and linked data applications – and asks where the features needed for these are in OWL. The talk ends with some challenges to the OWL, and greater ontology, community needed to see more eventual use of standard ontologies on the Web.
A 1015 update to the 2012 "Data Big and Broad" talk - http://www.slideshare.net/jahendler/data-big-and-broad-oxford-2012 - extends coverage, brings more in context of recent "big data" work.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
The Metaverse and AI: how can decision-makers harness the Metaverse for their...Jen Stirrup
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives?
How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 4
Big Data and Computer Science Education
1. Big Data Meets Computer
Science
Jim Hendler
Tetherless World Professor of Computer, Web and
Data Sciences
Director, Rensselaer Institute for Data Exploration and
Applications
@jahendler
3. The Rensselaer IDEA 3
… Across Applications (corresponding to Challenges Identified in the
Rensselaer Plan 2024)
Healthcare
Analytics
Business
Systems
Built and Natural
Environments
Virtual and
Augmented Reality
Cyber-
Resiliency
Policy, Ethics and
Open Government
Materials
Informatics
Data-driven
Physical/Life
Sciences
4. The Rensselaer IDEA 4
Developing a Comprehensive “Data Science” Research Agenda
P. Fox and J. Hendler, The Science of Data Science, Big Data, 2(2), in press
5. The Rensselaer IDEA
Graduate Projects in IDEA
• IDEA and CCI (HPC): technologies to enable
Rensselaer researchers to work with data at larger
scales and in new ways
• Population-scale cognitive computing models for
“human intensive” agent-based simulations
• IDEA and EMPAC (Performing arts center): provide
next generation data exploration tools
• Multi-person data visualization tools for big-data
applications
• IDEA and Watson: New direction in Cognitive
Computation
• How do we go from Question/Answering to Open Web
Data exploration?
• IDEA and CBIS (Ctr for Biotechnology &
Interdisciplinary Studies): Data-driven Informatics
• Can we couple semantics and big data to find new medical
uses for already approved drugs?
6. The Rensselaer IDEA
External Projects and partnerships
Emergency Room Care
Language and Agents
Largescale Healthcare Analytics
In Discussion Jumpstart (Proposal underway)
Built and Natural Biome data-driven
science and engineering
Cognitive Computing Collaborative
Research Initiative
7. Campus Data
Infrastructure
Metadata
• Title
• Author
• Author Email
• Licence
• Subject
• Keyword
• Data Type
Dataset
CDF
RPI Object Deposit RPI Research Network
RPI-ID Request RPI-ID Request
Share
Knowledge
Join
Network
Allocate a universal accessible RPI-ID
Register Metadata
Upload Any Data
RPI Research Object
Registration and Deposit
RPI Research Collaboration
and Community Network
8. Requires going Beyond
the Database
Discovery
Integrate
Visualize
Explain
Thinking outside the Database box
Strata talk, 2013 - https://www.youtube.com/watch?v=Cob5oltMGMc
9. At new scales (and in
new ways)
Fox and Hendler, Changing the Equation on Scientific Visualization,
Science, 2/11 - http://www.sciencemag.org/content/331/6018/705.short)
10. A Whole New World
• But what about undergraduate
education
– where do we train the students who can
take on projects needing
• statistics and analytics
• informatics
• data science challenges
• machine learning
• unstructured data
• cognitive computation
• …
11. Computer Science
Education?
• Programming is a necessary skill
– not sufficient
• and we mostly teach it wrong…
– (For my heresies about teaching programming, see
“Let’s Help Computer Science Students Crack the
Code, 3/13 http://chronicle.com/article/Lets-Help-Computer-Science/137649/ )
• The computing environment of today is nothing like
the computing environment of the 70s,
– but the curriculum hasn’t changed much since I was in
school – but the fundamentals are NOT all the same
– data-oriented computations involve graphs, memory
intensive algorithms, machine learning, …
12. Deploying these ideas at
RPI
• Innovation in the interdisciplinary Information
Technology Program
– Renamed Information Technology and Web
Science, 2011
• for more on Web Science, see
– Berners-Lee et al., Creating a Science of the World Wide Web,
Science, 2006,
https://www.sciencemag.org/content/313/5788/769.summary;
– Hendler et. al, Web Science: An interdisciplinary Approach to
Understanding the Web, CACM, 7/2008,
http://cacm.acm.org/magazines/2008/7/5366-web-science/fulltext
13. IT and Web Science
• First IT academic program in U.S.
• First web science degree program in
U.S.; First undergraduate web science
degree anywhere
• BS in ITWS (20 concentrations) and MS
in IT (10 concentrations)
• PhD in Multi-Disciplinary Sciences
• http://itws.rpi.edu
– I was Director 2008-2012
– Now directed by Peter Fox (whose slides I stole
for this section)
14.
Technical Track Courses
Concentrations
Computer Engineering
Track
1) ECSE-2610 Computer Components and Operations
2) ENGR-2350 Embedded Control
3) ECSE-2660 Computer Architecture, Networking and
Operating Systems
Civil Engineering
Computer Hardware
Computer Networking (hardware focus)
Mechanical/Aeronautical Eng.
Computer Science Track 1) CSCI-2200 Foundations of Computer Science
2) CSCI-2300 Introduction to Algorithms
3) CSCI-2500 Computer Organization
Cognitive Science
Computer Networking (software focus)
Information Security
Machine and Computational Learning
Information Systems Track 1) CSCI-2200 Foundation of Computer Science
2) CSCI-2500 Computer Organization
3) Four credits from the following:
• CSCI-2220 Programming in Java (2 credits)
• CSCI-2961 Program in Python (2 credits)
• CSCI-2300 Introduction to Algorithms (4 credits)
• ITWS-49XX Web Systems Development II (4 credits)
Arts
Communication
Economics
Entrepreneurship
Finance
Management Information
Systems
Medicine
Pre-law
Psychology
STS
Web Science Track 1) CSCI-2200 Foundations of Computer Science
2) CSCI-2500 Computer Organization
3) One of the following:
• CSCI-49XX Web Systems Development II
• Web/Data Course approved by ITWS Curriculum
Committee
Data Science
Science Informatics
Web Technologies
15. CHANGES TO THE MASTER’S IN
INFORMATION TECHNOLOGY
PROGRAM
• In Spring 2013 the MS in IT core curriculum was revised
to include Data Analytics.
• Networking core classes were replaced with Data
Analytics core classes: Data Science, Database Mining,
X-informatics, and Data Analytics (a new class offered in
Spring 2014).
• The MS in IT program also added two new
concentrations: Data Science and Analytics and
Information Dominance.
• The Information Dominance concentration was
developed for a new Navy program that will be educating
a select group of 5-10 naval officers a year with the skills
needed for military cyberspace operations. Two officers
started in Fall 2013 and three began in Spring 2014.
16. IT Core Area Course Number Course Title
Term(s)
Offered
Database Systems CSCI-4380 Database Systems Fall/Spring
Data Analytics ITWS-6350 Data Science Fall
Software Design and
Engineering
CSCI-4440 Software Design and Documentation Fall
ITWS-6400 X-Informatics Spring
Management of
Technology*
ITWS-6300
Business Issues for Engineers and Scientists
(Professional Track Only)
Fall/Spring
Human Computer
Interaction
COMM-6420 Foundations of HCI Usability Fall
COMM-696X Human Media Interaction Spring
MS in IT Required Core Courses
* For the research track, replace ITWS-6300 Business Issues for Engineers and Scientists with one of the two semester courses ITWS-
6980 Master’s Project or ITWS-6990 Master’s Thesis.
Advanced Core options for students who have previously completed a Core Course
IT Core Area Course Number Course Title
Term(s)
Offered
Database Systems
CSCI-6390 Database Mining Fall
ITWS-6350 Data Science Fall
ITWS-696X Semantic E-Science Fall
Data Analytics
CSCI-6390 Database Mining Fall
ITWS-6400 X-Informatics Spring
ITWX-696X Data Analytics Spring
Software Design
CSCI-6500 Distributed Computing Over the Internet Fall
ECSE-6780 Software Engineering II Fall
ITWS-696X Semantic E-Science Fall
Management of
Technology
MGMT-6080 Networks, Innovation and Value Creation Fall
MGMT-6140 Information Systems for Management Spring
Human Computer
Interaction
COMM-6620 Information Architecture Spring
COMM-6770 User-Centered Design Fall
COMM-696X Interactive Media Design Summer
17. Concentration Course Number Course Name Term(s)
Offered
Data
Science and
Analytics
Data and Information analytics extends analysis (descriptive and
predictive models to obtain knowledge from data) by using
insight from analyses to recommend action or to guide and
communicate decision-making. Thus, analytics is not so much
concerned with individual analyses or analysis steps, but with an
entire methodology. Key topics include: advanced statistical
computing theory, multivariate analysis, and application of
computer science courses such as data mining and machine
learning and change detection by uncovering unexpected
patterns in data.
Select two or three of the following courses:
ITWS-6350 Data Science Fall
ITWS-6400 X-Informatics Spring
ITWS-696X Data Analytics Spring
ITWS-696X Semantic E-Science Fall
ITWX-696X
Advanced Semantic
Technologies*
Spring
If only two of the above were chosen, select one more of
the following courses:
COMM-6620 Information Architecture Spring
CSCI-4020 Computer Algorithms Spring
CSCI-4150 Introduction to AI Fall
CSCI-6390 Database Mining Fall
CSCI-4220 or CSCI-
6220
Network Programming
or Parallel Algorithm
Design
Spring
ISYE-4220
Optimization Algorithms
and Applications
Fall
ISYE-6180
Knowledge Discovery
with Data Mining
Spring
MGMT-696X
Technology Foundations
for Business Analytics
Fall
MGMT-696X
Predictive Analytics
Using Social Media
Spring
Concentration Course Number Course Name Term(s)
Offered
Information
Dominance
The Information Dominance concentration prepares students for
careers designing, building, and managing secure information
systems and networks. The concentration includes advanced
study in encryption and network security, formal models and
policies for access control in databases and application systems,
secure coding techniques, and other related information
assurance topics. The combination of coursework provides
comprehensive coverage of issues and solutions for utilizing
high assurance systems for tactical decision-making. It
prepares students for careers ranging from secure information
systems analyst, to information security engineer, to field
information manager and chief information officer. It is also
appropriate for all IT professionals who want to enhance their
knowledge of how to use pervasive information in situational
awareness, operations scenarios, and decision-making.
Select two or three of the following courses:
ISYE-6180
Knowledge Discovery with Data
Mining
Spring
CSCI-6960
Cryptography and Network
Security I
Fall
ITWS-4370 Information System Security Spring
CSCI-4650 Networking Laboratory I
Fall/Spri
ng
MGMT-7760 Risk Management Fall
ISYE-4310
Ethics of Modeling for Industrial
Systems Engineering
Fall
If only two of the above were chosen, select one more of the
following courses:
CSCI-6390 Database Mining Fall
CSCI-6968
Cryptography and Network
Security II
Spring
CSCI-4660 Networking Laboratory II
Fall/Spri
ng
ECSE-6860
Evaluation Methods for Decision
Making
Fall
ISYE-6500
Information and Decision
Technologies for Industrial and
Service Systems
Fall/Spri
ng
CSCI-496X
Computational Analysis of
Social Processes
Fall
Two New MS in IT Concentrations
18. Also at RPI
• Data Science Research Center and Data Science
Education Center (dsrc.rpi.edu, 2009)
• http://www.rpi.edu/about/inside/issue/v4n17/datacente
r.html
– Over 45: research faculty, post-docs, grad students, staff,
undergraduates…
• Data is one of the Rensselaer Plan’s five thrusts
• Other key faculty
– Fran Berman (Center for Digital Society and RDA)
– Bulent Yener (DSRC Director)
– Peter Fox(ITWS Director)
19. More RPI Curriculua
• Environmental Science with Geoinformatics
concentration
• Bio, geo, chem, astro, materials - informatics
• GIS for Science
• Visualization (new summer program)
• Multi-disciplinary science program - PhD in
Data and Web Science
• DATUM: Data in Undergraduate Math! (Bennett)
• Missing – intermediate statistics
• Graphs – significant potential here – must teach!
20. 5-6 years in…
• Science and interdisciplinary from the start!
– Not a question of: do we train scientists to be
technical/data people, or do we train technical
people to learn the science
– It’s a skill/ course level approach that is needed
• We teach methodology and principles over
technology
• Data science must be a skill, and natural like
using instruments, writing/using codes
• Team/ collaboration aspects are key
• Foundations and theory must be taught
– for data, as well as programming