SlideShare a Scribd company logo
1 of 131
Download to read offline
B.Sc (Data Science)
CURRICULUM AND SYLLABUS
SEMESTER 1
S.No
Course
Code
Course Title
Contact
Hours
L T P C M
THEORY
1. UDASC01 Communicative English 2 2 0 0 2 100
2. UDASC02 Linear Algebra & Calculus 4 3 1 0 4 100
3. UDASC03 Computer Architecture 3 3 0 0 3 100
4. UDASC04
Problem Solving and
Programming using C
3 3 0 0 3 100
5. UDASC05 Digital System Design 4 3 1 0 4 100
6. UDASC06 Ethics and Human Values 2 2 0 0 2 100
PRACTICAL
7. UDASC1PA
Problem Solving using C
Laboratory
4 0 0 4 2 100
8. UDASC1PB
Communicative skills and
Language Laboratory
2 0 0 4 1 100
TOTAL 24 16 2 8 21
SEMESTER 2
S.No
Course
Code
Course Title
Contact
Hours
L T P C M
THEORY
1. UDASC21 Principles of Data Science 3 3 0 0 3 100
2. UDASC22 Fundamentals of Statistics 4 3 1 0 4 100
3. UDASC23 Operating Systems 3 3 0 0 3 100
4. UDASC24 Database Management System 3 3 0 0 3 100
5. UDASC25 Computer Networks 3 3 0 0 3 100
PRACTICAL
6. UDASC2PA Statistics and Data Science Lab 4 0 0 4 2 100
7. UDASC2PB
Database Management System
(DBMS) Laboratory
4 0 0 4 2 100
8. UDASC2PC
Operating Systems and
Networks Laboratory
4 0 0 4 2 100
TOTAL 28 15 1 12 22
SEMESTER 3
S.No Course Code Course Title
Contact
Hours
L T P C M
THEORY
1. UDASC31 Probability Theory 3 3 0 0 3 100
2. UDASC32 Cloud Computing 3 3 0 0 3 100
3. UDASC33 Advanced Database Technologies 3 3 0 0 3 100
4. UDASC34 Web Programming 3 3 0 0 3 100
5. UDASC35 Data Mining 3 3 0 0 3 100
6. UDASC36 Operation Research 3 3 0 0 3 100
PRACTICAL
7. UDASC3PA Data Mining Lab 4 0 0 4 2 100
8. UDASC3PB
Cloud Computing and Web
Programming Lab
4 0 0 4 2 100
TOTAL 26 18 0 8 22
SEMESTER 4
S.No Course Code Course Title
Contact
Hours
L T P C M
THEORY
1. UDASC41
Data Handling and
Visualization
3 3 0 0 3 100
2. UDASC42 Machine Learning 3 3 0 0 3 100
3. PEC1 3 3 0 0 3 100
4. UDASC44 Optimization Techniques 3 3 0 0 3 100
5. UDASC45 Big Data Analytics 3 3 0 0 3 100
PRACTICAL
6 UDASC4PA Machine Learning Lab 4 0 0 4 2 100
7. UDASC4PB Big Data Analytics Lab 4 0 0 4 2 100
8. UDASC4PC
Data Handling and
Visualization lab
4 0 0 4 2 100
TOTAL 27 15 0 12 21
SEMESTER 5
S.No Course Code Course Title
Contact
Hours
L T P C M
THEORY
1. UDASC51 Deep Learning 3 3 0 0 3 100
2. UDASC52 Natural Language Processing 3 3 0 0 3 100
3. PEC 2 3 3 0 0 3 100
4. PEC 3 3 3 0 0 3 100
PRACTICAL
5. UDASC5PA Deep Learning Lab 4 0 0 4 2 100
6 UDASC5PB
Natural Language processing
Lab
4 0 0 4 2 100
7. UDASC5PC Phase I Project 6 0 0 6 6 100
TOTAL 26 12 0 14 22
SEMESTER 6
S.No Course Code Course Title
Contact
Hours
L T P C M
THEORY/PRACTICAL
1. UDASC61 Stream Processing Analytics 3 3 0 0 3 100
2. UDASC62 PEC 4 3 3 0 0 3 100
3. UDASC63 PEC 5 3 3 0 0 3 100
4. UDASC6P Phase II project 12 0 0 12 12 100
TOTAL 21 9 0 12 21
LIST OF ELECTIVE COURSES
Sl.
No.
Course
Code
Course Title
Contact
Hours
L T P C M
PROGRAM ELECTIVE COURSE-1
1. UDASC46 Cloud Services for IOT 3 3 0 0 3 100
2 UDASC47 Business Analytics 3 3 0 0 3 100
3 UDASC48 Business Intelligence 3 3 0 0 3 100
4 UDASC49 Intelligent Database System 3 3 0 0 3 100
5 UDASC50 Digital Marketing Analytics 3 3 0 0 3 100
6 UDASC55 Internet of Things 3 3 0 0 3 100
PROGRAM ELECTIVE COURSE-2 & 3
1 UDASC53
Augmented Reality &Virtual
reality
3 3 0 0 3 100
2 UDASC54 Linux Programming 3 3 0 0 3 100
3 UDASC57 Image Processing and Analysis 3 3 0 0 3 100
4 UDASC58 Health Care Analytics 3 3 0 0 3 100
5 UDASC59 Data mining using R 3 3 0 0 3 100
6 UDASC60 Text Analytics 3 3 0 0 3 100
PROGRAM ELECTIVE COURSE-4 & 5
1 UDASC62 High-Dimensional Data Analysis 3 3 0 0 3 100
2 UDASC65 Cyber Forensic analytics 3 3 0 0 3 100
3 UDASC66 Social Network Analytics 3 3 0 0 3 100
4 UDASC67 IoT cloud and data analytics 3 3 0 0 3 100
5 UDASC68 Predictive Modeling Analysis 3 3 0 0 3 100
SEMESTER-I
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC01/ COMMUNICATIVE ENGLISH
YEAR / SEMESTER I / I
L T P C
2 0 0 2
COURSE OBJECTIVES:
1. To make the students learn to speak grammatically correct English. Guiding and
supporting their skill development –Listening, speaking, reading and writing in English.
2. Making them realize the importance of English as Global language and its importance in
today‘s scenario
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Outline the importance of communication skill.
2. Illustrate technical and general vocabulary.
3. Distinguish different tenses and identification of common errors
4. Infer the skill for writing formal and informal letters
5. Develop good listening and speaking skills
6. Apply the skills to speak and write English grammatically
UNIT I INTRODUCTION (6HRS)
Listening – short texts – formal and informal conversations - Speaking –basics in speaking–
speaking on given topics & situations –recording speeches and strategies to improve-Reading–
critical eading–finding key information in a given text – shifting facts from opinions - Writing
– freewritingonanygiventopic–autobiographicalwriting-LanguageDevelopment–tenses–voices-
wordformation: prefixes and suffixes– parts of speech– developing hints
UNIT II READING AND LANGUAGEDEVELOPMENT(6HRS)
Listening - long texts - TED talks - extensive speech on current affairs and discussions-
Speaking–describing a simple process–asking and answering questions - Reading
comprehension – skimming / scanning / predicting &analytical reading–question & answers–
objective and descriptive answers –identifyingsynonymsandantonyms-processdescription-
Writinginstructions – Language Development – writing definitions – compound words-
articles–prepositions.
UNIT III SPEAKING AND INTERPRETATION SKILLS (6HRS)
Listening-dialogues & conversations-Speaking–role plays–asking about routine actions and
expressing opinions - Reading longer texts & making a critical analysis of the given text -
Writing – types of paragraph and writing essays – rearrangement of jumbled sentences -
writing recommendations –Language Development–use of sequence words-cause & effect
expressions -sentences expressing purpose-picture based and news paper based activities–
single word substitutes.
UNIT IV VOCABULARY BUILDING AND WRITING SKILLS (6HRS)
Listening-debates and discussions–practicing multiple tasks–self introduction – Speaking
about friends/places/hobbies - Reading –Making inference from the reading passage –
Predicting the content of the reading passage - Writing – informal letters/e-mails - Language
Development -synonyms &antonyms - conditionals – if, unless, in case, when and others –
framing questions.
UNIT-V LANGUAGE DEVELOPMENT AND TECHNICAL WRITING (6HRS)
Listening - popular speeches and presentations -Speaking – impromptu speeches & debates -
Reading - articles – magazines/newspapers Writing –essay writing on technical topics-
channel conversion–bar diagram/graph–picture interpretation-process description-Language
Development–modal verbs-fixed/semi-fixed expressions–collocations.
TOTAL:30 Hours
TEXT BOOKS:
1. Board of Editors. Using English: A Course book for Undergraduate Engineers and
Technologists. Orient Black swan Limited, Hyderabad:2018
2. Dhanavel, S.P. English and Communication Skills for Students of Science and
Engineering. Orient Black swan, Chennai,2011.
REFERENCE BOOKS:
1. Anderson, Paul V.Technical Communication :A Reader Centered Approach. Cengage,
NewDelhi,2008.
2. Smith Worthington, Darlene& SueJefferson. Technical Writing for Success. Cengage,
Mason, USA,2007.
3. Grussendorf, Marion, English for Presentations, Oxford University Press, Oxford,2007.
4. Chauhan, Gajendra Singhandet.al. Technical Communication (Latest Revised Edition).
Cengage LearningIndiaPvt.Limited,2018.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC02/ LINEAR ALGEBRA & CALCULUS
YEAR / SEMESTER I / I
L T P C
3 1 0 4
COURSE OBJECTIVES:
This course introduces students to some basic mathematical ideas and tools which are at the core
of any engineering course. A brief course in Linear Algebra familiarises students with some
basic techniques in matrix theory which are essential for analysing linear systems. The calculus
of functions of one or more variables taught in this course are useful in modelling and analysing
physical phenomena involving continuous change of variables or parameters and have
applications across all branches.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Apply the Matrix Methods to solve the system of linear equations
2. Test the convergence and divergence of the infinite Series.
3. Determine the extreme values of functions of two variables.
4. Apply the vector differential operator to scalar and vector functions .
5.Solve line, surface & volume integrals by Greens, Gauss and Stoke’s theorems.
UNIT-I Matrices: (12HRS)
Rank of a matrix, Echelon form, consistency of linear System of equations, Linear dependence
of vectors, Eigen values, Eigenvectors, Properties of Eigen values, Cayley-Hamilton theorem,
Quadratic forms, Reduction of quadratic form to canonical form by linear transformation, Nature
of quadratic form.
UNIT-II Infinite Series: (12HRS)
Definition of Convergence of sequence and series. Series of positive terms –Necessary condition
for convergence, Comparison tests, limit form comparison test, D’Alembert’s Ratio test, Raabe’s
test, Cauchy’s root test, alternating series, Leibnitz’s rule, absolutely and conditionally
convergence.
UNIT-III Partial Differentiation and Its Applications: (12HRS)
Functions of two or more variables, Partial derivatives, Higher order partial derivatives, Total
derivative, Differentiation of implicit functions, Jacobians, Taylor’s expansion of functions of
two variables, Maxima and minima of functions of two variables.
UNIT-IV Vector Differential Calculus: (12HRS)
Scalar and vector point functions, vector operator Del, Gradient, Directional derivative,
Divergence, Curl, Del applied twice to point functions, Del applied to product of point functions
(vector identities). Applications: Irrotational fields and Solenoidal fields.
UNIT-V Vector Integral Calculus: (12HRS)
Line integral, Surface integral and Volume integral. Green’s theorem in the plane, verifications
of Stroke’s theorem (without proof) and Gauss’s divergence theorem(without proof).
TOTAL-60 Hours
TEXT BOOKS:
1. B.S. Grewal, Higher Engineering Mathematics, Khanna Publishers, 44th Edition, 2017.
2. Erwin kreyszig, Advanced Engineering Mathematics, 10th Edition, John Wiley & Sons, 2010.
3. Ramana B.V., Higher Engineering Mathematics, Tata McGraw Hill New Delhi, Reprint, 2017.
4. Kreyszig Erwin, "Advanced Engineering Mathematics ", John Wiley and Sons, 10th Edition,
New Delhi, 2016.
REFERENCE BOOKS
1. Sastry, S.S, ―Engineering Mathema[cs", Vol. I & II, PHI Learning Pvt. Ltd, 4th Edition, New
Delhi, 2014.
2. Wylie, R.C. and Barre, L.C., ―Advanced Engineering Mathema[cs ―Tata McGraw Hill
Education Pvt. Ltd, 6th Edition, New Delhi, 2012.
3. Dean G. Duffy., “Advanced Engineering Mathematics with MATLAB”, CRC Press, Third
Edition 2013
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC03/ COMPUTER ARCHITECTURE
YEAR / SEMESTER I / I
L T P C
3 0 0 3
COURSE OBJECTIVES:
 To learn the basic structure and operations of a computer.
 To learn the arithmetic and logic unit and implementation of fixed-point and floating
point arithmetic unit.
 To learn the basics of pipelined execution.
 To understand parallelism and multi-core processors.
 To understand the memory hierarchies, cache memories and virtual memories.
 To learn the different ways of communication with I/O devices.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Understand the basics structure of computers, operations and instructions.
2. Design arithmetic and logic unit.
3. Understand pipelined execution and design control unit.
4. Understand parallel processing architectures.
5. Understand the various memory systems and I/O communication.
UNIT I BASIC STRUCTURE OF A COMPUTER SYSTEM (9HRS)
Functional Units – Basic Operational Concepts – Performance – Instructions: Language of the
Computer – Operations, Operands – Instruction representation – Logical operations – decision
making – MIPS Addressing.
UNIT II ARITHMETIC FOR COMPUTERS (9HRS)
Addition and Subtraction – Multiplication – Division – Floating Point Representation – Floating
Point Operations – Subword Parallelism
UNIT III PROCESSOR AND CONTROL UNIT (9HRS)
A Basic MIPS implementation – Building a Datapath – Control Implementation Scheme –
Pipelining – Pipelined datapath and control – Handling Data Hazards & Control Hazards –
Exceptions.
UNIT IV PARALLELISIM (9HRS)
Parallel processing challenges – Flynn‘s classification – SISD, MIMD, SIMD, SPMD, and
Vector Architectures - Hardware multithreading – Multi-core processors and other Shared
Memory Multiprocessors - Introduction to Graphics Processing Units, Clusters, Warehouse Scale
Computers and other Message-Passing Multiprocessors.
UNIT V MEMORY & I/O SYSTEMS (9HRS)
Memory Hierarchy - memory technologies – cache memory – measuring and improving cache
performance – virtual memory, TLB‘s – Accessing I/O Devices – Interrupts – Direct Memory
Access – Bus structure – Bus operation – Arbitration – Interface circuits - USB.
TOTAL : 45 Hours
TEXT BOOKS:
1. David A. Patterson and John L. Hennessy, Computer Organization and Design: The
Hardware/Software Interface, Fifth Edition, Morgan Kaufmann / Elsevier, 2014.
2. Carl Hamacher, Zvonko Vranesic, Safwat Zaky and Naraig Manjikian, Computer
Organization and Embedded Systems, Sixth Edition, Tata McGraw Hill, 2012.
REFERENCE BOOKS:
1. William Stallings, Computer Organization and Architecture – Designing for Performance,
Eighth Edition, Pearson Education, 2010.
2. John P. Hayes, Computer Architecture and Organization, Third Edition, Tata McGraw Hill,
2012.
3. John L. Hennessey and David A. Patterson, Computer Architecture – A Quantitative
Approach‖, Morgan Kaufmann / Elsevier Publishers, Fifth Edition, 2012.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC04/ PROBLEM SOLVING AND
PROGRAMMING USING C
YEAR / SEMESTER I / I
L T P C
3 0 0 3
COURSE OBJECTIVES:
 To acquire problem solving skills
 To be able to develop flowcharts
 To understand structured programming concepts
 To be able to write programs in C Language
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Apply appropriate looping and conditional constructs for given problems
2. Use pointers, arrays and strings to solve complex problems
3. Use Structures, unions and files for problem solving
4. Apply problem solving techniques to real world problems
5. Make use of functions to build modular programming
UNIT I –PROBLEM SOLVING FUNDAMENTALS (9HRS)
Introduction to problem solving - Flow Chart, Algorithm, Pseudo code - Procedural
Programming (Modular and Structural)- Program Compilation, Execution, Debugging, Testing –
Preprocessors -Basic features of C, Structure of C program - Data types- Storage Classes-Tokens
in C- Input and Output Statements inC, Operators- Bitwise, Unary, Binary and Ternary
Operators, Precedence and Associativity –Expression Evaluation
UNIT II – CONDITIONAL STATEMENTS AND LOOPING CONSTRUCTS (9HRS)
Problem solving using Conditional or Selection or Branching Statements: Structure of if, if-else,
else-if ladder, nested-if, switch constructs - Looping constructs: Structure of for, while, do-while
constructs, usage of break, return, go to and continue keywords
UNIT III – ARRAYS AND STRINGS (9HRS)
1D Array –Declaration, Initialization, 2DArray - Declaration, Initialization, Multi-dimensional
Arrays Strings: Declaration, Initialization, String operations: length, compare, concatenate, copy
UNIT IV – FUNCTIONS AND POINTERS (9HRS)
Functions: Built-in Functions, User defined functions – Function Prototypes –Recursion –
Command Line Argument -Arrays and Functions – Strings and Functions. Pointers: Declaration
– Pointer operators – Pointer Arithmetic-Passing Pointers to a function-Pointers and one-
dimensional arrays-Dynamic memory allocation.
UNIT V – STRUCTURES, UNION AND FILE HANDLING (9HRS)
Structure: Create a Structure-Member initialization - Accessing Structure Members - Nested
structures– Pointer and Structures – Array of structures -Self Referential Structures – type def-
Unions, Files –Opening and Closing a Data File, Reading and writing a data file.
TOTAL : 45 Hours
TEXT BOOKS:
1.Jeyapoovan T, “Fundamentals of Computing and Programming in C”, Vikas Publishing house,
2015
2. Mark Siegesmund, "Embedded C Programming", first edition, Elsevier publications, 2014.
REFERENCE BOOKS
1. Ashok Kamthane, “Computer Programming”, Pearson Education, 7th Edition, Inc 2017.
2. Yashavant Kanetkar, “Let us C”, 15th edition, BPP publication, 2016.
3. S.Sathyalakshmi, S.Dinakar, “Computer Programming Practicals – Computer Lab Manual”,
Dhanam Publication, First Edition, July 2013
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC05/ DIGITAL SYSTEM DESIGN
YEAR / SEMESTER I / I
L T P C
3 1 0 4
COURSE OBJECTIVES:
 To design digital circuits using simplified Boolean functions
 To analyze and design combinational circuits
 To analyze and design synchronous and asynchronous sequential circuits
 To understand Programmable Logic Devices
 To write HDL code for combinational and sequential circuits
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Explain the fundamentals of number system , Codes and digital logic families
2. Develop combinational circuits.
3. Design synchronous sequential circuits using flip-flops.
4. Demonstrate Asynchronous Sequential circuits and Programmable Logic Devices.
5. Apply simulation tools for designing digital logic circuits.
UNIT I BOOLEAN ALGEBRA AND LOGIC GATES(12HRS)
Number Systems – Arithmetic Operations – Binary Codes- Boolean Algebra and Logic Gates –
Theorems and Properties of Boolean Algebra – Boolean Functions – Canonical and Standard
Forms – Simplification of Boolean Functions using Karnaugh Map – Logic Gates – NAND and
NOR Implementations.
UNIT II COMBINATIONAL LOGIC (12HRS)
Combinational Circuits – Analysis and Design Procedures – Binary Adder-Subtractor – Decimal
Adder – Binary Multiplier – Magnitude Comparator – Decoders – Encoders – Multiplexers –
Introduction to HDL – HDL Models of Combinational circuits.
UNIT III SYNCHRONOUS SEQUENTIAL LOGIC (12HRS)
Sequential Circuits – Storage Elements: Latches , Flip-Flops – Analysis of Clocked Sequential
Circuits – State Reduction and Assignment – Design Procedure – Registers and Counters – HDL
Models of Sequential Circuits.
UNIT IV ASYNCHRONOUS SEQUENTIAL LOGIC(12HRS)
Analysis and Design of Asynchronous Sequential Circuits – Reduction of State and Flow Tables
– Race-free State Assignment – Hazards.
UNIT V MEMORY AND PROGRAMMABLE LOGIC (12HRS)
RAM – Memory Decoding – Error Detection and Correction – ROM – Programmable Logic
Array – Programmable Array Logic – Sequential Programmable Devices.
TOTAL : 60 Hours
TEXT BOOKS:
1.M. Morris R. Mano, Michael D. Ciletti, ―Digital Design: With an Introduction to the Verilog
HDL, VHDL, and SystemVerilog‖, 6th Edition, Pearson Education, 2017.
REFERENCE BOOKS:
1. G. K. Kharate, Digital Electronics, Oxford University Press, 2010
2. John F. Wakerly, Digital Design Principles and Practices, Fifth Edition, Pearson Education,
2017.
3. Charles H. Roth Jr, Larry L. Kinney, Fundamentals of Logic Design, Sixth Edition,
CENGAGE Learning, 2013
4. Donald D. Givone, Digital Principles and Design‖, Tata Mc Graw Hill, 2003.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC06/ ETHICS AND HUMAN VALUES
YEAR / SEMESTER I / I
L T P C
2 0 0 2
COURSE OBJECTIVES:
To enable the students to create an awareness on Engineering Ethics and Human Values,to instill
Moral and Social Values and Loyalty and to appreciate the rights of others.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
 After successful completion of the course, the student will be able to:
 Apply ethics in society,discuss the ethical issues related to engineering and realize the
responsibilities and rights in the society
UNIT I HUMAN VALUES (6HRS)
Morals, values and Ethics – Integrity – Work ethic – Service learning – Civic virtue – Respect
for others – Living peacefully – Caring – Sharing – Honesty – Courage – Valuing time –
Cooperation – Commitment – Empathy – Self-confidence – Character – Spirituality –
Introduction to Yoga and meditation for professional excellence and stress management.
UNIT II ENGINEERING ETHICS (6HRS)
Senses of ‘Engineering Ethics’ – Variety of moral issues – Types of inquiry – Moraldilemmas –
Moral Autonomy – Kohlberg’s theory – Gilligan’s theory – Consensus and Controversy –
Models of professional roles - Theories about right action – Self-interest –Customs and Religion
– Uses of Ethical Theories
UNIT III ENGINEERING AS SOCIAL EXPERIMENTATION (6HRS)
Engineering as Experimentation – Engineers as responsible Experimenters – Codes ofEthics – A
Balanced Outlook on Law.
UNIT IV SAFETY, RESPONSIBILITIES AND RIGHTS (6HRS)
Safety and Risk – Assessment of Safety and Risk – Risk Benefit Analysis and Reducing Risk -
Respect for Authority – Collective Bargaining – Confidentiality – Conflicts of Interest –
Occupational Crime – Professional Rights – Employee Rights – Intellectual Property
Rights(IPR) – Discrimination
UNIT V GLOBAL ISSUES (6HRS)
Multinational Corporations – Environmental Ethics – Computer Ethics – Weapons Development
– Engineers as Managers – Consulting Engineers – Engineers as Expert Witnesses and Advisors
– Moral Leadership –Code of Conduct – Corporate Social Responsibility
TOTAL: 30 HOURS
TEXT BOOKS:
1. Mike W. Martin and Roland Schinzinger, “Ethics in Engineering”, Tata McGraw Hill,
NewDelhi, 2017.
2. Govindarajan M, Natarajan S, Senthil Kumar V. S, “Engineering Ethics”, Prentice Hall of
India, New Delhi, 2004.
REFERENCE BOOKS:
1. Charles B. Fleddermann, “Engineering Ethics”, Pearson Prentice Hall, New Jersey, 2004.
2. Charles E. Harris, Michael S. Pritchard and Michael J. Rabins, “Engineering Ethics –Concepts
and Cases”, Cengage Learning, 2009
3. John R Boatright, “Ethics and the Conduct of Business”, Pearson Education, New Delhi,2003
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC1PA / PROBLEM SOLVING USING C
LABORATORY
YEAR / SEMESTER I / I
L T P C
0 0 4 2
COURSE OBJECTIVES:
 To acquire problem solving skills
 To be able to develop flowcharts
 To understand structured programming concepts
 To be able to write programs in C Language
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
 Solve problems using data types and operators
 Apply appropriate looping and conditional constructs for given C programs
 Use functions to build modular programs
 Use appropriate IDE and tools to write, compile, debug and execute a C Program
 Implement structures, unions and File Operations
LIST OF EXPERIMENTS:
1. Problem solving design using Scratch tool
2. Conditional Statements- if-if else-else if ladder- nested if- switch
3. Looping Constructs – for – while- do-while
4. One dimensional Arrays
5. Two dimensional Arrays
6. Functions- Modular Programming
7. Pointers and arrays
8. Dynamic Memory Allocation
9. Programs to illustrate File operations
10. Structures and Union
TOTAL : 60 Hours
TEXT BOOKS:
1. Kernighan B. W. and Ritchie D. M., “C Programming Language (ANSI C)”, Prentice Hall
of IndiaPrivate Limited, New Delhi, 2015.
2. Herbert Schildt, “C – The Complete Reference”, Tata McGraw Hill Publishing Company,
NewDelhi, 2017.
REFERENCE BOOKS:
1. Deitel and Deitel, “C How to Program”, Pearson Education, New Delhi, 2011.
2.Byron S. Gottfried and Jitendar Kumar Chhabra, “Programming with C”, Tata McGraw Hill
Publishing Company,New Delhi,2011
3. PradipDey and ManasGhosh, “Programming in C”, Oxford University Press, New Delhi,
2009.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC1PB/ COMMUNICATIVE SKILLS AND
LANGUAGE LABORATORY
YEAR / SEMESTER I / I
L T P C
0 0 2 1
COURSE OBJECTIVES:
1. To nuances of Phonetics and give them sufficient practice in correct pronunciation.
2. To word stress and intonation.
3. To IELTS and TOEFL material for honing their listening skills.
4. To activities enabling them overcome their inhibitions while speaking in English with the
focus being on fluency rather than accuracy.
5. To team work, role behaviour while developing their ability to discuss in groups and
making oral presentations.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Define the speech sounds in English and understand the nuances of pronunciation in English
2. Apply stress correctly and speak with the proper tone, intonation and rhythm.
3. Analyze IELTS and TOEFL listening comprehension texts to enhance their listening skills.
4. Determine the context and speak appropriately in various situations.
5. Design and present effective posters while working in teams, and discuss and participate in
Group discussions.
LIST OF EXERCISES:
1. Introduction to English Phonetics: Introduction to auditory, acoustic and articulatory
phonetics,organs of speech: the respiratory, articulatory and phonatory systems.
2. Sound system of English: Phonetic sounds and phonemic sounds, introduction to
international phonetic alphabet, classification and description of English phonemic sounds,
minimal pairs. The syllable: types of syllables, consonant clusters.
3. Word stress: Primary stress, secondary stress, functional stress, rules of word stress.
4. Rhythm &Intonation: Introduction to Rhythm and Intonation. Major patterns, intonation
of English with the semantic implications.
5. Listening skills – Practice with IELTS and TOEFL material
6. Public speaking – Speaking with confidence and clarity in different contexts on various
issues.
7. Group Discussions - Dynamics of a group discussion, group discussion techniques, body
language.
8. Pictionary – weaving an imaginative story around a given picture.
9. Information Gap Activity – Writing a brief report on a newspaper headline by building on
the hints given
10. Poster presentation – Theme, poster preparation, team work and presentation.
TOTAL : 30 Hours
REFERENCE BOOKS:
1. T Balasubramanian. A Textbook of English Phonetics for Indian Students, Macmillan,
2017.
2. J Sethi et al. A Practical Course in English Pronunciation (with CD), Prentice Hall India,
2013 .
3. Priyadarshi Patnaik. Group Discussions and Interviews, Cambridge University Press Pvt.
Ltd.,2011
4. ArunaKoneru, Professional Speaking Skills, Oxford University Press, 2016
SEMESTER-II
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC21/ PRINCIPLES OF DATA SCIENCE
YEAR / SEMESTER I / II
L T P C
3 0 0 3
COURSE OBJECTIVES:
To provide strong foundation for data science and application area related to information
technology and understand the underlying core concepts and emerging technologies in data
science
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1.Explore the fundamental concepts of data science
2.Understand data analysis techniques for applications handling large data
3.Understand various machine learning algorithms used in data science process
4.Visualize and present the inference using various tools
5.Learn to think through the ethics surrounding privacy, data sharing and
algorithmicdecision-making
UNIT-1-INTRODUCTION TO DATA SCIENCE (9HRS)
Definition – Big Data and Data Science Hype – Why data science – Getting Past the
Hype – The Current Landscape – Who is Data Scientist? - Data Science Process
Overview – Defining goals – Retrieving data – Data preparation – Data exploration –
Data modeling – Presentation.
UNIT-2 -BIG DATA (9HRS)
Problems when handling large data – General techniques for handling large data – Case study
– Steps in big data – Distributing data storage and processing with Frameworks – Case study.
UNIT-3-MACHINE LEARNING (9HRS)
Machine learning – Modeling Process – Training model – Validating model – Predicting
newobservations –Supervised learning algorithms – Unsupervised learning algorithms.
UNIT-4-DEEP LEARNING (9HRS)
Introduction – Deep Feedforward Networks – Regularization – Optimization of Deep
Learning – Convolutional Networks – Recurrent and Recursive Nets – Applications of
DeepLearning.
UNIT-5 - DATA VISUALIZATION (9HRS)
Introduction to data visualization – Data visualization options – Filters – MapReduce –
Dashboard development tools – Creating an interactive dashboard with dc.js-summary.
TOTAL -45 Hours
TEXT BOOKS:
1. Introducing Data Science, Davy Cielen, Arno D. B. Meysman, Mohamed Ali,
ManningPublications Co., 1st
edition, 2016
2. An Introduction to Statistical Learning: with Applications in R, Gareth James, Daniela
Witten, Trevor Hastie, Robert Tibshirani, Springer, 1st
edition, 2013
3. Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville, MIT Press, 1st
edition,2016
4. Ethics and Data Science, D J Patil, Hilary Mason, Mike Loukides, O’ Reilly, 1st
edition,2018
REFERRENCE BOOKS:
1. Data Science from Scratch: First Principles with Python, Joel Grus, O’Reilly, 1st
edition,2015
2. Doing Data Science, Straight Talk from the Frontline, Cathy O'Neil, Rachel Schutt, O’
Reilly, 1st
edition, 2013
3. Mining of Massive Datasets, Jure Leskovec, Anand Rajaraman, Jeffrey David
Ullman,Cambridge University Press, 2nd
edition, 2014
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC22/ FUNDAMENTALS OF STATISTICS
YEAR / SEMESTER I / II
L T P C
3 1 0 4
COURSE OBJECTIVES:
To enable the students to understand the fundamentals of statistics to apply descriptive measures
and probability for data analysis.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Understand the science of studying & analyzing numbers.
2. Identify and use various visualization tools for representing data.
3. Describe various statistical formulas.
4. Compute various statistical measures.
UNIT I Statistics and Probability: (12HRS)
Introduction to Statistics – Origin of Statistics, Features of Statistics, Scope of Statistics,
Functions of Statics, Uses and importance of Statistics, Limitation of Statistics, Distrust of
Statistics
UNIT –II Collection of Data: (12HRS)
Introduction to Collection of Data, Primary and Secondary Data, Methods of Collecting Primary
Data, Methods of Secondary Data, Statistical Errors, Rounding off Data (Approximation).
UNIT III Classification of Data Frequency Distribution:(12HRS)
Introduction Classification of Data, Objectives of Classification, Methods of Classification,
Ways to Classify Numerical Data or Raw Data. Tabular, Diagrammatic and Graphic Presentation
of Data: Introduction to Tabular Presentation of Data, Objectives of Tabulation, Components of
a Statistical Table, General Rules for the Construction of a Table, Types of Tables, Introduction
to Diagrammatic Presentation of Data, Advantage and Disadvantage of Diagrammatic
Presentation, Types of Diagrams, Introduction to Graphic Presentation of Data, Advantage and
Disadvantage of Graphic Presentation, Types of Graphs.
UNIT IV Measures of Central tendency: (12HRS)
Introduction to Central Tendency, Purpose and Functions of Average, Characteristics of a Good
Average, Types of Averages, Meaning of Arithmetic Mean, Calculation of Arithmetic Mean,
Merit and Demerits of Arithmetic Mean, Meaning of Median, Calculation of Median, Merit and
Demerits of Median, Meaning of Mode, Calculation of Mode, Merit and Demerits of Mode,
Harmonic Mean- PropertiesMerit and Demerits.
UNIT V Measures of Dispersion: (12HRS)
Meaning of Dispersion, Objectives of Dispersion, Properties of a good Measure of Dispersion,
Methods of Measuring Dispersion, Range Introduction, Calculation of Range , Merit and
Demerits of Range, Mean Deviation, Calculation of Mean Deviation , Merit and Demerits of
Mean Deviation, Standard Deviation Meaning, Calculation of Standard Deviation , Merit and
Demerits of Standard Deviation, Coefficient of Variation, Calculation of Coefficient Variance,
Merit and Demerits of Coefficient of Variation.
TOTAL: 60 Hours
TEXT BOOKS:
1. Statistics and Data Analysis, A.Abebe, J. Daniels, J.W.Mckean, December 2000.
2. Statistics, Tmt. S. EzhilarasiThiru, 2005, Government of Tamilnadu.
3. Introduction to Statistics, David M. Lane.
4. Weiss, N.A., Introductory Statistics. Addison Wesley, 1999.
5. Clarke, G.M. & Cooke, D., A Basic course in Statistics. Arnold, 1998.
REFERENCE BOOKS:
1. Banfield J.(1999), Rweb: Web-based Statistical Analysis, Journal of Statistical Software.
2. Bhattacharya,G.K. and Johnson, R.A.(19977), Statistical Concepts and Methods, New York,
John Wiley & Sons.
E-Books/ Online learning material
1. http://onlinestatbook.com/Online_Statistics_Education.pdf
2. https://textbookcorp.tn.gov.in/Books/12/Std12-Stat-EM.pdf
3. https://3lihandam69.files.wordpress.com/2015/10/introductorystatistics.pdf
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC23/ OPERATING SYSTEMS
YEAR / SEMESTER I / II
L T P C
3 0 0 3
COURSE OBJECTIVES:
 To understand the basic concepts and functions of operating systems.
 To understand Processes and Threads
 To analyze Scheduling algorithms.
 To understand the concept of Deadlocks.
 To analyze various memory management schemes.
 To understand I/O management and File systems.
 To be familiar with the basics of Linux system and Mobile OS like iOS and
Android.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
 Characterize the basic functions of operating systems.
 Design the concepts of process management
 Implement the concepts of deadlocks
 Describe virtual memory and file system
 Analyze the File system implementation and disk I/O technique
UNIT 1 - INTRODUCTION (9HRS)
Introduction ‐ Computer System Organization ‐ Computer System Architecture ‐ Computer
System Structure ‐ Operating System Operations ‐ Process Management ‐ Memory Management
‐ Storage Management ‐ Distributed Systems ‐ Operating System Services ‐ User Operating
System Interface ‐ System Calls ‐ Types of System calls ‐ System Programs ‐ Process Concept ‐
Process Scheduling ‐ Operations on Processes ‐ Inter‐process Communication
UNIT 2 - SCHEDULING (9HRS)
Threads ‐ Overview ‐ Multithreading Models ‐ CPU Scheduling ‐ Basic Concepts ‐ Scheduling
Criteria ‐ Scheduling Algorithms ‐ Thread Scheduling ‐ Multiple‐Processor Scheduling ‐ The
Critical‐Section Problem ‐ Peterson's Solution ‐ Synchronization Hardware ‐ Semaphores
UNIT 3 - DEADLOCKS (9HRS)
System Model ‐ Deadlock Characterization ‐ Methods for handling Deadlocks ‐ Deadlock
Prevention‐ Deadlock avoidance‐ Deadlock detection‐Recovery from Deadlock Storage
Management ‐ Swapping‐ Contiguous Memory allocation
UNIT 4 - PAGING ANDFILE SYSTEM (9HRS)
Paging‐ Demand Paging ‐ Copy‐on Write ‐ Page Replacement ‐ Allocation of frames –
Thrashing‐ Virtual Memory ‐File Concept ‐ Access Methods ‐ Directory and Disk Structure
UNIT 5 - FILE MANAGEMENT (9HRS)
File System Structure ‐ File System Implementation ‐ Directory Implementation ‐ Allocation
Methods ‐ Free‐space Management – Disk Structure – Disk Attachment ‐ Disk Scheduling Disk
Management ‐ Swap‐Space Management ‐ RAID Structure
TOTAL: 45 Hours
TEXT BOOKS
1.Abraham Silberschatz, Peter Baer Galvin and Greg Gagne, "Operating System Concepts",
Eighth Edition, John Wiley & Sons (ASIA) Pvt. Ltd, 2009.
REFERENCE BOOKS
1. Harvey M. Deitel, "Operating Systems", Second Edition, Pearson Education, 2002.
2. William Stallings, "Operating System", Prentice Hall of India, 4th Edition, 2003.
3.Andrew S. Tanenbaum, "Modern Operating Systems", Prentice Hall of India, 2003.
E-BOOKS
1.http://www.freebookcentre.net/CompuScience/Free‐Operating‐Systems‐Books‐Download.html
MOOC
1. https://www.coursera.org/learn/web‐applications‐php COURSE TITLE COMPUTER NET
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC24/ DATABASE MANAGEMENT SYSTEM
YEAR / SEMESTER I / II
L T P C
3 0 0 3
COURSE OBJECTIVES:
1. To explain basic database concepts, applications, data models, schemas and instances.
2. To demonstrate the use of constraints and relational algebra operations.
3.Describe the basics of SQL and construct queries using SQL.
4. To emphasize the importance of normalization in databases.
5. To facilitate students in Database design
6. To familiarize issues of concurrency control and transaction management
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Recall the basic concepts of database systems.
2. Identify the SQL queries for a given scenario.
3. Illustrate relational database theory, and be able to write relational algebra
expressions for queries.
4. Summarize the various data storage devices and types of indexes.
5. Demonstrate transaction processing and concurrency control.
6. Explain Object oriented dB, Distributed dB, XML, data warehousing and Mobile
database.
UNIT 1: INTRODUCTION AND CONCEPTUAL MODELING (9HRS)
Introduction to File and Database systems- Database system structure – Data Models –
Introduction to Network and Hierarchical Models – ER model – Relational Model – Relational
Algebra and Calculus.
UNIT 2: RELATIONAL MODEL (9HRS)
SQL – Data definition- Queries in SQL- Updates- Views – Integrity and Security – Relational
Database design – Functional dependencies and Normalization for Relational Databases (up to
BCNF).
UNIT-3: DATA STORAGE AND QUERY PROCESSING (9HRS)
Record storage and Primary file organization- Secondary storage Devices- Operations on
Files- HeapFile- Sorted Files- Hashing Techniques – Index Structure for files –Different types
of Indexes- B-Tree - B+Tree – Query Processing.
UNIT 4: TRANSACTION MANAGEMENT (9HRS)
Transaction Processing – Introduction- Need for Concurrency control- Desirable
properties of Transaction- Schedule and Recoverability- Serializability and Schedules –
Concurrency Control – Typesof Locks- Two Phases locking- Deadlock- Recovery Techniques.
UNIT 5: CURRENT TRENDS (9HRS)
Object Oriented Databases – Need for Complex Data types- OO data Model- Nested relations-
ComplexTypes- Inheritance Reference Types - Distributed databases- Distributed data Storage
– Querying and Transformation. – Data Mining and Data Warehousing and Mobile Database.
TOTAL: 45 Hours
TEXT BOOKS
1.Abraham Silberschatz, Henry F. Korth and S. Sudarshan- ―Database System Concepts,
seventh Edition, 2019.
REFERENCE BOOKS
1.Ramez Elmasri and Shamkant B. Navathe, ―Fundamental Database Systems‖, Seventh
Edition,Pearson Education,2016.
2.Raghu Ramakrishnan, ―Database Management System, Tata McGraw-Hill Publishing
Company, Third Edition, 2014.
3.Jiawei Han, Micheline Kamber, Jian Pei -Data Mining Concepts and
Techniques,Morgan Kaufmann, Third Edition, 2012.
E BOOKS
1.https://ff.tusofia.bg/~bogi/knigi/BD/Database%20Management%20Systems.%202nd%20Ed.pd
f
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC25/ COMPUTER NETWORKS
YEAR / SEMESTER I / II
L T P C
3 0 0 3
COURSE OBJECTIVES:
The students should be able to
 Understand the division of network functionalities into layers.
 Be familiar with the components required to build different types of networks Be exposed
to the required functionality at each layer
 Learn the flow control and congestion control algorithms
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Identify the components required to build different types of networks
2. Choose the required functionality at each layer for given application
3. Identify solution for each functionality at each layer
4. Trace the flow of information from one node to another node in the network
UNIT I FUNDAMENTALS & LINK LAYER(9HRS)
Building a network – Requirements – Layering and protocols – Internet Architecture – Network
software – Performance ; Link layer Services – Framing – Error Detection – Flow control
UNIT II MEDIA ACCESS & INTERNETWORKING(9HRS)
Media access control – Ethernet (802.3) – Wireless LANs – 802.11 – Bluetooth – Switching and
bridging – Basic Internetworking (IP, CIDR, ARP, DHCP,ICMP )
UNIT III ROUTING (9HRS)
Routing (RIP, OSPF, metrics) – Switch basics – Global Internet (Areas, BGP, IPv6), Multicast –
addresses – multicast routing (DVMRP, PIM)
UNIT IV TRANSPORT LAYER (9HRS)
Overview of Transport layer – UDP – Reliable byte stream (TCP) – Connection management –
Flow control – Retransmission – TCP Congestion control – Congestion avoidance (DECbit,
RED) – QoS – Application requirements
UNIT V APPLICATION LAYER (9HRS)
Traditional applications -Electronic Mail (SMTP, POP3, IMAP, MIME) – HTTP – Web Services
– DNS – SNMP
TOTAL: 45 HOURS
TEXT BOOK:
1. Larry L. Peterson, Bruce S. Davie, “Computer Networks: A Systems Approach”, Fifth
Edition, Morgan Kaufmann Publishers, 2011.
REFERENCES:
1. James F. Kurose, Keith W. Ross, “Computer Networking – A Top-Down Approach
Featuring the Internet”, Fifth Edition, Pearson Education, 2009.
2. Nader. F. Mir, “Computer and Communication Networks”, Pearson Prentice Hall
Publishers, 2010.
3. Ying-Dar Lin, Ren-Hung Hwang, Fred Baker, “Computer Networks: An Open Source
Approach”,Mc Graw Hill Publisher, 2011.
4. Behrouz A. Forouzan, “Data communication and Networking”, Fourth Edition, Tata
McGraw – Hill,2011.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC2PA / STATISTICS AND DATA SCIENCE
LAB
YEAR / SEMESTER I / II
L T P C
0 0 4 2
COURSE OBJECTIVES:
The students should be able to:
I. Understand the R Programming Language.
II. Exposure on Solving of data science problems.
III. Understand The classification and Regression Model.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1.Explore the fundamental concepts of data science
2.Understand data analysis techniques for applications handling large data
3.Understand various machine learning algorithms used in data science process
4.Visualize and present the inference using various tools
5.Learn to think through the ethics surrounding privacy, data sharing and
algorithmicdecision-making
LIST OF EXPERIMENTS:
1. R AS CALCULATOR APPLICATION a. Using with and without R objects on
console b. Using mathematical functions on console c. Write an R script, to create R
objects for calculator application and save in a specified location in disk
2. DESCRIPTIVE STATISTICS IN R a. Write an R script to find basic descriptive
statistics using summary b. Write an R script to find subset of dataset by using subset ()
3. READING AND WRITING DIFFERENT TYPES OF DATASETS a. Reading
different types of data sets (.txt, .csv) from web and disk and writing in file in specific
disk location. b. Reading Excel data sheet in R. c. Reading XML dataset in R.
4. VISUALIZATIONS a. Find the data distributions using box and scatter plot. b. Find
the outliers using plot. c. Plot the histogram, bar chart and pie chart on sample data
5. CORRELATION AND COVARIANCE a. Find the correlation matrix. b. Plot the
correlation plot on dataset and visualize giving an overview of relationships among data
on iris data. c. Analysis of covariance: variance (ANOVA), if data have categorical
variables on iris data
6. REGRESSION MODEL Import a data from web storage. Name the dataset and now
do Logistic Regression to find out relation between variables that are affecting the
admission of a student in a institute based on his or her GRE score, GPA obtained and
rank of the student. Also check the model is fit or not. require (foreign), require(MASS).
7. MULTIPLE REGRESSION MODEL Apply multiple regressions, if data have a
continuous independent variable. Apply on above dataset.
8. REGRESSION MODEL FOR PREDICTION Apply regression Model techniques to
predict the data on above dataset
9. CLASSIFICATION MODEL a. Install relevant package for classification. b. Choose
classifier for classification problem. c. Evaluate the performance of classifier
10. CLUSTERING MODEL a. Clustering algorithms for unsupervised classification. b.
Plot the cluster data using R visualizations.
TOTAL :60 Hours
Reference Books:
Yanchang Zhao, “R and Data Mining: Examples and Case Studies”, Elsevier, 1st Edition, 2012
Web References:
1.http://www.r-bloggers.com/how-to-perform-a-logistic-regression-in-r/
2.http://www.ats.ucla.edu/stat/r/dae/rreg.htm
3.http://www.coastal.edu/kingw/statistics/R-tutorials/logistic.html
4. http://www.ats.ucla.edu/stat/r/data/binary.csv
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC2PB / DATABASE MANAGEMENT SYSTEM
(DBMS) LABORATORY
YEAR / SEMESTER I / II
L T P C
0 0 4 2
COURSE OBJECTIVES:
1. To explain basic database concepts, applications, data models, schemas and instances.
2. To demonstrate the use of constraints and relational algebra operations.
3.Describe the basics of SQL and construct queries using SQL.
4. To emphasize the importance of normalization in databases.
5. To facilitate students in Database design
6. To familiarize issues of concurrency control and transaction management
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Populate and query a database using SQL commands.
2.Declare and enforce integrity constraints on a database using a state-of the-art RDBMS
3. Implementing Indexing on table.
4.Programming PL/SQL including stored procedures, stored functions, cursors, packages
5.Solve basic issues of simple database applications and construct a real time database
6.application using current techniques
LIST OF EXPERIMENTS:
1.To study Basic SQL commands (create table, use , drop, insert) and execute the following
queriesusing these commands:
Create a table ‘Emp’ with attributes ‘ename’,’ecity’,’salary’,’enumber’,’eaddress’,’depttname’.
Create another table ‘Company’ with attributes ‘cname’, ccity’,’empnumber’ in the
database ‘Employee’.
2.To study the viewing commands (select , update) and execute the following
queries using thesecommands:
Find the names of all employees who live in Delhi.
Increase the salary of all employees by Rs. 5,000.
Find the company names where the number of employees is greater than 10,000.
Change the Company City to Gurgaon where the Company name is ‘TCS’.
3.To study the commands to modify the structure of table (alter, delete) and execute the
followingqueries using these commands:
Add an attribute named ‘ Designation’ to the table ‘Emp’.
Modify the table ‘Emp’, Change the datatype of ‘salary’ attribute to float.
Drop the attribute ‘depttname’ from the table ‘emp’.
Delete the entries from the table ‘ Company’ where the number of employees are lessthan 500.
4.To study the commands that involve compound conditions (and, or, in , not in,
between , not between , like , not like) and execute the following queries using these
commands:
Find the names of all employees who live in ‘ Gurgaon’ and whose salary is between
Rs.20,000 and Rs. 30,000.
Find the names of all employees whose names begin with either letter ‘A’ or ‘B’.
Find the company names where the company city is ‘Delhi’ and the number of employees
is not between 5000 and 10,000.
Find the names of all companies that do not end with letter ‘A’.
5.To study the aggregate functions (sum, count, max, min, average) and execute the
following queriesusing these commands:
Find the sum and average of salaries of all employees in computer science department.
Find the number of all employees who live in Delhi.
Find the maximum and the minimum salary in the HR department.
6.To study the grouping commands (group by, order by) and execute the following queries
using thesecommands:
List all employee names in descending order.
Find number of employees in each department where number of employees is greater
than 5.
List all the department names where average salary of a department is Rs.10,000.
7.To study the commands involving data constraints and execute the following queries
using thesecommands:
Alter table ‘Emp’ and make ‘enumber’ as the primary key.
Alter table ‘Company’ and add the foreign key constraint.
Add a check constraint in the table ‘Emp’ such that salary has the value between 0 and
Rs.1,00,000
Alter table ‘Company’ and add unique constraint to column cname
Add a default constraint to column ccity of table company with the value ‘Delhi’
8.To study the commands for joins ( cross join, inner join, outer join) and execute the
following queriesusing these commands:
Retrieve the complete record of an employee and its company from both the table using
joins.
List all the employees working in the company ‘TCS’.
9.To study the various set operations and execute the following queries using these commands:
List the enumber of all employees who live in Delhi and whose company is in Gurgaon or
ifboth conditions are true.
List the enumber of all employees who live in Delhi but whose company is not in Gurgaon.
10.To study the various scalar functions and string functions ( power, square, substring,
reverse, upper, lower, concatenation) and execute the following queries using these
commands:
Reverse the names of all employees.
Change the names of company cities to uppercase.
Concatenate name and city of the employee.
11. To study the commands involving indexes and execute the following queries:
Create an index with attribute ename on the table employee.
Create a composite index with attributes cname and ccity on table company.
Drop all indexes created on table company.
12. To study the conditional controls and case statement in PL-SQL and execute the following
queries:
Calculate the average salary from table ‘Emp’ and print increase the salary if the average
salary is less that 10,000.
Display the deptno from the employee table using the case statement if the deptname is
‘Technical’ then deptno is 1, if the deptname is ‘HR’ then the deptno is 2 else deptno is 3.
13. To study procedures and triggers in PL-SQL and execute the following queries:
Create a procedure on table employee to display the details of employee to display the
details of employees by providing them value of salaries during execution.
Create a trigger on table company for deletion where the whole table is displayed when
delete operation is performed.
14. Consider the tables given below. The primary keys are made bold and the data types are
specified.PERSON( driver_id:string , name:string , address:string )
CAR( regno:string , model:string , year:int )
ACCIDENT( report_number:int , accd_date:date , location:string ) OWNS(
driver_id:string , regno:string )
PARTICIPATED( driver_id:string , regno:string , report_number:int ,
damage_amount:int)
Create the above tables by properly specifying the primary keys and foreign keys.
Enter at least five tuples for each relation.
Demonstrate how you Update the damage amount for the car with specific regno in the accident
with reportnumber 12 to 25000.
Find the total number of people who owned cars that were involved in accidents in theyear
2008.
Find the number of accidents in which cars belonging to a specific model were involved.
TOTAL :60 Hours
TEXT BOOKS
1.Abraham Silberschatz, Henry F. Korth and S. Sudarshan- “Database System Concepts”,
seventh edition -2017
REFERENCE BOOKS
1.Ramez Elmasri and Shamkant B. Navathe, “Fundamental Database Systems”, Seventh
Edition, Pearson Education,2016
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC2PC/ OPERATING SYSTEMS AND
NETWORKS LABORATORY
YEAR / SEMESTER I / II
L T P C
0 0 4 2
COURSE OBJECTIVES:
To understand the functionalities of various layers of OSI model
To explain the difference between hardware, software; operating systems, programs and files.
Identify the purpose of different software applications.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Understand fundamental underlying principles of computer networking.
2. Understand details and functionality of layered network architecture.
3. Apply mathematical foundations to solve computational problems in computer
networking. Describe and demonstrate the functions and features of current operating
systems
4. Demonstrate proficiency in common industry software applications (word processing,
spreadsheet, presentation, and database) to effectively communicate in a professional
business setting
5. Demonstrate skills that meet industry standards and certification requirements in the use
of system hardware, operating systems technologies, and application systems.
LIST OF EXPERIMENTS IN NETWORKS
1. Implement the data link layer framing methods such as character count, character stuffing
and bit stuffing
2. Implement on a data set of characters the three CRC polynomials CRC 12, CRC 16 and
CRC CCIP
3. Implement Dijkstra’s algorithm to compute the shortest path thru a graph
4. Take an example subnet graph with weights indicating delay between nodes
5. Now obtain Routing table art each node using distance vector routing algorithm
6. Take an example subnet of hosts. Obtain broadcast tree for
7. Take a 64 bit playing text and encrypt the same using DES algorithm.
8. Write a program to break the above DES coding
9. Using RSA algorithm Encrypt a text data and Decrypt the same.
LIST OF EXPERIMENTS IN OPERTING SYSTEM
10. Simulate the following CPU scheduling algorithm a) FCFS b) SJF c) Round Robin d)
Priority
11. Simulate MVT & MFT
12. Simulate all page replacement algorithms a) FIFO b) LRU c)OPTIMAL
13. Simulate all file organization techniques a) Single level b)Two level
14. Simulate all File Allocation Strategies a) Sequential B)Indexed C)Linked
15. Simulate Bankers Algorithm for Deadlock Avoidance
TOTAL: 60 HOURS
TEXT BOOKS / REFERECES / WEBSITES :
1. An Introduction to Operating Systems, P.C.P Bhatt, 2nd edition, PHI. 2. Modern
Operating Systems, Andrew S Tanenbaum, 3rd Edition, PHI
SEMESTER-III
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC31/ PROBABILITY THEORY
YEAR / SEMESTER II/III
L T P C
3 0 0 3
COURSE OBJECTIVES:
To enable the students to understand the properties and applications of various probability
functions.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1.Demonstrate the random variables and its functions
2. Infer the expectations for random variable functions and generating functions.
3.Demonstrate various discrete and continuous distributions and their usage
UNIT-1 ALGEBRA OF PROBABILITY (9HRS)
Algebra of sets - fields and sigma - fields, Inverse function -Measurable function –
Probability measure on a sigma field – simple properties - Probability space - Random
variables and Random vectors – Induced Probability space – Distribution functions –
Decomposition of distribution functions.
UNIT-2-EXPECTATION AND MOMENTS OF RANDOM VARIABLES(9HRS)
Definitions and simple properties - Moment inequalities – Holder, Jenson Inequalities
– Characteristic function – definition and properties – Inversion formula. Convergence
of a sequence of random variables - convergence in distribution - convergence in
probability almost sure convergence and convergence in quadratic mean - Weak and
Complete convergence of distribution functions – Helly - Bray theorem.
UNIT-3 LAW OF LARGE NUMBERS (9HRS)
Khintchin's weak law of large numbers, Kolmogorov strong law of large numbers
(statementonly) – Central Limit Theorem – Lindeberg – Levy theorem, Linderberg –
Feller theorem (statement only), Liapounov theorem – Relation between Liapounov
and Linderberg –Fellerforms – Radon Nikodym theorem and derivative (without proof)
– Conditional expectation –definition and simple properties.
UNIT-4 DISTRIBUTION THEORY (9HRS)
Distribution of functions of random variables – Laplace, Cauchy, Inverse Gaussian,
Lognormal, Logarithmic series and Power series distributions - Multinomial
distribution - Bivariate Binomial – Bivariate Poisson – Bivariate Normal - Bivariate
Exponential of Marshall and Olkin - Compound, truncated and mixture of
distributions, Concept of convolution - Multivariate normal distribution (Definition
and Concept only)
UNIT-5 SAMPLING DISTRIBUTION (9HRS)
Sampling distributions: Non - central chi - square, t and F distributions and their
properties - Distributions of quadratic forms under normality -independence of
quadratic form and a linear form - Cochran’s theorem.
TOTAL: 45 HOURS
TEXT BOOKS
1. Modern Probability Theory, B.R Bhat, New Age International, 4th
Edition, 2014.
2. An Introduction to Probability and Statistics, V.K Rohatgi and Saleh, 3rd
Edition, 2015.
REFERENCE BOOKS
1. Introduction to the theory of statistics, A.M Mood, F.A Graybill and D.C Boes, Tata
McGraw-Hill, 3rd
Edition (Reprint), 2017.
2. Order Statistics, H.A David and H.N Nagaraja, John Wiley & Sons, 3rd
Edition, 2003.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC32/ CLOUD COMPUTING
YEAR / SEMESTER II / III
L T P C
3 0 0 3
COURSE OBJECTIVES:
 To understand the concept of cloud computing.
 To appreciate the evolution of cloud from the existing technologies.
 To have knowledge on the various issues in cloud computing.
 To be familiar with the lead players in cloud.
 To appreciate the emergence of cloud as the next generation computing paradigm.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
 Articulate the main concepts, key technologies, strengths and limitations of
cloud computing.
 Learn the key and enabling technologies that help in the development of cloud.
 Develop the ability to understand and use the architecture of compute and storage cloud,
service and delivery models.
 Explain the core issues of cloud computing such as resource management and security.
 Be able to install and use current cloud technologies.
 Evaluate and choose the appropriate technologies, algorithms and approaches
for implementation and use of cloud.
UNIT I INTRODUCTION (9HRS)
Introduction to Cloud Computing – Definition of Cloud – Evolution of Cloud Computing –
Underlying Principles of Parallel and Distributed Computing – Cloud Characteristics – Elasticity
in Cloud – On-demand Provisioning.
UNIT II CLOUD ENABLING TECHNOLOGIES (9HRS)
Service Oriented Architecture – REST and Systems of Systems – Web Services – Publish
Subscribe Model – Basics of Virtualization – Types of Virtualization – Implementation Levels of
Virtualization – Virtualization Structures – Tools and Mechanisms – Virtualization of CPU –
Memory – I/O Devices –Virtualization Support and Disaster Recovery.
UNIT III CLOUD ARCHITECTURE, SERVICES AND STORAGE (9HRS)
Layered Cloud Architecture Design – NIST Cloud Computing Reference Architecture – Public,
Private and Hybrid Clouds – laaS – PaaS – SaaS – Architectural Design Challenges – Cloud
Storage – Storage-as-a-Service – Advantages of Cloud Storage – Cloud Storage Providers – S3.
UNIT IV RESOURCE MANAGEMENT AND SECURITY IN CLOUD (9HRS)
Inter Cloud Resource Management – Resource Provisioning and Resource Provisioning Methods
– Global Exchange of Cloud Resources – Security Overview – Cloud Security Challenges –
Software-as-a-Service Security – Security Governance – Virtual Machine Security – IAM –
Security Standards.
UNIT V CLOUD TECHNOLOGIES AND ADVANCEMENTS (9HRS)
Hadoop – MapReduce – Virtual Box — Google App Engine – Programming Environment for
Google App Engine –– Open Stack –Federation in the Cloud – Four Levels of Federation –
Federated Services and Applications – Future of Federation.
TOTAL: 45 HOURS
REFERENCE BOOKS:
1. Kai Hwang, Geoffrey C. Fox, Jack G. Dongarra, "Distributed and Cloud Computing,
From Parallel Processing to the Internet of Things", Morgan Kaufmann Publishers, 2013.
2. Rittinghouse, John W., and James F. Ransome,―Cloud Computing: Implementation,
Management and Security‖, CRC Press, 2017.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC33 / ADVANCED DATABASE
TECHNOLOGIES
YEAR / SEMESTER II / III
L T P C
3 0 0 3
COURSE OBJECTIVES:
 Be familiar with a commercial relational database system (Oracle) by writing SQL using the
system.
 Be familiar with the relational database theory, and be able to write relational algebra
expressions for queries..
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Understand the fundamental concepts of Database Management Systems and Entity
Relationship Model and develop ER Models.
2. Build SQL Queries to perform data creation and data manipulation operations on databases.
3. Understand the concepts of functional dependencies, normalization and apply such knowledge
to the normalization of a database.
4. Identify the issues related to Query processing and Transaction management in database
management systems.
5. Analyze the trends in data storage, query processing and concurrency control of modern
database technologies
UNIT I PARALLEL AND DISTRIBUTED DATABASES (9HRS)
Database System Architectures: Centralized and Client-Server Architectures – Server
System Architectures – Parallel Systems- Distributed Systems – Parallel Databases: I/O
Parallelism – Inter and Intra Query Parallelism – Inter and Intra operation Parallelism –
Distributed Database Concepts - Distributed Data Storage – Distributed Transactions –
Commit Protocols – Concurrency Control – Distributed Query Processing – Three Tier
Client Server Architecture- Case Studies.
UNIT II OBJECT AND OBJECT RELATIONAL DATABASES (9HRS)
Concepts for Object Databases: Object Identity – Object structure – Type
Constructors – Encapsulation of Operations – Methods – Persistence – Type and Class
Hierarchies – Inheritance – Complex Objects – Object Database Standards, Languages
and Design: ODMG Model – ODL – OQL – Object Relational and Extended –
Relational Systems : Object Relational features in SQL / Oracle – Case Studies.
UNIT III XML DATABASES (9HRS)
XML Databases: XML Data Model – DTD - XML Schema - XML Querying – Web
Databases – JDBC– Information Retrieval – Data Warehousing – Data Mining.
UNIT IV MOBILE DATABASES (9HRS)
Mobile Databases: Location and Handoff Management - Effect of Mobility on Data
Management - Location Dependent Data Distribution - Mobile Transaction Models
- Concurrency Control - Transaction Commit Protocols- Mobile Database Recovery
Schemes.
UNIT V INTELLIGENT DATABASES (9HRS)
Active databases – Deductive Databases – Knowledge bases – Multimedia Databases-
Multidimensional Data Structures – Image Databases – Text/Document Databases-
Video Databases– Audio Databases – Multimedia Database Design.
TOTAL: 45 HOURS
TEXT BOOKS
[1]. Henry F. Korth and Silberschatz Abraham, “Database System Concepts”, Mc.Graw
Hill.2019
[2]. Thomas Cannolly and Carolyn Begg, “Database Systems, A Practical Approach to
Design, Implementation and Management”, Third Edition, Pearson Education, 2001.
[3]. The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, 2nd
John
Wiley & Sons, Inc. New York, USA, 2002
REFERENCE BOOKS
1] LiorRokach and OdedMaimon, Data Mining and Knowledge Discovery Handbook,
Springer, 2nd edition, 2010
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC34/ WEB PROGRAMMING
YEAR / SEMESTER II / III
L T P C
3 0 0 3
COURSE OBJECTIVES:
The students should be able to
1. Understand the technologies used in Web Programming.
2 .Know the importance of object oriented aspects of Scripting.
3. Understand creating database connectivity using JDBC.
4 .Learn the concepts of web based application using sockets.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1 .Design web pages.
2. Use technologies of Web Programming.
3. Apply object oriented aspects to Scripting.
4. Create databases with connectivity using JDBC.
5 .Build web based application using sockets.
UNIT I SCRIPTING(9HRS)
Web page Designing using HTML, Scripting basics- Client side and server side scripting. Java
ScriptObject,names, literals, operators and expressions- statements and features- events -
windows -documents - frames - data types - built-in functions- Browser object model -
Verifying forms.-HTML5-CSS3- HTML 5 canvas - Web site creation using tools.
UNIT II JAVA(9HRS)
Introduction to object oriented programming-Features of Java – Data types, variables and arrays
–Operators – Control statements – Classes and Methods – Inheritance. Packages and Interfaces
–Exception Handling – Multithreaded Programming – Input/Output – Files – Utility Classes –
String Handling.
UNIT III JDBC (9HRS)
JDBC Overview – JDBC implementation – Connection class – Statements - Catching Database
Results, handling database Queries. Networking– InetAddress class – URL class- TCP sockets
– UDP sockets, Java Beans –RMI.
UNIT IV APPLETS (9HRS)
Java applets- Life cycle of an applet – Adding images to an applet – Adding sound to an applet.
Passing parameters to an applet. Event Handling. Introducing AWT: Working with Windows
Graphics and Text. Using AWT Controls, Layout Managers and Menus. Servlet – life cycle of a
servlet. The Servlet API, Handling HTTP Request and Response, using Cookies, Session
Tracking. Introduction to JSP.
UNIT V XML AND WEB SERVICES (9HRS)
Xml – Introduction-Form Navigation-XML Documents- XSL – XSLT- Web services-UDDI-
WSDL-Java web services – Web resources.
TOTAL :45 HOURS
TEXT BOOKS:
1. Harvey Deitel, Abbey Deitel, Internet and World Wide Web: How To Program 5 Edition.
2. Herbert Schildt, Java - The Complete Reference, 7th
Edition. Tata McGraw- Hill Edition.
3. Michael Morrison XML Unleashed Tech media SAMS.
REFERENCE BOOKS:
1. John Pollock, Javascript - A Beginners Guide, 3rd
Edition –- Tata McGraw-Hill Edition.
2. Keyur Shah, Gateway to Java Programmer Sun Certification, Tata McGraw Hill, 2002.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC35/ DATA MINING
YEAR / SEMESTER II / III
L T P C
3 0 0 3
COURSE OBJECTIVES:
 To Understand Data mining principles and techniques and Introduce DM as a cutting
edge business intelligence
 To expose the students to the concepts of Data ware housing Architecture and
Implementation
 To study the overview of developing areas – Web mining, Text mining and ethical
aspects of Data mining
 To identify Business applications and Trends of Data mining
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Evolve Multidimensional Intelligent model from typical system
2. Discover the knowledge imbibed in the high dimensional system
3. Evaluate various mining techniques on complex data objects
UNIT I INTRODUCTION TO DATA MINING (9HRS)
Data mining-KDD versus datamining, Stages of the Data Mining Process-task premitives,
Data Mining Techniques -Data mining knowledge representation – Data mining query
languages, Integration of a Data Mining System with a Data Warehouse – Issues, Data
preprocessing – Data cleaning, Data transformation, Feature selection, Dimensionality
reduction, Discretization and generating concept hierarchies-Mining frequent patterns-
association-correlation
UNIT II CLASSIFICATION AND CLUSTERING (9HRS)
Decision Tree Induction - Bayesian Classification – Rule Based Classification – Classification
by Back propagation – Support Vector Machines – Associative Classification – Lazy Learners
– Other Classification Methods – Clustering techniques – , Partitioning methods- k-means-
Hierarchical Methods – distance based agglomerative and divisible clustering, Density-Based
Methods – expectation maximization -Grid Based Methods – Model-Based Clustering
Methods – Constraint – Based Cluster Analysis – Outlier Analysis
UNIT III PREDICTIVE MODELING OF BIG DATA AND TRENDS IN
DATAMINING (9HRS)
Statistics and Data Analysis – EDA – Small and Big Data –Logistic Regression Model -
Ordinary Regression Model-Mining complex data objects – Spatial databases – Temporal
databases – Multimedia databases – Time series and sequence data – Text mining – Web
mining – Applications in Data mining
UNIT IV INTRODUCTION TO DATA WAREHOUSING (9HRS)
Evolution of Decision Support Systems- Data warehousing Components – Building a Data
warehouse, Data Warehouse and DBMS, Data marts, Metadata, Multidimensional data model,
OLAP vs OLTP, OLAP operations, Data cubes, Schemas for Multidimensional Database:
Stars, Snowflakes and Fact constellations
UNIT V DATA WAREHOUSE PROCESS AND ARCHITECTURE (9HRS)
Types of OLAP servers, 3–Tier data warehouse architecture, distributed and virtual data
warehouses. Data warehouse implementation, tuning and testing of data warehouse. Data
Staging (ETL) Design and Development, data warehouse visualization, Data Warehouse
Deployment, Maintenance, Growth, Business Intelligence Overview- Data Warehousing and
Business Intelligence Trends - Business Applications- tools-SAS
TOTAL : 45 HOURS
TEXT BOOKS:
1. Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Morgan
Kaufmann Publishers, third edition 2011, ISBN: 1558604898.
2. Alex Berson and Stephen J. Smith, “ Data Warehousing, Data Mining & OLAP”, Tata
McGraw Hill Edition, Tenth Reprint 2007.
3. G. K. Gupta, “Introduction to Data Min Data Mining with Case Studies”, Easter Economy
Edition, Prentice Hall of India, 2006.
4. Data Mining:Practical Machine Learning Tools and Techniques,Third edition,(Then
Morgan Kufmann series in Data Management systems), Ian.H.Witten, Eibe Frank and
Mark.A.Hall, 2011
5. Statistical and Machine learning –Learning Data Mining, techniques for better Predictive
Modeling and Analysis to Big Data
REFERENCE BOOKS:
1. Mehmed kantardzic,“Data mining concepts,models,methods, and algorithms”, Wiley
Interscience, 2003.
2. Ian Witten, Eibe Frank, Data Mining; Practical Machine Learning Tools and Techniques,
third edition, Morgan Kaufmann, 2011.
3. George M Marakas, Modern Data Warehousing, Mining and Visualization,Prentice Hall,
2003.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC36/ OPERATION RESEARCH
YEAR / SEMESTER II / III
L T P C
3 0 0 3
COURSE OBJECTIVES:
 To provide basic knowledge of computer operating system structures and functioning
 To study about process management
 To learn the basics of memory management
 To understand the structure of file and I/O systems
 To be familiar with some operating systems
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1.Implement the various System calls and Inter process Communication
2.Apply various processor scheduling algorithms and handle process synchronization problems
3.Apply various memory management techniques to given situation
4.Apply various disk management techniques
5.Apply various processor scheduling algorithms and memory management techniques for
popular operating systems – Linux, Windows and Android
UNIT I OPERATING SYSTEMS OVERVIEW (9HRS)
Introduction to operating systems – Computer system organization - architecture – Operating
system structure - operations – Process, memory, storage management – Protection and security
– Distributed systems – Computing environments – Open source operating systems – OS
services – User interface – System calls – System programs – Process concept - scheduling –
Operations on processes – Cooperating processes – Inter-process communication – Threads.
UNIT II PROCESS MANAGEMENT (9HRS)
Basic concepts – Scheduling criteria – Scheduling algorithms – Multiple processor scheduling –
Algorithm evaluation – The critical section problem – Synchronization hardware – Semaphores –
Classic problems of synchronization – Critical regions – Monitors – Deadlocks – Deadlock
characterization – Methods for handling deadlocks – Deadlock prevention – Deadlock avoidance
– Deadlock detection – Recovery from deadlock.
UNIT III MEMORY MANAGEMENT (9HRS)
Memory management – Swapping – Contiguous memory allocation – Paging – Segmentation-
Segmentation with paging – Virtual memory - Demand paging – Copy on write – Page
replacement – Allocation of frames – Thrashing
UNIT IV FILE AND I/O SYSTEMS (9HRS)
File concept – Access methods – Directory structure – File-system mounting –Protection –
Directory implementation – Allocation methods – Free space management – Disk scheduling –
Disk management – Swap space management – Protection - I/O Systems – I/O Hardware –
Application I/O Interface – Kernel I/O subsystem
UNIT V CASE STUDY (9HRS)
Linux system – History – Design principles – Kernel modules – Process management –
Scheduling – Memory management – File systems – Input and output – Inter Process
Communication – Network structure – Security - Windows 8 – History – Design principles -
Android OS - History – Design principles.
TOTAL: 45 HOURS
TEXT BOOKS:
1.Abraham Silberschatz, Peter B. Galvin, Greg Gagne, “Operating System Concepts Essentials”,
John Wiley & Sons Inc., Ninth Edition, 2013
2. Reto Meier, John Wiley and sons, “Professional Android 4 Application Development”, 2012
REFERENCE BOOKS:
1. Andrew S. Tanenbaum, “Modern Operating Systems”, Pearson Education, Fourth Edition,
2015.
2. Charles Crowley, “Operating Systems: A Design-Oriented Approach”, Tata McGraw Hill
Education”, 2017.
3. D M Dhamdhere, “Operating Systems: A Concept-based Approach”, Tata McGraw-Hill
Education, Third Edition, 2012.
4. William Stallings, “Operating Systems: Internals and Design Principles”, Prentice Hall,
Seventh Edition, 2011.
Extensive Reading:
1. http://nptel.ac.in
2. http://nptel.ac.in/downloads/106108055/
3. http://cseweb.ucsd.edu/classes/fa06/cse120/lectures/120-fa06-l13.pdf
4. http://www.cs.kent.edu/~farrell/osf03/oldnotes/
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC3PA / DATA MINING LAB
YEAR / SEMESTER II/ III
L T P C
0 0 4 2
COURSE OBJECTIVES:
The students should be able to
1. Be familiar with the algorithms of data mining,
2. Be acquainted with the tools and techniques used for Knowledge Discovery in Databases.
3. Be exposed to web mining and text mining
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
 Apply data mining techniques and methods to large data sets.
 Use data mining tools.
 Compare and contrast the various classifiers.
LIST OF EXPERIMENTS:
1. Creation of a Data Warehouse.
2. Apriori Algorithm.
3. FP-Growth Algorithm.
4. K-means clustering.
5. One Hierarchical clustering algorithm.
6. Bayesian Classification.
7. Decision Tree.
8. Support Vector Machines.
9. Applications of classification for web mining.
10. Case Study on Text Mining or any commercial application.
TOTAL : 60 HOURS
TEXT BOOKS:
1. Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Morgan
Kaufmann Publishers, third edition2011, ISBN: 1558604898.
2. Alex Berson and Stephen J. Smith, “ Data Warehousing, Data Mining & OLAP”, Tata
McGraw Hill Edition, Tenth Reprint 2007.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC3PB / CLOUD COMPUTING AND WEB
PROGRAMMING LAB
YEAR / SEMESTER II / III
L T P C
0 0 4 2
COURSE OBJECTIVES:
 To develop web applications in cloud
 To learn the design and development process involved in creating a cloud based
application
 To learn to implement and use parallel programming using Hadoop
 To learn to implement embedded devices in IoT
 To design and develop IoT Devices
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
 Design and deploy a web application in a PaaS environment.
 Learn how to simulate a cloud environment to implement new schedulers.
 Install and use a generic cloud environment that can be used as a private cloud.
 Make use of Cloud platform to upload and analyse any sensor data.
 Use cascading style sheets to design web pages
LIST OF EXPERIMENTS:
CLOUD COMPUTING
1. Install Virtualbox/VMware Workstation with different flavours of linux or windows OS
on top of windows7 or 8.
2. Install a C compiler in the virtual machine created using virtual box and execute Simple
Programs
3. Install Google App Engine. Create hello world app and other simple web applications
using python/java.
4. Use GAE launcher to launch the web applications.
5. Simulate a cloud scenario using CloudSim and run a scheduling algorithm that is not
present in CloudSim.
6. Find a procedure to transfer the files from one virtual machine to another virtual
machine.
7. Find a procedure to launch virtual machine using trystack (Online Openstack Demo
Version)
8. Install Hadoop single node cluster and run simple applications like wordcount.
WEB PROGRAMMING
1. Develop and demonstrate a XHTML file that includes Javascript script for the following
problems:
a) Input: A number n obtained using prompt Output: The first n Fibonacci numbers
b) Input: A number n obtained using prompt Output: A table of numbers from 1 to n and their
squares using alert
2. a) Develop and demonstrate, using Javascript script, a XHTML document that collects the
USN ( the valid format is: A digit from 1 to 4 followed by two upper-case characters followed
by two digits followed by two upper-case characters followed by three digits; no embedded
spaces allowed) of the user. Event handler must be included for the form element that collects
this information to validate the input. Messages in the alert windows must be produced when
errors are detected. b) Modify the above program to get the current semester also (restricted to
be a number from 1 to 8)
3. a) Develop and demonstrate, using Javascript script, a XHTML document that contains three
short paragraphs of text, stacked on top of each other, with only enough of each showing so that
the mouse cursor can be placed over some part of them. When the cursor is placed over the
exposed part of any paragraph, it should rise to the top to become completely visible.
b) Modify the above document so that when a paragraph is moved from the top stacking
position, it returns to its original position rather than to the bottom.
4. a) Design an XML document to store information about a student in an engineering college
affiliated to VTU. The information must include 100 USN, Name, Name of the College, Brach,
Year of Joining, and e-mail id. Make up sample data for 3 students. Create a CSS style sheet
and use it to display the document. b) Create an XSLT style sheet for one student element of the
above document and use it to create a display of that element.
5. a) Write a Perl program to display various Server Information like Server Name, Server
Software, Server protocol, CGI Revision etc. b) Write a Perl program to accept UNIX
command from a HTML form and to display the output of the command executed.
6. a) Write a Perl program to accept the User Name and display a greeting message randomly
chosen from a list of 4 greeting messages.
b) Write a Perl program to keep track of the number of visitors visiting the web page and to
display this count of visitors, with proper headings.
7. Write a Perl program to display a digital clock which displays the current time of the server.
TOTAL :60 Hours
REFERENCE BOOKS:
1.Kai Hwang, Geoffrey C. Fox, Jack G. Dongarra, "Distributed and Cloud Computing,
2 . Harvey Deitel, Abbey Deitel, Internet and World Wide Web: How To Program 5 Edition.
SEMESTER-IV
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC41/ DATA HANDLING AND
VISUALIZATION
YEAR / SEMESTER II / IV
L T P C
3 0 0 3
COURSE OBJECTIVE:
The course is designed to enable students to know the basics of data visualization and
understand the importance of data visualization and the design and use of visual components
and basic algorithms.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1.Understand basics of Data Visualization
2.Implement visualization of distributions
3.Write programs on visualization of time series, proportions & associations
4.Apply visualization on Trends and uncertainty
5.Explain principles of proportions
UNIT 1: INTRODUCTION TO VISUALIZATION(9HRS)
Visualizing Data-Mapping Data onto Aesthetics, Aesthetics and Types of Data, Scales Map
Data Values onto Aesthetics, Coordinate Systems and Axes- Cartesian Coordinates, Nonlinear
Axes, Coordinate Systems with Curved Axes, Color Scales-Color as a Tool to Distinguish,
Color to Represent Data Values ,Color as a Tool to Highlight, Directory of Visualizations-
Amounts, Distributions, Proportions, x–y relationships, Geospatial Data
UNIT 2: VISUALIZING DISTRIBUTIONS (9HRS)
Visualizing Amounts-Bar Plots, Grouped and Stacked Bars, Dot Plots and Heatmaps,
Visualizing Distributions: Histograms and Density Plots- Visualizing a Single Distribution,
Visualizing Multiple Distributions at the Same Time, Visualizing Distributions: Empirical
Cumulative Distribution Functions and Q-Q Plots-Empirical Cumulative Distribution Functions,
Highly Skewed Distributions, Quantile- Quantile Plots, Visualizing Many Distributions at
Once-Visualizing Distributions Along the Vertical Axis, Visualizing Distributions Along the
Horizontal Axis
UNIT 3: VISUALIZING ASSOCIATIONS & TIME SERIES (9HRS)
Visualizing Proportions-A Case for Pie Charts, A Case for Side-by-Side Bars, A Case for
Stacked Bars and Stacked Densities, Visualizing Proportions Separately as Parts of the Total
,Visualizing Nested Proportions- Nested Proportions Gone Wrong, Mosaic Plots and Treemaps,
Nested Pies ,Parallel Sets. Visualizing Associations Among Two or More Quantitative
Variables-Scatterplots, Correlograms, Dimension Reduction, Paired Data. Visualizing Time
Series and Other Functions of an Independent Variable-Individual Time Series , Multiple Time
Series and Dose–Response Curves, Time Series of Twoor More Response Variables
UNIT 4: VISUALIZING UNCERTIANITY (9HRS)
Visualizing Trends-Smoothing, Showing Trends with a Defined Functional Form, Detrending
and Time-Series Decomposition, Visualizing Geospatial Data-Projections, Layers, Choropleth
Mapping, Cartograms, Visualizing Uncertainty-Framing Probabilities as Frequencies,
Visualizing the Uncertainty of Point Estimates, Visualizing the Uncertainty of Curve Fits,
Hypothetical Outcome Plots
UNIT 5: PRINCIPLE OF PROPORTIONAL INK(9HRS)
The Principle of Proportional Ink-Visualizations Along Linear Axes, Visualizations Along
Logarithmic Axes, Direct Area Visualizations, Handling Overlapping Points-Partial
Transparency and Jittering, 2D Histograms, Contour Lines, Common Pitfalls of Color Use-
Encoding Too Much or Irrelevant Information ,Using Nonmonotonic Color Scales to Encode
Data Values, Not Designing for Color-Vision Deficiency.
TOTAL: 45 HOURS
TEXT BOOKS
1.Claus Wilke, “Fundamentals of Data Visualization: A Primer on Making Informative
and Compelling Figures”, 1st edition, O’Reilly Media Inc, 2019.
REFERENCE BOOKS
1.Tony Fischetti, Brett Lantz, R: Data Analysis and Visualization,O’Reilly ,2016
2.Ossama Embarak, Data Analysis and Visualization Using Python: Analyze Data to
Create Visualizations for BI Systems,Apress, 2018
E BOOKS
1.https://www.netquest.com/hubfs/docs/ebook-data-visualization-EN.pdf
MOOC
1.https://www.coursera.org/learn/data-visualization
2.https://www.coursera.org/learn/python-for-data-visualization#syllabus
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC42/ MACHINE LEARNING
YEAR / SEMESTER II / IV
L T P C
3 0 0 3
COURSE OBJECTIVES:
The objective of this course is to provide introduction to the principles and design of machine
learning algorithms. The course is aimed at providing foundations for conceptual aspects of
machine learning algorithms along with their applications to solve real world problems.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Identify various machine learning algorithms and terminologies and perform data
pre-processing using standard ML library.
2. Design a predictive model using appropriate supervised learning algorithms to solve
any given problem.
3. Develop an application using appropriate unsupervised learning algorithms for
performing clustering and dimensionality reduction.
4. Solve complex problems using artificial neural networks and kernel machines.
5. Implement probabilistic graphical models for suitable applications.
UNIT-I INTRODUCTION TO ARTIFICIAL NEURAL NETWORKS (9HRS)
Fundamentals Of Neural Networks – Model of Artificial Neuron – Neural Network Architectures
– Learning Methods – Taxonomy Of Neural Network Architectures – Applications
UNIT II FEED FORWARD NEURAL NETWORKS (9HRS)
Perceptron Models: Discrete, Continuous and Multi-Category –Training Algorithms: Discrete
and Continuous Perceptron Networks – Limitations of the Perceptron – Model. Credit
Assignment Problem – Generalized Delta Rule, Derivation of Back propagation (BP) Training,
and Summary of Back propagation Algorithm –Kolmogorov Theorem
UNIT III: MACHINE LEARNING(9HRS)
Machine Learning Fundamentals –Types of Machine Learning - Supervised, Unsupervised,
Reinforcement- The Machine Learning process. Terminologies in ML- Testing ML algorithms:
Overfitting, Training, Testing and Validation Sets- Confusion matrix -Accuracy metrics- ROC
Curve- Basic Statistics: Averages, Variance and Covariance, The Gaussian- The Bias-Variance
trade off- Applications of Machine Learning.
UNIT IV: SUPERVISED LEARNING(9HRS)
Regression: Linear Regression – Multivariate Regression- Classification: Linear Discriminant
Analysis, Logistic Regression- K-Nearest Neighbor classifier. Decision Tree based methods for
classification and Regression- Ensemble methods.
UNIT V: UNSUPERVISED LEARNING(9HRS)
Clustering- K-Means clustering, Hierarchical clustering - The Curse of Dimensionality -
Dimensionality Reduction - Principal Component Analysis - Probabilistic PCA- Independent
Components analysis
TOTAL: 45 HOURS
TEXT BOOKS
1. CharuC.Aggarwal “Neural Networks and Deep learning” Springer International Publishing,
2018 2. Satish Kumar, “Neural Networks, A Classroom Approach”, Tata McGraw -Hill, 2007.
3.Kevin P. Murphy, “Machine Learning: A Probabilistic Perspective”, MIT Press, 2012.
4..Stephen Marsland, “Machine Learning –An Algorithmic Perspective”, CRC Press, 2009.
5.Saikat Dutt, Subramanian Chandramouli, Amit Kumar Das, “Machine Learning”,
Pearson Education, 2018.
6.Christopher Bishop, “Pattern Recognition and Machine Learning” Springer, 2011.
REFERENCE BOOKS
1.Andreas C. Muller, “Introduction to Machine Learning with Python: A Guide for Data
Scientists”, O'Reilly,2016.
2.Sebastian Raschka, “Python Machine Learning”, Packt Publishing, 2015.
3.Hastie, Tibshirani, Friedman, “The Elements of Statistical Learning: Data Mining,
Inference, and Prediction”,2nd
Edition, Springer, 2017.
4.Ethem Alpaydin, “Introduction to Machine Learning”, 2nd Revised edition, MIT
Press,2010.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC44/ OPTIMIZATION TECHNIQUES
YEAR / SEMESTER II / IV
L T P C
3 0 0 3
COURSE OBJECTIVES:
To impart knowledge on various categories of existing engineering problems and solutions to
such problems through different optimization techniques and approaches
COURSE OUTCOMES:
At the end of the course, the students should be able to:
1. Relate key concepts and applications of various optimization techniques
2. Identify the appropriate optimization technique for the given problem
3. Formulate appropriate objective functions and constraints to solve real life optimization
problems
UNIT I INTRODUCTION (9HRS)
Statement of an optimization problems – classification of optimization problem –
classical optimization techniques; Single variable optimizations, Multi variable
optimization, equality constrainst, inequality constraints, No constraints.
UNIT II LINEAR PROGRAMMING (9HRS)
Graphical method for two dimensional problems – central problems of Linear Programming –
Definitions – Simples – Algorithm – Phase I and II of simplex Method – Revised Simplex
Method.Simplex Multipliers – Dual and Primal – Dual Simplex Method – Sensitivity Analysis
– Transportation problem and its solution – Assignment problem and its solution –
Assignment problem and its solution by Hungarian method – Karmakar’s method – statement,
Conversion of the Linear Programming problem into the required form, Algorithm.
UNIT III NON LINEAR PROGRAMMING (9HRS)
NON LINEAR PROGRAMMING (ONE DIMENSIONAL MINIMIZATION: Introduction –
Unrestricted search – Exhaustive search – interval halving method – Fibonacci method. NON
LINEAR PROGRAMMING : (UNCONSTRAINED OPRIMIZATION): - Introduction
– Random search method – Uni variate method – Pattern search methods – Hooke and jeeves
method, simplex method- Gradient of a function – steepest descent method – Conjugate
gradient method. NON LINEAR PROGRAMMING – (CONSTRAINED OPTIMIZATION):
Introduction – Characteristics of the problem – Random search method – Conjugate gradient
method.
UNIT IV DYNAMIC PROGRAMMING (9HRS)
Introduction – multistage decision processes – Principles of optimality – Computation
procedures.
UNIT V DECISIOIN MAKING (9HRS)
Decisions under uncertainty, under certainty and under risk – Decision trees – Expected value
of perfect information and imperfect information.
TOTAL: 45 HOURS
TEXT BOOKS:
1. Kalynamoy Deb, “Optimization for Engineering Design, Alogorithms and
Examples”, Prentice Hall, 2012.
2. Hamdy A Taha, “Operations Research – An introduction”, Pearson Education ,2017
REFERENCE BOOKS:
1. Hillier / Lieberman, “Introduction to Operations Research”, Tata McGraw Hill
Publishing company Ltd, 2002.
2. Singiresu S Rao, “Engineering optimization Theory and Practice”, New Age
International, 1996.
3. Mik Misniewski, “Quantitative Methods for Decision makers”, MacMillian Press
Ltd., 1994.
4. Kambo N S, “Mathematical Programming Techniques”, Affiliated East – West press, 1991.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC45/ BIG DATA ANALYTICS
YEAR / SEMESTER II/ IV
L T P C
3 0 0 3
COURSE OBJECTIVES:
To optimize business decisions and create competitive advantage with Big Data analytics
 To explore the fundamental concepts of big data analytics.
 To learn to analyze the big data using intelligent techniques.
 To understand the various search methods and visualization techniques.
 To learn to use various techniques for mining data stream.
 To understand the applications using Map Reduce Concepts.
 To introduce programming tools PIG & HIVE in Hadoop echo system
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Work with big data platform and explore the big data analytics techniques business
applications.
2. Design efficient algorithms for mining the data from large volumes.
3. Analyze the HADOOP and Map Reduce technologies associated with big data analytics.
4. Explore on Big Data applications Using Pig and Hive.
5. Understand the fundamentals of various big data analytics techniques.
6. Build a complete business data analytics solution
UNIT-I INTRODUCTION TO BIG DATA( 9HRS)
Introduction to Big Data Platform – Challenges of Conventional Systems - Intelligent data
analysis – Nature of Data - Analytic Processes and Tools - Analysis vs Reporting.
UNIT-II MINING DATA STREAMS (9HRS)
Introduction To Streams Concepts – Stream Data Model and Architecture - Stream Computing -
Sampling Data in a Stream – Filtering Streams – Counting Distinct Elements in a Stream –
Estimating Moments – Counting Oneness in a Window – Decaying Window - Real time
Analytics Platform(RTAP) Applications - Case Studies - Real Time Sentiment Analysis- Stock
Market Predictions.
UNIT-III HADOOP(9HRS)
History of Hadoop- the Hadoop Distributed File System – Components of Hadoop Analysing
the Data with Hadoop- Scaling Out- Hadoop Streaming- Design of HDFS-Java interfaces to
HDFS Basics- Developing a Map Reduce Application-How Map Reduce Works-Anatomy of a
Map Reduce Job run-Failures-Job Scheduling-Shuffle and Sort – Task execution - Map Reduce
Types and Formats- Map Reduce FeaturesHadoop environment.
UNIT-IV FRAMEWORKS (9HRS)
Applications on Big Data Using Pig and Hive – Data processing operators in Pig – Hive services
– HiveQL – Querying Data in Hive - fundamentals of HBase and ZooKeeper - IBM InfoSphere
BigInsights and Streams.
UNIT-V PREDICTIVE ANALYTICS (9HRS)
Predictive Analytics -Simple linear regression- Multiple linear regression- Interpretation
of regression coefficients. Visualizations - Visual data analysis techniques- interaction
techniques - Systems and applications
TEXT BOOK
1. Michael Berthold, David J. Hand, “Intelligent Data Analysis”, Springer, 2007.
2. Tom White “Hadoop: The Definitive Guide” Third Edition, O’reilly Media, 2012.
References:
1. Chris Eaton, Dirk DeRoos, Tom Deutsch, George Lapis, Paul Zikopoulos, “Understanding Big
Data: Analytics for Enterprise Class Hadoop and Streaming Data”, McGrawHill Publishing,
2012.
2. Anand Rajaraman and Jeffrey David Ullman, “Mining of Massive Datasets”, CUP, 2012.
3. Bill Franks, “Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams
with Advanced Analytics”, John Wiley& sons, 2012.
4. Glenn J. Myatt, “Making Sense of Data”, John Wiley & Sons, 2007.
5. Pete Warden, “Big Data Glossary”, O’Reilly, 2011. 8. Jiawei Han, Micheline Kamber “Data
Mining Concepts and Techniques”, 2 nd Edition, Elsevier, Reprinted 2008.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC4PA / MACHINE LEARNING LAB
YEAR / SEMESTER II / IV
L T P C
0 0 4 2
COURSE OBJECTIVES:
 Make use of Data sets in implementing the machine learning algorithms
 Implement the machine learning concepts and algorithms in any suitable language of
choice.
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Understand the implementation procedures for the machine learning algorithms.
2. Design Java/Python programs for various Learning algorithms.
3. Applyappropriate data sets to the Machine Learning algorithms.
4. Identify and apply Machine Learning algorithms to solve real world problems.
LIST OF EXPERIMENTS
1. Implement and demonstratethe FIND-Salgorithm for finding the most specific hypothesis
based on a given set of training data samples. Read the training data from a .CSV file.
2. For a given set of training data examples stored in a .CSV file, implement and demonstrate
the Candidate-Elimination algorithmto output a description of the set of all hypotheses
consistent with the training examples.
3. Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use
an appropriate data set for building the decision tree and apply this knowledge toclassify a new
sample.
4. Build an Artificial Neural Network by implementing the Backpropagationalgorithm and test
the same using appropriate data sets.
5. Write a program to implement the naïve Bayesian classifier for a sample training data set
stored as a .CSV file. Compute the accuracy of the classifier, considering few test data sets.
6. Assuming a set of documents that need to be classified, use the naïve Bayesian Classifier
model to perform this task. Built-in Java classes/API can be used to write the program.
Calculate the accuracy, precision, and recall for your data set.
7. Write a program to construct a Bayesian network considering medical data. Use this model to
demonstrate the diagnosis of heart patients using standard Heart Disease Data Set. You can use
Java/Python ML library classes/API.
8. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set for
clustering using k-Means algorithm. Compare the results of these two algorithms and comment
on the quality of clustering. You can add Java/Python ML library classes/API in the program.
9. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set.
Print both correct and wrong predictions. Java/Python ML library classes can be used for this
problem. 10. Implement the non-parametric Locally Weighted Regression algorithm in order to
fit data points. Select appropriate data set for your experiment and draw graphs.
TOTAL : 60 HOURS
REFERENCES
: 1. Christopher Bishop, “Pattern Recognition and Machine Learning” Springer, 2007.
2. Stephen Marsland, “Machine Learning – An Algorithmic Perspective”, Chapman andHall,
CRC Press, Second Edition, 2014.
3. Kevin P. Murphy, “Machine Learning: A Probabilistic Perspective”, MIT Press, 2012.
4. Ethem Alpaydin, “Introduction to Machine Learning”, MIT Press, Third Edition, 2014.
5. Tom Mitchell, “Machine Learning”, McGraw-Hill, 1997.
PROGRAM NAME B.SC-DATA SCIENCE
COURSE CODE / NAME UDASC4PB / BIG DATA ANALYTICS LAB
YEAR / SEMESTER II / IV
L T P C
0 0 4 2
COURSE OBJECTIVES:
 Optimize business decisions and create competitive advantage with Big Data analytics
 Imparting the architectural concepts of Hadoop and introducing map reduce paradigm
 Introducing Java concepts required for developing map reduce programs
 Derive business benefit from unstructured data Introduce programming tools PIG &
HIVE in Hadoop echo system.
 Developing Big Data applications for streaming data using Apache Spark
COURSE OUTCOMES:
Upon completion of this course, the students should be able to:
1. Preparing for data summarization, query, and analysis.
2. Applying data modelling techniques to large data sets
3. Creating applications for Big Data analytics
4. Building a complete business data analytic solution
LIST OF EXPERIMENTS
1.(i)Perform setting up and Installing Hadoop in its two operating modes: Pseudo
distributed,Fully distributed.
(ii) Use web based tools to monitor your Hadoop setup.
2.(i) Implement the following file management tasks in Hadoop: Adding files and directories
Retrieving files Deleting files
[ii) Benchmark and stress test an Apache Hadoop cluster
3.Run a basic Word Count Map Reduce program to understand Map Reduce Paradigm.
Find the number of occurrence of each word appearing in the input file(s)
Performing a MapReduce Job for word search count (look for specific keywords in a file
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus
Amet University-  B.Sc Data Science Syllabus

More Related Content

Similar to Amet University- B.Sc Data Science Syllabus

Cse%20 r16 syllabus
Cse%20 r16 syllabusCse%20 r16 syllabus
Cse%20 r16 syllabussree kanth
 
Cse revised syllabus-updated on 20-07-2017 (1)
Cse revised syllabus-updated on 20-07-2017 (1)Cse revised syllabus-updated on 20-07-2017 (1)
Cse revised syllabus-updated on 20-07-2017 (1)jagadeesh matlab
 
anna university syllabus regulation 2013
anna university syllabus regulation 2013anna university syllabus regulation 2013
anna university syllabus regulation 2013Wisely Britto
 
files_1570175665_204715750.pdf
files_1570175665_204715750.pdffiles_1570175665_204715750.pdf
files_1570175665_204715750.pdfbeherapravat936
 
Discrete-Mathematics syllabus sample.docx
Discrete-Mathematics syllabus sample.docxDiscrete-Mathematics syllabus sample.docx
Discrete-Mathematics syllabus sample.docxLaizaMaeRodriguezAgn
 
Syllabus for Bachelors in Engineering Information Science
Syllabus for Bachelors in Engineering Information ScienceSyllabus for Bachelors in Engineering Information Science
Syllabus for Bachelors in Engineering Information Sciencesyed qutubuddin
 
Data Structure Syllabus.pdf
Data Structure Syllabus.pdfData Structure Syllabus.pdf
Data Structure Syllabus.pdfMarvin158667
 
Analysis of social interactions and prediction of assignment grades in a Mass...
Analysis of social interactions and prediction of assignment grades in a Mass...Analysis of social interactions and prediction of assignment grades in a Mass...
Analysis of social interactions and prediction of assignment grades in a Mass...eMadrid network
 
Ece 1322 programming_for_engineers_s1_201213(1)
Ece 1322 programming_for_engineers_s1_201213(1)Ece 1322 programming_for_engineers_s1_201213(1)
Ece 1322 programming_for_engineers_s1_201213(1)Minda Kronik
 
vtu data structures lab manual bcs304 pdf
vtu data structures lab manual bcs304 pdfvtu data structures lab manual bcs304 pdf
vtu data structures lab manual bcs304 pdfLPSChandana
 
313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptxsameernsn1
 
Eee r16 syllabus
Eee r16 syllabusEee r16 syllabus
Eee r16 syllabussakru naik
 
Cs8581 networks lab manual 2017
Cs8581 networks lab manual   2017Cs8581 networks lab manual   2017
Cs8581 networks lab manual 2017Kayathri Devi D
 

Similar to Amet University- B.Sc Data Science Syllabus (20)

Cse%20 r16 syllabus
Cse%20 r16 syllabusCse%20 r16 syllabus
Cse%20 r16 syllabus
 
Cse revised syllabus-updated on 20-07-2017 (1)
Cse revised syllabus-updated on 20-07-2017 (1)Cse revised syllabus-updated on 20-07-2017 (1)
Cse revised syllabus-updated on 20-07-2017 (1)
 
anna university syllabus regulation 2013
anna university syllabus regulation 2013anna university syllabus regulation 2013
anna university syllabus regulation 2013
 
8th sem (1)
8th sem (1)8th sem (1)
8th sem (1)
 
III-1ece.pdf
III-1ece.pdfIII-1ece.pdf
III-1ece.pdf
 
Course plan mpmc
Course plan  mpmcCourse plan  mpmc
Course plan mpmc
 
files_1570175665_204715750.pdf
files_1570175665_204715750.pdffiles_1570175665_204715750.pdf
files_1570175665_204715750.pdf
 
Discrete-Mathematics syllabus sample.docx
Discrete-Mathematics syllabus sample.docxDiscrete-Mathematics syllabus sample.docx
Discrete-Mathematics syllabus sample.docx
 
Lec 01 introduction
Lec 01   introductionLec 01   introduction
Lec 01 introduction
 
4200 (1).pdf
4200 (1).pdf4200 (1).pdf
4200 (1).pdf
 
Syllabus for Bachelors in Engineering Information Science
Syllabus for Bachelors in Engineering Information ScienceSyllabus for Bachelors in Engineering Information Science
Syllabus for Bachelors in Engineering Information Science
 
Data Structure Syllabus.pdf
Data Structure Syllabus.pdfData Structure Syllabus.pdf
Data Structure Syllabus.pdf
 
DTCP2023 Fundamentals of Programming
DTCP2023 Fundamentals of ProgrammingDTCP2023 Fundamentals of Programming
DTCP2023 Fundamentals of Programming
 
Analysis of social interactions and prediction of assignment grades in a Mass...
Analysis of social interactions and prediction of assignment grades in a Mass...Analysis of social interactions and prediction of assignment grades in a Mass...
Analysis of social interactions and prediction of assignment grades in a Mass...
 
Ece 1322 programming_for_engineers_s1_201213(1)
Ece 1322 programming_for_engineers_s1_201213(1)Ece 1322 programming_for_engineers_s1_201213(1)
Ece 1322 programming_for_engineers_s1_201213(1)
 
vtu data structures lab manual bcs304 pdf
vtu data structures lab manual bcs304 pdfvtu data structures lab manual bcs304 pdf
vtu data structures lab manual bcs304 pdf
 
313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx313 IDS _Course_Introduction_PPT.pptx
313 IDS _Course_Introduction_PPT.pptx
 
Eee r16 syllabus
Eee r16 syllabusEee r16 syllabus
Eee r16 syllabus
 
21AI401 AI Unit 1.pdf
21AI401 AI Unit 1.pdf21AI401 AI Unit 1.pdf
21AI401 AI Unit 1.pdf
 
Cs8581 networks lab manual 2017
Cs8581 networks lab manual   2017Cs8581 networks lab manual   2017
Cs8581 networks lab manual 2017
 

Recently uploaded

Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 

Recently uploaded (20)

Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 

Amet University- B.Sc Data Science Syllabus

  • 2. SEMESTER 1 S.No Course Code Course Title Contact Hours L T P C M THEORY 1. UDASC01 Communicative English 2 2 0 0 2 100 2. UDASC02 Linear Algebra & Calculus 4 3 1 0 4 100 3. UDASC03 Computer Architecture 3 3 0 0 3 100 4. UDASC04 Problem Solving and Programming using C 3 3 0 0 3 100 5. UDASC05 Digital System Design 4 3 1 0 4 100 6. UDASC06 Ethics and Human Values 2 2 0 0 2 100 PRACTICAL 7. UDASC1PA Problem Solving using C Laboratory 4 0 0 4 2 100 8. UDASC1PB Communicative skills and Language Laboratory 2 0 0 4 1 100 TOTAL 24 16 2 8 21
  • 3. SEMESTER 2 S.No Course Code Course Title Contact Hours L T P C M THEORY 1. UDASC21 Principles of Data Science 3 3 0 0 3 100 2. UDASC22 Fundamentals of Statistics 4 3 1 0 4 100 3. UDASC23 Operating Systems 3 3 0 0 3 100 4. UDASC24 Database Management System 3 3 0 0 3 100 5. UDASC25 Computer Networks 3 3 0 0 3 100 PRACTICAL 6. UDASC2PA Statistics and Data Science Lab 4 0 0 4 2 100 7. UDASC2PB Database Management System (DBMS) Laboratory 4 0 0 4 2 100 8. UDASC2PC Operating Systems and Networks Laboratory 4 0 0 4 2 100 TOTAL 28 15 1 12 22
  • 4. SEMESTER 3 S.No Course Code Course Title Contact Hours L T P C M THEORY 1. UDASC31 Probability Theory 3 3 0 0 3 100 2. UDASC32 Cloud Computing 3 3 0 0 3 100 3. UDASC33 Advanced Database Technologies 3 3 0 0 3 100 4. UDASC34 Web Programming 3 3 0 0 3 100 5. UDASC35 Data Mining 3 3 0 0 3 100 6. UDASC36 Operation Research 3 3 0 0 3 100 PRACTICAL 7. UDASC3PA Data Mining Lab 4 0 0 4 2 100 8. UDASC3PB Cloud Computing and Web Programming Lab 4 0 0 4 2 100 TOTAL 26 18 0 8 22
  • 5. SEMESTER 4 S.No Course Code Course Title Contact Hours L T P C M THEORY 1. UDASC41 Data Handling and Visualization 3 3 0 0 3 100 2. UDASC42 Machine Learning 3 3 0 0 3 100 3. PEC1 3 3 0 0 3 100 4. UDASC44 Optimization Techniques 3 3 0 0 3 100 5. UDASC45 Big Data Analytics 3 3 0 0 3 100 PRACTICAL 6 UDASC4PA Machine Learning Lab 4 0 0 4 2 100 7. UDASC4PB Big Data Analytics Lab 4 0 0 4 2 100 8. UDASC4PC Data Handling and Visualization lab 4 0 0 4 2 100 TOTAL 27 15 0 12 21
  • 6. SEMESTER 5 S.No Course Code Course Title Contact Hours L T P C M THEORY 1. UDASC51 Deep Learning 3 3 0 0 3 100 2. UDASC52 Natural Language Processing 3 3 0 0 3 100 3. PEC 2 3 3 0 0 3 100 4. PEC 3 3 3 0 0 3 100 PRACTICAL 5. UDASC5PA Deep Learning Lab 4 0 0 4 2 100 6 UDASC5PB Natural Language processing Lab 4 0 0 4 2 100 7. UDASC5PC Phase I Project 6 0 0 6 6 100 TOTAL 26 12 0 14 22
  • 7. SEMESTER 6 S.No Course Code Course Title Contact Hours L T P C M THEORY/PRACTICAL 1. UDASC61 Stream Processing Analytics 3 3 0 0 3 100 2. UDASC62 PEC 4 3 3 0 0 3 100 3. UDASC63 PEC 5 3 3 0 0 3 100 4. UDASC6P Phase II project 12 0 0 12 12 100 TOTAL 21 9 0 12 21
  • 8. LIST OF ELECTIVE COURSES Sl. No. Course Code Course Title Contact Hours L T P C M PROGRAM ELECTIVE COURSE-1 1. UDASC46 Cloud Services for IOT 3 3 0 0 3 100 2 UDASC47 Business Analytics 3 3 0 0 3 100 3 UDASC48 Business Intelligence 3 3 0 0 3 100 4 UDASC49 Intelligent Database System 3 3 0 0 3 100 5 UDASC50 Digital Marketing Analytics 3 3 0 0 3 100 6 UDASC55 Internet of Things 3 3 0 0 3 100 PROGRAM ELECTIVE COURSE-2 & 3 1 UDASC53 Augmented Reality &Virtual reality 3 3 0 0 3 100 2 UDASC54 Linux Programming 3 3 0 0 3 100 3 UDASC57 Image Processing and Analysis 3 3 0 0 3 100 4 UDASC58 Health Care Analytics 3 3 0 0 3 100 5 UDASC59 Data mining using R 3 3 0 0 3 100 6 UDASC60 Text Analytics 3 3 0 0 3 100
  • 9. PROGRAM ELECTIVE COURSE-4 & 5 1 UDASC62 High-Dimensional Data Analysis 3 3 0 0 3 100 2 UDASC65 Cyber Forensic analytics 3 3 0 0 3 100 3 UDASC66 Social Network Analytics 3 3 0 0 3 100 4 UDASC67 IoT cloud and data analytics 3 3 0 0 3 100 5 UDASC68 Predictive Modeling Analysis 3 3 0 0 3 100
  • 10. SEMESTER-I PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC01/ COMMUNICATIVE ENGLISH YEAR / SEMESTER I / I L T P C 2 0 0 2 COURSE OBJECTIVES: 1. To make the students learn to speak grammatically correct English. Guiding and supporting their skill development –Listening, speaking, reading and writing in English. 2. Making them realize the importance of English as Global language and its importance in today‘s scenario COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Outline the importance of communication skill. 2. Illustrate technical and general vocabulary. 3. Distinguish different tenses and identification of common errors 4. Infer the skill for writing formal and informal letters 5. Develop good listening and speaking skills 6. Apply the skills to speak and write English grammatically UNIT I INTRODUCTION (6HRS) Listening – short texts – formal and informal conversations - Speaking –basics in speaking– speaking on given topics & situations –recording speeches and strategies to improve-Reading– critical eading–finding key information in a given text – shifting facts from opinions - Writing – freewritingonanygiventopic–autobiographicalwriting-LanguageDevelopment–tenses–voices- wordformation: prefixes and suffixes– parts of speech– developing hints UNIT II READING AND LANGUAGEDEVELOPMENT(6HRS) Listening - long texts - TED talks - extensive speech on current affairs and discussions-
  • 11. Speaking–describing a simple process–asking and answering questions - Reading comprehension – skimming / scanning / predicting &analytical reading–question & answers– objective and descriptive answers –identifyingsynonymsandantonyms-processdescription- Writinginstructions – Language Development – writing definitions – compound words- articles–prepositions. UNIT III SPEAKING AND INTERPRETATION SKILLS (6HRS) Listening-dialogues & conversations-Speaking–role plays–asking about routine actions and expressing opinions - Reading longer texts & making a critical analysis of the given text - Writing – types of paragraph and writing essays – rearrangement of jumbled sentences - writing recommendations –Language Development–use of sequence words-cause & effect expressions -sentences expressing purpose-picture based and news paper based activities– single word substitutes. UNIT IV VOCABULARY BUILDING AND WRITING SKILLS (6HRS) Listening-debates and discussions–practicing multiple tasks–self introduction – Speaking about friends/places/hobbies - Reading –Making inference from the reading passage – Predicting the content of the reading passage - Writing – informal letters/e-mails - Language Development -synonyms &antonyms - conditionals – if, unless, in case, when and others – framing questions. UNIT-V LANGUAGE DEVELOPMENT AND TECHNICAL WRITING (6HRS) Listening - popular speeches and presentations -Speaking – impromptu speeches & debates - Reading - articles – magazines/newspapers Writing –essay writing on technical topics- channel conversion–bar diagram/graph–picture interpretation-process description-Language Development–modal verbs-fixed/semi-fixed expressions–collocations. TOTAL:30 Hours TEXT BOOKS: 1. Board of Editors. Using English: A Course book for Undergraduate Engineers and Technologists. Orient Black swan Limited, Hyderabad:2018 2. Dhanavel, S.P. English and Communication Skills for Students of Science and Engineering. Orient Black swan, Chennai,2011.
  • 12. REFERENCE BOOKS: 1. Anderson, Paul V.Technical Communication :A Reader Centered Approach. Cengage, NewDelhi,2008. 2. Smith Worthington, Darlene& SueJefferson. Technical Writing for Success. Cengage, Mason, USA,2007. 3. Grussendorf, Marion, English for Presentations, Oxford University Press, Oxford,2007. 4. Chauhan, Gajendra Singhandet.al. Technical Communication (Latest Revised Edition). Cengage LearningIndiaPvt.Limited,2018.
  • 13. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC02/ LINEAR ALGEBRA & CALCULUS YEAR / SEMESTER I / I L T P C 3 1 0 4 COURSE OBJECTIVES: This course introduces students to some basic mathematical ideas and tools which are at the core of any engineering course. A brief course in Linear Algebra familiarises students with some basic techniques in matrix theory which are essential for analysing linear systems. The calculus of functions of one or more variables taught in this course are useful in modelling and analysing physical phenomena involving continuous change of variables or parameters and have applications across all branches. COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Apply the Matrix Methods to solve the system of linear equations 2. Test the convergence and divergence of the infinite Series. 3. Determine the extreme values of functions of two variables. 4. Apply the vector differential operator to scalar and vector functions . 5.Solve line, surface & volume integrals by Greens, Gauss and Stoke’s theorems. UNIT-I Matrices: (12HRS) Rank of a matrix, Echelon form, consistency of linear System of equations, Linear dependence of vectors, Eigen values, Eigenvectors, Properties of Eigen values, Cayley-Hamilton theorem, Quadratic forms, Reduction of quadratic form to canonical form by linear transformation, Nature of quadratic form. UNIT-II Infinite Series: (12HRS) Definition of Convergence of sequence and series. Series of positive terms –Necessary condition for convergence, Comparison tests, limit form comparison test, D’Alembert’s Ratio test, Raabe’s test, Cauchy’s root test, alternating series, Leibnitz’s rule, absolutely and conditionally convergence.
  • 14. UNIT-III Partial Differentiation and Its Applications: (12HRS) Functions of two or more variables, Partial derivatives, Higher order partial derivatives, Total derivative, Differentiation of implicit functions, Jacobians, Taylor’s expansion of functions of two variables, Maxima and minima of functions of two variables. UNIT-IV Vector Differential Calculus: (12HRS) Scalar and vector point functions, vector operator Del, Gradient, Directional derivative, Divergence, Curl, Del applied twice to point functions, Del applied to product of point functions (vector identities). Applications: Irrotational fields and Solenoidal fields. UNIT-V Vector Integral Calculus: (12HRS) Line integral, Surface integral and Volume integral. Green’s theorem in the plane, verifications of Stroke’s theorem (without proof) and Gauss’s divergence theorem(without proof). TOTAL-60 Hours TEXT BOOKS: 1. B.S. Grewal, Higher Engineering Mathematics, Khanna Publishers, 44th Edition, 2017. 2. Erwin kreyszig, Advanced Engineering Mathematics, 10th Edition, John Wiley & Sons, 2010. 3. Ramana B.V., Higher Engineering Mathematics, Tata McGraw Hill New Delhi, Reprint, 2017. 4. Kreyszig Erwin, "Advanced Engineering Mathematics ", John Wiley and Sons, 10th Edition, New Delhi, 2016. REFERENCE BOOKS 1. Sastry, S.S, ―Engineering Mathema[cs", Vol. I & II, PHI Learning Pvt. Ltd, 4th Edition, New Delhi, 2014. 2. Wylie, R.C. and Barre, L.C., ―Advanced Engineering Mathema[cs ―Tata McGraw Hill Education Pvt. Ltd, 6th Edition, New Delhi, 2012. 3. Dean G. Duffy., “Advanced Engineering Mathematics with MATLAB”, CRC Press, Third Edition 2013
  • 15. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC03/ COMPUTER ARCHITECTURE YEAR / SEMESTER I / I L T P C 3 0 0 3 COURSE OBJECTIVES:  To learn the basic structure and operations of a computer.  To learn the arithmetic and logic unit and implementation of fixed-point and floating point arithmetic unit.  To learn the basics of pipelined execution.  To understand parallelism and multi-core processors.  To understand the memory hierarchies, cache memories and virtual memories.  To learn the different ways of communication with I/O devices. COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Understand the basics structure of computers, operations and instructions. 2. Design arithmetic and logic unit. 3. Understand pipelined execution and design control unit. 4. Understand parallel processing architectures. 5. Understand the various memory systems and I/O communication. UNIT I BASIC STRUCTURE OF A COMPUTER SYSTEM (9HRS) Functional Units – Basic Operational Concepts – Performance – Instructions: Language of the Computer – Operations, Operands – Instruction representation – Logical operations – decision making – MIPS Addressing. UNIT II ARITHMETIC FOR COMPUTERS (9HRS) Addition and Subtraction – Multiplication – Division – Floating Point Representation – Floating Point Operations – Subword Parallelism UNIT III PROCESSOR AND CONTROL UNIT (9HRS)
  • 16. A Basic MIPS implementation – Building a Datapath – Control Implementation Scheme – Pipelining – Pipelined datapath and control – Handling Data Hazards & Control Hazards – Exceptions. UNIT IV PARALLELISIM (9HRS) Parallel processing challenges – Flynn‘s classification – SISD, MIMD, SIMD, SPMD, and Vector Architectures - Hardware multithreading – Multi-core processors and other Shared Memory Multiprocessors - Introduction to Graphics Processing Units, Clusters, Warehouse Scale Computers and other Message-Passing Multiprocessors. UNIT V MEMORY & I/O SYSTEMS (9HRS) Memory Hierarchy - memory technologies – cache memory – measuring and improving cache performance – virtual memory, TLB‘s – Accessing I/O Devices – Interrupts – Direct Memory Access – Bus structure – Bus operation – Arbitration – Interface circuits - USB. TOTAL : 45 Hours TEXT BOOKS: 1. David A. Patterson and John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, Fifth Edition, Morgan Kaufmann / Elsevier, 2014. 2. Carl Hamacher, Zvonko Vranesic, Safwat Zaky and Naraig Manjikian, Computer Organization and Embedded Systems, Sixth Edition, Tata McGraw Hill, 2012. REFERENCE BOOKS: 1. William Stallings, Computer Organization and Architecture – Designing for Performance, Eighth Edition, Pearson Education, 2010. 2. John P. Hayes, Computer Architecture and Organization, Third Edition, Tata McGraw Hill, 2012. 3. John L. Hennessey and David A. Patterson, Computer Architecture – A Quantitative Approach‖, Morgan Kaufmann / Elsevier Publishers, Fifth Edition, 2012.
  • 17. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC04/ PROBLEM SOLVING AND PROGRAMMING USING C YEAR / SEMESTER I / I L T P C 3 0 0 3 COURSE OBJECTIVES:  To acquire problem solving skills  To be able to develop flowcharts  To understand structured programming concepts  To be able to write programs in C Language COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Apply appropriate looping and conditional constructs for given problems 2. Use pointers, arrays and strings to solve complex problems 3. Use Structures, unions and files for problem solving 4. Apply problem solving techniques to real world problems 5. Make use of functions to build modular programming UNIT I –PROBLEM SOLVING FUNDAMENTALS (9HRS) Introduction to problem solving - Flow Chart, Algorithm, Pseudo code - Procedural Programming (Modular and Structural)- Program Compilation, Execution, Debugging, Testing – Preprocessors -Basic features of C, Structure of C program - Data types- Storage Classes-Tokens in C- Input and Output Statements inC, Operators- Bitwise, Unary, Binary and Ternary Operators, Precedence and Associativity –Expression Evaluation UNIT II – CONDITIONAL STATEMENTS AND LOOPING CONSTRUCTS (9HRS) Problem solving using Conditional or Selection or Branching Statements: Structure of if, if-else, else-if ladder, nested-if, switch constructs - Looping constructs: Structure of for, while, do-while constructs, usage of break, return, go to and continue keywords UNIT III – ARRAYS AND STRINGS (9HRS)
  • 18. 1D Array –Declaration, Initialization, 2DArray - Declaration, Initialization, Multi-dimensional Arrays Strings: Declaration, Initialization, String operations: length, compare, concatenate, copy UNIT IV – FUNCTIONS AND POINTERS (9HRS) Functions: Built-in Functions, User defined functions – Function Prototypes –Recursion – Command Line Argument -Arrays and Functions – Strings and Functions. Pointers: Declaration – Pointer operators – Pointer Arithmetic-Passing Pointers to a function-Pointers and one- dimensional arrays-Dynamic memory allocation. UNIT V – STRUCTURES, UNION AND FILE HANDLING (9HRS) Structure: Create a Structure-Member initialization - Accessing Structure Members - Nested structures– Pointer and Structures – Array of structures -Self Referential Structures – type def- Unions, Files –Opening and Closing a Data File, Reading and writing a data file. TOTAL : 45 Hours TEXT BOOKS: 1.Jeyapoovan T, “Fundamentals of Computing and Programming in C”, Vikas Publishing house, 2015 2. Mark Siegesmund, "Embedded C Programming", first edition, Elsevier publications, 2014. REFERENCE BOOKS 1. Ashok Kamthane, “Computer Programming”, Pearson Education, 7th Edition, Inc 2017. 2. Yashavant Kanetkar, “Let us C”, 15th edition, BPP publication, 2016. 3. S.Sathyalakshmi, S.Dinakar, “Computer Programming Practicals – Computer Lab Manual”, Dhanam Publication, First Edition, July 2013
  • 19. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC05/ DIGITAL SYSTEM DESIGN YEAR / SEMESTER I / I L T P C 3 1 0 4 COURSE OBJECTIVES:  To design digital circuits using simplified Boolean functions  To analyze and design combinational circuits  To analyze and design synchronous and asynchronous sequential circuits  To understand Programmable Logic Devices  To write HDL code for combinational and sequential circuits COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Explain the fundamentals of number system , Codes and digital logic families 2. Develop combinational circuits. 3. Design synchronous sequential circuits using flip-flops. 4. Demonstrate Asynchronous Sequential circuits and Programmable Logic Devices. 5. Apply simulation tools for designing digital logic circuits. UNIT I BOOLEAN ALGEBRA AND LOGIC GATES(12HRS) Number Systems – Arithmetic Operations – Binary Codes- Boolean Algebra and Logic Gates – Theorems and Properties of Boolean Algebra – Boolean Functions – Canonical and Standard Forms – Simplification of Boolean Functions using Karnaugh Map – Logic Gates – NAND and NOR Implementations. UNIT II COMBINATIONAL LOGIC (12HRS) Combinational Circuits – Analysis and Design Procedures – Binary Adder-Subtractor – Decimal Adder – Binary Multiplier – Magnitude Comparator – Decoders – Encoders – Multiplexers – Introduction to HDL – HDL Models of Combinational circuits. UNIT III SYNCHRONOUS SEQUENTIAL LOGIC (12HRS) Sequential Circuits – Storage Elements: Latches , Flip-Flops – Analysis of Clocked Sequential Circuits – State Reduction and Assignment – Design Procedure – Registers and Counters – HDL
  • 20. Models of Sequential Circuits. UNIT IV ASYNCHRONOUS SEQUENTIAL LOGIC(12HRS) Analysis and Design of Asynchronous Sequential Circuits – Reduction of State and Flow Tables – Race-free State Assignment – Hazards. UNIT V MEMORY AND PROGRAMMABLE LOGIC (12HRS) RAM – Memory Decoding – Error Detection and Correction – ROM – Programmable Logic Array – Programmable Array Logic – Sequential Programmable Devices. TOTAL : 60 Hours TEXT BOOKS: 1.M. Morris R. Mano, Michael D. Ciletti, ―Digital Design: With an Introduction to the Verilog HDL, VHDL, and SystemVerilog‖, 6th Edition, Pearson Education, 2017. REFERENCE BOOKS: 1. G. K. Kharate, Digital Electronics, Oxford University Press, 2010 2. John F. Wakerly, Digital Design Principles and Practices, Fifth Edition, Pearson Education, 2017. 3. Charles H. Roth Jr, Larry L. Kinney, Fundamentals of Logic Design, Sixth Edition, CENGAGE Learning, 2013 4. Donald D. Givone, Digital Principles and Design‖, Tata Mc Graw Hill, 2003.
  • 21. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC06/ ETHICS AND HUMAN VALUES YEAR / SEMESTER I / I L T P C 2 0 0 2 COURSE OBJECTIVES: To enable the students to create an awareness on Engineering Ethics and Human Values,to instill Moral and Social Values and Loyalty and to appreciate the rights of others. COURSE OUTCOMES: Upon completion of this course, the students should be able to:  After successful completion of the course, the student will be able to:  Apply ethics in society,discuss the ethical issues related to engineering and realize the responsibilities and rights in the society UNIT I HUMAN VALUES (6HRS) Morals, values and Ethics – Integrity – Work ethic – Service learning – Civic virtue – Respect for others – Living peacefully – Caring – Sharing – Honesty – Courage – Valuing time – Cooperation – Commitment – Empathy – Self-confidence – Character – Spirituality – Introduction to Yoga and meditation for professional excellence and stress management. UNIT II ENGINEERING ETHICS (6HRS) Senses of ‘Engineering Ethics’ – Variety of moral issues – Types of inquiry – Moraldilemmas – Moral Autonomy – Kohlberg’s theory – Gilligan’s theory – Consensus and Controversy – Models of professional roles - Theories about right action – Self-interest –Customs and Religion – Uses of Ethical Theories UNIT III ENGINEERING AS SOCIAL EXPERIMENTATION (6HRS) Engineering as Experimentation – Engineers as responsible Experimenters – Codes ofEthics – A Balanced Outlook on Law. UNIT IV SAFETY, RESPONSIBILITIES AND RIGHTS (6HRS) Safety and Risk – Assessment of Safety and Risk – Risk Benefit Analysis and Reducing Risk -
  • 22. Respect for Authority – Collective Bargaining – Confidentiality – Conflicts of Interest – Occupational Crime – Professional Rights – Employee Rights – Intellectual Property Rights(IPR) – Discrimination UNIT V GLOBAL ISSUES (6HRS) Multinational Corporations – Environmental Ethics – Computer Ethics – Weapons Development – Engineers as Managers – Consulting Engineers – Engineers as Expert Witnesses and Advisors – Moral Leadership –Code of Conduct – Corporate Social Responsibility TOTAL: 30 HOURS TEXT BOOKS: 1. Mike W. Martin and Roland Schinzinger, “Ethics in Engineering”, Tata McGraw Hill, NewDelhi, 2017. 2. Govindarajan M, Natarajan S, Senthil Kumar V. S, “Engineering Ethics”, Prentice Hall of India, New Delhi, 2004. REFERENCE BOOKS: 1. Charles B. Fleddermann, “Engineering Ethics”, Pearson Prentice Hall, New Jersey, 2004. 2. Charles E. Harris, Michael S. Pritchard and Michael J. Rabins, “Engineering Ethics –Concepts and Cases”, Cengage Learning, 2009 3. John R Boatright, “Ethics and the Conduct of Business”, Pearson Education, New Delhi,2003
  • 23. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC1PA / PROBLEM SOLVING USING C LABORATORY YEAR / SEMESTER I / I L T P C 0 0 4 2 COURSE OBJECTIVES:  To acquire problem solving skills  To be able to develop flowcharts  To understand structured programming concepts  To be able to write programs in C Language COURSE OUTCOMES: Upon completion of this course, the students should be able to:  Solve problems using data types and operators  Apply appropriate looping and conditional constructs for given C programs  Use functions to build modular programs  Use appropriate IDE and tools to write, compile, debug and execute a C Program  Implement structures, unions and File Operations LIST OF EXPERIMENTS: 1. Problem solving design using Scratch tool 2. Conditional Statements- if-if else-else if ladder- nested if- switch 3. Looping Constructs – for – while- do-while 4. One dimensional Arrays 5. Two dimensional Arrays 6. Functions- Modular Programming 7. Pointers and arrays 8. Dynamic Memory Allocation 9. Programs to illustrate File operations 10. Structures and Union TOTAL : 60 Hours
  • 24. TEXT BOOKS: 1. Kernighan B. W. and Ritchie D. M., “C Programming Language (ANSI C)”, Prentice Hall of IndiaPrivate Limited, New Delhi, 2015. 2. Herbert Schildt, “C – The Complete Reference”, Tata McGraw Hill Publishing Company, NewDelhi, 2017. REFERENCE BOOKS: 1. Deitel and Deitel, “C How to Program”, Pearson Education, New Delhi, 2011. 2.Byron S. Gottfried and Jitendar Kumar Chhabra, “Programming with C”, Tata McGraw Hill Publishing Company,New Delhi,2011 3. PradipDey and ManasGhosh, “Programming in C”, Oxford University Press, New Delhi, 2009.
  • 25. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC1PB/ COMMUNICATIVE SKILLS AND LANGUAGE LABORATORY YEAR / SEMESTER I / I L T P C 0 0 2 1 COURSE OBJECTIVES: 1. To nuances of Phonetics and give them sufficient practice in correct pronunciation. 2. To word stress and intonation. 3. To IELTS and TOEFL material for honing their listening skills. 4. To activities enabling them overcome their inhibitions while speaking in English with the focus being on fluency rather than accuracy. 5. To team work, role behaviour while developing their ability to discuss in groups and making oral presentations. COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Define the speech sounds in English and understand the nuances of pronunciation in English 2. Apply stress correctly and speak with the proper tone, intonation and rhythm. 3. Analyze IELTS and TOEFL listening comprehension texts to enhance their listening skills. 4. Determine the context and speak appropriately in various situations. 5. Design and present effective posters while working in teams, and discuss and participate in Group discussions. LIST OF EXERCISES: 1. Introduction to English Phonetics: Introduction to auditory, acoustic and articulatory phonetics,organs of speech: the respiratory, articulatory and phonatory systems. 2. Sound system of English: Phonetic sounds and phonemic sounds, introduction to international phonetic alphabet, classification and description of English phonemic sounds, minimal pairs. The syllable: types of syllables, consonant clusters. 3. Word stress: Primary stress, secondary stress, functional stress, rules of word stress. 4. Rhythm &Intonation: Introduction to Rhythm and Intonation. Major patterns, intonation
  • 26. of English with the semantic implications. 5. Listening skills – Practice with IELTS and TOEFL material 6. Public speaking – Speaking with confidence and clarity in different contexts on various issues. 7. Group Discussions - Dynamics of a group discussion, group discussion techniques, body language. 8. Pictionary – weaving an imaginative story around a given picture. 9. Information Gap Activity – Writing a brief report on a newspaper headline by building on the hints given 10. Poster presentation – Theme, poster preparation, team work and presentation. TOTAL : 30 Hours REFERENCE BOOKS: 1. T Balasubramanian. A Textbook of English Phonetics for Indian Students, Macmillan, 2017. 2. J Sethi et al. A Practical Course in English Pronunciation (with CD), Prentice Hall India, 2013 . 3. Priyadarshi Patnaik. Group Discussions and Interviews, Cambridge University Press Pvt. Ltd.,2011 4. ArunaKoneru, Professional Speaking Skills, Oxford University Press, 2016
  • 27. SEMESTER-II PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC21/ PRINCIPLES OF DATA SCIENCE YEAR / SEMESTER I / II L T P C 3 0 0 3 COURSE OBJECTIVES: To provide strong foundation for data science and application area related to information technology and understand the underlying core concepts and emerging technologies in data science COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1.Explore the fundamental concepts of data science 2.Understand data analysis techniques for applications handling large data 3.Understand various machine learning algorithms used in data science process 4.Visualize and present the inference using various tools 5.Learn to think through the ethics surrounding privacy, data sharing and algorithmicdecision-making UNIT-1-INTRODUCTION TO DATA SCIENCE (9HRS) Definition – Big Data and Data Science Hype – Why data science – Getting Past the Hype – The Current Landscape – Who is Data Scientist? - Data Science Process Overview – Defining goals – Retrieving data – Data preparation – Data exploration – Data modeling – Presentation. UNIT-2 -BIG DATA (9HRS) Problems when handling large data – General techniques for handling large data – Case study – Steps in big data – Distributing data storage and processing with Frameworks – Case study.
  • 28. UNIT-3-MACHINE LEARNING (9HRS) Machine learning – Modeling Process – Training model – Validating model – Predicting newobservations –Supervised learning algorithms – Unsupervised learning algorithms. UNIT-4-DEEP LEARNING (9HRS) Introduction – Deep Feedforward Networks – Regularization – Optimization of Deep Learning – Convolutional Networks – Recurrent and Recursive Nets – Applications of DeepLearning. UNIT-5 - DATA VISUALIZATION (9HRS) Introduction to data visualization – Data visualization options – Filters – MapReduce – Dashboard development tools – Creating an interactive dashboard with dc.js-summary. TOTAL -45 Hours TEXT BOOKS: 1. Introducing Data Science, Davy Cielen, Arno D. B. Meysman, Mohamed Ali, ManningPublications Co., 1st edition, 2016 2. An Introduction to Statistical Learning: with Applications in R, Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Springer, 1st edition, 2013 3. Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville, MIT Press, 1st edition,2016 4. Ethics and Data Science, D J Patil, Hilary Mason, Mike Loukides, O’ Reilly, 1st edition,2018 REFERRENCE BOOKS: 1. Data Science from Scratch: First Principles with Python, Joel Grus, O’Reilly, 1st edition,2015 2. Doing Data Science, Straight Talk from the Frontline, Cathy O'Neil, Rachel Schutt, O’ Reilly, 1st edition, 2013 3. Mining of Massive Datasets, Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman,Cambridge University Press, 2nd edition, 2014
  • 29. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC22/ FUNDAMENTALS OF STATISTICS YEAR / SEMESTER I / II L T P C 3 1 0 4 COURSE OBJECTIVES: To enable the students to understand the fundamentals of statistics to apply descriptive measures and probability for data analysis. COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Understand the science of studying & analyzing numbers. 2. Identify and use various visualization tools for representing data. 3. Describe various statistical formulas. 4. Compute various statistical measures. UNIT I Statistics and Probability: (12HRS) Introduction to Statistics – Origin of Statistics, Features of Statistics, Scope of Statistics, Functions of Statics, Uses and importance of Statistics, Limitation of Statistics, Distrust of Statistics UNIT –II Collection of Data: (12HRS) Introduction to Collection of Data, Primary and Secondary Data, Methods of Collecting Primary Data, Methods of Secondary Data, Statistical Errors, Rounding off Data (Approximation). UNIT III Classification of Data Frequency Distribution:(12HRS) Introduction Classification of Data, Objectives of Classification, Methods of Classification, Ways to Classify Numerical Data or Raw Data. Tabular, Diagrammatic and Graphic Presentation of Data: Introduction to Tabular Presentation of Data, Objectives of Tabulation, Components of a Statistical Table, General Rules for the Construction of a Table, Types of Tables, Introduction to Diagrammatic Presentation of Data, Advantage and Disadvantage of Diagrammatic Presentation, Types of Diagrams, Introduction to Graphic Presentation of Data, Advantage and Disadvantage of Graphic Presentation, Types of Graphs.
  • 30. UNIT IV Measures of Central tendency: (12HRS) Introduction to Central Tendency, Purpose and Functions of Average, Characteristics of a Good Average, Types of Averages, Meaning of Arithmetic Mean, Calculation of Arithmetic Mean, Merit and Demerits of Arithmetic Mean, Meaning of Median, Calculation of Median, Merit and Demerits of Median, Meaning of Mode, Calculation of Mode, Merit and Demerits of Mode, Harmonic Mean- PropertiesMerit and Demerits. UNIT V Measures of Dispersion: (12HRS) Meaning of Dispersion, Objectives of Dispersion, Properties of a good Measure of Dispersion, Methods of Measuring Dispersion, Range Introduction, Calculation of Range , Merit and Demerits of Range, Mean Deviation, Calculation of Mean Deviation , Merit and Demerits of Mean Deviation, Standard Deviation Meaning, Calculation of Standard Deviation , Merit and Demerits of Standard Deviation, Coefficient of Variation, Calculation of Coefficient Variance, Merit and Demerits of Coefficient of Variation. TOTAL: 60 Hours TEXT BOOKS: 1. Statistics and Data Analysis, A.Abebe, J. Daniels, J.W.Mckean, December 2000. 2. Statistics, Tmt. S. EzhilarasiThiru, 2005, Government of Tamilnadu. 3. Introduction to Statistics, David M. Lane. 4. Weiss, N.A., Introductory Statistics. Addison Wesley, 1999. 5. Clarke, G.M. & Cooke, D., A Basic course in Statistics. Arnold, 1998. REFERENCE BOOKS: 1. Banfield J.(1999), Rweb: Web-based Statistical Analysis, Journal of Statistical Software. 2. Bhattacharya,G.K. and Johnson, R.A.(19977), Statistical Concepts and Methods, New York, John Wiley & Sons. E-Books/ Online learning material 1. http://onlinestatbook.com/Online_Statistics_Education.pdf 2. https://textbookcorp.tn.gov.in/Books/12/Std12-Stat-EM.pdf 3. https://3lihandam69.files.wordpress.com/2015/10/introductorystatistics.pdf
  • 31. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC23/ OPERATING SYSTEMS YEAR / SEMESTER I / II L T P C 3 0 0 3 COURSE OBJECTIVES:  To understand the basic concepts and functions of operating systems.  To understand Processes and Threads  To analyze Scheduling algorithms.  To understand the concept of Deadlocks.  To analyze various memory management schemes.  To understand I/O management and File systems.  To be familiar with the basics of Linux system and Mobile OS like iOS and Android. COURSE OUTCOMES: Upon completion of this course, the students should be able to:  Characterize the basic functions of operating systems.  Design the concepts of process management  Implement the concepts of deadlocks  Describe virtual memory and file system  Analyze the File system implementation and disk I/O technique UNIT 1 - INTRODUCTION (9HRS) Introduction ‐ Computer System Organization ‐ Computer System Architecture ‐ Computer System Structure ‐ Operating System Operations ‐ Process Management ‐ Memory Management ‐ Storage Management ‐ Distributed Systems ‐ Operating System Services ‐ User Operating System Interface ‐ System Calls ‐ Types of System calls ‐ System Programs ‐ Process Concept ‐
  • 32. Process Scheduling ‐ Operations on Processes ‐ Inter‐process Communication UNIT 2 - SCHEDULING (9HRS) Threads ‐ Overview ‐ Multithreading Models ‐ CPU Scheduling ‐ Basic Concepts ‐ Scheduling Criteria ‐ Scheduling Algorithms ‐ Thread Scheduling ‐ Multiple‐Processor Scheduling ‐ The Critical‐Section Problem ‐ Peterson's Solution ‐ Synchronization Hardware ‐ Semaphores UNIT 3 - DEADLOCKS (9HRS) System Model ‐ Deadlock Characterization ‐ Methods for handling Deadlocks ‐ Deadlock Prevention‐ Deadlock avoidance‐ Deadlock detection‐Recovery from Deadlock Storage Management ‐ Swapping‐ Contiguous Memory allocation UNIT 4 - PAGING ANDFILE SYSTEM (9HRS) Paging‐ Demand Paging ‐ Copy‐on Write ‐ Page Replacement ‐ Allocation of frames – Thrashing‐ Virtual Memory ‐File Concept ‐ Access Methods ‐ Directory and Disk Structure UNIT 5 - FILE MANAGEMENT (9HRS) File System Structure ‐ File System Implementation ‐ Directory Implementation ‐ Allocation Methods ‐ Free‐space Management – Disk Structure – Disk Attachment ‐ Disk Scheduling Disk Management ‐ Swap‐Space Management ‐ RAID Structure TOTAL: 45 Hours TEXT BOOKS 1.Abraham Silberschatz, Peter Baer Galvin and Greg Gagne, "Operating System Concepts", Eighth Edition, John Wiley & Sons (ASIA) Pvt. Ltd, 2009. REFERENCE BOOKS 1. Harvey M. Deitel, "Operating Systems", Second Edition, Pearson Education, 2002. 2. William Stallings, "Operating System", Prentice Hall of India, 4th Edition, 2003. 3.Andrew S. Tanenbaum, "Modern Operating Systems", Prentice Hall of India, 2003. E-BOOKS 1.http://www.freebookcentre.net/CompuScience/Free‐Operating‐Systems‐Books‐Download.html MOOC 1. https://www.coursera.org/learn/web‐applications‐php COURSE TITLE COMPUTER NET
  • 33. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC24/ DATABASE MANAGEMENT SYSTEM YEAR / SEMESTER I / II L T P C 3 0 0 3 COURSE OBJECTIVES: 1. To explain basic database concepts, applications, data models, schemas and instances. 2. To demonstrate the use of constraints and relational algebra operations. 3.Describe the basics of SQL and construct queries using SQL. 4. To emphasize the importance of normalization in databases. 5. To facilitate students in Database design 6. To familiarize issues of concurrency control and transaction management COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Recall the basic concepts of database systems. 2. Identify the SQL queries for a given scenario. 3. Illustrate relational database theory, and be able to write relational algebra expressions for queries. 4. Summarize the various data storage devices and types of indexes. 5. Demonstrate transaction processing and concurrency control. 6. Explain Object oriented dB, Distributed dB, XML, data warehousing and Mobile database. UNIT 1: INTRODUCTION AND CONCEPTUAL MODELING (9HRS) Introduction to File and Database systems- Database system structure – Data Models – Introduction to Network and Hierarchical Models – ER model – Relational Model – Relational Algebra and Calculus. UNIT 2: RELATIONAL MODEL (9HRS)
  • 34. SQL – Data definition- Queries in SQL- Updates- Views – Integrity and Security – Relational Database design – Functional dependencies and Normalization for Relational Databases (up to BCNF). UNIT-3: DATA STORAGE AND QUERY PROCESSING (9HRS) Record storage and Primary file organization- Secondary storage Devices- Operations on Files- HeapFile- Sorted Files- Hashing Techniques – Index Structure for files –Different types of Indexes- B-Tree - B+Tree – Query Processing. UNIT 4: TRANSACTION MANAGEMENT (9HRS) Transaction Processing – Introduction- Need for Concurrency control- Desirable properties of Transaction- Schedule and Recoverability- Serializability and Schedules – Concurrency Control – Typesof Locks- Two Phases locking- Deadlock- Recovery Techniques. UNIT 5: CURRENT TRENDS (9HRS) Object Oriented Databases – Need for Complex Data types- OO data Model- Nested relations- ComplexTypes- Inheritance Reference Types - Distributed databases- Distributed data Storage – Querying and Transformation. – Data Mining and Data Warehousing and Mobile Database. TOTAL: 45 Hours TEXT BOOKS 1.Abraham Silberschatz, Henry F. Korth and S. Sudarshan- ―Database System Concepts, seventh Edition, 2019. REFERENCE BOOKS 1.Ramez Elmasri and Shamkant B. Navathe, ―Fundamental Database Systems‖, Seventh Edition,Pearson Education,2016. 2.Raghu Ramakrishnan, ―Database Management System, Tata McGraw-Hill Publishing Company, Third Edition, 2014. 3.Jiawei Han, Micheline Kamber, Jian Pei -Data Mining Concepts and Techniques,Morgan Kaufmann, Third Edition, 2012. E BOOKS 1.https://ff.tusofia.bg/~bogi/knigi/BD/Database%20Management%20Systems.%202nd%20Ed.pd f
  • 35. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC25/ COMPUTER NETWORKS YEAR / SEMESTER I / II L T P C 3 0 0 3 COURSE OBJECTIVES: The students should be able to  Understand the division of network functionalities into layers.  Be familiar with the components required to build different types of networks Be exposed to the required functionality at each layer  Learn the flow control and congestion control algorithms COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Identify the components required to build different types of networks 2. Choose the required functionality at each layer for given application 3. Identify solution for each functionality at each layer 4. Trace the flow of information from one node to another node in the network UNIT I FUNDAMENTALS & LINK LAYER(9HRS) Building a network – Requirements – Layering and protocols – Internet Architecture – Network software – Performance ; Link layer Services – Framing – Error Detection – Flow control UNIT II MEDIA ACCESS & INTERNETWORKING(9HRS) Media access control – Ethernet (802.3) – Wireless LANs – 802.11 – Bluetooth – Switching and bridging – Basic Internetworking (IP, CIDR, ARP, DHCP,ICMP ) UNIT III ROUTING (9HRS) Routing (RIP, OSPF, metrics) – Switch basics – Global Internet (Areas, BGP, IPv6), Multicast – addresses – multicast routing (DVMRP, PIM) UNIT IV TRANSPORT LAYER (9HRS) Overview of Transport layer – UDP – Reliable byte stream (TCP) – Connection management –
  • 36. Flow control – Retransmission – TCP Congestion control – Congestion avoidance (DECbit, RED) – QoS – Application requirements UNIT V APPLICATION LAYER (9HRS) Traditional applications -Electronic Mail (SMTP, POP3, IMAP, MIME) – HTTP – Web Services – DNS – SNMP TOTAL: 45 HOURS TEXT BOOK: 1. Larry L. Peterson, Bruce S. Davie, “Computer Networks: A Systems Approach”, Fifth Edition, Morgan Kaufmann Publishers, 2011. REFERENCES: 1. James F. Kurose, Keith W. Ross, “Computer Networking – A Top-Down Approach Featuring the Internet”, Fifth Edition, Pearson Education, 2009. 2. Nader. F. Mir, “Computer and Communication Networks”, Pearson Prentice Hall Publishers, 2010. 3. Ying-Dar Lin, Ren-Hung Hwang, Fred Baker, “Computer Networks: An Open Source Approach”,Mc Graw Hill Publisher, 2011. 4. Behrouz A. Forouzan, “Data communication and Networking”, Fourth Edition, Tata McGraw – Hill,2011.
  • 37. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC2PA / STATISTICS AND DATA SCIENCE LAB YEAR / SEMESTER I / II L T P C 0 0 4 2 COURSE OBJECTIVES: The students should be able to: I. Understand the R Programming Language. II. Exposure on Solving of data science problems. III. Understand The classification and Regression Model. COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1.Explore the fundamental concepts of data science 2.Understand data analysis techniques for applications handling large data 3.Understand various machine learning algorithms used in data science process 4.Visualize and present the inference using various tools 5.Learn to think through the ethics surrounding privacy, data sharing and algorithmicdecision-making LIST OF EXPERIMENTS: 1. R AS CALCULATOR APPLICATION a. Using with and without R objects on console b. Using mathematical functions on console c. Write an R script, to create R objects for calculator application and save in a specified location in disk 2. DESCRIPTIVE STATISTICS IN R a. Write an R script to find basic descriptive statistics using summary b. Write an R script to find subset of dataset by using subset () 3. READING AND WRITING DIFFERENT TYPES OF DATASETS a. Reading different types of data sets (.txt, .csv) from web and disk and writing in file in specific disk location. b. Reading Excel data sheet in R. c. Reading XML dataset in R. 4. VISUALIZATIONS a. Find the data distributions using box and scatter plot. b. Find
  • 38. the outliers using plot. c. Plot the histogram, bar chart and pie chart on sample data 5. CORRELATION AND COVARIANCE a. Find the correlation matrix. b. Plot the correlation plot on dataset and visualize giving an overview of relationships among data on iris data. c. Analysis of covariance: variance (ANOVA), if data have categorical variables on iris data 6. REGRESSION MODEL Import a data from web storage. Name the dataset and now do Logistic Regression to find out relation between variables that are affecting the admission of a student in a institute based on his or her GRE score, GPA obtained and rank of the student. Also check the model is fit or not. require (foreign), require(MASS). 7. MULTIPLE REGRESSION MODEL Apply multiple regressions, if data have a continuous independent variable. Apply on above dataset. 8. REGRESSION MODEL FOR PREDICTION Apply regression Model techniques to predict the data on above dataset 9. CLASSIFICATION MODEL a. Install relevant package for classification. b. Choose classifier for classification problem. c. Evaluate the performance of classifier 10. CLUSTERING MODEL a. Clustering algorithms for unsupervised classification. b. Plot the cluster data using R visualizations. TOTAL :60 Hours Reference Books: Yanchang Zhao, “R and Data Mining: Examples and Case Studies”, Elsevier, 1st Edition, 2012 Web References: 1.http://www.r-bloggers.com/how-to-perform-a-logistic-regression-in-r/ 2.http://www.ats.ucla.edu/stat/r/dae/rreg.htm 3.http://www.coastal.edu/kingw/statistics/R-tutorials/logistic.html 4. http://www.ats.ucla.edu/stat/r/data/binary.csv
  • 39. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC2PB / DATABASE MANAGEMENT SYSTEM (DBMS) LABORATORY YEAR / SEMESTER I / II L T P C 0 0 4 2 COURSE OBJECTIVES: 1. To explain basic database concepts, applications, data models, schemas and instances. 2. To demonstrate the use of constraints and relational algebra operations. 3.Describe the basics of SQL and construct queries using SQL. 4. To emphasize the importance of normalization in databases. 5. To facilitate students in Database design 6. To familiarize issues of concurrency control and transaction management COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Populate and query a database using SQL commands. 2.Declare and enforce integrity constraints on a database using a state-of the-art RDBMS 3. Implementing Indexing on table. 4.Programming PL/SQL including stored procedures, stored functions, cursors, packages 5.Solve basic issues of simple database applications and construct a real time database 6.application using current techniques LIST OF EXPERIMENTS: 1.To study Basic SQL commands (create table, use , drop, insert) and execute the following queriesusing these commands: Create a table ‘Emp’ with attributes ‘ename’,’ecity’,’salary’,’enumber’,’eaddress’,’depttname’. Create another table ‘Company’ with attributes ‘cname’, ccity’,’empnumber’ in the database ‘Employee’. 2.To study the viewing commands (select , update) and execute the following queries using thesecommands:
  • 40. Find the names of all employees who live in Delhi. Increase the salary of all employees by Rs. 5,000. Find the company names where the number of employees is greater than 10,000. Change the Company City to Gurgaon where the Company name is ‘TCS’. 3.To study the commands to modify the structure of table (alter, delete) and execute the followingqueries using these commands: Add an attribute named ‘ Designation’ to the table ‘Emp’. Modify the table ‘Emp’, Change the datatype of ‘salary’ attribute to float. Drop the attribute ‘depttname’ from the table ‘emp’. Delete the entries from the table ‘ Company’ where the number of employees are lessthan 500. 4.To study the commands that involve compound conditions (and, or, in , not in, between , not between , like , not like) and execute the following queries using these commands: Find the names of all employees who live in ‘ Gurgaon’ and whose salary is between Rs.20,000 and Rs. 30,000. Find the names of all employees whose names begin with either letter ‘A’ or ‘B’. Find the company names where the company city is ‘Delhi’ and the number of employees is not between 5000 and 10,000. Find the names of all companies that do not end with letter ‘A’. 5.To study the aggregate functions (sum, count, max, min, average) and execute the following queriesusing these commands: Find the sum and average of salaries of all employees in computer science department. Find the number of all employees who live in Delhi. Find the maximum and the minimum salary in the HR department. 6.To study the grouping commands (group by, order by) and execute the following queries using thesecommands: List all employee names in descending order. Find number of employees in each department where number of employees is greater than 5.
  • 41. List all the department names where average salary of a department is Rs.10,000. 7.To study the commands involving data constraints and execute the following queries using thesecommands: Alter table ‘Emp’ and make ‘enumber’ as the primary key. Alter table ‘Company’ and add the foreign key constraint. Add a check constraint in the table ‘Emp’ such that salary has the value between 0 and Rs.1,00,000 Alter table ‘Company’ and add unique constraint to column cname Add a default constraint to column ccity of table company with the value ‘Delhi’ 8.To study the commands for joins ( cross join, inner join, outer join) and execute the following queriesusing these commands: Retrieve the complete record of an employee and its company from both the table using joins. List all the employees working in the company ‘TCS’. 9.To study the various set operations and execute the following queries using these commands: List the enumber of all employees who live in Delhi and whose company is in Gurgaon or ifboth conditions are true. List the enumber of all employees who live in Delhi but whose company is not in Gurgaon. 10.To study the various scalar functions and string functions ( power, square, substring, reverse, upper, lower, concatenation) and execute the following queries using these commands: Reverse the names of all employees. Change the names of company cities to uppercase. Concatenate name and city of the employee. 11. To study the commands involving indexes and execute the following queries: Create an index with attribute ename on the table employee. Create a composite index with attributes cname and ccity on table company. Drop all indexes created on table company.
  • 42. 12. To study the conditional controls and case statement in PL-SQL and execute the following queries: Calculate the average salary from table ‘Emp’ and print increase the salary if the average salary is less that 10,000. Display the deptno from the employee table using the case statement if the deptname is ‘Technical’ then deptno is 1, if the deptname is ‘HR’ then the deptno is 2 else deptno is 3. 13. To study procedures and triggers in PL-SQL and execute the following queries: Create a procedure on table employee to display the details of employee to display the details of employees by providing them value of salaries during execution. Create a trigger on table company for deletion where the whole table is displayed when delete operation is performed. 14. Consider the tables given below. The primary keys are made bold and the data types are specified.PERSON( driver_id:string , name:string , address:string ) CAR( regno:string , model:string , year:int ) ACCIDENT( report_number:int , accd_date:date , location:string ) OWNS( driver_id:string , regno:string ) PARTICIPATED( driver_id:string , regno:string , report_number:int , damage_amount:int) Create the above tables by properly specifying the primary keys and foreign keys. Enter at least five tuples for each relation. Demonstrate how you Update the damage amount for the car with specific regno in the accident with reportnumber 12 to 25000. Find the total number of people who owned cars that were involved in accidents in theyear 2008. Find the number of accidents in which cars belonging to a specific model were involved. TOTAL :60 Hours TEXT BOOKS 1.Abraham Silberschatz, Henry F. Korth and S. Sudarshan- “Database System Concepts”, seventh edition -2017
  • 43. REFERENCE BOOKS 1.Ramez Elmasri and Shamkant B. Navathe, “Fundamental Database Systems”, Seventh Edition, Pearson Education,2016
  • 44. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC2PC/ OPERATING SYSTEMS AND NETWORKS LABORATORY YEAR / SEMESTER I / II L T P C 0 0 4 2 COURSE OBJECTIVES: To understand the functionalities of various layers of OSI model To explain the difference between hardware, software; operating systems, programs and files. Identify the purpose of different software applications. COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Understand fundamental underlying principles of computer networking. 2. Understand details and functionality of layered network architecture. 3. Apply mathematical foundations to solve computational problems in computer networking. Describe and demonstrate the functions and features of current operating systems 4. Demonstrate proficiency in common industry software applications (word processing, spreadsheet, presentation, and database) to effectively communicate in a professional business setting 5. Demonstrate skills that meet industry standards and certification requirements in the use of system hardware, operating systems technologies, and application systems. LIST OF EXPERIMENTS IN NETWORKS 1. Implement the data link layer framing methods such as character count, character stuffing and bit stuffing 2. Implement on a data set of characters the three CRC polynomials CRC 12, CRC 16 and CRC CCIP 3. Implement Dijkstra’s algorithm to compute the shortest path thru a graph
  • 45. 4. Take an example subnet graph with weights indicating delay between nodes 5. Now obtain Routing table art each node using distance vector routing algorithm 6. Take an example subnet of hosts. Obtain broadcast tree for 7. Take a 64 bit playing text and encrypt the same using DES algorithm. 8. Write a program to break the above DES coding 9. Using RSA algorithm Encrypt a text data and Decrypt the same. LIST OF EXPERIMENTS IN OPERTING SYSTEM 10. Simulate the following CPU scheduling algorithm a) FCFS b) SJF c) Round Robin d) Priority 11. Simulate MVT & MFT 12. Simulate all page replacement algorithms a) FIFO b) LRU c)OPTIMAL 13. Simulate all file organization techniques a) Single level b)Two level 14. Simulate all File Allocation Strategies a) Sequential B)Indexed C)Linked 15. Simulate Bankers Algorithm for Deadlock Avoidance TOTAL: 60 HOURS TEXT BOOKS / REFERECES / WEBSITES : 1. An Introduction to Operating Systems, P.C.P Bhatt, 2nd edition, PHI. 2. Modern Operating Systems, Andrew S Tanenbaum, 3rd Edition, PHI
  • 46. SEMESTER-III PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC31/ PROBABILITY THEORY YEAR / SEMESTER II/III L T P C 3 0 0 3 COURSE OBJECTIVES: To enable the students to understand the properties and applications of various probability functions. COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1.Demonstrate the random variables and its functions 2. Infer the expectations for random variable functions and generating functions. 3.Demonstrate various discrete and continuous distributions and their usage UNIT-1 ALGEBRA OF PROBABILITY (9HRS) Algebra of sets - fields and sigma - fields, Inverse function -Measurable function – Probability measure on a sigma field – simple properties - Probability space - Random variables and Random vectors – Induced Probability space – Distribution functions – Decomposition of distribution functions. UNIT-2-EXPECTATION AND MOMENTS OF RANDOM VARIABLES(9HRS) Definitions and simple properties - Moment inequalities – Holder, Jenson Inequalities – Characteristic function – definition and properties – Inversion formula. Convergence of a sequence of random variables - convergence in distribution - convergence in probability almost sure convergence and convergence in quadratic mean - Weak and Complete convergence of distribution functions – Helly - Bray theorem. UNIT-3 LAW OF LARGE NUMBERS (9HRS)
  • 47. Khintchin's weak law of large numbers, Kolmogorov strong law of large numbers (statementonly) – Central Limit Theorem – Lindeberg – Levy theorem, Linderberg – Feller theorem (statement only), Liapounov theorem – Relation between Liapounov and Linderberg –Fellerforms – Radon Nikodym theorem and derivative (without proof) – Conditional expectation –definition and simple properties. UNIT-4 DISTRIBUTION THEORY (9HRS) Distribution of functions of random variables – Laplace, Cauchy, Inverse Gaussian, Lognormal, Logarithmic series and Power series distributions - Multinomial distribution - Bivariate Binomial – Bivariate Poisson – Bivariate Normal - Bivariate Exponential of Marshall and Olkin - Compound, truncated and mixture of distributions, Concept of convolution - Multivariate normal distribution (Definition and Concept only) UNIT-5 SAMPLING DISTRIBUTION (9HRS) Sampling distributions: Non - central chi - square, t and F distributions and their properties - Distributions of quadratic forms under normality -independence of quadratic form and a linear form - Cochran’s theorem. TOTAL: 45 HOURS TEXT BOOKS 1. Modern Probability Theory, B.R Bhat, New Age International, 4th Edition, 2014. 2. An Introduction to Probability and Statistics, V.K Rohatgi and Saleh, 3rd Edition, 2015. REFERENCE BOOKS 1. Introduction to the theory of statistics, A.M Mood, F.A Graybill and D.C Boes, Tata McGraw-Hill, 3rd Edition (Reprint), 2017. 2. Order Statistics, H.A David and H.N Nagaraja, John Wiley & Sons, 3rd Edition, 2003.
  • 48. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC32/ CLOUD COMPUTING YEAR / SEMESTER II / III L T P C 3 0 0 3 COURSE OBJECTIVES:  To understand the concept of cloud computing.  To appreciate the evolution of cloud from the existing technologies.  To have knowledge on the various issues in cloud computing.  To be familiar with the lead players in cloud.  To appreciate the emergence of cloud as the next generation computing paradigm. COURSE OUTCOMES: Upon completion of this course, the students should be able to:  Articulate the main concepts, key technologies, strengths and limitations of cloud computing.  Learn the key and enabling technologies that help in the development of cloud.  Develop the ability to understand and use the architecture of compute and storage cloud, service and delivery models.  Explain the core issues of cloud computing such as resource management and security.  Be able to install and use current cloud technologies.  Evaluate and choose the appropriate technologies, algorithms and approaches for implementation and use of cloud. UNIT I INTRODUCTION (9HRS) Introduction to Cloud Computing – Definition of Cloud – Evolution of Cloud Computing – Underlying Principles of Parallel and Distributed Computing – Cloud Characteristics – Elasticity in Cloud – On-demand Provisioning. UNIT II CLOUD ENABLING TECHNOLOGIES (9HRS)
  • 49. Service Oriented Architecture – REST and Systems of Systems – Web Services – Publish Subscribe Model – Basics of Virtualization – Types of Virtualization – Implementation Levels of Virtualization – Virtualization Structures – Tools and Mechanisms – Virtualization of CPU – Memory – I/O Devices –Virtualization Support and Disaster Recovery. UNIT III CLOUD ARCHITECTURE, SERVICES AND STORAGE (9HRS) Layered Cloud Architecture Design – NIST Cloud Computing Reference Architecture – Public, Private and Hybrid Clouds – laaS – PaaS – SaaS – Architectural Design Challenges – Cloud Storage – Storage-as-a-Service – Advantages of Cloud Storage – Cloud Storage Providers – S3. UNIT IV RESOURCE MANAGEMENT AND SECURITY IN CLOUD (9HRS) Inter Cloud Resource Management – Resource Provisioning and Resource Provisioning Methods – Global Exchange of Cloud Resources – Security Overview – Cloud Security Challenges – Software-as-a-Service Security – Security Governance – Virtual Machine Security – IAM – Security Standards. UNIT V CLOUD TECHNOLOGIES AND ADVANCEMENTS (9HRS) Hadoop – MapReduce – Virtual Box — Google App Engine – Programming Environment for Google App Engine –– Open Stack –Federation in the Cloud – Four Levels of Federation – Federated Services and Applications – Future of Federation. TOTAL: 45 HOURS REFERENCE BOOKS: 1. Kai Hwang, Geoffrey C. Fox, Jack G. Dongarra, "Distributed and Cloud Computing, From Parallel Processing to the Internet of Things", Morgan Kaufmann Publishers, 2013. 2. Rittinghouse, John W., and James F. Ransome,―Cloud Computing: Implementation, Management and Security‖, CRC Press, 2017.
  • 50. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC33 / ADVANCED DATABASE TECHNOLOGIES YEAR / SEMESTER II / III L T P C 3 0 0 3 COURSE OBJECTIVES:  Be familiar with a commercial relational database system (Oracle) by writing SQL using the system.  Be familiar with the relational database theory, and be able to write relational algebra expressions for queries.. COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Understand the fundamental concepts of Database Management Systems and Entity Relationship Model and develop ER Models. 2. Build SQL Queries to perform data creation and data manipulation operations on databases. 3. Understand the concepts of functional dependencies, normalization and apply such knowledge to the normalization of a database. 4. Identify the issues related to Query processing and Transaction management in database management systems. 5. Analyze the trends in data storage, query processing and concurrency control of modern database technologies UNIT I PARALLEL AND DISTRIBUTED DATABASES (9HRS) Database System Architectures: Centralized and Client-Server Architectures – Server System Architectures – Parallel Systems- Distributed Systems – Parallel Databases: I/O Parallelism – Inter and Intra Query Parallelism – Inter and Intra operation Parallelism – Distributed Database Concepts - Distributed Data Storage – Distributed Transactions –
  • 51. Commit Protocols – Concurrency Control – Distributed Query Processing – Three Tier Client Server Architecture- Case Studies. UNIT II OBJECT AND OBJECT RELATIONAL DATABASES (9HRS) Concepts for Object Databases: Object Identity – Object structure – Type Constructors – Encapsulation of Operations – Methods – Persistence – Type and Class Hierarchies – Inheritance – Complex Objects – Object Database Standards, Languages and Design: ODMG Model – ODL – OQL – Object Relational and Extended – Relational Systems : Object Relational features in SQL / Oracle – Case Studies. UNIT III XML DATABASES (9HRS) XML Databases: XML Data Model – DTD - XML Schema - XML Querying – Web Databases – JDBC– Information Retrieval – Data Warehousing – Data Mining. UNIT IV MOBILE DATABASES (9HRS) Mobile Databases: Location and Handoff Management - Effect of Mobility on Data Management - Location Dependent Data Distribution - Mobile Transaction Models - Concurrency Control - Transaction Commit Protocols- Mobile Database Recovery Schemes. UNIT V INTELLIGENT DATABASES (9HRS) Active databases – Deductive Databases – Knowledge bases – Multimedia Databases- Multidimensional Data Structures – Image Databases – Text/Document Databases- Video Databases– Audio Databases – Multimedia Database Design. TOTAL: 45 HOURS TEXT BOOKS [1]. Henry F. Korth and Silberschatz Abraham, “Database System Concepts”, Mc.Graw Hill.2019 [2]. Thomas Cannolly and Carolyn Begg, “Database Systems, A Practical Approach to Design, Implementation and Management”, Third Edition, Pearson Education, 2001. [3]. The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, 2nd John
  • 52. Wiley & Sons, Inc. New York, USA, 2002 REFERENCE BOOKS 1] LiorRokach and OdedMaimon, Data Mining and Knowledge Discovery Handbook, Springer, 2nd edition, 2010
  • 53. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC34/ WEB PROGRAMMING YEAR / SEMESTER II / III L T P C 3 0 0 3 COURSE OBJECTIVES: The students should be able to 1. Understand the technologies used in Web Programming. 2 .Know the importance of object oriented aspects of Scripting. 3. Understand creating database connectivity using JDBC. 4 .Learn the concepts of web based application using sockets. COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1 .Design web pages. 2. Use technologies of Web Programming. 3. Apply object oriented aspects to Scripting. 4. Create databases with connectivity using JDBC. 5 .Build web based application using sockets. UNIT I SCRIPTING(9HRS) Web page Designing using HTML, Scripting basics- Client side and server side scripting. Java ScriptObject,names, literals, operators and expressions- statements and features- events - windows -documents - frames - data types - built-in functions- Browser object model - Verifying forms.-HTML5-CSS3- HTML 5 canvas - Web site creation using tools. UNIT II JAVA(9HRS) Introduction to object oriented programming-Features of Java – Data types, variables and arrays –Operators – Control statements – Classes and Methods – Inheritance. Packages and Interfaces –Exception Handling – Multithreaded Programming – Input/Output – Files – Utility Classes – String Handling. UNIT III JDBC (9HRS)
  • 54. JDBC Overview – JDBC implementation – Connection class – Statements - Catching Database Results, handling database Queries. Networking– InetAddress class – URL class- TCP sockets – UDP sockets, Java Beans –RMI. UNIT IV APPLETS (9HRS) Java applets- Life cycle of an applet – Adding images to an applet – Adding sound to an applet. Passing parameters to an applet. Event Handling. Introducing AWT: Working with Windows Graphics and Text. Using AWT Controls, Layout Managers and Menus. Servlet – life cycle of a servlet. The Servlet API, Handling HTTP Request and Response, using Cookies, Session Tracking. Introduction to JSP. UNIT V XML AND WEB SERVICES (9HRS) Xml – Introduction-Form Navigation-XML Documents- XSL – XSLT- Web services-UDDI- WSDL-Java web services – Web resources. TOTAL :45 HOURS TEXT BOOKS: 1. Harvey Deitel, Abbey Deitel, Internet and World Wide Web: How To Program 5 Edition. 2. Herbert Schildt, Java - The Complete Reference, 7th Edition. Tata McGraw- Hill Edition. 3. Michael Morrison XML Unleashed Tech media SAMS. REFERENCE BOOKS: 1. John Pollock, Javascript - A Beginners Guide, 3rd Edition –- Tata McGraw-Hill Edition. 2. Keyur Shah, Gateway to Java Programmer Sun Certification, Tata McGraw Hill, 2002.
  • 55. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC35/ DATA MINING YEAR / SEMESTER II / III L T P C 3 0 0 3 COURSE OBJECTIVES:  To Understand Data mining principles and techniques and Introduce DM as a cutting edge business intelligence  To expose the students to the concepts of Data ware housing Architecture and Implementation  To study the overview of developing areas – Web mining, Text mining and ethical aspects of Data mining  To identify Business applications and Trends of Data mining COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Evolve Multidimensional Intelligent model from typical system 2. Discover the knowledge imbibed in the high dimensional system 3. Evaluate various mining techniques on complex data objects UNIT I INTRODUCTION TO DATA MINING (9HRS) Data mining-KDD versus datamining, Stages of the Data Mining Process-task premitives, Data Mining Techniques -Data mining knowledge representation – Data mining query languages, Integration of a Data Mining System with a Data Warehouse – Issues, Data preprocessing – Data cleaning, Data transformation, Feature selection, Dimensionality reduction, Discretization and generating concept hierarchies-Mining frequent patterns- association-correlation UNIT II CLASSIFICATION AND CLUSTERING (9HRS) Decision Tree Induction - Bayesian Classification – Rule Based Classification – Classification by Back propagation – Support Vector Machines – Associative Classification – Lazy Learners – Other Classification Methods – Clustering techniques – , Partitioning methods- k-means-
  • 56. Hierarchical Methods – distance based agglomerative and divisible clustering, Density-Based Methods – expectation maximization -Grid Based Methods – Model-Based Clustering Methods – Constraint – Based Cluster Analysis – Outlier Analysis UNIT III PREDICTIVE MODELING OF BIG DATA AND TRENDS IN DATAMINING (9HRS) Statistics and Data Analysis – EDA – Small and Big Data –Logistic Regression Model - Ordinary Regression Model-Mining complex data objects – Spatial databases – Temporal databases – Multimedia databases – Time series and sequence data – Text mining – Web mining – Applications in Data mining UNIT IV INTRODUCTION TO DATA WAREHOUSING (9HRS) Evolution of Decision Support Systems- Data warehousing Components – Building a Data warehouse, Data Warehouse and DBMS, Data marts, Metadata, Multidimensional data model, OLAP vs OLTP, OLAP operations, Data cubes, Schemas for Multidimensional Database: Stars, Snowflakes and Fact constellations UNIT V DATA WAREHOUSE PROCESS AND ARCHITECTURE (9HRS) Types of OLAP servers, 3–Tier data warehouse architecture, distributed and virtual data warehouses. Data warehouse implementation, tuning and testing of data warehouse. Data Staging (ETL) Design and Development, data warehouse visualization, Data Warehouse Deployment, Maintenance, Growth, Business Intelligence Overview- Data Warehousing and Business Intelligence Trends - Business Applications- tools-SAS TOTAL : 45 HOURS TEXT BOOKS: 1. Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, third edition 2011, ISBN: 1558604898. 2. Alex Berson and Stephen J. Smith, “ Data Warehousing, Data Mining & OLAP”, Tata McGraw Hill Edition, Tenth Reprint 2007. 3. G. K. Gupta, “Introduction to Data Min Data Mining with Case Studies”, Easter Economy Edition, Prentice Hall of India, 2006. 4. Data Mining:Practical Machine Learning Tools and Techniques,Third edition,(Then Morgan Kufmann series in Data Management systems), Ian.H.Witten, Eibe Frank and
  • 57. Mark.A.Hall, 2011 5. Statistical and Machine learning –Learning Data Mining, techniques for better Predictive Modeling and Analysis to Big Data REFERENCE BOOKS: 1. Mehmed kantardzic,“Data mining concepts,models,methods, and algorithms”, Wiley Interscience, 2003. 2. Ian Witten, Eibe Frank, Data Mining; Practical Machine Learning Tools and Techniques, third edition, Morgan Kaufmann, 2011. 3. George M Marakas, Modern Data Warehousing, Mining and Visualization,Prentice Hall, 2003.
  • 58. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC36/ OPERATION RESEARCH YEAR / SEMESTER II / III L T P C 3 0 0 3 COURSE OBJECTIVES:  To provide basic knowledge of computer operating system structures and functioning  To study about process management  To learn the basics of memory management  To understand the structure of file and I/O systems  To be familiar with some operating systems COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1.Implement the various System calls and Inter process Communication 2.Apply various processor scheduling algorithms and handle process synchronization problems 3.Apply various memory management techniques to given situation 4.Apply various disk management techniques 5.Apply various processor scheduling algorithms and memory management techniques for popular operating systems – Linux, Windows and Android UNIT I OPERATING SYSTEMS OVERVIEW (9HRS) Introduction to operating systems – Computer system organization - architecture – Operating system structure - operations – Process, memory, storage management – Protection and security – Distributed systems – Computing environments – Open source operating systems – OS services – User interface – System calls – System programs – Process concept - scheduling – Operations on processes – Cooperating processes – Inter-process communication – Threads. UNIT II PROCESS MANAGEMENT (9HRS) Basic concepts – Scheduling criteria – Scheduling algorithms – Multiple processor scheduling – Algorithm evaluation – The critical section problem – Synchronization hardware – Semaphores – Classic problems of synchronization – Critical regions – Monitors – Deadlocks – Deadlock
  • 59. characterization – Methods for handling deadlocks – Deadlock prevention – Deadlock avoidance – Deadlock detection – Recovery from deadlock. UNIT III MEMORY MANAGEMENT (9HRS) Memory management – Swapping – Contiguous memory allocation – Paging – Segmentation- Segmentation with paging – Virtual memory - Demand paging – Copy on write – Page replacement – Allocation of frames – Thrashing UNIT IV FILE AND I/O SYSTEMS (9HRS) File concept – Access methods – Directory structure – File-system mounting –Protection – Directory implementation – Allocation methods – Free space management – Disk scheduling – Disk management – Swap space management – Protection - I/O Systems – I/O Hardware – Application I/O Interface – Kernel I/O subsystem UNIT V CASE STUDY (9HRS) Linux system – History – Design principles – Kernel modules – Process management – Scheduling – Memory management – File systems – Input and output – Inter Process Communication – Network structure – Security - Windows 8 – History – Design principles - Android OS - History – Design principles. TOTAL: 45 HOURS TEXT BOOKS: 1.Abraham Silberschatz, Peter B. Galvin, Greg Gagne, “Operating System Concepts Essentials”, John Wiley & Sons Inc., Ninth Edition, 2013 2. Reto Meier, John Wiley and sons, “Professional Android 4 Application Development”, 2012 REFERENCE BOOKS: 1. Andrew S. Tanenbaum, “Modern Operating Systems”, Pearson Education, Fourth Edition, 2015. 2. Charles Crowley, “Operating Systems: A Design-Oriented Approach”, Tata McGraw Hill Education”, 2017. 3. D M Dhamdhere, “Operating Systems: A Concept-based Approach”, Tata McGraw-Hill Education, Third Edition, 2012. 4. William Stallings, “Operating Systems: Internals and Design Principles”, Prentice Hall,
  • 60. Seventh Edition, 2011. Extensive Reading: 1. http://nptel.ac.in 2. http://nptel.ac.in/downloads/106108055/ 3. http://cseweb.ucsd.edu/classes/fa06/cse120/lectures/120-fa06-l13.pdf 4. http://www.cs.kent.edu/~farrell/osf03/oldnotes/
  • 61. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC3PA / DATA MINING LAB YEAR / SEMESTER II/ III L T P C 0 0 4 2 COURSE OBJECTIVES: The students should be able to 1. Be familiar with the algorithms of data mining, 2. Be acquainted with the tools and techniques used for Knowledge Discovery in Databases. 3. Be exposed to web mining and text mining COURSE OUTCOMES: Upon completion of this course, the students should be able to:  Apply data mining techniques and methods to large data sets.  Use data mining tools.  Compare and contrast the various classifiers. LIST OF EXPERIMENTS: 1. Creation of a Data Warehouse. 2. Apriori Algorithm. 3. FP-Growth Algorithm. 4. K-means clustering. 5. One Hierarchical clustering algorithm. 6. Bayesian Classification. 7. Decision Tree. 8. Support Vector Machines. 9. Applications of classification for web mining. 10. Case Study on Text Mining or any commercial application. TOTAL : 60 HOURS TEXT BOOKS:
  • 62. 1. Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, third edition2011, ISBN: 1558604898. 2. Alex Berson and Stephen J. Smith, “ Data Warehousing, Data Mining & OLAP”, Tata McGraw Hill Edition, Tenth Reprint 2007.
  • 63. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC3PB / CLOUD COMPUTING AND WEB PROGRAMMING LAB YEAR / SEMESTER II / III L T P C 0 0 4 2 COURSE OBJECTIVES:  To develop web applications in cloud  To learn the design and development process involved in creating a cloud based application  To learn to implement and use parallel programming using Hadoop  To learn to implement embedded devices in IoT  To design and develop IoT Devices COURSE OUTCOMES: Upon completion of this course, the students should be able to:  Design and deploy a web application in a PaaS environment.  Learn how to simulate a cloud environment to implement new schedulers.  Install and use a generic cloud environment that can be used as a private cloud.  Make use of Cloud platform to upload and analyse any sensor data.  Use cascading style sheets to design web pages LIST OF EXPERIMENTS: CLOUD COMPUTING 1. Install Virtualbox/VMware Workstation with different flavours of linux or windows OS on top of windows7 or 8. 2. Install a C compiler in the virtual machine created using virtual box and execute Simple Programs 3. Install Google App Engine. Create hello world app and other simple web applications
  • 64. using python/java. 4. Use GAE launcher to launch the web applications. 5. Simulate a cloud scenario using CloudSim and run a scheduling algorithm that is not present in CloudSim. 6. Find a procedure to transfer the files from one virtual machine to another virtual machine. 7. Find a procedure to launch virtual machine using trystack (Online Openstack Demo Version) 8. Install Hadoop single node cluster and run simple applications like wordcount. WEB PROGRAMMING 1. Develop and demonstrate a XHTML file that includes Javascript script for the following problems: a) Input: A number n obtained using prompt Output: The first n Fibonacci numbers b) Input: A number n obtained using prompt Output: A table of numbers from 1 to n and their squares using alert 2. a) Develop and demonstrate, using Javascript script, a XHTML document that collects the USN ( the valid format is: A digit from 1 to 4 followed by two upper-case characters followed by two digits followed by two upper-case characters followed by three digits; no embedded spaces allowed) of the user. Event handler must be included for the form element that collects this information to validate the input. Messages in the alert windows must be produced when errors are detected. b) Modify the above program to get the current semester also (restricted to be a number from 1 to 8) 3. a) Develop and demonstrate, using Javascript script, a XHTML document that contains three short paragraphs of text, stacked on top of each other, with only enough of each showing so that the mouse cursor can be placed over some part of them. When the cursor is placed over the exposed part of any paragraph, it should rise to the top to become completely visible. b) Modify the above document so that when a paragraph is moved from the top stacking position, it returns to its original position rather than to the bottom. 4. a) Design an XML document to store information about a student in an engineering college affiliated to VTU. The information must include 100 USN, Name, Name of the College, Brach,
  • 65. Year of Joining, and e-mail id. Make up sample data for 3 students. Create a CSS style sheet and use it to display the document. b) Create an XSLT style sheet for one student element of the above document and use it to create a display of that element. 5. a) Write a Perl program to display various Server Information like Server Name, Server Software, Server protocol, CGI Revision etc. b) Write a Perl program to accept UNIX command from a HTML form and to display the output of the command executed. 6. a) Write a Perl program to accept the User Name and display a greeting message randomly chosen from a list of 4 greeting messages. b) Write a Perl program to keep track of the number of visitors visiting the web page and to display this count of visitors, with proper headings. 7. Write a Perl program to display a digital clock which displays the current time of the server. TOTAL :60 Hours REFERENCE BOOKS: 1.Kai Hwang, Geoffrey C. Fox, Jack G. Dongarra, "Distributed and Cloud Computing, 2 . Harvey Deitel, Abbey Deitel, Internet and World Wide Web: How To Program 5 Edition.
  • 66. SEMESTER-IV PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC41/ DATA HANDLING AND VISUALIZATION YEAR / SEMESTER II / IV L T P C 3 0 0 3 COURSE OBJECTIVE: The course is designed to enable students to know the basics of data visualization and understand the importance of data visualization and the design and use of visual components and basic algorithms. COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1.Understand basics of Data Visualization 2.Implement visualization of distributions 3.Write programs on visualization of time series, proportions & associations 4.Apply visualization on Trends and uncertainty 5.Explain principles of proportions UNIT 1: INTRODUCTION TO VISUALIZATION(9HRS) Visualizing Data-Mapping Data onto Aesthetics, Aesthetics and Types of Data, Scales Map Data Values onto Aesthetics, Coordinate Systems and Axes- Cartesian Coordinates, Nonlinear Axes, Coordinate Systems with Curved Axes, Color Scales-Color as a Tool to Distinguish, Color to Represent Data Values ,Color as a Tool to Highlight, Directory of Visualizations- Amounts, Distributions, Proportions, x–y relationships, Geospatial Data UNIT 2: VISUALIZING DISTRIBUTIONS (9HRS) Visualizing Amounts-Bar Plots, Grouped and Stacked Bars, Dot Plots and Heatmaps, Visualizing Distributions: Histograms and Density Plots- Visualizing a Single Distribution,
  • 67. Visualizing Multiple Distributions at the Same Time, Visualizing Distributions: Empirical Cumulative Distribution Functions and Q-Q Plots-Empirical Cumulative Distribution Functions, Highly Skewed Distributions, Quantile- Quantile Plots, Visualizing Many Distributions at Once-Visualizing Distributions Along the Vertical Axis, Visualizing Distributions Along the Horizontal Axis UNIT 3: VISUALIZING ASSOCIATIONS & TIME SERIES (9HRS) Visualizing Proportions-A Case for Pie Charts, A Case for Side-by-Side Bars, A Case for Stacked Bars and Stacked Densities, Visualizing Proportions Separately as Parts of the Total ,Visualizing Nested Proportions- Nested Proportions Gone Wrong, Mosaic Plots and Treemaps, Nested Pies ,Parallel Sets. Visualizing Associations Among Two or More Quantitative Variables-Scatterplots, Correlograms, Dimension Reduction, Paired Data. Visualizing Time Series and Other Functions of an Independent Variable-Individual Time Series , Multiple Time Series and Dose–Response Curves, Time Series of Twoor More Response Variables UNIT 4: VISUALIZING UNCERTIANITY (9HRS) Visualizing Trends-Smoothing, Showing Trends with a Defined Functional Form, Detrending and Time-Series Decomposition, Visualizing Geospatial Data-Projections, Layers, Choropleth Mapping, Cartograms, Visualizing Uncertainty-Framing Probabilities as Frequencies, Visualizing the Uncertainty of Point Estimates, Visualizing the Uncertainty of Curve Fits, Hypothetical Outcome Plots UNIT 5: PRINCIPLE OF PROPORTIONAL INK(9HRS) The Principle of Proportional Ink-Visualizations Along Linear Axes, Visualizations Along Logarithmic Axes, Direct Area Visualizations, Handling Overlapping Points-Partial Transparency and Jittering, 2D Histograms, Contour Lines, Common Pitfalls of Color Use- Encoding Too Much or Irrelevant Information ,Using Nonmonotonic Color Scales to Encode Data Values, Not Designing for Color-Vision Deficiency. TOTAL: 45 HOURS TEXT BOOKS 1.Claus Wilke, “Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures”, 1st edition, O’Reilly Media Inc, 2019.
  • 68. REFERENCE BOOKS 1.Tony Fischetti, Brett Lantz, R: Data Analysis and Visualization,O’Reilly ,2016 2.Ossama Embarak, Data Analysis and Visualization Using Python: Analyze Data to Create Visualizations for BI Systems,Apress, 2018 E BOOKS 1.https://www.netquest.com/hubfs/docs/ebook-data-visualization-EN.pdf MOOC 1.https://www.coursera.org/learn/data-visualization 2.https://www.coursera.org/learn/python-for-data-visualization#syllabus
  • 69. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC42/ MACHINE LEARNING YEAR / SEMESTER II / IV L T P C 3 0 0 3 COURSE OBJECTIVES: The objective of this course is to provide introduction to the principles and design of machine learning algorithms. The course is aimed at providing foundations for conceptual aspects of machine learning algorithms along with their applications to solve real world problems. COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Identify various machine learning algorithms and terminologies and perform data pre-processing using standard ML library. 2. Design a predictive model using appropriate supervised learning algorithms to solve any given problem. 3. Develop an application using appropriate unsupervised learning algorithms for performing clustering and dimensionality reduction. 4. Solve complex problems using artificial neural networks and kernel machines. 5. Implement probabilistic graphical models for suitable applications. UNIT-I INTRODUCTION TO ARTIFICIAL NEURAL NETWORKS (9HRS) Fundamentals Of Neural Networks – Model of Artificial Neuron – Neural Network Architectures – Learning Methods – Taxonomy Of Neural Network Architectures – Applications UNIT II FEED FORWARD NEURAL NETWORKS (9HRS) Perceptron Models: Discrete, Continuous and Multi-Category –Training Algorithms: Discrete and Continuous Perceptron Networks – Limitations of the Perceptron – Model. Credit Assignment Problem – Generalized Delta Rule, Derivation of Back propagation (BP) Training, and Summary of Back propagation Algorithm –Kolmogorov Theorem UNIT III: MACHINE LEARNING(9HRS) Machine Learning Fundamentals –Types of Machine Learning - Supervised, Unsupervised,
  • 70. Reinforcement- The Machine Learning process. Terminologies in ML- Testing ML algorithms: Overfitting, Training, Testing and Validation Sets- Confusion matrix -Accuracy metrics- ROC Curve- Basic Statistics: Averages, Variance and Covariance, The Gaussian- The Bias-Variance trade off- Applications of Machine Learning. UNIT IV: SUPERVISED LEARNING(9HRS) Regression: Linear Regression – Multivariate Regression- Classification: Linear Discriminant Analysis, Logistic Regression- K-Nearest Neighbor classifier. Decision Tree based methods for classification and Regression- Ensemble methods. UNIT V: UNSUPERVISED LEARNING(9HRS) Clustering- K-Means clustering, Hierarchical clustering - The Curse of Dimensionality - Dimensionality Reduction - Principal Component Analysis - Probabilistic PCA- Independent Components analysis TOTAL: 45 HOURS TEXT BOOKS 1. CharuC.Aggarwal “Neural Networks and Deep learning” Springer International Publishing, 2018 2. Satish Kumar, “Neural Networks, A Classroom Approach”, Tata McGraw -Hill, 2007. 3.Kevin P. Murphy, “Machine Learning: A Probabilistic Perspective”, MIT Press, 2012. 4..Stephen Marsland, “Machine Learning –An Algorithmic Perspective”, CRC Press, 2009. 5.Saikat Dutt, Subramanian Chandramouli, Amit Kumar Das, “Machine Learning”, Pearson Education, 2018. 6.Christopher Bishop, “Pattern Recognition and Machine Learning” Springer, 2011. REFERENCE BOOKS 1.Andreas C. Muller, “Introduction to Machine Learning with Python: A Guide for Data Scientists”, O'Reilly,2016. 2.Sebastian Raschka, “Python Machine Learning”, Packt Publishing, 2015. 3.Hastie, Tibshirani, Friedman, “The Elements of Statistical Learning: Data Mining, Inference, and Prediction”,2nd Edition, Springer, 2017. 4.Ethem Alpaydin, “Introduction to Machine Learning”, 2nd Revised edition, MIT Press,2010.
  • 71. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC44/ OPTIMIZATION TECHNIQUES YEAR / SEMESTER II / IV L T P C 3 0 0 3 COURSE OBJECTIVES: To impart knowledge on various categories of existing engineering problems and solutions to such problems through different optimization techniques and approaches COURSE OUTCOMES: At the end of the course, the students should be able to: 1. Relate key concepts and applications of various optimization techniques 2. Identify the appropriate optimization technique for the given problem 3. Formulate appropriate objective functions and constraints to solve real life optimization problems UNIT I INTRODUCTION (9HRS) Statement of an optimization problems – classification of optimization problem – classical optimization techniques; Single variable optimizations, Multi variable optimization, equality constrainst, inequality constraints, No constraints. UNIT II LINEAR PROGRAMMING (9HRS) Graphical method for two dimensional problems – central problems of Linear Programming – Definitions – Simples – Algorithm – Phase I and II of simplex Method – Revised Simplex Method.Simplex Multipliers – Dual and Primal – Dual Simplex Method – Sensitivity Analysis – Transportation problem and its solution – Assignment problem and its solution – Assignment problem and its solution by Hungarian method – Karmakar’s method – statement, Conversion of the Linear Programming problem into the required form, Algorithm. UNIT III NON LINEAR PROGRAMMING (9HRS) NON LINEAR PROGRAMMING (ONE DIMENSIONAL MINIMIZATION: Introduction –
  • 72. Unrestricted search – Exhaustive search – interval halving method – Fibonacci method. NON LINEAR PROGRAMMING : (UNCONSTRAINED OPRIMIZATION): - Introduction – Random search method – Uni variate method – Pattern search methods – Hooke and jeeves method, simplex method- Gradient of a function – steepest descent method – Conjugate gradient method. NON LINEAR PROGRAMMING – (CONSTRAINED OPTIMIZATION): Introduction – Characteristics of the problem – Random search method – Conjugate gradient method. UNIT IV DYNAMIC PROGRAMMING (9HRS) Introduction – multistage decision processes – Principles of optimality – Computation procedures. UNIT V DECISIOIN MAKING (9HRS) Decisions under uncertainty, under certainty and under risk – Decision trees – Expected value of perfect information and imperfect information. TOTAL: 45 HOURS TEXT BOOKS: 1. Kalynamoy Deb, “Optimization for Engineering Design, Alogorithms and Examples”, Prentice Hall, 2012. 2. Hamdy A Taha, “Operations Research – An introduction”, Pearson Education ,2017 REFERENCE BOOKS: 1. Hillier / Lieberman, “Introduction to Operations Research”, Tata McGraw Hill Publishing company Ltd, 2002. 2. Singiresu S Rao, “Engineering optimization Theory and Practice”, New Age International, 1996. 3. Mik Misniewski, “Quantitative Methods for Decision makers”, MacMillian Press Ltd., 1994. 4. Kambo N S, “Mathematical Programming Techniques”, Affiliated East – West press, 1991.
  • 73. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC45/ BIG DATA ANALYTICS YEAR / SEMESTER II/ IV L T P C 3 0 0 3 COURSE OBJECTIVES: To optimize business decisions and create competitive advantage with Big Data analytics  To explore the fundamental concepts of big data analytics.  To learn to analyze the big data using intelligent techniques.  To understand the various search methods and visualization techniques.  To learn to use various techniques for mining data stream.  To understand the applications using Map Reduce Concepts.  To introduce programming tools PIG & HIVE in Hadoop echo system COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Work with big data platform and explore the big data analytics techniques business applications. 2. Design efficient algorithms for mining the data from large volumes. 3. Analyze the HADOOP and Map Reduce technologies associated with big data analytics. 4. Explore on Big Data applications Using Pig and Hive. 5. Understand the fundamentals of various big data analytics techniques. 6. Build a complete business data analytics solution UNIT-I INTRODUCTION TO BIG DATA( 9HRS) Introduction to Big Data Platform – Challenges of Conventional Systems - Intelligent data analysis – Nature of Data - Analytic Processes and Tools - Analysis vs Reporting. UNIT-II MINING DATA STREAMS (9HRS) Introduction To Streams Concepts – Stream Data Model and Architecture - Stream Computing - Sampling Data in a Stream – Filtering Streams – Counting Distinct Elements in a Stream –
  • 74. Estimating Moments – Counting Oneness in a Window – Decaying Window - Real time Analytics Platform(RTAP) Applications - Case Studies - Real Time Sentiment Analysis- Stock Market Predictions. UNIT-III HADOOP(9HRS) History of Hadoop- the Hadoop Distributed File System – Components of Hadoop Analysing the Data with Hadoop- Scaling Out- Hadoop Streaming- Design of HDFS-Java interfaces to HDFS Basics- Developing a Map Reduce Application-How Map Reduce Works-Anatomy of a Map Reduce Job run-Failures-Job Scheduling-Shuffle and Sort – Task execution - Map Reduce Types and Formats- Map Reduce FeaturesHadoop environment. UNIT-IV FRAMEWORKS (9HRS) Applications on Big Data Using Pig and Hive – Data processing operators in Pig – Hive services – HiveQL – Querying Data in Hive - fundamentals of HBase and ZooKeeper - IBM InfoSphere BigInsights and Streams. UNIT-V PREDICTIVE ANALYTICS (9HRS) Predictive Analytics -Simple linear regression- Multiple linear regression- Interpretation of regression coefficients. Visualizations - Visual data analysis techniques- interaction techniques - Systems and applications TEXT BOOK 1. Michael Berthold, David J. Hand, “Intelligent Data Analysis”, Springer, 2007. 2. Tom White “Hadoop: The Definitive Guide” Third Edition, O’reilly Media, 2012. References: 1. Chris Eaton, Dirk DeRoos, Tom Deutsch, George Lapis, Paul Zikopoulos, “Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data”, McGrawHill Publishing, 2012. 2. Anand Rajaraman and Jeffrey David Ullman, “Mining of Massive Datasets”, CUP, 2012. 3. Bill Franks, “Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics”, John Wiley& sons, 2012. 4. Glenn J. Myatt, “Making Sense of Data”, John Wiley & Sons, 2007. 5. Pete Warden, “Big Data Glossary”, O’Reilly, 2011. 8. Jiawei Han, Micheline Kamber “Data Mining Concepts and Techniques”, 2 nd Edition, Elsevier, Reprinted 2008.
  • 75. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC4PA / MACHINE LEARNING LAB YEAR / SEMESTER II / IV L T P C 0 0 4 2 COURSE OBJECTIVES:  Make use of Data sets in implementing the machine learning algorithms  Implement the machine learning concepts and algorithms in any suitable language of choice. COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Understand the implementation procedures for the machine learning algorithms. 2. Design Java/Python programs for various Learning algorithms. 3. Applyappropriate data sets to the Machine Learning algorithms. 4. Identify and apply Machine Learning algorithms to solve real world problems. LIST OF EXPERIMENTS 1. Implement and demonstratethe FIND-Salgorithm for finding the most specific hypothesis based on a given set of training data samples. Read the training data from a .CSV file. 2. For a given set of training data examples stored in a .CSV file, implement and demonstrate the Candidate-Elimination algorithmto output a description of the set of all hypotheses consistent with the training examples. 3. Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use an appropriate data set for building the decision tree and apply this knowledge toclassify a new sample. 4. Build an Artificial Neural Network by implementing the Backpropagationalgorithm and test the same using appropriate data sets. 5. Write a program to implement the naïve Bayesian classifier for a sample training data set
  • 76. stored as a .CSV file. Compute the accuracy of the classifier, considering few test data sets. 6. Assuming a set of documents that need to be classified, use the naïve Bayesian Classifier model to perform this task. Built-in Java classes/API can be used to write the program. Calculate the accuracy, precision, and recall for your data set. 7. Write a program to construct a Bayesian network considering medical data. Use this model to demonstrate the diagnosis of heart patients using standard Heart Disease Data Set. You can use Java/Python ML library classes/API. 8. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set for clustering using k-Means algorithm. Compare the results of these two algorithms and comment on the quality of clustering. You can add Java/Python ML library classes/API in the program. 9. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set. Print both correct and wrong predictions. Java/Python ML library classes can be used for this problem. 10. Implement the non-parametric Locally Weighted Regression algorithm in order to fit data points. Select appropriate data set for your experiment and draw graphs. TOTAL : 60 HOURS REFERENCES : 1. Christopher Bishop, “Pattern Recognition and Machine Learning” Springer, 2007. 2. Stephen Marsland, “Machine Learning – An Algorithmic Perspective”, Chapman andHall, CRC Press, Second Edition, 2014. 3. Kevin P. Murphy, “Machine Learning: A Probabilistic Perspective”, MIT Press, 2012. 4. Ethem Alpaydin, “Introduction to Machine Learning”, MIT Press, Third Edition, 2014. 5. Tom Mitchell, “Machine Learning”, McGraw-Hill, 1997.
  • 77. PROGRAM NAME B.SC-DATA SCIENCE COURSE CODE / NAME UDASC4PB / BIG DATA ANALYTICS LAB YEAR / SEMESTER II / IV L T P C 0 0 4 2 COURSE OBJECTIVES:  Optimize business decisions and create competitive advantage with Big Data analytics  Imparting the architectural concepts of Hadoop and introducing map reduce paradigm  Introducing Java concepts required for developing map reduce programs  Derive business benefit from unstructured data Introduce programming tools PIG & HIVE in Hadoop echo system.  Developing Big Data applications for streaming data using Apache Spark COURSE OUTCOMES: Upon completion of this course, the students should be able to: 1. Preparing for data summarization, query, and analysis. 2. Applying data modelling techniques to large data sets 3. Creating applications for Big Data analytics 4. Building a complete business data analytic solution LIST OF EXPERIMENTS 1.(i)Perform setting up and Installing Hadoop in its two operating modes: Pseudo distributed,Fully distributed. (ii) Use web based tools to monitor your Hadoop setup. 2.(i) Implement the following file management tasks in Hadoop: Adding files and directories Retrieving files Deleting files [ii) Benchmark and stress test an Apache Hadoop cluster 3.Run a basic Word Count Map Reduce program to understand Map Reduce Paradigm. Find the number of occurrence of each word appearing in the input file(s) Performing a MapReduce Job for word search count (look for specific keywords in a file