SlideShare a Scribd company logo
1 | P a g e
Internship Report
Ghulam Ishaq Khan Institute of Engineering
Sciences and Technology
2 | P a g e
Name: Salman Khan
Registration Number: 2012338
Organization: Teradata
Duration: 1 Month (Four Weeks)
Submission Date: 30th
November 2015
Faculty of Computer Science and Engineering (Fall- 2016)
3 | P a g e
Acknowledgement:
First I would like to thank Sir Hassan Waqar, Awais Ijaz Professional
Services Consultant , for giving me the opportunity to do an internship
within the organization. For me it was a unique experience to be in
Teradata Pakistan and to study an interesting data warehousing. It also
helped to get back my interest in databases and to have new plans for my
future career.
I also would like all the people that worked in the office of Teradata in
Lahore. With their patience and openness they created an enjoyable
working environment.
Furthermore I want to thank all the students, with whom I did the fieldwork.
We experienced great things together and they have shown me their final
year projects.
At last I would like to thank the all the administration staff of Ghulam Ishaq
Institute of engineering Sciences and technology and the faculty members
of Computer science department, especially Sir Fawad .
4 | P a g e
EXECUTIVE SUMMARY:
The report is specially meant for my internship program. It is concerned to
a brief study of operations, functions, tasks I performed during my
internship program.
Teradata is playing leading role in providing powerful, enterprise big data
analytics and services that include Data Warehousing, Data Driven
Marketing, BI and CRM.
In preparation of this report I have tried my best to provide all possible
information about the operations, functions, tasks and the corporate
information of Teradata Pakistan in brief and comprehensive form.
5 | P a g e
Letter of Undertaking:
6 | P a g e
Internship Certificate:
7 | P a g e
About Teradata:
Introduction:
Teradata Corporation is a publicly held international computer company
that sells analytic data platforms, marketing applications and related
services. Its analytics products are meant to consolidate data from different
sources and make the data available for analysis. Teradata marketing
applications are meant to support marketing teams that use data analytics
to inform and develop programs.
Teradata is an enterprise software company that develops and sells
a relational database management system (RDBMS) with the same name.
Teradata is publicly traded on the New York Stock Exchange (NYSE) under
the stock symbol TDC.
Teradata Products:
The Teradata product is referred to as a "data warehouse system" and
stores and manages data. The data warehouses use a "shared nothing"
architecture, which means that each server node has its own memory and
processing power. Adding more servers and nodes increases the amount
of data that can be stored. The database software sits on top of the servers
and spreads the workload among them. Teradata sells applications and
software to process different types of data. In 2010, Teradata added text
analytics to track unstructured, such as word processor documents,
and semi-structured data, such as spreadsheets.
Teradata's product can be used for business analysis. Data warehouses
can track company data, such as sales, customer preferences, product
placement, etc.
8 | P a g e
Teradata Database:
Teradata is a relational database management system (RDBMS) that is:
• Teradata is an open system, running on a UNIX MP-RAS or Windows
server platform.
• Teradata is capable of supporting many concurrent users from various
client platforms.
• Teradata is compatible with industry standards (ANSI compliant).
• Teradata is completely built on a parallel architecture.
Why Teradata?
There have plenty of reasons why customers like to choose Teradata .
 Teradata supports larger warehouse data than all competitors
combined.
 Teradata Database can scale from 100 gigabytes to over 100+
petabytes of data on a single system without losing any performance
.This is called Scalability.
 Provides a parallel-aware Optimizer that makes query tuning
unnecessary to get a query to run.
 Automatic and even data distribution eliminates complex indexing
schemes or time-consuming reorganizations.
 Teradata Database can handle the most concurrent users, who are
often running multiple, complex queries.
 Designed and built with parallelism.
 Supports ad-hoc queries using SQL
 Single point of control for the DBA (Teradata Manager).
 Unconditional parallelism (parallel architecture)
 Teradata provides the lowest total cost (TCO) of ownership
 High availability of data because there is no single point of failure -
fault tolerance is built-in to the system.
9 | P a g e
Teradata Database can be used as :
 Enterprise data warehousing
 Active data warehousing
 Customer relationship management
 Internet and E-Business
 Data marts.
10 | P a g e
OBJECTIVE OR PURPOSE OF INTERNSHIP:
Two cogent reasons / purposes of the study are following.
1: General Purpose / Objective
 To know about how people works in an organization.
 To gain experience of work in Teradata which will help me in job
process.
 To know what skills they want from an employee.
 To see the application of our Professional studies especially.
2: Specific Purpose / Objective
Specific purpose of the study includes.
 To know how the employees in large organization handle a problem.
 To get a certificate from Teradata organization.
 To use their database software and to check its queries.
 To objectively observe the operations of Teradata in general.
11 | P a g e
Interview Questions?
 Tell me about your self
 What Can You Do for Us That Other Candidates Can't?
 What is parallelism in Teradata?
 Can we load a Multi set table using MLOAD?
 What is use of BI in Teradata?
 What is snowflake in database?
 What is star schema?
 Normalization is necessary because?
 De-normalization is necessary because?
 What are views is database?
12 | P a g e
Description of the internship:
This report is a short description of my four week internship carried out as
compulsory component of the BS in computer science. The internship was
carried out within the organization Teradata in summer 2015. As I am
interested in databases the work was concentrated on the data
warehousing.
At the beginning of the internship I formulated several learning goals, which
I wanted to achieve:
 to understand the functioning and working conditions of a non-
governmental organization;
 to see what is like to work in a professional environment;
 to see if this kind of work is a possibility for my future career;
 to use my gained skills and knowledge;
 to see what skills and knowledge I still need to work in a professional
environment;
 to learn about the organizing of a research project (planning,
preparation, permissions etc.)
 to learn about research methodologies (field methods/methods to
analyze data)
 to get fieldwork experience/collect data in an environment unknown
for me.
 to enhance my communication skills;
 to build a network.
This internship report contains my activities that have contributed to
achieve a number of my stated goals.
13 | P a g e
1st
Week:
During the first week I just revise the databases basic concept and did
practice of writing complex queries.
This is task is given to me as my homework while in the office I was given
the training session of using the software name Tableau.
Tableau Software:
Tableau Software is an American computer software company
headquartered in Seattle, Washington. It produces a family of
interactive data visualization products focused on business intelligence.
Products:
Tableau offers five main products: Tableau Desktop, Tableau Server,
Tableau Online, Tableau Reader and Tableau Public. Tableau Public and
Tableau Reader are free to use, while both Tableau Server and Tableau
Desktop come with a 14-day fully functional free trial period, after which the
user must pay for the software. Tableau Desktop comes in both a
Professional and a lower cost Personal edition. Tableau Online is available
with an annual subscription for a single user, and scales to support
thousands of users.
14 | P a g e
2nd
Week:
The below picture shows my assignment no 1.
15 | P a g e
The below list was to send to me by Miss Maria it contain the name of
different companies.
16 | P a g e
17 | P a g e
18 | P a g e
Conclusion:
This was my 2nd
week task which I did with full dedication and hard work.
19 | P a g e
3rd
Week:
In the 3rd
and 4th
week I was the task of creating a data warehouse.
In the 3rd
week I created a schema diagram for normalized data and then
created the tables. After the creation of the database it is the time to
populate that data base with data up to 500000 per table in the normalized
database.
Here is the schema diagram for normalized data.
20 | P a g e
Fact Tables:
A fact table is the central table in a star schema of a data warehouse. A
fact table stores quantitative information for analysis and is often
denormalized.
Dimension Tables:
Contrary to fact tables, dimension tables contain descriptive attributes (or
fields) that are typically textual fields (or discrete numbers that behave like
text). These attributes are designed to serve two critical purposes: query
constraining and/or filtering, and query result set labeling.
Code:
The following code is to generate data up to 500000 and store it into a text
file and then its loaded into the database tables.
#include<iostream>
#include <stdlib.h>
#include <time.h>
#include <fstream>
#include <string>
using namespace std;
static const char alphanum[]
="0123456789""ABCDEFGHIJKLMNOPQRSTUVWXYZ""abcdefghijklmnopqrstuvwxyz";
int stringLength = sizeof(alphanum) - 1;
char genRandom()
{
return alphanum[rand() % stringLength];
}
int main(){
int x;
srand (time(NULL));
ofstream myfile;
myfile.open ("Name.txt");
for(int i=0;i<500000;i++){
21 | P a g e
x=rand()% 8+4;
myfile<< i<<" | ";
for(int j=0;j<x;j++){
//cout<<j<<" my name is salman";
int num = rand() % 26;
char upper = static_cast<char>( 'A' + num ); // Convert to upper case
myfile <<upper ;
}
myfile<<" | ";
for(int z=0; z < 21; z++) // generate alphanumeric data
{
myfile << genRandom(); }
myfile<<"n";
}
myfile.close();
return 0;
}
22 | P a g e
4th
Week:
In the last week task was to denormalized the above database, make a
warehouse and check time difference for both normalized data and
denormalized data.
Schema diagram for denormalized data.
23 | P a g e
Comparison of Normalized and Denormalized queries:
Normalized Query:
Denormalized Query:

More Related Content

What's hot

Business Intelligence tools comparison
Business Intelligence tools comparisonBusiness Intelligence tools comparison
Business Intelligence tools comparison
Stratebi
 
SAP Web IDE
SAP Web IDESAP Web IDE
The Future of Business Planning with BPC 10.1 and SAP HANA
The Future of Business Planning with BPC 10.1 and SAP  HANAThe Future of Business Planning with BPC 10.1 and SAP  HANA
The Future of Business Planning with BPC 10.1 and SAP HANA
Dickinson + Associates
 
Differences Between Bw3.5 Bi7.0
Differences Between Bw3.5 Bi7.0Differences Between Bw3.5 Bi7.0
Differences Between Bw3.5 Bi7.0srinath_vj
 
SAP Extended ECM by OpenText 10.0 - What's New?
SAP Extended ECM by OpenText 10.0 - What's New?SAP Extended ECM by OpenText 10.0 - What's New?
SAP Extended ECM by OpenText 10.0 - What's New?
Thomas Demmler
 
SAP A1 Vs SAP B1
SAP A1 Vs SAP B1SAP A1 Vs SAP B1
SAP A1 Vs SAP B1
ITChamps Software Pvt. Ltd
 
S4F00_EN_Col17 Overview of Financials in SAP S4HANA.pdf
S4F00_EN_Col17 Overview of Financials in SAP S4HANA.pdfS4F00_EN_Col17 Overview of Financials in SAP S4HANA.pdf
S4F00_EN_Col17 Overview of Financials in SAP S4HANA.pdf
lakshmi vara
 
Running an Agile Project with Odoo
Running an Agile Project with OdooRunning an Agile Project with Odoo
Running an Agile Project with Odoo
Odoo
 
Power BI Ecosystem
Power BI EcosystemPower BI Ecosystem
Power BI Ecosystem
Swapnil Jadhav
 
Key Considerations for a Successful Hyperion Planning Implementation
Key Considerations for a Successful Hyperion Planning ImplementationKey Considerations for a Successful Hyperion Planning Implementation
Key Considerations for a Successful Hyperion Planning ImplementationAlithya
 
DAX (Data Analysis eXpressions) from Zero to Hero
DAX (Data Analysis eXpressions) from Zero to HeroDAX (Data Analysis eXpressions) from Zero to Hero
DAX (Data Analysis eXpressions) from Zero to Hero
Microsoft TechNet - Belgium and Luxembourg
 
Présentation Sage Erp X3
Présentation Sage Erp X3Présentation Sage Erp X3
Présentation Sage Erp X3Alexis Noal
 
Microsoft + Agile
Microsoft + AgileMicrosoft + Agile
Microsoft + Agile
Andrea Tino
 
How to Prepare for a BI Migration
How to Prepare for a BI MigrationHow to Prepare for a BI Migration
How to Prepare for a BI Migration
Senturus
 
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...Simplify Complex Consolidations and Close Processes with Oracle Financial Con...
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...
Alithya
 
Oracle AIM Methodology
Oracle AIM MethodologyOracle AIM Methodology
Oracle AIM Methodology
Feras Ahmad
 
IBM Maximo Asset Management solutions for the oil and gas industry
IBM Maximo Asset Management solutions for the oil and gas industryIBM Maximo Asset Management solutions for the oil and gas industry
IBM Maximo Asset Management solutions for the oil and gas industry
IBM Chemical Petroleum
 
HANA Modeling
HANA Modeling HANA Modeling
HANA Modeling
Kishore Chaganti
 

What's hot (20)

Business Intelligence tools comparison
Business Intelligence tools comparisonBusiness Intelligence tools comparison
Business Intelligence tools comparison
 
SAP Web IDE
SAP Web IDESAP Web IDE
SAP Web IDE
 
The Future of Business Planning with BPC 10.1 and SAP HANA
The Future of Business Planning with BPC 10.1 and SAP  HANAThe Future of Business Planning with BPC 10.1 and SAP  HANA
The Future of Business Planning with BPC 10.1 and SAP HANA
 
Hfm intro
Hfm introHfm intro
Hfm intro
 
Differences Between Bw3.5 Bi7.0
Differences Between Bw3.5 Bi7.0Differences Between Bw3.5 Bi7.0
Differences Between Bw3.5 Bi7.0
 
SAP Extended ECM by OpenText 10.0 - What's New?
SAP Extended ECM by OpenText 10.0 - What's New?SAP Extended ECM by OpenText 10.0 - What's New?
SAP Extended ECM by OpenText 10.0 - What's New?
 
SAP A1 Vs SAP B1
SAP A1 Vs SAP B1SAP A1 Vs SAP B1
SAP A1 Vs SAP B1
 
Sap
SapSap
Sap
 
S4F00_EN_Col17 Overview of Financials in SAP S4HANA.pdf
S4F00_EN_Col17 Overview of Financials in SAP S4HANA.pdfS4F00_EN_Col17 Overview of Financials in SAP S4HANA.pdf
S4F00_EN_Col17 Overview of Financials in SAP S4HANA.pdf
 
Running an Agile Project with Odoo
Running an Agile Project with OdooRunning an Agile Project with Odoo
Running an Agile Project with Odoo
 
Power BI Ecosystem
Power BI EcosystemPower BI Ecosystem
Power BI Ecosystem
 
Key Considerations for a Successful Hyperion Planning Implementation
Key Considerations for a Successful Hyperion Planning ImplementationKey Considerations for a Successful Hyperion Planning Implementation
Key Considerations for a Successful Hyperion Planning Implementation
 
DAX (Data Analysis eXpressions) from Zero to Hero
DAX (Data Analysis eXpressions) from Zero to HeroDAX (Data Analysis eXpressions) from Zero to Hero
DAX (Data Analysis eXpressions) from Zero to Hero
 
Présentation Sage Erp X3
Présentation Sage Erp X3Présentation Sage Erp X3
Présentation Sage Erp X3
 
Microsoft + Agile
Microsoft + AgileMicrosoft + Agile
Microsoft + Agile
 
How to Prepare for a BI Migration
How to Prepare for a BI MigrationHow to Prepare for a BI Migration
How to Prepare for a BI Migration
 
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...Simplify Complex Consolidations and Close Processes with Oracle Financial Con...
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...
 
Oracle AIM Methodology
Oracle AIM MethodologyOracle AIM Methodology
Oracle AIM Methodology
 
IBM Maximo Asset Management solutions for the oil and gas industry
IBM Maximo Asset Management solutions for the oil and gas industryIBM Maximo Asset Management solutions for the oil and gas industry
IBM Maximo Asset Management solutions for the oil and gas industry
 
HANA Modeling
HANA Modeling HANA Modeling
HANA Modeling
 

Similar to Report for internship

Top tableau questions and answers in 2019
Top tableau questions and answers in 2019Top tableau questions and answers in 2019
Top tableau questions and answers in 2019
minatibiswal1
 
Lecture2 big data life cycle
Lecture2 big data life cycleLecture2 big data life cycle
Lecture2 big data life cycle
hktripathy
 
Frameworks provide structure. The core objective of the Big Data Framework is...
Frameworks provide structure. The core objective of the Big Data Framework is...Frameworks provide structure. The core objective of the Big Data Framework is...
Frameworks provide structure. The core objective of the Big Data Framework is...
RINUSATHYAN
 
Metadata in data warehouse
Metadata in data warehouseMetadata in data warehouse
Metadata in data warehouse
Siddique Ibrahim
 
SAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperSAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperVipul Neema
 
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docxRunning head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
todd271
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
Sandeep Garg
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
SpringPeople
 
Pavankumar_TeraData_DBA_8yrsExp
Pavankumar_TeraData_DBA_8yrsExpPavankumar_TeraData_DBA_8yrsExp
Pavankumar_TeraData_DBA_8yrsExppavankumar akula
 
Visualization using Tableau
Visualization using TableauVisualization using Tableau
Visualization using Tableau
Girija Muscut
 
Master Meta Data
Master Meta DataMaster Meta Data
Master Meta Data
Digikrit
 
Aginity "Big Data" Research Lab
Aginity "Big Data" Research LabAginity "Big Data" Research Lab
Aginity "Big Data" Research Labkevinflorian
 
Planning Data Warehouse
Planning Data WarehousePlanning Data Warehouse
Planning Data Warehouse
Fahri Firdausillah
 
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Rohit Dubey
 
Learn Business Analytics with R at edureka!
Learn Business Analytics with R at edureka!Learn Business Analytics with R at edureka!
Learn Business Analytics with R at edureka!
Edureka!
 
data wrangling (1).pptx kjhiukjhknjbnkjh
data wrangling (1).pptx kjhiukjhknjbnkjhdata wrangling (1).pptx kjhiukjhknjbnkjh
data wrangling (1).pptx kjhiukjhknjbnkjh
VISHALMARWADE1
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
Data Science Council of America
 
Big Data Transformation Powered By Apache Spark.pptx
Big Data Transformation Powered By Apache Spark.pptxBig Data Transformation Powered By Apache Spark.pptx
Big Data Transformation Powered By Apache Spark.pptx
Knoldus Inc.
 

Similar to Report for internship (20)

Top tableau questions and answers in 2019
Top tableau questions and answers in 2019Top tableau questions and answers in 2019
Top tableau questions and answers in 2019
 
Lecture2 big data life cycle
Lecture2 big data life cycleLecture2 big data life cycle
Lecture2 big data life cycle
 
Frameworks provide structure. The core objective of the Big Data Framework is...
Frameworks provide structure. The core objective of the Big Data Framework is...Frameworks provide structure. The core objective of the Big Data Framework is...
Frameworks provide structure. The core objective of the Big Data Framework is...
 
Metadata in data warehouse
Metadata in data warehouseMetadata in data warehouse
Metadata in data warehouse
 
SAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperSAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White Paper
 
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docxRunning head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docx
 
Yeswanth-Resume
Yeswanth-ResumeYeswanth-Resume
Yeswanth-Resume
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
 
Pavankumar_TeraData_DBA_8yrsExp
Pavankumar_TeraData_DBA_8yrsExpPavankumar_TeraData_DBA_8yrsExp
Pavankumar_TeraData_DBA_8yrsExp
 
Visualization using Tableau
Visualization using TableauVisualization using Tableau
Visualization using Tableau
 
Master Meta Data
Master Meta DataMaster Meta Data
Master Meta Data
 
Aginity "Big Data" Research Lab
Aginity "Big Data" Research LabAginity "Big Data" Research Lab
Aginity "Big Data" Research Lab
 
Gopi
GopiGopi
Gopi
 
Planning Data Warehouse
Planning Data WarehousePlanning Data Warehouse
Planning Data Warehouse
 
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
 
Learn Business Analytics with R at edureka!
Learn Business Analytics with R at edureka!Learn Business Analytics with R at edureka!
Learn Business Analytics with R at edureka!
 
data wrangling (1).pptx kjhiukjhknjbnkjh
data wrangling (1).pptx kjhiukjhknjbnkjhdata wrangling (1).pptx kjhiukjhknjbnkjh
data wrangling (1).pptx kjhiukjhknjbnkjh
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
 
Big Data Transformation Powered By Apache Spark.pptx
Big Data Transformation Powered By Apache Spark.pptxBig Data Transformation Powered By Apache Spark.pptx
Big Data Transformation Powered By Apache Spark.pptx
 

Report for internship

  • 1. 1 | P a g e Internship Report Ghulam Ishaq Khan Institute of Engineering Sciences and Technology
  • 2. 2 | P a g e Name: Salman Khan Registration Number: 2012338 Organization: Teradata Duration: 1 Month (Four Weeks) Submission Date: 30th November 2015 Faculty of Computer Science and Engineering (Fall- 2016)
  • 3. 3 | P a g e Acknowledgement: First I would like to thank Sir Hassan Waqar, Awais Ijaz Professional Services Consultant , for giving me the opportunity to do an internship within the organization. For me it was a unique experience to be in Teradata Pakistan and to study an interesting data warehousing. It also helped to get back my interest in databases and to have new plans for my future career. I also would like all the people that worked in the office of Teradata in Lahore. With their patience and openness they created an enjoyable working environment. Furthermore I want to thank all the students, with whom I did the fieldwork. We experienced great things together and they have shown me their final year projects. At last I would like to thank the all the administration staff of Ghulam Ishaq Institute of engineering Sciences and technology and the faculty members of Computer science department, especially Sir Fawad .
  • 4. 4 | P a g e EXECUTIVE SUMMARY: The report is specially meant for my internship program. It is concerned to a brief study of operations, functions, tasks I performed during my internship program. Teradata is playing leading role in providing powerful, enterprise big data analytics and services that include Data Warehousing, Data Driven Marketing, BI and CRM. In preparation of this report I have tried my best to provide all possible information about the operations, functions, tasks and the corporate information of Teradata Pakistan in brief and comprehensive form.
  • 5. 5 | P a g e Letter of Undertaking:
  • 6. 6 | P a g e Internship Certificate:
  • 7. 7 | P a g e About Teradata: Introduction: Teradata Corporation is a publicly held international computer company that sells analytic data platforms, marketing applications and related services. Its analytics products are meant to consolidate data from different sources and make the data available for analysis. Teradata marketing applications are meant to support marketing teams that use data analytics to inform and develop programs. Teradata is an enterprise software company that develops and sells a relational database management system (RDBMS) with the same name. Teradata is publicly traded on the New York Stock Exchange (NYSE) under the stock symbol TDC. Teradata Products: The Teradata product is referred to as a "data warehouse system" and stores and manages data. The data warehouses use a "shared nothing" architecture, which means that each server node has its own memory and processing power. Adding more servers and nodes increases the amount of data that can be stored. The database software sits on top of the servers and spreads the workload among them. Teradata sells applications and software to process different types of data. In 2010, Teradata added text analytics to track unstructured, such as word processor documents, and semi-structured data, such as spreadsheets. Teradata's product can be used for business analysis. Data warehouses can track company data, such as sales, customer preferences, product placement, etc.
  • 8. 8 | P a g e Teradata Database: Teradata is a relational database management system (RDBMS) that is: • Teradata is an open system, running on a UNIX MP-RAS or Windows server platform. • Teradata is capable of supporting many concurrent users from various client platforms. • Teradata is compatible with industry standards (ANSI compliant). • Teradata is completely built on a parallel architecture. Why Teradata? There have plenty of reasons why customers like to choose Teradata .  Teradata supports larger warehouse data than all competitors combined.  Teradata Database can scale from 100 gigabytes to over 100+ petabytes of data on a single system without losing any performance .This is called Scalability.  Provides a parallel-aware Optimizer that makes query tuning unnecessary to get a query to run.  Automatic and even data distribution eliminates complex indexing schemes or time-consuming reorganizations.  Teradata Database can handle the most concurrent users, who are often running multiple, complex queries.  Designed and built with parallelism.  Supports ad-hoc queries using SQL  Single point of control for the DBA (Teradata Manager).  Unconditional parallelism (parallel architecture)  Teradata provides the lowest total cost (TCO) of ownership  High availability of data because there is no single point of failure - fault tolerance is built-in to the system.
  • 9. 9 | P a g e Teradata Database can be used as :  Enterprise data warehousing  Active data warehousing  Customer relationship management  Internet and E-Business  Data marts.
  • 10. 10 | P a g e OBJECTIVE OR PURPOSE OF INTERNSHIP: Two cogent reasons / purposes of the study are following. 1: General Purpose / Objective  To know about how people works in an organization.  To gain experience of work in Teradata which will help me in job process.  To know what skills they want from an employee.  To see the application of our Professional studies especially. 2: Specific Purpose / Objective Specific purpose of the study includes.  To know how the employees in large organization handle a problem.  To get a certificate from Teradata organization.  To use their database software and to check its queries.  To objectively observe the operations of Teradata in general.
  • 11. 11 | P a g e Interview Questions?  Tell me about your self  What Can You Do for Us That Other Candidates Can't?  What is parallelism in Teradata?  Can we load a Multi set table using MLOAD?  What is use of BI in Teradata?  What is snowflake in database?  What is star schema?  Normalization is necessary because?  De-normalization is necessary because?  What are views is database?
  • 12. 12 | P a g e Description of the internship: This report is a short description of my four week internship carried out as compulsory component of the BS in computer science. The internship was carried out within the organization Teradata in summer 2015. As I am interested in databases the work was concentrated on the data warehousing. At the beginning of the internship I formulated several learning goals, which I wanted to achieve:  to understand the functioning and working conditions of a non- governmental organization;  to see what is like to work in a professional environment;  to see if this kind of work is a possibility for my future career;  to use my gained skills and knowledge;  to see what skills and knowledge I still need to work in a professional environment;  to learn about the organizing of a research project (planning, preparation, permissions etc.)  to learn about research methodologies (field methods/methods to analyze data)  to get fieldwork experience/collect data in an environment unknown for me.  to enhance my communication skills;  to build a network. This internship report contains my activities that have contributed to achieve a number of my stated goals.
  • 13. 13 | P a g e 1st Week: During the first week I just revise the databases basic concept and did practice of writing complex queries. This is task is given to me as my homework while in the office I was given the training session of using the software name Tableau. Tableau Software: Tableau Software is an American computer software company headquartered in Seattle, Washington. It produces a family of interactive data visualization products focused on business intelligence. Products: Tableau offers five main products: Tableau Desktop, Tableau Server, Tableau Online, Tableau Reader and Tableau Public. Tableau Public and Tableau Reader are free to use, while both Tableau Server and Tableau Desktop come with a 14-day fully functional free trial period, after which the user must pay for the software. Tableau Desktop comes in both a Professional and a lower cost Personal edition. Tableau Online is available with an annual subscription for a single user, and scales to support thousands of users.
  • 14. 14 | P a g e 2nd Week: The below picture shows my assignment no 1.
  • 15. 15 | P a g e The below list was to send to me by Miss Maria it contain the name of different companies.
  • 16. 16 | P a g e
  • 17. 17 | P a g e
  • 18. 18 | P a g e Conclusion: This was my 2nd week task which I did with full dedication and hard work.
  • 19. 19 | P a g e 3rd Week: In the 3rd and 4th week I was the task of creating a data warehouse. In the 3rd week I created a schema diagram for normalized data and then created the tables. After the creation of the database it is the time to populate that data base with data up to 500000 per table in the normalized database. Here is the schema diagram for normalized data.
  • 20. 20 | P a g e Fact Tables: A fact table is the central table in a star schema of a data warehouse. A fact table stores quantitative information for analysis and is often denormalized. Dimension Tables: Contrary to fact tables, dimension tables contain descriptive attributes (or fields) that are typically textual fields (or discrete numbers that behave like text). These attributes are designed to serve two critical purposes: query constraining and/or filtering, and query result set labeling. Code: The following code is to generate data up to 500000 and store it into a text file and then its loaded into the database tables. #include<iostream> #include <stdlib.h> #include <time.h> #include <fstream> #include <string> using namespace std; static const char alphanum[] ="0123456789""ABCDEFGHIJKLMNOPQRSTUVWXYZ""abcdefghijklmnopqrstuvwxyz"; int stringLength = sizeof(alphanum) - 1; char genRandom() { return alphanum[rand() % stringLength]; } int main(){ int x; srand (time(NULL)); ofstream myfile; myfile.open ("Name.txt"); for(int i=0;i<500000;i++){
  • 21. 21 | P a g e x=rand()% 8+4; myfile<< i<<" | "; for(int j=0;j<x;j++){ //cout<<j<<" my name is salman"; int num = rand() % 26; char upper = static_cast<char>( 'A' + num ); // Convert to upper case myfile <<upper ; } myfile<<" | "; for(int z=0; z < 21; z++) // generate alphanumeric data { myfile << genRandom(); } myfile<<"n"; } myfile.close(); return 0; }
  • 22. 22 | P a g e 4th Week: In the last week task was to denormalized the above database, make a warehouse and check time difference for both normalized data and denormalized data. Schema diagram for denormalized data.
  • 23. 23 | P a g e Comparison of Normalized and Denormalized queries: Normalized Query: Denormalized Query: