The document is an internship report submitted by Salman Khan detailing his 1 month internship at Teradata. It includes an acknowledgements section, executive summary, and sections describing Teradata's products and database system. Salman's internship objectives were to learn about working in an organization and gain experience in databases. Over 4 weeks, he completed tasks involving Tableau training, creating normalized and denormalized databases, and comparing query performance between the two structures.
Combining EDA & API-led Connectivity through MuleSoft for integrating Salesfo...CzechDreamin
Event-Driven Architecture (EDA) has been widely adopted for companies looking to accelerate their digital transformation. These companies realized that batch-based, point-to-point Integrations slow down the business and leave a technical debt. EDA improves the ability to react to changes in real-time and improves the overall customer experience. API-led connectivity approach unlocks data through APIs allowing companies to create a composable business. In this session, I will share how to integrate Salesforce with multiple ERPs by combining EDA & API-led Connectivity through MuleSoft.
Power BI vs Tableau - An Overview from EPC Group.pptxEPC Group
Power BI and Tableau are two dominating business intelligence technologies, and many enterprises use either of them. Both the tools collect, integrate, analyze, and present business data. They assist you in performing data analysis, data manipulation, and data visualization to extract sense from raw business data.
Businesses, learners, and practitioners often find it hard to decide which tool is best for them. We organized this webinar to discuss some significant differences between Power BI and Tableau and help you find the best BI tool suiting your requirements.
Key Points of Discussion:
-History of Power BI and Tableau
-Cost of Power BI and Tableau
-Performance and Functionality of -Power BI vs. Tableau
-User Interface of Power BI and Tableau
-Data Sources in Power BI and Tableau
Power BI Create lightning fast dashboard with power bi & Its Components Vishal Pawar
Every data has meaning, but we had limitation to use data through big long running process Extraction, Transformation and Representation, but now Power BI solves your problem to kick start having Data extraction in Power Query, Data Modelling and Transformation in Power Pivot and reach data representation using power view and power map on demand any nearby device on your fingertips, You will learn all latest and greatest features of Power BI.
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Denodo
Watch full webinar here: https://bit.ly/32TT2Uu
Data virtualization is not just for self-service, it’s also a first-class citizen when it comes to modern data platform architectures. Technology has forced many businesses to rethink their delivery models. Startups emerged, leveraging the internet and mobile technology to better meet customer needs (like Amazon and Lyft), disrupting entire categories of business, and grew to dominate their categories.
Schedule a complimentary Data Virtualization Discovery Session with g2o.
Traditional companies are still struggling to meet rising customer expectations. During this webinar with the experts from g2o and Denodo we covered the following:
- How modern data platforms enable businesses to address these new customer expectation
- How you can drive value from your investment in a data platform now
- How you can use data virtualization to enable multi-cloud strategies
Leveraging the strategy insights of g2o and the power of the Denodo platform, companies do not need to undergo the costly removal and replacement of legacy systems to modernize their systems. g2o and Denodo can provide a strategy to create a modern data architecture within a company’s existing infrastructure.
So, you want to partner with SAP. Are you ready?
So, you have a great product and proposition growing globally
and now you are ready to partner with SAP.
What does it take to get your product sold by SAP?
What does it take to become a SAP Partner?
SAP is the third largest software company in the World with over 320,000 customers and more than 17,000 partners.
What Will You Learn?
● What does it take to partner with SAP?
● What does SAP look for in the product partner companies? How do they evaluate partners?
● How is SAP structured? Can it truly be a global partner for your business?
● How do you manage a giant partner such as SAP?
● How do you evaluate if SAP is the right path for your business?
About Tenego
Tenego provides international sales execution services for growing and established Technology Companies. Tenego's proven approach and expertise take responsibility for driving the company's sales revenues across multiple markets with direct sales outsourcing, channel development, channel management, lead generation services and sales management services.
The art of information architecture in Office 365Simon Rawson
I gave this this presentation at the Collab365 Global Conference in September 2020. It covers the main elements you need to consider in developing an information architecture and management plan for Office 365
Combining EDA & API-led Connectivity through MuleSoft for integrating Salesfo...CzechDreamin
Event-Driven Architecture (EDA) has been widely adopted for companies looking to accelerate their digital transformation. These companies realized that batch-based, point-to-point Integrations slow down the business and leave a technical debt. EDA improves the ability to react to changes in real-time and improves the overall customer experience. API-led connectivity approach unlocks data through APIs allowing companies to create a composable business. In this session, I will share how to integrate Salesforce with multiple ERPs by combining EDA & API-led Connectivity through MuleSoft.
Power BI vs Tableau - An Overview from EPC Group.pptxEPC Group
Power BI and Tableau are two dominating business intelligence technologies, and many enterprises use either of them. Both the tools collect, integrate, analyze, and present business data. They assist you in performing data analysis, data manipulation, and data visualization to extract sense from raw business data.
Businesses, learners, and practitioners often find it hard to decide which tool is best for them. We organized this webinar to discuss some significant differences between Power BI and Tableau and help you find the best BI tool suiting your requirements.
Key Points of Discussion:
-History of Power BI and Tableau
-Cost of Power BI and Tableau
-Performance and Functionality of -Power BI vs. Tableau
-User Interface of Power BI and Tableau
-Data Sources in Power BI and Tableau
Power BI Create lightning fast dashboard with power bi & Its Components Vishal Pawar
Every data has meaning, but we had limitation to use data through big long running process Extraction, Transformation and Representation, but now Power BI solves your problem to kick start having Data extraction in Power Query, Data Modelling and Transformation in Power Pivot and reach data representation using power view and power map on demand any nearby device on your fingertips, You will learn all latest and greatest features of Power BI.
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Denodo
Watch full webinar here: https://bit.ly/32TT2Uu
Data virtualization is not just for self-service, it’s also a first-class citizen when it comes to modern data platform architectures. Technology has forced many businesses to rethink their delivery models. Startups emerged, leveraging the internet and mobile technology to better meet customer needs (like Amazon and Lyft), disrupting entire categories of business, and grew to dominate their categories.
Schedule a complimentary Data Virtualization Discovery Session with g2o.
Traditional companies are still struggling to meet rising customer expectations. During this webinar with the experts from g2o and Denodo we covered the following:
- How modern data platforms enable businesses to address these new customer expectation
- How you can drive value from your investment in a data platform now
- How you can use data virtualization to enable multi-cloud strategies
Leveraging the strategy insights of g2o and the power of the Denodo platform, companies do not need to undergo the costly removal and replacement of legacy systems to modernize their systems. g2o and Denodo can provide a strategy to create a modern data architecture within a company’s existing infrastructure.
So, you want to partner with SAP. Are you ready?
So, you have a great product and proposition growing globally
and now you are ready to partner with SAP.
What does it take to get your product sold by SAP?
What does it take to become a SAP Partner?
SAP is the third largest software company in the World with over 320,000 customers and more than 17,000 partners.
What Will You Learn?
● What does it take to partner with SAP?
● What does SAP look for in the product partner companies? How do they evaluate partners?
● How is SAP structured? Can it truly be a global partner for your business?
● How do you manage a giant partner such as SAP?
● How do you evaluate if SAP is the right path for your business?
About Tenego
Tenego provides international sales execution services for growing and established Technology Companies. Tenego's proven approach and expertise take responsibility for driving the company's sales revenues across multiple markets with direct sales outsourcing, channel development, channel management, lead generation services and sales management services.
The art of information architecture in Office 365Simon Rawson
I gave this this presentation at the Collab365 Global Conference in September 2020. It covers the main elements you need to consider in developing an information architecture and management plan for Office 365
SAP Business Planning and Consolidation 10.1, version for SAP NetWeaver, provides a unified planning suite to address a wide range of business issues. Now fully supported by the SAP BW on HANA platform, BPC 10.1 provides deeper integration with SAP NetWeaver and can leverage the platform's Integrated Planning (IP) and Planning Applications Kit (PAK) framework. The introduction of SAP Cloud for Planning builds upon this by offering the ability to roll out a public cloud planning application to strategic business units that can plug into the central BPC Plan.
SAP Extended ECM by OpenText 10.0 - What's New?Thomas Demmler
Many business processes, in both commercial and government organizations, span the worlds of transactional data and business content. OpenText Extended ECM (sold by SAP as SAP Extended ECM by OpenText) extends the transactional process management capabilities of SAP ERP with comprehensive Enterprise Content Management (ECM) capabilities, including document management, records management and collaboration. With OpenText Extended ECM, you can unite the worlds of ERP and ECM in a single solution, reducing the risks and costs of records mismanagement, increasing information worker productivity and enhancing your ability to comply with regulations.
This presentation shows how the new version 10 of SAP Extended ECM improves the efficiency of selected business processes.
Learn how to accurately scope analytics migrations that comes in on time and on budget. See the recording and download this deck: https://senturus.com/resources/prepare-bi-migration/
Senturus offers a full spectrum of services for business analytics. Our Knowledge Center has hundreds of free live and recorded webinars, blog posts, demos and unbiased product reviews available on our website at: https://senturus.com/resources/
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...Alithya
Oracle Financial Consolidation and Close Cloud Service (FCCS) mixes best practices with customization to streamline and enhance the close and consolidation process. This configurable product leverages out-of-the-box content which gives companies the framework needed to deploy an easy-to-use, yet sophisticated, close solution. It requires no infrastructure investments, offers flexible deployment options, and provides fast time-to-value.
In this presentation, we explore the full range of capabilities of this solution, including the orchestration of the close process, supplemental data collection, ease of administration, and more. Find out how FCCS can be used to address the complexities that are encountered with consolidations, such as foreign currency translation, intercompany eliminations, and ownership requirements. We touch on additional features that are improved in the cloud such as process automation, administration, and supplemental data collection. Interact and ask questions to learn more about this world-class solution for close and consolidation applications and decide if it's time for you to make the move to the Cloud!
IBM Maximo Asset Management solutions for the oil and gas industryIBM Chemical Petroleum
As oil and gas companies strive for operational excellence in a world becoming smaller and smarter, IBM Maximo for Oil & Gas ensures a competitive advantage. Maximo software provides the control and automation necessary to provide detailed risk analysis for improved operational efficiency.
Top tableau questions and answers in 2019minatibiswal1
At the instant, Tableau Server is Windows and UNIX system compatible. Tableau Training could be a hosted Tableau Server version to skip hardware setup. Tableau Public could be a free computer code that enables anyone to attach to a program or file and build interactive Data visualizations for the online.
SAP Business Planning and Consolidation 10.1, version for SAP NetWeaver, provides a unified planning suite to address a wide range of business issues. Now fully supported by the SAP BW on HANA platform, BPC 10.1 provides deeper integration with SAP NetWeaver and can leverage the platform's Integrated Planning (IP) and Planning Applications Kit (PAK) framework. The introduction of SAP Cloud for Planning builds upon this by offering the ability to roll out a public cloud planning application to strategic business units that can plug into the central BPC Plan.
SAP Extended ECM by OpenText 10.0 - What's New?Thomas Demmler
Many business processes, in both commercial and government organizations, span the worlds of transactional data and business content. OpenText Extended ECM (sold by SAP as SAP Extended ECM by OpenText) extends the transactional process management capabilities of SAP ERP with comprehensive Enterprise Content Management (ECM) capabilities, including document management, records management and collaboration. With OpenText Extended ECM, you can unite the worlds of ERP and ECM in a single solution, reducing the risks and costs of records mismanagement, increasing information worker productivity and enhancing your ability to comply with regulations.
This presentation shows how the new version 10 of SAP Extended ECM improves the efficiency of selected business processes.
Learn how to accurately scope analytics migrations that comes in on time and on budget. See the recording and download this deck: https://senturus.com/resources/prepare-bi-migration/
Senturus offers a full spectrum of services for business analytics. Our Knowledge Center has hundreds of free live and recorded webinars, blog posts, demos and unbiased product reviews available on our website at: https://senturus.com/resources/
Simplify Complex Consolidations and Close Processes with Oracle Financial Con...Alithya
Oracle Financial Consolidation and Close Cloud Service (FCCS) mixes best practices with customization to streamline and enhance the close and consolidation process. This configurable product leverages out-of-the-box content which gives companies the framework needed to deploy an easy-to-use, yet sophisticated, close solution. It requires no infrastructure investments, offers flexible deployment options, and provides fast time-to-value.
In this presentation, we explore the full range of capabilities of this solution, including the orchestration of the close process, supplemental data collection, ease of administration, and more. Find out how FCCS can be used to address the complexities that are encountered with consolidations, such as foreign currency translation, intercompany eliminations, and ownership requirements. We touch on additional features that are improved in the cloud such as process automation, administration, and supplemental data collection. Interact and ask questions to learn more about this world-class solution for close and consolidation applications and decide if it's time for you to make the move to the Cloud!
IBM Maximo Asset Management solutions for the oil and gas industryIBM Chemical Petroleum
As oil and gas companies strive for operational excellence in a world becoming smaller and smarter, IBM Maximo for Oil & Gas ensures a competitive advantage. Maximo software provides the control and automation necessary to provide detailed risk analysis for improved operational efficiency.
Top tableau questions and answers in 2019minatibiswal1
At the instant, Tableau Server is Windows and UNIX system compatible. Tableau Training could be a hosted Tableau Server version to skip hardware setup. Tableau Public could be a free computer code that enables anyone to attach to a program or file and build interactive Data visualizations for the online.
Frameworks provide structure. The core objective of the Big Data Framework is...RINUSATHYAN
Frameworks provide structure. The core objective of the Big Data Framework is to provide a structure for enterprise organisations that aim to benefit from the potential of Big Data
Running head CS688 – Data Analytics with R1CS688 – Data Analyt.docxtodd271
Running head: CS688 – Data Analytics with R1
CS688 – Data Analytics with R10
CS688 – Data Analytics with R
Surendra Parimi
CS688 – Introduction to CRISP-DM and the R platform IP 1
Colorado Technical University
07/10/2019
Table of Contents
Introduction to CRISP-DM and the R Platform Organizational Background3
Organizational Background:3
CRISP-DM(Cross-industry standard process for data mining):3
Data Maturity:4
Role of Data Analyst:6
How Do we Implement the R Platform:6
R Modeling With Regressions and Classifications (TBD)7
Model Performance Evaluation (TBD)8
Visualizations With R (TBD)9
Machine Learning (TBD)10
References11
Introduction to CRISP-DM and the R Platform Organizational BackgroundOrganizational Background:
The organization I currently work for and planning to implement the techniques of the data analytics course is T-Mobile USA, which offers wireless mobile phone services to 0ver 80 million customers in the United States. It’s a huge enterprise with large scale information technology systems that support the business that T-Mobile does. The company is seeing significant growth in terms of business and therefore the IT systems that are supporting the business. Myself as a DEVOPS engineer works on deploying the code to these mission critical systems, host them and operate to make sure the systems are working as expected. As the land scape of our IT systems grow, we want to be able to identify the issues in our systems in advance so that we can prevent them before causing any outage to the business. To achieve such a result, our IT systems logs needs to be analyzed in-depth to unleash the critical insights about the system performance and apply the feedback to improve our systems.
CRISP-DM(Cross-industry standard process for data mining):
The CRISP-DM helps us ensure our data analysis adheres certain standards and CRISP-DM is a proven strategy worldwide. Corporations like IBM have further enhanced and or customized the standard and came up with their own methodology knows as ‘Analytics
Solution
s Unified Method for Data Mining/Predictive Analytics(ASUS_DM)’
The CRISP-DM methodology involves 6 different steps
Business Understanding: Building the knowledge about business requirements and objectives from functional aspect and transforming this knowledge as a data mining objective with an implementation plan.
Data Understanding: Involves the process of data collection from diverse sources of data, review and understand the data to be able to identify the problems which compromise data quality and also give the initial understanding of what the data can deliver.
Data Preparation: The data preparation phase covers all activities to build the final dataset from the initial raw data collected.
Modeling: Modeling techniques are based on the objective of the problem being tried. So, based on the problem, model is decided and based on the model, data is collected.
Evaluation: The evaluation phase is taken up once.
Top Big data Analytics tools: Emerging trends and Best practicesSpringPeople
For many IT experts, big data analytics tools and technologies are now a top priority. Let's find out the top big data analytics tools in this slide to initialize and advance the process of big data analysis.
Organize & manage master meta data centrally, built upon kong, cassandra, neo4j & elasticsearch. Managing master & meta data is a very common problem with no good opensource alternative as far as I know, so initiating this project – MasterMetaData.
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...Rohit Dubey
How Much Do Data Scientists Make?
The demand and salary for data scientists tend to be higher than most other ITES jobs. Experience is one of the key factors in determining the salary range of a data science professional.
According to Glassdoor, a Data Scientist in the United States earns an annual average of USD 117,212, and the same site reports that Data Scientists in India make a yearly average of ₹1,000,000.
Data Scientist Career Path
Data Science is currently considered one of the most lucrative careers available. Companies across all major industries/sectors have data scientist requirements to help them gain valuable insights from big data. There is a sharp growth in demand for highly skilled data science professionals who can straddle the business and IT worlds.
The career path to becoming a data scientist isn’t clearly defined since this is a relatively new profession. People from different backgrounds like mathematics, statistics, computer science or economics, end up in data science.
The major designations for data science professionals are:
Data Analyst
Data Scientist (entry-level)
Associate data scientist
Data Scientist (senior-level)
Product Manager
Lead data scientist
Director/VP/SVP
That was all about Data Scientist Job Description.
Become a Data Scientist Today!
In this write-up, we covered the Data Scientist job description in detail. Irrespective of which location you are in, there is no dearth of jobs for skillful data scientists. A career in data science is a rewarding journey to embark on, especially in the finance, retail, and e-commerce sectors. Jobs are also available with Government departments, universities and research institutes, telecoms, transports, the list goes on.
This video covers
Introductory Questions
Data Science Introduction
Data Science Technical Interview QnA :
#Excel
#SQL
#Python3
#MachineLearning
#DataAnalyticstechnical Interview
#DataScienceProjects
#coder #statistics #datamining #dataanalyst #code #engineering #linux #codinglife #cloudcomputing #businessintelligence #robotics #softwaredeveloper #automation #cloud #neuralnetworks #sql #science #softwareengineer #digitaltransformation #computer #daysofcode #coders #bigdataanalytics #programminglife #dataviz #html #digitalmarketing #devops #datasciencetraining #dataprotection
#rohitdubey
#teachtechtoe
#datascience #datasciencetraining #datasciencejobs #datasciencecourse #datasciencenigeria #datasciencebootcamp #datascienceworkshop #datasciencecareers #datasciencestudent #datascienceproject #datascienceforall #datasciencetraininginpatelnagar#datasciencetrainingindelhi
Learn Business Analytics with R at edureka!Edureka!
This is a 6-week course for professionals who aspire to learn 'R' language for Analytics. Practical approach of learning has been followed in order to provide a real time experience and make you think like an analyst. Our course will cover not only the basic concepts but also the advanced concepts like Data Visualization, Data Mining, Model Building in R, Web Analytics and so on.
Data pipelines are the heart and soul of data science. Are you a beginner looking to understand data pipelines? A glimpse into what they are and how they work.
Big Data Transformation Powered By Apache Spark.pptxKnoldus Inc.
Witness how Spark revolutionizes data processing. Dive into transformative functions like aggregation, array manipulation, and advanced joins, unveiling Spark as the driving force for actionable insights in the vast expanse of big data.
Big Data Transformation Powered By Apache Spark.pptx
Report for internship
1. 1 | P a g e
Internship Report
Ghulam Ishaq Khan Institute of Engineering
Sciences and Technology
2. 2 | P a g e
Name: Salman Khan
Registration Number: 2012338
Organization: Teradata
Duration: 1 Month (Four Weeks)
Submission Date: 30th
November 2015
Faculty of Computer Science and Engineering (Fall- 2016)
3. 3 | P a g e
Acknowledgement:
First I would like to thank Sir Hassan Waqar, Awais Ijaz Professional
Services Consultant , for giving me the opportunity to do an internship
within the organization. For me it was a unique experience to be in
Teradata Pakistan and to study an interesting data warehousing. It also
helped to get back my interest in databases and to have new plans for my
future career.
I also would like all the people that worked in the office of Teradata in
Lahore. With their patience and openness they created an enjoyable
working environment.
Furthermore I want to thank all the students, with whom I did the fieldwork.
We experienced great things together and they have shown me their final
year projects.
At last I would like to thank the all the administration staff of Ghulam Ishaq
Institute of engineering Sciences and technology and the faculty members
of Computer science department, especially Sir Fawad .
4. 4 | P a g e
EXECUTIVE SUMMARY:
The report is specially meant for my internship program. It is concerned to
a brief study of operations, functions, tasks I performed during my
internship program.
Teradata is playing leading role in providing powerful, enterprise big data
analytics and services that include Data Warehousing, Data Driven
Marketing, BI and CRM.
In preparation of this report I have tried my best to provide all possible
information about the operations, functions, tasks and the corporate
information of Teradata Pakistan in brief and comprehensive form.
7. 7 | P a g e
About Teradata:
Introduction:
Teradata Corporation is a publicly held international computer company
that sells analytic data platforms, marketing applications and related
services. Its analytics products are meant to consolidate data from different
sources and make the data available for analysis. Teradata marketing
applications are meant to support marketing teams that use data analytics
to inform and develop programs.
Teradata is an enterprise software company that develops and sells
a relational database management system (RDBMS) with the same name.
Teradata is publicly traded on the New York Stock Exchange (NYSE) under
the stock symbol TDC.
Teradata Products:
The Teradata product is referred to as a "data warehouse system" and
stores and manages data. The data warehouses use a "shared nothing"
architecture, which means that each server node has its own memory and
processing power. Adding more servers and nodes increases the amount
of data that can be stored. The database software sits on top of the servers
and spreads the workload among them. Teradata sells applications and
software to process different types of data. In 2010, Teradata added text
analytics to track unstructured, such as word processor documents,
and semi-structured data, such as spreadsheets.
Teradata's product can be used for business analysis. Data warehouses
can track company data, such as sales, customer preferences, product
placement, etc.
8. 8 | P a g e
Teradata Database:
Teradata is a relational database management system (RDBMS) that is:
• Teradata is an open system, running on a UNIX MP-RAS or Windows
server platform.
• Teradata is capable of supporting many concurrent users from various
client platforms.
• Teradata is compatible with industry standards (ANSI compliant).
• Teradata is completely built on a parallel architecture.
Why Teradata?
There have plenty of reasons why customers like to choose Teradata .
Teradata supports larger warehouse data than all competitors
combined.
Teradata Database can scale from 100 gigabytes to over 100+
petabytes of data on a single system without losing any performance
.This is called Scalability.
Provides a parallel-aware Optimizer that makes query tuning
unnecessary to get a query to run.
Automatic and even data distribution eliminates complex indexing
schemes or time-consuming reorganizations.
Teradata Database can handle the most concurrent users, who are
often running multiple, complex queries.
Designed and built with parallelism.
Supports ad-hoc queries using SQL
Single point of control for the DBA (Teradata Manager).
Unconditional parallelism (parallel architecture)
Teradata provides the lowest total cost (TCO) of ownership
High availability of data because there is no single point of failure -
fault tolerance is built-in to the system.
9. 9 | P a g e
Teradata Database can be used as :
Enterprise data warehousing
Active data warehousing
Customer relationship management
Internet and E-Business
Data marts.
10. 10 | P a g e
OBJECTIVE OR PURPOSE OF INTERNSHIP:
Two cogent reasons / purposes of the study are following.
1: General Purpose / Objective
To know about how people works in an organization.
To gain experience of work in Teradata which will help me in job
process.
To know what skills they want from an employee.
To see the application of our Professional studies especially.
2: Specific Purpose / Objective
Specific purpose of the study includes.
To know how the employees in large organization handle a problem.
To get a certificate from Teradata organization.
To use their database software and to check its queries.
To objectively observe the operations of Teradata in general.
11. 11 | P a g e
Interview Questions?
Tell me about your self
What Can You Do for Us That Other Candidates Can't?
What is parallelism in Teradata?
Can we load a Multi set table using MLOAD?
What is use of BI in Teradata?
What is snowflake in database?
What is star schema?
Normalization is necessary because?
De-normalization is necessary because?
What are views is database?
12. 12 | P a g e
Description of the internship:
This report is a short description of my four week internship carried out as
compulsory component of the BS in computer science. The internship was
carried out within the organization Teradata in summer 2015. As I am
interested in databases the work was concentrated on the data
warehousing.
At the beginning of the internship I formulated several learning goals, which
I wanted to achieve:
to understand the functioning and working conditions of a non-
governmental organization;
to see what is like to work in a professional environment;
to see if this kind of work is a possibility for my future career;
to use my gained skills and knowledge;
to see what skills and knowledge I still need to work in a professional
environment;
to learn about the organizing of a research project (planning,
preparation, permissions etc.)
to learn about research methodologies (field methods/methods to
analyze data)
to get fieldwork experience/collect data in an environment unknown
for me.
to enhance my communication skills;
to build a network.
This internship report contains my activities that have contributed to
achieve a number of my stated goals.
13. 13 | P a g e
1st
Week:
During the first week I just revise the databases basic concept and did
practice of writing complex queries.
This is task is given to me as my homework while in the office I was given
the training session of using the software name Tableau.
Tableau Software:
Tableau Software is an American computer software company
headquartered in Seattle, Washington. It produces a family of
interactive data visualization products focused on business intelligence.
Products:
Tableau offers five main products: Tableau Desktop, Tableau Server,
Tableau Online, Tableau Reader and Tableau Public. Tableau Public and
Tableau Reader are free to use, while both Tableau Server and Tableau
Desktop come with a 14-day fully functional free trial period, after which the
user must pay for the software. Tableau Desktop comes in both a
Professional and a lower cost Personal edition. Tableau Online is available
with an annual subscription for a single user, and scales to support
thousands of users.
14. 14 | P a g e
2nd
Week:
The below picture shows my assignment no 1.
15. 15 | P a g e
The below list was to send to me by Miss Maria it contain the name of
different companies.
18. 18 | P a g e
Conclusion:
This was my 2nd
week task which I did with full dedication and hard work.
19. 19 | P a g e
3rd
Week:
In the 3rd
and 4th
week I was the task of creating a data warehouse.
In the 3rd
week I created a schema diagram for normalized data and then
created the tables. After the creation of the database it is the time to
populate that data base with data up to 500000 per table in the normalized
database.
Here is the schema diagram for normalized data.
20. 20 | P a g e
Fact Tables:
A fact table is the central table in a star schema of a data warehouse. A
fact table stores quantitative information for analysis and is often
denormalized.
Dimension Tables:
Contrary to fact tables, dimension tables contain descriptive attributes (or
fields) that are typically textual fields (or discrete numbers that behave like
text). These attributes are designed to serve two critical purposes: query
constraining and/or filtering, and query result set labeling.
Code:
The following code is to generate data up to 500000 and store it into a text
file and then its loaded into the database tables.
#include<iostream>
#include <stdlib.h>
#include <time.h>
#include <fstream>
#include <string>
using namespace std;
static const char alphanum[]
="0123456789""ABCDEFGHIJKLMNOPQRSTUVWXYZ""abcdefghijklmnopqrstuvwxyz";
int stringLength = sizeof(alphanum) - 1;
char genRandom()
{
return alphanum[rand() % stringLength];
}
int main(){
int x;
srand (time(NULL));
ofstream myfile;
myfile.open ("Name.txt");
for(int i=0;i<500000;i++){
21. 21 | P a g e
x=rand()% 8+4;
myfile<< i<<" | ";
for(int j=0;j<x;j++){
//cout<<j<<" my name is salman";
int num = rand() % 26;
char upper = static_cast<char>( 'A' + num ); // Convert to upper case
myfile <<upper ;
}
myfile<<" | ";
for(int z=0; z < 21; z++) // generate alphanumeric data
{
myfile << genRandom(); }
myfile<<"n";
}
myfile.close();
return 0;
}
22. 22 | P a g e
4th
Week:
In the last week task was to denormalized the above database, make a
warehouse and check time difference for both normalized data and
denormalized data.
Schema diagram for denormalized data.
23. 23 | P a g e
Comparison of Normalized and Denormalized queries:
Normalized Query:
Denormalized Query: