Unix commands are useful for automating and managing Informatica workflows. The pmcmd command can be used to connect to a repository, start, stop, or abort workflows. The pmrep command updates repository information and performs repository functions. Common Unix commands like cp can be used to copy files between paths during workflows. Overall, Unix commands are preferred for automating Informatica jobs through tools like Control-M or AutoSys.
BTEQ can export data directly from Teradata to external files using different export formats. The export formats include record mode, field mode, and indicator mode. Record mode exports data in its native format as a flat file, field mode exports data in a human-readable format similar to SQL with headers and spacing, and indicator mode exports data in record mode along with a bitmap to identify null values and allow proper loading into other database systems.
This document provides an overview of advanced dimensional modelling techniques. It discusses:
1) Dimension structures such as slowly changing dimension type 6, using one or two dimensions, and when to snowflake dimensions.
2) Fact table considerations like primary keys, snapshotting transaction fact tables, aggregate fact tables, and vertical fact tables.
3) Dimension behaviors like rapidly changing dimensions, very large dimensions, banding and stamping dimension rows, and dimensions with multi-valued attributes.
4) Combination techniques involving real-time fact tables, dealing with currency rates and status values. The document covers several sections and many modelling patterns in 44 slides.
Learning R via Python…or the other way aroundSid Xing
The document discusses similarities and differences between R and Python programming languages. Both support functional and object-oriented programming paradigms. However, Python is designed more for general application development while R focuses on data analysis functions. Their syntax also differs, with Python emphasizing readability through whitespace and R favoring brevity through nesting. Examples are provided to demonstrate equivalent functionality in each language.
This document provides a reference for the command line programs that are used to interact with Informatica PowerCenter. It includes copyright information for the various software components, a table of contents, and describes how to use the command line programs. The document is published by Informatica and provides technical documentation for administrators of the PowerCenter platform.
Chandan Das is a developer/designer with over 7 years of experience in IT implementation projects using technologies like Teradata, Oracle, Hadoop, Pig, Hive, and Sqoop. He has extensive experience in data warehousing, ETL, and database administration. His career objective is to obtain a productive role in an IT organization where he can implement his expertise in developing complex projects efficiently and meeting expectations. He provides details of his professional experience, technical skills, key achievements and completed projects.
Unix commands are useful for automating and managing Informatica workflows. The pmcmd command can be used to connect to a repository, start, stop, or abort workflows. The pmrep command updates repository information and performs repository functions. Common Unix commands like cp can be used to copy files between paths during workflows. Overall, Unix commands are preferred for automating Informatica jobs through tools like Control-M or AutoSys.
BTEQ can export data directly from Teradata to external files using different export formats. The export formats include record mode, field mode, and indicator mode. Record mode exports data in its native format as a flat file, field mode exports data in a human-readable format similar to SQL with headers and spacing, and indicator mode exports data in record mode along with a bitmap to identify null values and allow proper loading into other database systems.
This document provides an overview of advanced dimensional modelling techniques. It discusses:
1) Dimension structures such as slowly changing dimension type 6, using one or two dimensions, and when to snowflake dimensions.
2) Fact table considerations like primary keys, snapshotting transaction fact tables, aggregate fact tables, and vertical fact tables.
3) Dimension behaviors like rapidly changing dimensions, very large dimensions, banding and stamping dimension rows, and dimensions with multi-valued attributes.
4) Combination techniques involving real-time fact tables, dealing with currency rates and status values. The document covers several sections and many modelling patterns in 44 slides.
Learning R via Python…or the other way aroundSid Xing
The document discusses similarities and differences between R and Python programming languages. Both support functional and object-oriented programming paradigms. However, Python is designed more for general application development while R focuses on data analysis functions. Their syntax also differs, with Python emphasizing readability through whitespace and R favoring brevity through nesting. Examples are provided to demonstrate equivalent functionality in each language.
This document provides a reference for the command line programs that are used to interact with Informatica PowerCenter. It includes copyright information for the various software components, a table of contents, and describes how to use the command line programs. The document is published by Informatica and provides technical documentation for administrators of the PowerCenter platform.
Chandan Das is a developer/designer with over 7 years of experience in IT implementation projects using technologies like Teradata, Oracle, Hadoop, Pig, Hive, and Sqoop. He has extensive experience in data warehousing, ETL, and database administration. His career objective is to obtain a productive role in an IT organization where he can implement his expertise in developing complex projects efficiently and meeting expectations. He provides details of his professional experience, technical skills, key achievements and completed projects.
2. Document Control
Change Record
1
Date Author Version Change Reference
19-Aug-08 1.0 No Previous Document
Reviewers
Name Position
Distribution
Copy No. Name Location
1
2
3
4
Note To Holders:
If you receive an electronic copy of this document and print it out, please write your
name on the equivalent of the cover page, for document control purposes.
If you receive a hard copy of this document, please write your name on the front
cover, for document control purposes.
7. 创建物理层
本章我们使用Oracle BI Administration工具创建仓库的物理层。
物理层的创建可以使用导入元数据的方式来创建,也可以手工创建。物理层定义了查
询使用的数据源,以及多个数据源之间的连接,多个数据源的类型可以不同。
通过导入的方式建立物理层,很多属性会自动从数据源处获得。导入后,你可以定义
其它属性,包括连接条件等。数据肯能来自数据库、Spreadsheets、XML文档。
创建仓库
停止 Oracle BI Server
选择StartProgramsAdministrative ToolsServices打开服务控制台,选中Oracle
BI Server服务,选择停止按钮
新建仓库
选择StartProgramsOracle Business IntelligenceAdministration打开Oracle BI
Administration
选择FileNew打开新建仓库对话框
在New Repository对话框,File Name输入SH.rpd,目录保持默认不变
15. 创建业务模型和映射层
本章我们使用Oracle BI Administration工具建立业务模型和映射层。
业务模型和映射层定义了业务模型和物理层的映射,在这里将物理层简化然后呈现给
用户。
每个逻辑列可以对应物理层的一个或多个数据源。
主要有两类逻辑表:事实表和维表。逻辑事实表是指度量,逻辑维表用来限制事实的
数据。
创建业务模型
创建业务逻辑和映射
在Business Model and Mapping层,右键点击空白区域,选择New Business Model
在Business Model 对话框,输入名称为SH
点击OK按钮,关闭Business Model对话框
16. 创建逻辑表
创建逻辑表
在Business Model and Mapping层,右键点击SH,选择New ObjectLogical Table
在逻辑表对话框,输入Name为Sales Facts
点击OK按钮,关闭逻辑表对话框
25. 每个单词的首字母大 写
选中First letter of each word capital,点击Add
特殊字符
选中Change specified text,在Find字段输入一个空格+Id,在Replace with处输入一
个空格+ID,并选中Case sensitive
点击Add按钮
76. 双击Connection Pool或点击Select按钮,关闭Select Connection Pool对话框,回到
Session Variable Initialization Block Data Source对话框
在Default Initialization String处输入
select ‘:USER’,case when upper(‘USER’)=’KURT’ then ‘Germany’ when
upper(‘:USER’)=’KEIKO’ then ‘Japan’ when upper(‘:USER’)=’CHARLES’ then
‘United Kingdom’ when upper(‘:USER’)=’KAREN’ then ‘United States of America’
end,’CountryManagers’,2 from Dual
109. 关闭Physical Diagram窗口
保存仓库
映射逻辑列并设置聚合内容
映射存在的逻辑列
从物理层拖动列到业务模型和映射层对应的列上
业务模型和映射层 物理层
Sales Facts Amount Sold AGG_ST_CAT_DAY_SALES_F AMOUNT_SOLD
Cust State CUST_STATE_PROVINC
Province E
Country COUNTRY_NAME
CUSTOMER
AGG_CUSTOMER_STATE_D
S Country Subregion COUNTRY_SUBREGION
Country Region COUNTRY_REGION
Country Total COUNTRY_TOTAL
Prod Category PROD_CATEGORY
AGG_PRODUCTS_CATEGORY_
Products
D
Prod Total PROD_TOTAL
验证逻辑表来源
在业务模型和映射层,展开Sales FactsSources,确认增加了逻辑表来源
AGG_ST_CAT_DAY_SALES_F
112. 测试结果
创建查询
返回Answers,点击Reload Server MetaData链接,重新载入展现层,新建查询如下
图
添加过滤
在Calendar Year上打开Create/Edit Filter对话框,设置条件为Calendar Year is equal
to / is 2001
点击OK按钮,为查询添加过滤条件
查看结果
点击Results标签页,查看执行结果
检查查询日志
检查是否使用聚合表AGG_CUSTOMER_STATE_D和AGG_ST_CAT_DAY_SALES_F