What is the future of etl tools like ab initiomaxonlinetr
What is ab initio ?
Latin word, meaning “From First Principles” ETL tool, developed by Ab Initio software corporation Used in data warehousing, batch processing and application integration.
TRAINING FEATURES
• Live Interactive Training methods like screen and voice sharing;
• Web conferencing tools like WebEx and Gotomeeting using desktop and audio sharing; &
• 24×7 access to servers and live chat support.
For More Info - http://maxonlinetraining.com/ab-initio-online-training/
CONTACT INFO: USA: +1-940 440 8084
Mobile: +91-9533837156
Email Us: info@maxonlinetraining.com
Demo: https://goo.gl/LzucpY
What is the future of etl tools like ab initiomaxonlinetr
What is ab initio ?
Latin word, meaning “From First Principles” ETL tool, developed by Ab Initio software corporation Used in data warehousing, batch processing and application integration.
TRAINING FEATURES
• Live Interactive Training methods like screen and voice sharing;
• Web conferencing tools like WebEx and Gotomeeting using desktop and audio sharing; &
• 24×7 access to servers and live chat support.
For More Info - http://maxonlinetraining.com/ab-initio-online-training/
CONTACT INFO: USA: +1-940 440 8084
Mobile: +91-9533837156
Email Us: info@maxonlinetraining.com
Demo: https://goo.gl/LzucpY
Power BI Full Course | Power BI Tutorial for Beginners | EdurekaEdureka!
YouTube Link: https://youtu.be/3u7MQz1EyPY
** Power BI Training - https://www.edureka.co/power-bi-training **
This Edureka PPT on "Power BI Full Course" will help you understand and learn Power BI in detail. This Power BI Tutorial is ideal for both beginners as well as professionals who want to master up their Power BI concepts.
BI in the Cloud - Microsoft Power BI Overview and DemoChristopher Foot
RDX Insights Series Presentation focusing on Microsoft Power BI in the cloud. We begin with a high-level overview of the Microsoft BI product suite and discuss the SSIS/SSAS/SSRS tech stack and Power BI. The webinar continues with a deep dive into Power BI and includes instructions on how to use the product to capture, model, analyze and visualize business data. We end the webinar with a Power BI demo highlighting some of its most beneficial and interesting features.
OData (Open Data Protocol) is an ISO/IEC approved, OASIS standard that defines a set of best practices for building and consuming RESTful APIs. OData helps you focus on your business logic while building RESTful APIs without having to worry about the various approaches to define request and response headers, status codes, HTTP methods, URL conventions, media types, payload formats, query options, etc. OData also provides guidance for tracking changes, defining functions/actions for reusable procedures, and sending asynchronous/batch requests.
The Open Data Protocol (OData) enables the creation of REST-based data services, which allow resources, identified using Uniform Resource Identifiers (URIs) and defined in a data model, to be published and edited by Web clients using simple HTTP messages.
Power BI Charts Tutorial | Counter Strike Data Analysis using Power BI | Powe...Edureka!
** Power B Training: https://www.edureka.co/power-bi-training **
This Edureka Tutorial on "Power BI Charts" deals with the importance of all the basic visualizations available on Power BI Desktop. It will help you create Impactful and Comprehensive Reports on the Power BI Desktop.
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Power BI is a business analytics service that enables you to see all of your data through a single pane of glass. Live Power BI dashboards and reports...
BI: new of the buzz words that everyone is talking about but what is it? How can it be used to make a impact in my organization? How do I get started? This session was delivered for SharePoint Saturday Reston.
Recover 30% of your day with IBM Development Tools (Smarter Mainframe Develop...Susan Yoskin
If you need to attract new developers, and want to keep your company’s name out of the headlines, then this session is for you. When your business depends on your mainframe apps working and performing well—all the time—you need to be alerted to issues as they occur and have the tools to help you find and fix the problems and test your solutions before disaster strikes (we’ve all been in those late night and weekend drills). You also need to continue supporting these applications for years to come, and that will require new talent.
This session will introduce you to the development environments that college grads are already comfortable with, and help your applications become more resilient at the same time. We’ll walk you through the tools to help you accomplish all of this and demo some scenarios to show you how efficiently our tools can perform the tasks that slow you down.
Power BI Full Course | Power BI Tutorial for Beginners | EdurekaEdureka!
YouTube Link: https://youtu.be/3u7MQz1EyPY
** Power BI Training - https://www.edureka.co/power-bi-training **
This Edureka PPT on "Power BI Full Course" will help you understand and learn Power BI in detail. This Power BI Tutorial is ideal for both beginners as well as professionals who want to master up their Power BI concepts.
BI in the Cloud - Microsoft Power BI Overview and DemoChristopher Foot
RDX Insights Series Presentation focusing on Microsoft Power BI in the cloud. We begin with a high-level overview of the Microsoft BI product suite and discuss the SSIS/SSAS/SSRS tech stack and Power BI. The webinar continues with a deep dive into Power BI and includes instructions on how to use the product to capture, model, analyze and visualize business data. We end the webinar with a Power BI demo highlighting some of its most beneficial and interesting features.
OData (Open Data Protocol) is an ISO/IEC approved, OASIS standard that defines a set of best practices for building and consuming RESTful APIs. OData helps you focus on your business logic while building RESTful APIs without having to worry about the various approaches to define request and response headers, status codes, HTTP methods, URL conventions, media types, payload formats, query options, etc. OData also provides guidance for tracking changes, defining functions/actions for reusable procedures, and sending asynchronous/batch requests.
The Open Data Protocol (OData) enables the creation of REST-based data services, which allow resources, identified using Uniform Resource Identifiers (URIs) and defined in a data model, to be published and edited by Web clients using simple HTTP messages.
Power BI Charts Tutorial | Counter Strike Data Analysis using Power BI | Powe...Edureka!
** Power B Training: https://www.edureka.co/power-bi-training **
This Edureka Tutorial on "Power BI Charts" deals with the importance of all the basic visualizations available on Power BI Desktop. It will help you create Impactful and Comprehensive Reports on the Power BI Desktop.
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Power BI is a business analytics service that enables you to see all of your data through a single pane of glass. Live Power BI dashboards and reports...
BI: new of the buzz words that everyone is talking about but what is it? How can it be used to make a impact in my organization? How do I get started? This session was delivered for SharePoint Saturday Reston.
Recover 30% of your day with IBM Development Tools (Smarter Mainframe Develop...Susan Yoskin
If you need to attract new developers, and want to keep your company’s name out of the headlines, then this session is for you. When your business depends on your mainframe apps working and performing well—all the time—you need to be alerted to issues as they occur and have the tools to help you find and fix the problems and test your solutions before disaster strikes (we’ve all been in those late night and weekend drills). You also need to continue supporting these applications for years to come, and that will require new talent.
This session will introduce you to the development environments that college grads are already comfortable with, and help your applications become more resilient at the same time. We’ll walk you through the tools to help you accomplish all of this and demo some scenarios to show you how efficiently our tools can perform the tasks that slow you down.
FME:23 for the Enterprise - A Deep Dive into Key New FeaturesSafe Software
Join our product experts for a deep-dive into some of the latest features in FME:23 (such as Remote Engine Service, Compare Workspaces) that will help you improve IT operational efficiency, empower team collaboration, and drive speed to insights. A demo will accompany each feature to help you understand its importance and usage to explore the full potential of your data.
Whether you are a seasoned FME user or a newcomer, this webinar is your opportunity to gain valuable insights into the FME:23 release and how it can solve your enterprise integration needs. Don't miss this chance to unlock the power of FME for the Enterprise!
.NET Conf 2019 Tel-Aviv Israel
There are cases where bugs are discovered only after the product is shipped and used by the end-users. The main reason for these bugs that appear only in the production environment is the use of real user scenarios with real user data. Production debugging is about solving customer-facing issues that aren't easily reproducible in the development or testing environments. When it comes to a cloud-hosted application, production debugging becomes even harder. The code is running on multiple hosts, a business flow can span many services. A remote debugging session with the cloud is dangerous and may introduce side effects to the currently running software, such as performance degradation, interruption of service, and data correctness issues.
In this lecture, we will see how we can remote debug our cloud staging environment, and how we can use Visual Studio Snapshot debugger to set Snapshots and Log points in our production environment.
To get even more insights, the audience will see a revolutionary tool and approach for a collaborative production debugging – OzCode Debugging as a Service (DaaS), where the DevOps and the Dev team can solve production problems together!
You will learn:
1. The difficulties of debugging a modern cloud-hosted application
2. Methods and tools for capturing the state and debugging cloud-hosted services
This presentation introduces the audience to BDD - the Behavior-Driven Development method and how it can be applied to development and testing of GUI applications. We will also try to debunk myths and false hopes surrounding it.
BDD centers around stories written in an "ubiquitous language" that describe the expected behavior of an application. The use of a human-readable language allows for technical as well as non-technical project stakeholders to participate in authoring of feature descriptions. Those descriptions then serve as a base for the work of both developers and testers.
Classic agile and test-driven programming takes an inside-out approach by focussing on the specification and testing of the API of individual software components. BDD, on the other hand, looks at the application as a whole and puts interaction sequences and their expected outcomes into the foreground.
An introduction to the de-facto standard BDD language Gherkin will be given. It became popular as part of the Cucumber Ruby testing framework but has found its way into various free and commercial tools that will be listed.
A sample feature file including scenarios, outlines and backgrounds descriptions will be developed live using the Squish GUI Tester. This feature file can already be "run" in dry mode. We'll see different types of usage of this input:
* A mean to communicate with the customer.
* Documentation for the acceptance test before delivery
* A sequence to walk through for manual testing
* Automated GUI testing through tools like Squish.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Pushing the limits of ePRTC: 100ns holdover for 100 days
Ab initio training Ab-initio Architecture
1. Ab Initio OverviewAb Initio Overview
Co>Operating
system
EME
DTM
GDE
User
User
User
Create all
your
graphs
Graph when
deployed
generate .ksh
Run all your
graphs
Store all variables
in a repository / is
also used for
control / also
collects all
metadata about
graph developed
in GDE
Used to schedule graphs developed in
GDE. It also has capability to maintain
dependencies between graphs
2. Continue To Next SlideContinue To Next Slide
More Details VisitMore Details Visit
More Details blog:http://sandyclassic.wordpress.com
linkedin:https://www.linkedin.com/in/sandepsharma
slideshare:
http://www.slideshare.net/SandeepSharma65
facebook:https://facebook.com/sandeepclassic
google+ http://google.com/+SandeepSharmaa
Twitter: https://twitter.com/sandeeclassic
http://thedatawarehouseclassics.wordpress.com
http://businessintelligencetechnologytrend.wordpress.
com
3. About Ab InitioAbout Ab Initio
Ab Initio is a general purpose data processing platform for enterpriseAb Initio is a general purpose data processing platform for enterprise
class, mission critical applications such as data warehousing,class, mission critical applications such as data warehousing,
clickstream processing, data movement, data transformation andclickstream processing, data movement, data transformation and
analytics.analytics.
Supports integration of arbitrary data sources and programs, andSupports integration of arbitrary data sources and programs, and
provides complete metadata management across the enterprise.provides complete metadata management across the enterprise.
Proven best of breed ETL solution.Proven best of breed ETL solution.
Applications of Ab Initio:Applications of Ab Initio:
– ETL for data warehouses, data marts and operational data sources.ETL for data warehouses, data marts and operational data sources.
– Parallel data cleansing and validation.Parallel data cleansing and validation.
– Parallel data transformation and filtering.Parallel data transformation and filtering.
– High performance analyticsHigh performance analytics
– Real time, parallel data capture.Real time, parallel data capture.
4. Ab Initio ArchitectureAb Initio Architecture
Native Operating System
UNIX Windows NT
Ab Initio Co>Operating System
Component
Library
User-defined
Components
Third Party
Components
Application Development Environments
Graphical C ++ Shell
Applications
Ab Initio
Metadata
Repository
5. Co>Operating SystemCo>Operating System
The Co>Operating System is core software that unites aThe Co>Operating System is core software that unites a
network of computing resources-CPUs, storage disks,network of computing resources-CPUs, storage disks,
programs, datasets-into a production-quality dataprograms, datasets-into a production-quality data
processing system with scalable performance andprocessing system with scalable performance and
mainframe reliabilitymainframe reliability..
The Co>Operating System is layered on top of the nativeThe Co>Operating System is layered on top of the native
operating systems of a collection of computers. It providesoperating systems of a collection of computers. It provides
a distributed model for process execution, filea distributed model for process execution, file
management, process monitoring, check-pointing, andmanagement, process monitoring, check-pointing, and
debugging.debugging.
6. Co>Operating SystemCo>Operating System
The Graphical Development Environment (GDE) providesThe Graphical Development Environment (GDE) provides
a graphical user interface into the services of thea graphical user interface into the services of the
Co>Operating System.Co>Operating System.
Unlimited scalability : Data parallelism results in speedupsUnlimited scalability : Data parallelism results in speedups
proportional to the hardware resources provided, doubleproportional to the hardware resources provided, double
the number of CPUs and execution time is halved.the number of CPUs and execution time is halved.
Flexibility : Provides a powerful and efficient dataFlexibility : Provides a powerful and efficient data
transformation engine and an open component model fortransformation engine and an open component model for
extending and customizing Ab Initio’s functionality.extending and customizing Ab Initio’s functionality.
Portability : Runs heterogeneously across a huge variety ofPortability : Runs heterogeneously across a huge variety of
operating system and hardware platforms.operating system and hardware platforms.
7. Graphical Development EnvironmentGraphical Development Environment
(GDE)(GDE)
GDE lets create applications by dragging and droppingGDE lets create applications by dragging and dropping
components onto a canvas configuring them withcomponents onto a canvas configuring them with
familiar, intuitive point and click operations, andfamiliar, intuitive point and click operations, and
connecting them into executable flowcharts.connecting them into executable flowcharts.
These diagrams are architectural documents thatThese diagrams are architectural documents that
developers and managers alike can understand anddevelopers and managers alike can understand and
use, but they are not mere pictures: the co>operatinguse, but they are not mere pictures: the co>operating
system executes these flowcharts directly. This meanssystem executes these flowcharts directly. This means
that there is a seamless and solid connection betweenthat there is a seamless and solid connection between
the abstract picture of the application and the concretethe abstract picture of the application and the concrete
reality of its execution.reality of its execution.
8. Ab Initio S/w Versions & File ExtensionsAb Initio S/w Versions & File Extensions
Software VersionsSoftware Versions
– Co>Operating System Version =>Co>Operating System Version =>
– GDE Version =>GDE Version =>
File ExtensionsFile Extensions
– .mp.mp Stored Ab Initio graph or graph componentStored Ab Initio graph or graph component
– .mpc.mpc Program or custom componentProgram or custom component
– .mdc.mdc Dataset or custom dataset componentDataset or custom dataset component
– .dml.dml Data Manipulation Language file or record typeData Manipulation Language file or record type
definitiondefinition
– .xfr.xfr Transform function fileTransform function file
– .dat.dat Data file (either serial file or multifile)Data file (either serial file or multifile)
10. Host Profile SettingHost Profile Setting
1.1. Choose settings from the run menuChoose settings from the run menu
2.2. Check the use host profile setting checkbox.Check the use host profile setting checkbox.
3.3. Click Edit button to open the Host profile dialog.Click Edit button to open the Host profile dialog.
4.4. If running Ab Initio on your local NT system, check LocalIf running Ab Initio on your local NT system, check Local
Execution (NT) checkbox and go to step 6.Execution (NT) checkbox and go to step 6.
5.5. If running Ab Initio on a Remote UNIX system, fill in theIf running Ab Initio on a Remote UNIX system, fill in the
path to the Host and Host Login and Password.path to the Host and Host Login and Password.
6.6. Type the full path of Host directory.Type the full path of Host directory.
7.7. Select the Shell Type from pull down menu.Select the Shell Type from pull down menu.
8.8. Test Login and if necessary make changes.Test Login and if necessary make changes.
14. Create Graph - DmlCreate Graph - Dml
Propagate from Neighbors: CopyPropagate from Neighbors: Copy
record formats from connected flow.record formats from connected flow.
Same As: Copy record format’sSame As: Copy record format’s
from a specific component’s port.from a specific component’s port.
Path: Store record formats in aPath: Store record formats in a
Local file, Host File, or in the AbLocal file, Host File, or in the Ab
Initio repository.Initio repository.
Embedded: Type the record formatEmbedded: Type the record format
directly in a string.directly in a string.
Specify
the .dml file
15. Creating Graph - dmlCreating Graph - dml
DML is Ab Initio’s DataDML is Ab Initio’s Data
Manipulation Language.Manipulation Language.
DML describes data in termsDML describes data in terms
ofof
– Record Formats that list theRecord Formats that list the
fields and format of input,fields and format of input,
output, and intermediateoutput, and intermediate
records.records.
– Expressions that defineExpressions that define
simple computations, forsimple computations, for
example, selection.example, selection.
– Transform Functions thatTransform Functions that
control reformatting,control reformatting,
aggregation, and other dataaggregation, and other data
transformations.transformations.
– Keys that specify groupings,Keys that specify groupings,
ordering, and partitioningordering, and partitioning
relationships betweenrelationships between
records.records.
Editing .dml file through
Record Format Editor – Grid
View
16. Creating Graph - TransformCreating Graph - Transform
A transform function is either aA transform function is either a
DML file or a DML string thatDML file or a DML string that
describes how you manipulatedescribes how you manipulate
your data.your data.
Ab Initio transform functionsAb Initio transform functions
mainly consist of a series ofmainly consist of a series of
assignment statements. Eachassignment statements. Each
statement is called a businessstatement is called a business
rule.rule.
When Ab Initio evaluates aWhen Ab Initio evaluates a
transform function, it performstransform function, it performs
following tasks:following tasks:
– Initializes local variablesInitializes local variables
– Evaluates statementsEvaluates statements
– Evaluates rules.Evaluates rules.
Transform function files have theTransform function files have the
xfr extension.xfr extension.
Specify the .xfr file
17. Creating Graph - xfrCreating Graph - xfr
Transform functions: A setTransform functions: A set
of rules that computeof rules that compute
output values from inputoutput values from input
values.values.
Business rule: Part of aBusiness rule: Part of a
transform function thattransform function that
describes how youdescribes how you
manipulate one field ofmanipulate one field of
your output data.your output data.
Variable: Optional part of aVariable: Optional part of a
transform function thattransform function that
provides storage forprovides storage for
temporary values.temporary values.
Statement: Optional part ofStatement: Optional part of
a transform function thata transform function that
assigns values of variablesassigns values of variables
in a specific order.in a specific order.
18. Sample ComponentsSample Components
SortSort
DedupDedup
JoinJoin
ReplicateReplicate
RollupRollup
Filter by ExpressionFilter by Expression
MergeMerge
LookupLookup
Reformat etc.Reformat etc.
19. Creating Graph – Sort ComponentCreating Graph – Sort Component
Sort: The sort componentSort: The sort component
reorders data. Itreorders data. It
comprises twocomprises two
parameters: Key andparameters: Key and
max-core.max-core.
Key: The Key is one ofKey: The Key is one of
the parameters for Sortthe parameters for Sort
component whichcomponent which
describes the collationdescribes the collation
order.order.
Max-core: The max-coreMax-core: The max-core
parameter controls howparameter controls how
often the sort componentoften the sort component
dumps data fromdumps data from
memory to disk.memory to disk.
Specify Key for
the Sort
20. Creating Graph – Dedup componentCreating Graph – Dedup component
Dedup componentDedup component
removes duplicateremoves duplicate
records.records.
Dedup criteria willDedup criteria will
be either unique-be either unique-
only, First or Last.only, First or Last.
Select Dedup criteria.
21. Creating Graph – Replicate ComponentCreating Graph – Replicate Component
ReplicateReplicate
combines thecombines the
data records fromdata records from
the inputs intothe inputs into
one flow andone flow and
writes a copy ofwrites a copy of
that flow to eachthat flow to each
of its output ports.of its output ports.
Use Replicate toUse Replicate to
supportsupport
componentcomponent
parallelism.parallelism.
22. Creating Graph – Join ComponentCreating Graph – Join Component
• Specify the key for join
• Specify Type of Join
23. Database Configuration (.dbc)Database Configuration (.dbc)
A file with a .dbc extension which provides the GDE withA file with a .dbc extension which provides the GDE with
the information it needs to connect to a database. Athe information it needs to connect to a database. A
configuration file contains the following information:configuration file contains the following information:
– The name and version number of the database to which you wantThe name and version number of the database to which you want
to connect.to connect.
– The name of the computer on which the database instance orThe name of the computer on which the database instance or
server to which you want to connect runs, or on which the databaseserver to which you want to connect runs, or on which the database
remote access software is installed.remote access software is installed.
– The name of the database instance, server, or provider to whichThe name of the database instance, server, or provider to which
you want to connect.you want to connect.
– You generate a configuration file by using the Properties dialog boxYou generate a configuration file by using the Properties dialog box
for one of the Database components.for one of the Database components.
24. Creating Parallel ApplicationsCreating Parallel Applications
Types of Parallel ProcessingTypes of Parallel Processing
– Component-level Parallelism: An application with multipleComponent-level Parallelism: An application with multiple
components running simultaneously on separate data usescomponents running simultaneously on separate data uses
component parallelism.component parallelism.
– Pipeline parallelism: An application with multiple componentsPipeline parallelism: An application with multiple components
running simultaneously on the same data uses pipeline parallelism.running simultaneously on the same data uses pipeline parallelism.
– Data Parallelism: An application with data divided into segmentsData Parallelism: An application with data divided into segments
that operates on each segment simultaneously uses datathat operates on each segment simultaneously uses data
parallelism.parallelism.
25. Partition ComponentsPartition Components
Partition by Expression: Dividing data according to a DML expression.Partition by Expression: Dividing data according to a DML expression.
Partition by Key: Grouping data by a key.Partition by Key: Grouping data by a key.
Partition with Load balance: Dynamic load balancing.Partition with Load balance: Dynamic load balancing.
Partition by Percentage: Distributing data, so the output is proportionalPartition by Percentage: Distributing data, so the output is proportional
to fractions of 100.to fractions of 100.
Partition by Range: Dividing data evenly among nodes, based on a keyPartition by Range: Dividing data evenly among nodes, based on a key
and a set of partitioning ranges.and a set of partitioning ranges.
Partition by Round-robin: Distributing data evenly, in blocksize chunks,Partition by Round-robin: Distributing data evenly, in blocksize chunks,
across the output partitions.across the output partitions.
26. Departition ComponentsDepartition Components
Concatenate: Concatenate component produces a single output flowConcatenate: Concatenate component produces a single output flow
that contains first all the records from the first input partition, then allthat contains first all the records from the first input partition, then all
the records from the second input partition and so on.the records from the second input partition and so on.
Gather: Gather component collects inputs from multiple partitions in anGather: Gather component collects inputs from multiple partitions in an
arbitrary manner, and produces a single output flow, does not maintainarbitrary manner, and produces a single output flow, does not maintain
sort order.sort order.
Interleave: Interleave component collects records from many sourcesInterleave: Interleave component collects records from many sources
in round robin fashion.in round robin fashion.
Merge: Merge component collects inputs from multiple sorted partitionsMerge: Merge component collects inputs from multiple sorted partitions
and maintains the sort order.and maintains the sort order.
27. Multifile systemsMultifile systems
A multifile system is a specially created set of directories, possibly onA multifile system is a specially created set of directories, possibly on
different machines, which have identical substructure.different machines, which have identical substructure.
Each directory is a partition of the multifile system. When a multifile isEach directory is a partition of the multifile system. When a multifile is
placed in a multifile system, its partitions are files within each of theplaced in a multifile system, its partitions are files within each of the
partitions of the multifile system.partitions of the multifile system.
Multifile system leads to better performance than flat file systemsMultifile system leads to better performance than flat file systems
because multifile systems can divide your data among multiple disks orbecause multifile systems can divide your data among multiple disks or
CPUs.CPUs.
Typically (SMP machine is exception) a multifile system is created withTypically (SMP machine is exception) a multifile system is created with
the control partition on one node and data partitions on other nodes tothe control partition on one node and data partitions on other nodes to
distribute the work and improve performance.distribute the work and improve performance.
To do this use full internet URLs that specify file and directory namesTo do this use full internet URLs that specify file and directory names
and locations on remote machines.and locations on remote machines.
29. SANDBOXSANDBOX
A sandbox is a collection of graphs and related files thatA sandbox is a collection of graphs and related files that
are stored in a single directory tree, and treated as a groupare stored in a single directory tree, and treated as a group
for purposes of version control, navigation, and migration.for purposes of version control, navigation, and migration.
A sandbox can be a file system copy of a datastore projectA sandbox can be a file system copy of a datastore project..
In the graph, instead of specifying the entire path for anyIn the graph, instead of specifying the entire path for any
file location ,we specify only the sandbox parameterfile location ,we specify only the sandbox parameter
variable. For ex : $AI_IN_DATA/customer_info.dat. wherevariable. For ex : $AI_IN_DATA/customer_info.dat. where
$AI_IN_DATA contains the entire path with reference to$AI_IN_DATA contains the entire path with reference to
the sandbox $AI_HOME variable.the sandbox $AI_HOME variable.
The actual in_data dir is $AI_HOME/in_data in sandboxThe actual in_data dir is $AI_HOME/in_data in sandbox
30. SANDBOXSANDBOX
The sandbox provides an excellent mechanism toThe sandbox provides an excellent mechanism to
maintain uniqueness while moving frommaintain uniqueness while moving from
development to production environment by meansdevelopment to production environment by means
switch parameters.switch parameters.
We can define parameters in sandbox those canWe can define parameters in sandbox those can
be used across all the graphs pertaining to thatbe used across all the graphs pertaining to that
sandbox.sandbox.
The topmost variable $PROJECT_DIR containsThe topmost variable $PROJECT_DIR contains
the path of the home directorythe path of the home directory
32. DeployingDeploying
Every graph after validation and testing has to be deployedEvery graph after validation and testing has to be deployed
as .ksh file into the run directory on UNIX.as .ksh file into the run directory on UNIX.
This .ksh file is an executable file which is the backbone forThis .ksh file is an executable file which is the backbone for
the entire automation/wrapper process.the entire automation/wrapper process.
The wrapper automation consists of .run, .env,The wrapper automation consists of .run, .env,
dependency list ,job list etcdependency list ,job list etc
For a detailed description on wrapper and differentFor a detailed description on wrapper and different
directories and files , Please refer the documentation ondirectories and files , Please refer the documentation on
wrapper / UNIX presentation.wrapper / UNIX presentation.
33. ReferencesReferences
Ab Initio TutorialAb Initio Tutorial
Ab Initio Online HelpAb Initio Online Help
Website (abinitio.com)Website (abinitio.com)
http://sandyclassic.wordpress.comhttp://sandyclassic.wordpress.com
Data warehouse : http://datawarehouseview.wordpress.com/
Data Science View : http://thedatascience.wordpress.com/
Business Intelligence trends :
http://businessintelligencetechnologytrend.wordpress.com/
Business Architect :http://businessarchitectview.wordpress.com/
Enterprise Architect : http://enterprisearchitectview.wordpress.com/
Project Manager :http://projectmanagerview.wordpress.com
data Architect : https://dataarchitectview.wordpress.com/