0% found this document useful (0 votes)
99 views105 pages

Day 1 Consolidation 1695814006998

The IDMC Associate Bootcamp is a comprehensive training program designed to enhance data integration skills across various levels. It spans three days, covering topics such as cloud data integration, governance, and modernization, culminating in a hands-on exam. Participants are encouraged to enroll in a mandatory Cloud Data Integration Services course to gain practical experience.

Uploaded by

asif ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views105 pages

Day 1 Consolidation 1695814006998

The IDMC Associate Bootcamp is a comprehensive training program designed to enhance data integration skills across various levels. It spans three days, covering topics such as cloud data integration, governance, and modernization, culminating in a hands-on exam. Participants are encouraged to enroll in a mandatory Cloud Data Integration Services course to gain practical experience.

Uploaded by

asif ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 105

IDMC Associate Bootcamp Intro

Melina Oliveira · Senior Solutions Consultant

Global Partner Technical Sales · GPTS

2023
IDMC Associate Bootcamp Setup
A single-stop, comprehensive program to serve all levels of learning needs!

DAY 1 DAY 2 DAY 3


Take
EXAM!

Enroll to Hands-On Cloud Data Integration Course


Housekeeping for this session!
A few tips on how to get the best ON 24 user experience

Key Facts
• Key Documentation under Related
Content
• Q&A box available for live sessions
• Access to speaker information Presentation Q&A Docs
• Resize and change layout as you wish
• Toolbox always available

Toolbox
Agenda - DAY 1

IDMC
IDMC Data IPU
Bootcamp
Intro Integration Pricing Model
Intro

4 © Informatica. Proprietary and Confidential.


Agenda - DAY 2

Modernization
DAY 1 Cloud Data Cloud
Panel
Recap Governance Modernization
Discussion

5 © Informatica. Proprietary and Confidential.


Agenda - DAY 3

DAY 2 MDM & 360 What’s New in


What’s Next?
Recap Applications IDMC

6 © Informatica. Proprietary and Confidential.


IMPORTANT : Action Required!

IICS : Cloud Data Integration


Services Course!

This course is MANDATORY to


achieve Data Integration hands
on experience!
Register TODAY!
It’s FREE of Charge!

7 © Informatica. Proprietary and Confidential.


Intelligent Data Management Cloud

Salvatore Moretto · PreSales Director


Global Partner Technical Sales · GPTS

2023
Technology Challenges Business Challenges

Data is difficult to find Reducing costs and growing revenue with


and understand
68% access to trusted data for business insights

of data leaders predict


an increase in data Increasing customer acquisition & retention
Poor data quality, not management with single source of customer data
trusted investments in 2023

Improving customer experience and


optimizing supply chains
Can’t scale for volume
and variety
Data Reducing costs and improving
business process efficiency

Data and applications siloed


and fragmented
$169B Ensuring data is trusted and used
responsibly by enforcing data governance
in data management
software spend by 2026
Difficult to share data and not with 5-year CAGR of 16.1% Empowering data consumers to find,
governed or protected understand, trust and access data
Why is Data Management Hard and Complex?

DATA DATA API & APP DATA MDM & 360 GOVERNANCE & DATA
CATALOG INTEGRATION INTEGRATION QUALITY APPLICATIONS PRIVACY MARKETPLACE

A A H I O N V O T D D

L U
Vendors

B M O J M O P P

C F P 50%W organizations
P rely on 5+K tools Q

D G K K S R

55% have 1,000+ data sources and 78% predict more in 2023

Statistics are based on 600 CDOs surveyed around the world – November 2022
10 © Informatica. Proprietary and Confidential.
Achieving Business Outcomes with Data

TRANSFORM it into a trusted resource

EVOLVE & BUILD on its intelligence

DEMOCRATIZE its use for all

ILLUMINATE insights & opportunities

11 © Informatica. Proprietary and Confidential.


DATA CONSUMERS

ETL Developer Data Engineer Citizen Integrator Data Scientist Data Analyst Business Users

Intelligent Data Management Cloud

DISCOVER & ACCESS & CONNECT & GOVERN & SHARE &
CLEANSE & TRUST MASTER & RELATE
UNDERSTAND INTEGRATE AUTOMATE PROTECT DEMOCRATIZE

DATA DATA API & APP DATA MDM & 360 GOVERNANCE & DATA
CATALOG INTEGRATION INTEGRATION QUALITY APPLICATIONS PRIVACY MARKETPLACE

10K+ Metadata-Aware Connectors


AI-Powered Metadata Intelligence & Automation

Connectivity
Metadata System of Record

Multi-Cloud Hybrid

On-premises Enterprise Cloud

12 © Informatica. Proprietary and Confidential.


Open and Flexible API and Microservices Based
Designed to work within your Modern architecture for optimal

The
reference architecture performance and resiliency

Low Code, No Code Elastic and Serverless


Industry’s Only Increased productivity for data
engineers and citizen integrators
Cost optimized performance
and scalability

Intelligent Best of Breed Cloud Native


Data Industry leading products and
services all in one unified platform
Built from the ground up for
enterprise cloud workloads

Management Multi-cloud and Hybrid Secure

Cloud Runs on and interoperates with


what you have today and tomorrow
Highest level of cloud security
certifications and attestations

Leverages Best of Single Uniform


Breed Open Source Consumption Pricing
Delivers open-source innovations Predictive and flexible pricing that
without the complexity adjusts to just what you need today
13 © Informatica. Proprietary and Confidential.
Intelligent and Scalable for the
Most Demanding Enterprises

49 Trillion
Transactions per month

18 Petabytes
Metadata

14 © Informatica. Proprietary and Confidential.


Intelligent and Scalable for the
Most Demanding Enterprises
Manual efforts
reduced by up to 10X

49
Analysts andTransactions
Trillion
data scientists per month Boosts productivity for data
find trusted data faster engineers and data stewards
AI-powered Metadata
Intelligence & Automation

18 Petabytes
Metadata
15 © Informatica. Proprietary and Confidential.
Simple Consumption Based Pricing

I nformatica
Pay for only what you use
P rocessing
Access to all platform services
U nit

Cloud Friendly Usage Based Comprehensive Chargebacks Elastic


Cloud-native services Pay only for what you use; Use full breadth of services Track consumption for Scale up or down to meet
aligned to customer needs includes connectors available on the platform departmental chargebacks business demands

16 © Informatica. Proprietary and Confidential.


Cloud Only, Consumption Driven!

17 © Informatica. Proprietary and Confidential.


Customer Leadership—Global and Across All Key Verticals

Customer Success 5,000+


Customers

9
of Fortune 10

85
of Fortune 100

18 © Informatica. Proprietary and Confidential.


IDMC · Technical Intro

Kilian Ingelfinger · Sr Solutions Consultant


Global Partner Technical Sales · GPTS

2023
DATA CONSUMERS

ETL Developer Data Engineer Citizen Integrator Data Scientist Data Analyst Business Users

Intelligent Data Management Cloud

DISCOVER & ACCESS & CONNECT & GOVERN & SHARE &
CLEANSE & TRUST MASTER & RELATE
UNDERSTAND INTEGRATE AUTOMATE PROTECT DEMOCRATIZE

DATA DATA API & APP DATA MDM & 360 GOVERNANCE & DATA
CATALOG INTEGRATION INTEGRATION QUALITY APPLICATIONS PRIVACY MARKETPLACE

10K+ Metadata-Aware Connectors


AI-Powered Metadata Intelligence & Automation

Connectivity
Metadata System of Record

DATA SOURCES

SaaS Apps On-premises Real-time /


Sources + Sources + Streaming Sources
Mainframe Applications Databases IoT Machine Data Logs

20 © Informatica. Proprietary and Confidential.


IDMC Architecture
Stream Processing Google Cloud
Pub/Sub Stream Storage
Hybrid Business
User
360 Applications
CLOUD DATA LAKE
Data Integration

Data Delivery and Marketplace


Data
Data Quality
Data Ingestion
Analyst
SaaS MDM

Data Integration
Google
Azure Data Lake Cloud Storage
Storage Gen2
Cloud Data Warehouse Line of
Business
Landing Data Enterprise Google
BigQuery
Zone Enrichment Zone Azure
Synapse
Analytics

Cloud Data
Data Science / AI Engineer
Storage
Elastic Compute - Spark
Azure
Machine
Learning

Data
Scientist
API & Application Integration
IoT
Data Catalog
Governance
Manager
Governance and Privacy

CLAIRE: Intelligence and Automation IDMC

21 © Informatica. Proprietary and Confidential.


Informatica Cloud Integration Reference Architecture
Cloud Data Lake / Warehouse Informatica managed Serverless Services
Locally hosted agent
1 managed by customer
Cloud hosted agent
3
2 managed by customer 4 Kubernetes

3 Serverless Spark/Kubernetes 2
Cluster managed by Informatica
Push Down Optimization to
4 Cloud Lakehouses Metadata Informatica hosted
Intelligent Cloud Services
4 Data
IDMC

Corporate Network Secure Agent Group

1
Cloud Applications
firewall
IDMC – A Secure Platform

Encryption Key Management Traffic Routing Authentication

Encryption at Rest Overview Network Capabilities*


SSO (Single Sign On)
Metadata & Data Encrypted Data Keys DirectConnect/ExpressRoute
Token Based
AES-256 Bit Tenant Specific
PrivateLink (Currently on AWS)
Certificate Based
Encryption in Transit Customer Managed VPN
SAML 2.0 Based
Different Encryption keys Automatic Rotation every Year
per IDMC Service Connector Level (Azure
Synapse/Snowflake etc.)
Customer controlled Key MFA (Multi Factor
TLS 1.2 encrypted using
Rotation Authentication)
AES256-SHA (256 bit)
cipher
Envelope Encryption Data Storage

No Inbound Firewalls Two levels of Master Keys


*Note - Network and
Logically Separated per Tenant
Infrastructure related
capabilities have customer
Https SSL Long Polling Master Key Stored in AWS dependencies Data and Metadata Encryption
KMS/Azure Key Vault
Intelligent
Intelligent Data and Scalable
Management Cloud –for the
IDMC
Common Services Most Demanding Enterprises

49 Trillion
Transactions per month

18 Petabytes
Metadata
24 © Informatica. Proprietary and Confidential.
Operational Insights

Operational Analytics Monitoring & Alerting CLAIRE Recommendations

Hybrid Cloud

Integrated Dashboard

25 © Informatica. Proprietary and Confidential.


Operational Insights

Operational Analytics Monitoring & Alerting CLAIRE Recommendations

Hybrid Cloud

Integrated Dashboard

26 © Informatica. Proprietary and Confidential.


Monitor

27 © Informatica. Proprietary and Confidential.


Administrator
External Authentication Support

SAML Authentication

Support for SSO based authentication through a


variety of SAML 2.0 compliant Identity providers
• ADFS
• LDAP with SAML 2.0 support
• Okta, SSO Circle, Shibboleth, AAD
• Other compliant IDP with SAML 2.0 support
IDMC
Support for
• Identity provider initiated SSO
• Service provider initiated SSO
• Attribute Mapping
• Role & Default Group Mapping

IP Address Filtering
Allowed trusted IP ranges to access tenant

28 © Informatica. Proprietary and Confidential.


Administrator
Role Based Access Model
Hot Tips:
Roles Privileges • When a new asset is created, the default
Consists of permissions for that asset allows access
to anyone with a role having privilege(s) to
the asset.

• To restrict permission to an asset, simply


add only the users or user groups who
Assigned to
should have the access to this asset.

• Restrict Admin role access to only those


User Groups User who absolutely need it and revisit them
when there is a change in roles and
Consists of responsibilities to those with Admin
access

• Revisit roles and audit logs regularly

• Create custom roles when necessary

Resources
Permission

29 © Informatica. Proprietary and Confidential.


Cloud Data Integration

Cliff Darroch · Principal Product Specialist

2023
IDMC Security Architecture Diagram

Informatica Managed Cloud Runtime


Secure Agent + Process Server

Host
Business Data
(HTTPS)
Cloud
Multi-Tenant
Micro Repositories
Services
Front End

Metadata
Services
Services
Services Data
Business Data Cloud Applications
AES Encryption (256 Bit) (HTTPS)

Intelligent Cloud Services


Firewall
Design &
Administration Metadata
(TLS 1.2) Secure Agent
(Windows, Linux) Proxy Optional
Encrypted Connection
Credentials & Logs*
Identify / Access SAML 2.0 Provider Optional

User Web Client:


Data Legacy
Browser Based Applications Warehouse Databases
User Interface
31 © Informatica. Proprietary and Confidential.
On-Premise Data & Applications
IDMC Micro Services
What are they?

• Data Integration
- Cloud Data Integration (CDI) – Batch mainly with a few RT-enabled connectors. Most closely resembles
PowerCenter in feel and execution
- Advanced Mappings – Almost all of the same functionality is Cloud Data Integration with much better
Hierarchical data handling and execution on SPARK
• Application Integration
- Cloud Application Integration (CAI) – Designed for orchestration or transaction style patterns on an event-
driven basis
• Mass Ingestion (MI)
- File Mass Ingestion – Database Ingestion – App Mass Ingestion – Steaming Mass Ingestion
- Low touch wizard-driven tool for moving large amounts of data from source to target with no
transformation. General the first step in a full ELT pattern
• Data Quality
• Data Profiling

32 © Informatica. Proprietary and Confidential.


IDMC Services
By Pattern
Pattern IDMC Micro Service

Load Data From File (Batch) Cloud Data Integration


Advanced Mappings
Mass Ingestion - Files
Load Data From Database (Batch) Cloud Data Integration
Advanced Mappings
Load Data From Database (CDC) Cloud Data Integration – CDC
Mass Ingestion - Database
Load Data from Queue Cloud Data Integration
Mass Ingestion - Streaming
Cloud Application Integration
ELT Patterns Cloud Data Integration
Advanced Mappings
FTP/SFTP Pattern Cloud Application Integration
Cloud Data Integration
Mass Ingestion - Files

33 © Informatica. Proprietary and Confidential.


IDMC Services
By Latency

Latency IDMC Micro Service

Event - API Cloud Data Integration – SOAP/REST


Cloud Application Integration – SOAP/REST
API Gateway – SOAP/REST

Event – File Watch Cloud Data Integration


Advanced Mappings
Cloud Application Integration
Mass Ingestion - Files

Event – Object Store Watch Cloud Data Integration


Cloud Application Integration
CDC Cloud Data Integration
Mass Ingestion - Databases
Realtime Stream Cloud Data Integration
Mass Ingestion - Streaming

34 © Informatica. Proprietary and Confidential.


Use Case
A Day in the Life

• Data Must Be Retrieved From SFTP And Archived


• Data Must Be Transformed And Standardized
• Data Will Be Loaded Into Snowflake
• Data Will Be Aggregated Into A Summary Table
• Any Errors Must Trigger A Case In Service Now

35 © Informatica. Proprietary and Confidential.


Use Case
Flow

36 © Informatica. Proprietary and Confidential.


Use Case
Data must be retrieved from SFTP and Archived

37 © Informatica. Proprietary and Confidential.


Intelligent and Scalable for the
Most Demanding Enterprises

49 Trillion
Transactions per month
DEMO
File Movement

18 Petabytes
Metadata
38 © Informatica. Proprietary and Confidential.
Cloud Mass Ingestion

39 © Informatica. Proprietary and Confidential.


Mass Ingestion Services

Files RDBMS Streaming Applications

CMI-F CMI-DB CMI-S CMI-A

40 © Informatica. Proprietary and Confidential.


Mass Ingestion Files
Overview

Provides file transfer capabilities


for exchanging files between on
premise and Cloud repositories,
using standard protocols

MI Metadata
Transfer any file type with a
high performance and Cloud
scalability GCS Redshift S3

1 MI Task 4
Update Job Log Azure DW, Blob, Data Lake
Job and file level tracking and
monitoring

Secure
Agent
Orchestrate File transfer and
2 3
File Mass
ingestion in hybrid/cloud as Ingestion Ingest Data
Service
managed and secure service Advanced
FTP/SFTP/FTPS
Cconnector
41 © Informatica. Proprietary and Confidential.
Mass Ingestion Files

• Unified user experience for all ingestion


types (Streaming, Database, File,
Application)
• Simple, wizard-based task definition
• Wide list of supported sources/targets
• Advanced, highly scalable connectors for
handling FTP/SFTP/FTPs
• Filter files by file name pattern, file size, file
date
Mass Ingestion Files

• API, schedule or file event triggered


• File actions :
- Compress/decompress (Zip, Gzip ,Tar)
- Encrypt/decrypt (PGP)
• Highly scalable, any file type
• Unified monitoring and tracking experience
- Tracking and monitoring - Job and file level
Leverage the File
Listener
• A platform level asset that provides file
listener capabilities that can be used
by different services
• User can define/manage file listeners and
different apps/services can register/invoke
file listeners (via UI or API)
• Usage:
- File Mass ingestion as a scheduling option-
move files when they land in a specific folder
- Taskflow:
• Trigger taskflow when file event occurs
• File watch inside a taskflow process
- B2B Gateway - as a scheduling option-
process files when they land in a specific
folder
Cloud Mass Ingestion Databases
Cloud Targets Supported Targets
Informatica Intelligent Amazon S3
Cloud Services Azure ADLS & Synapse
Provides Database ingestion
Apache Kafka
capabilities as part of IICS Mass
Ingestion service Snowflake

Data Data
Ingest relational database data Warehouse Lakes
from Oracle, SQL-Server & MySQL.
Also supporting Schema Drift on
CDC supported Databases

Real-time monitoring of ingestion Kafka


jobs with lifecycle management
and alerting in case of issues
Mainframe Databases
Secure
Agent

Orchestrate Database data Supported Sources


ingestion in hybrid/cloud as
App Servers Data Warehouse Oracle, SQL Server, MySQL, Teradata
managed and secure service

On-Premises
Sources
45 © Informatica. Proprietary and Confidential.
Benefits of Mass Ingestion Databases

1 Supports both data 2 Wizard driven 3 Efficiently ingest CDC


synchronization & real experience for ingestion data from 1000’s of
time analytics use cases tables
Faster decision making Increase business agility No expensive maintenance

4 Automatic schema drift 5 OOTB Connectivity to 6 Real time monitoring


addressing CDC sources, Data Lake and alerting
& DWH targets
Increased trust in data assets No need to hand code Faster troubleshooting
Mass Ingestion Streaming - Overview
Sensor
Data
Provides streaming ingestion Messaging
Systems Real time analytics
capabilities as part of IICS Machine
Data Ingestion service Data / IOT

WebLogs
Data Lake
Ingest streaming data: Logs, Social & ML Consumption
Media
clickstream, social media,
Kafka Kinesis, S3, ADLS, Messaging
Systems
Firehose, etc.

Real-time monitoring of
ingestion jobs with lifecycle
management and alerting in
case of issues

Orchestrate streaming data


ingestion in hybrid/cloud as
managed and secure service

47 © Informatica. Proprietary and Confidential.


Benefits of Mass Ingestion Streaming

1 Single ingestion 2 Wizard driven 3 Enable business the


solution for all patterns experience for ingestion ingest streaming data
for their usage
Save time and money Increase business agility Faster decision making

4 Edge transformations 5 Connectivity to 6 Real time monitoring


for cleansing data streaming sources & and alerting
targets
Increased trust in data assets No need to hand code Faster troubleshooting
Mass Ingestion Applications
• CMI-A can transfer data from Software-as-a-Service (SaaS) and on-premise applications to cloud-based
data warehouse.
• The SaaS and on-premises applications used in your business or organization store large amounts of
business-critical data on a daily basis. You can use CMI-A to transfer the data stored by your applications
to cloud-based targets that can handle large volumes of data.
• After you transfer the data to the target, you can consolidate the data and use it for various purposes,
such as advanced data analytics and data warehousing.
CMI-A can perform the following types of load operations:
• Initial load
- Loads source data read at a single point in time to a target.
• Incremental load
- Loads data changes continuously or until the ingestion job is stopped or ends.
• Initial and Incremental load
- Performs an initial load of point-in-time data to the target and then automatically switches to
propagating incremental data changes made to the same source objects on a continuous basis
Summary

Cloud native ingestion Connectivity Wizard Driven Design Real-time Monitoring


• Unified service for • On-prem Database & CDC • Simple easy to use • Pictorial view of the
ingestion from wizard ingestion job
• On-prem & cloud files
various sources
• Orchestration for • IoT & Streaming • Edge transformations • Real time flow
ingestion from visualization
• Cloud data lakes, • Intent driven ingestion
variety of patterns • Lifecycle management
Datawarehouse and
messaging hub
50 © Informatica. Proprietary and Confidential.
Use Case
Data must be Transformed and Standardized & Loaded into Snowflake

51 © Informatica. Proprietary and Confidential.


Intelligent and Scalable for the
Most Demanding Enterprises

49 Trillion
Transactions per month
DEMO
Data Transformation

18 Petabytes
Metadata
52 © Informatica. Proprietary and Confidential.
Cloud Data Integration

53 © Informatica. Proprietary and Confidential.


With modern role-based unified experience

• Uniform front-end for


cloud services
• Role-based, easy access,
individualized “Home
Page”
• Integrated access to
Marketplace, Community
and guided tutorials

Unified experience across all cloud services


Integration Task Wizards for Citizen Integrators
Cloud Mapping Designer for Integration Experts
Transformations

Task Flows
Multi Cloud Integrations using CDI

Key Platform Capabilities

• Ease of Use
• Templates and Wizards
• Micro-service Architecture
• Reusability
• Broad Hybrid and Multi-Cloud
Connectivity
• No coding across the platform
• Performance optimizations like
CDC, parallel processing,
pushdown optimization, Mass
Ingestion, etc

Hybrid, Multi-Cloud integrations using CDI Transformations and Patterns


Tools For Making Mapping Easier And More Robust

• Tools for defining complex data


- Hierarchy Parser
- Complex Flat File (For multi-record Mainframe Data)
- Intelligent Structure Discovery/Parser *
- And more…
• Tools for Automating Repetitive Tasks
- User Defined Functions
- Mapplets
- Parameters and Variables
- Dynamic Mapping Tasks *
- Sub Processes (reusable taskflows)
Intelligent Structure Discovery
Mapping Challenges
Mappings are tightly bound to schemas

Change in metadata (data type, column, etc.)


may involve manual changes to 100s of
transformations and mappings

Multiple mappings/workflows are


created, tested, maintained for each
source
Dynamic Mapping – Goals

• Support Any Data Integration Pattern


- Give customers the ability to develop a highly parameterized mapping
• Schema Drift
- Use one mapping to support multiple file formats
- Discover the schema at run-time
• Simplify Maintenance
- Turn hundreds of mappings into 1
- Support table changes without changing the related mappings
Efficiency & Flexibility with Dynamic Mapping

• Data Integration: Build a template once – automate mapping execution for 1000’s of
sources with different schemas automatically
• Mapping self-adjusts dynamically to external schema changes and column characteristics

Rule-based ports and links,


e.g., include all String ports

Design time Run time

Generic Source and Target Varying logic, e.g., apply TRIM for varying
with varying schemas number of String fields in the Source
Advanced Integration
Previously CDI-Elastic

63 © Informatica. Proprietary and Confidential.


Advanced Integration

• Single design time experience for Advanced Design Experience for Data Integration
all your data integration needs
• 250+ purpose-built, cloud-native Adva CDI
nced
connectors with purpose-built Map
transformations for any type of pings
workload, at any scale
• Support for optimized mixed-mode
CLAIRE FinOps Optimization Engine
execution (part DTM, part Spark)
• Intelligent (CLAIRE-driven)
Spark Processing ELT (PDO) ETL (Secure Agent Processing)
optimization @runtime for best
cost-performance
Execution

Auto-Scaling
Elastic Spark ETL ETL
Cluster
Secure Agent
Processing
Advanced Mappings
Enabling Kubernetes for auto-scaling and provisioning

IDMC

Same, familiar
Informatica Design-Time

Serverless Kubernetes Cluster

Deployed to your Cloud Network


Architecture
• IDMC-based Spark serverless solution
• Cutting edge technology
• Open source and best-in-class
Spark • Built on the cloud, built for the future
• Lower the overall TCO for customers with Claire-
Containerization based auto scaling and provisioning
- Informatica will manage the compute cluster

Kubernetes+ • Vendor neutral architecture


• Ready for multi-cloud from the get-go
+ – Kubernetes based orchestration for Serverless
Advanced Mappings
Advanced Mappings Deployment Options

Azure AWS GCP

• Compute cluster is launched by Secure Agent in the customer network


• Customer has complete control on network peering, assigning roles and privileges
Advanced Mappings: Automated Performance Tuning
Powered by CLAIRE
Advanced Mappings: Why tune?
Advanced Mappings: Why tune?

Manual work
30% of your Engineers time
Pick new
Parameters

Frequent Outages
Pager ringing at 3 AM
Developer
Analyze the
Run the Job
Logs
Slow and expensive
Missing SLA’s every week.
Advanced Mappings: What is tuned?

Optimal cluster parameters Optimal Spark Configuration


• Size • Parallelism
• Instance Type • Shuffle
• # of processors • Storage
• # of memory • JVM Tuning
• Disks • Feature Flags
• … • …
Advanced Mappings: Auto Scaling

• Auto scaling to meet your SLA at least possible costs


• Dynamically respond to changes to environment and workloads to meet the
data volume requirements and compute requirements
• Algorithm to scaling up/down effectively
• Auto adjust based on concurrency
• Horizontal and vertical scaling

• Increase/decrease parallelism by arriving at the optimum number of nodes


and spark executors based on the job demands
Advanced Mappings:
Incremental File Load
• Challenge:
• I want to load data (different flat files) into Cloud Storage
• Files which already have been processed should be ignored.
• I cannot just delete them, since they are used by other processes as well.

Execute Advanced
Source File Directory
Mapping Process

• Solution:
• Advanced Mappings can track data that has been processed during a previous run of an MCT by
persisting the state information of the job run.
• Incremental File Load is a feature of Advanced Mappings which will maintain the state
information and prevent reprocessing of old data.
• Time travel will help to go back in time and re-process files
Use Case
Data will be aggregated into a summary table

73 © Informatica. Proprietary and Confidential.


Intelligent and Scalable for the
Most Demanding Enterprises

49 Trillion
Transactions per month
DEMO
Leverage Investment In
CDW

18 Petabytes
Metadata
74 © Informatica. Proprietary and Confidential.
Advanced Pushdown Optimization

75 © Informatica. Proprietary and Confidential.


ETL vs ELT
ETL and ELT differs from each other when it comes to where data processing occurs

Server

Raw Source Data Transform Raw Source Data

Server Data Warehouse

Data warehouse Transform

Extract, Transform, Load Extract, Load, Transform


Advanced Pushdown Optimization(APDO)
Advanced pushdown
Converts and processes data pipelines to native ecosystem commands and SQL queries for faster processing at lower cost
while ensuring data stays within the ecosystem

Features
• Data pipeline logic gets translated into
Cloud ecosystem based native SQL (SQL
Based PDO) or native ecosystem API/
commands (Ecosystem PDO) based on the
Data integration pattern
• Support for Full, Source, Partial PDO
• Broadest array of connectors and support
for all major ecosystems (CDL/CDWs)
• Ecosystem agnostic
• Simple drop-down option in GUI with no S3/ ADLS/ GCS
need to learn proprietary commands
Enable faster processing with zero data egress charges through
advanced pushdown optimization
ODBC PDO vs APDO
ODBC Based Pushdown Optimization Advanced Pushdown Optimization
Developed 15+ years ago. No further plans to expand Specifically designed to support CDW/CDL patterns. Major
transformation/function support expansion plan for transformations/function support, more
features in roadmap.
An ODBC connection needs to be created and used in Advanced Pushdown Optimization is a native connector feature.
mappings No separate ODBC connection required
Classical CDW patterns only Multiple patterns within CDW, CDL, including classical CDW

Requires Secure Agent All-cloud: Works on Informatica Runtime, Informatica Advanced


Serverless (supports secure agent as well)

Supports only ODBC connection features Existing connector features are supported (example: any
advanced authentication options)

No separate license required Enabled with IPU based model. For Non-IPU, requires separate
license.

79 © Informatica. Proprietary and Confidential.


Use Case 1: Ecosystem Pushdown

Ecosystem Pushdown transfers data from cloud


data lake to data warehouse using native
ecosystem commands
Data warehouse
Data lake

Without Pushdown With Pushdown

S3 S3
COPY $
$$$
Redshift Redshift

AWS AWS
Loading data from Data lake to Data warehouse using Loading data from Data lake to Data warehouse using AWS
Informatica engine commands
Use Case 2: Data warehouse Pushdown

Data warehouse pushdown Use SQL queries to move data from


staging area to ODS and ODS to EDW within a data warehouse

CDW CDW

Without Pushdown With Pushdown


Snowflake
Snowflake

ODS ODS

$$$

AWS AWS
Loading data from staging to ODS in Snowflake using Informatica engine Loading data from staging to ODS in Snowflake using Snowflake engine
Use Case
Any errors must trigger a case in ServiceNow

82 © Informatica. Proprietary and Confidential.


Intelligent and Scalable for the
Most Demanding Enterprises

49 Trillion
Transactions per month
DEMO
APIs and Error Handling

18 Petabytes
Metadata
83 © Informatica. Proprietary and Confidential.
Cloud Application Integration

84 © Informatica. Proprietary and Confidential.


Critical Capabilities for Application Modernization
Data-Driven, Process and Event Integration APIs

Data APIs Process APIs and Integration Event Integration

Cloud Mobile Health insurance, Real-time integration


Services HR systems
app app benefits, training across Order and ERP
Manager ERP systems
Employee signs Laptop, phone,
IT systems screen, Salesforce
accepts offer onboarding access
request
Integration and API middleware Office space Space, desk, chair
application

First day: “Magically” everything works Quote Order Fulfill Payment


Legacy Present Single-click Integrated and Payment and
CRM Database
ERP Magic = Application integration digitally on
portal or email
conversion to
order
automated
fulfillment
invoice
integration

…for sharing data using …for integrating business … for app-to-app data
no-code data access APIs processes that span applications interchange in real-time using
and automating user tasks data events and messaging
API and Application Integration
Where things run Cloud-based Design,
Deployment and Management
iPaaS

APIs: Cloud Process Service Connectivity


REST Cloud Process APIs:
SOAP API Gateway Service/Kubern REST
OData etes SOAP
Cloud OData
Applications
Multi-tenant
Service

API and Application


Integration
Single-tenant Applications Agent Process Service Connectivity
Service
APIs: REST, SOAP, OData
Database: JDBC, Data Access Service
Agent Process Messaging: AMQP, JMS, Kafka, RabbitMQ,
Service AWS SNS/SQS
Data Azure Service Bus / Event Hub,
APIs: REST, SOAP, OData Salesforce Streaming API
Messaging: AMQP, JMS, Kafka, RabbitMQ, File Writers: File, FTP, AWS S3
AWS SNS/SQS Built in: Java Service, Shell Service
Azure Service Bus / Event Hub, Applications: SAP BAPI and others
Salesforce Streaming API / Outbound Messaging
Listeners: File, FTP, AWS S3 Services and
Microservices
What is Cloud Application Integration?

API and App Integration

Start Event Business Logic End Event


• Dequeue Message • Call another API • Enqueue Message
• File Event / Read • Call subprocess • Write Files
• HTTPS request • Read/write to DB • HTTPS response
• Salesforce • Mediation/Routing • Throw Error
OBM/Streaming
API Implementation and Management
Service Hierarchy
Application Data Services /
Business Services Tier
Services Tier Microservices Tier

Cloud and On Premises Application,


Cloud API Proxy Cloud API and Application Integration
Messaging and Data Services
and Management Cloud and On Premises
Services
Application
SOAP APIs Services and APIs

Data Services

REST APIs Queues / Topics


API
Gateway
Business Processes

Data and OData


APIs Data
Gateway, REST, SOAP, Processes can
Manager, OData APIs be composed Data Hub
Portal,
Registry
3rd-party and
customer-facing APIs
Provide: Synchronous and Async Services
Consumes: Synchronous and Async Services
Processes consume and are consumable
IDMC Pricing Model

Kilian Ingelfinger · Sr Solutions Consultant


Global Partner Technical Sales · GPTS

2023
DATA CONSUMERS

Intelligent Data Management Cloud

DATA DATA API & APP DATA DATA MASTER DATA CUSTOMER & DATA GOVERNANCE &
CATALOG INTEGRATION INTEGRATION PREP QUALITY MANAGEMENT BUSINESS 360 MARKETPLACE PRIVACY

10K+ Metadata-Aware Connectors


AI-Powered Metadata Intelligence & Automation

CLOUD-NATIVE • MICROSERVICES-BASED • API-DRIVEN • ELASTIC • SERVERLESS

DATA SOURCES
Why is Data Management Hard and Complex?

DATA DATA API & APP DATA MDM & 360 GOVERNANCE & DATA
CATALOG INTEGRATION INTEGRATION QUALITY APPLICATIONS PRIVACY MARKETPLACE

A A H I O N V O T D D

L U
Vendors

B M O J M O P P

C F P W P K Q

D G K K S R

INTELLIGENT DATA MANAGEMENT CLOUD

91 © Informatica. Proprietary and Confidential.


Modern Consumption Based Pricing Model

INTELLIGENT DATA MANAGEMENT CLOUD

92 © Informatica. Proprietary and Confidential.


Informatica Processing Unit – IPU
A new Currency for Consumption

IDMC Service A

IPU Consumption
Measured by
Exchange
IPU Consume IDMC Service B Flexible and Interchangeable
Currency
IDMC Service X

93 © Informatica. Proprietary and Confidential.


Data Management Landscape is Fragmented

CATALOG INGEST INTEGRATE CLEANSE RELATE GOVERN PROTECT PREPARE SHARE & DELIVER
Discover, catalog, Multi-latency data Integrate all types Make data fit for Match and relate Define and verify Detect and protect For analytics Publish and
and curate all ingestion and edge of data purpose identities and data governance sensitive data and collaborate on manage APIs and
enterprise data computing entities policies projects Data Services

Azure

94 © Informatica. Proprietary and Confidential.


We can do all this with one cloud native platform: IDMC

CATALOG INGEST INTEGRATE CLEANSE RELATE GOVERN PROTECT PREPARE SHARE & DELIVER
Discover, catalog, Multi-latency data Integrate all types Make data fit for Match and relate Define and verify Detect and protect For analytics Publish and
and curate all ingestion and edge of data purpose identities and data governance sensitive data and collaborate on manage APIs and
enterprise data computing entities policies projects Data Services

INTELLIGENT DATA MANAGEMENT CLOUD

Azure

95 © Informatica. Proprietary and Confidential.


IPU Meters and Scalars METER
Advanced Data Integration Compute Units
SCALAR

Advanced Serverless Compute Units


API Management API Calls
• IPU meters are the... Advanced Pushdown Rows Processed
Application Integration Compute Units
• Services and
B2B Gateway Compute Units
• Features... CDGC Governance Daily Assets Stored
included in the IDMC CDGC Catalog Daily Assets Stored
CDGC Scanner Compute Units
Data Integration Compute Units
• This table lists the IPU meters and Data Integration with CDC Rows Processed
their applicable scalars Data Marketplace Daily Assets Stored
Data Masking Compute Units
Data Quality Compute Units
Integration Hub Events Processed
Mass Ingestion Data Volume
Mass Ingestion DB with CDC Rows Processed
Sandbox/Sub-Org Organization
Primary Scalars and their Units of Measure
Scalar Unit of Measure Description
Processing capacity used or consumed. A minimum of four
Compute Units Hour physical or logical CPU-cores is used to calculate Compute
Units.
HTTP/HTTPS requests triggered by consuming applications and
API Calls Number
end-clients, using the API Gateway
Number of Rows that are transferred, transformed or
Rows Processed Million Rows
incorporated
Inbound & outbound instances of data accessed in an
Events Processed 1k Events
intermediate storage layer.

Daily Assets
100/1k/100k Assets Number of Assets that are stored by the service on a single day
Stored

Data Volume Gigabyte Volume of data that is transferred, transformed or incorporated


Organization Number Number of sub-organizations and sandbox organizations
Calculating IPU Consumption

• The „Rate Card“ defines the conversion rate of


metered consumption to IPU
• The „Rate Card“ aka Cloud and Product Description
Schedule is a part of the Customer Quote
• IPU rate of the scalars can depend on amount of
consumption
• Example:

• 1 Compute Hour = 0.16 IPUs


• 1.000 Compute Hours = 160 IPUs
• 10.000 Compute Hours = 320 +200 = 520 IPUs
Price per IPU decreases with increasing Pre-Commit
Price per IPU / Month
$ Incl. Premium Support
80

70

60

50

40

30

20

10

-
IPU
Volume Package

Whether a customer pre-commits to 120 IPUs or 10K IPUs, they all get access to the same functionality
Informatica IPU Commercial Model

What we charge for What we do NOT charge for

Usage of the platform: • Connect to any system at no


additional cost
• Processing hours for any
Integration pattern , Data • No charge per user
Quality or Masking • Swap between functionality for
• Data volume for Mass no additional cost
Ingestion • Install as many secure agents as
• Number of rows for change you wish
data capture
Informatica IPU Example Uses
120 IPU/ month 200 IPU/month 400 IPU/month

Example Usage : Example Usage : Example Usage :


• 750 hrs/month of batch • 1250 hrs/month of batch • 5000 hrs/month of batch
integration integration integration
OR OR OR
• 280 hrs/month of real time • 740 hrs/month of real time • 4000 hrs/month of real time
integration integration integration
OR OR OR
• 315 hrs/month of Data Quality • 525 hrs/month of Data Quality • 1050 hrs/month of Data Quality
OR OR OR
• 20 million rows/month of • 33 million rows/month of • 205 million rows of change data
change data capture change data capture capture
OR OR OR
• 150 hours batch, 36 hours real • 240 hours batch, 60 hours real • 500 hours batch, 450 hours real
time, 58hours Data Quality, 4 time, 80 hours Data Quality, 10 time, 120 hours Data Quality, 21
million rows CDC million rows CDC million rows CDC
IPU Usage Summary Page
1. Current Billing Period
2. Allocated IPUs across pro/sub-
orgs/sandboxes
3. IPUs consumed
4. Remaining IPUs
5. Projection of days available on
current usage rate
6. Split view by service/major feature
7. Daily plot of IPU usage in given
billing period
8. List of meters by scalar
9. IPUs consumed and scalar usage
(current period)
10. Current rate
11. IPUs consumed and scalar usage
(previous period)
12. Count and usage of sub/sandbox
orgs
13. Use swithcer to change to sub-org
Intelligent and Scalable for the
Most Demanding Enterprises

49 Trillion
Transactions per month
DEMO
IPU Metering

18 Petabytes
Metadata
103 © Informatica. Proprietary and Confidential.
Wrap up Day 1

104 © Informatica. Proprietary and Confidential.


Agenda - DAY 1

IDMC
IDMC Data IPU
Bootcamp
Intro Integration Pricing Model
Intro

105 © Informatica. Proprietary and Confidential.


Agenda - DAY 2

Modernization
DAY 1 Cloud Data Cloud
Panel
Recap Governance Modernization
Discussion

106 © Informatica. Proprietary and Confidential.

You might also like