+++> 3.Intro. SAP BO DATA Integrator

SAP BO DATA Integrator / Data Services
Data services is integrated with SAP BI/SAP R3/SAP Applications and non SAP Warehouse.
Purpose:- It do ETL via batch Job and online method thru bulk and delta load processing of both structured

and unstructured data to generate a Warehouse (sap and Non-sap) Data Services is the combination of Data

Integrator and Data Quality. Previously these are separate tools like Data Integrator which is used to do ETL

part and Data Quality to do the data profiling and Data Cleansing. Now with Data Services both DI and DQ

are combined in to one interface so that it provides the complete solution (data integration and Quality) under

one platform.This even combines the separate job servers & Repositories of DI and DI in to one.
Data Federator: - The output of the data federator is the virtual data. Federator provides data as input to

the data services and using federator we can project data from multiple sources as a single source.
Data Services Scenarios:-
Source Warehouse
SQL -- DS -- SQL
Flat File -- DS -- SQL
Flat File -- DS -- BI
R/3 -- DS -- BI
R/3 -- DS -- SQL
SQL   -- DS -- BI
We can move the data from any source to any target DB using Data Services.
Data Services is utility to do ETL process,It is not a warehouse,so it doesn’t stage any amount of data in it.
Data Services can create ETL process and can create a warehouse (SAP / Non-Sap) .
DS is used majorly for 3 sort of projects
  1. Migration
  2. Warehouse or DB building
  3. Data Quality
Data Profiling: - Pre processing of data before the ETL to check the health of the data. By profiling we check

the health of the data if it’s good or bad.
Advantages of Data Services over SAP BI/BW ETL process
It’s a GUI based framework
It has multiple data sources in built configuration
It has numerous inbuilt Transformations (Integrator, Quality, Platform)
It does data profiling activity
It easily adds external systems
It supports Export Execution Command to load the data in to the warehouse via batch mode process
It generates ABAP code automatically
It recognizes Structure and unstructured data
It can generate a warehouse (sap / Non Sap)
It supports huge data cleansing/ Consolidation/ Transformation
It can do real time data load/ Full data load/ Incremental Data load
Data integrator / Services Architecture


No concept of Process chains/ DTP/ Info packages if you use the data services to load the data.
Data Integrator Components
Designer
It Creates the ETL Process
It has wide set of transformations
It includes all the artifacts of the project ( Work Flow, Data Flow, Data Store, Tables)
It is a gate way to do profiling
All the designer objects are reusable
Management Console (URL based tool / Web based tool)
It is used to activate the repositories
It allows us to activate user profiles to specific environment
It allows us to create users and user groups and assign the users to the user groups with privileges
It allows to auto schedule or execute the jobs
We can execute the jobs from any geographic location as this is a web based tool
It allows us to connect the repositories to Connections (Dev/ Qual / Prod)
It allows us to customize the data stores


Access Server
It is used to run the real time jobs
It gets the XML input (real time data)
XML inputs can be loaded to the Ware house using the Access server
It is responsible for the execution of online / real time jobs
Repository Manager
It allows us to create the Repositories (Local, Central, and Profiler)
Repositories are created using standard database
Data Services system tables are available here
Mete Data Integrator
It generates Auto Documentation
It generates sample reports and semantic layers
It generates job based statistic dashboards
Job Server
This is the server which is responsible to execute the jobs. Without assigning the local / central repository

we cannot execute the job.
Data Integrator Objects
Projects :-
Project is a folder where you store all the related jobs at once place.We can call it as Folder to organize jobs.
Jobs:-
Jobs are the executable part of the Data Services. This job is present under the project.
Batch Job
Online jobs
Work Flows:-
This workflow acts a folder to contain the related Data Flows. This Workflows are reusable
Conditionals:-
Conditional contains Work Flows or data flows and these are controlled by script whether to trigger or not.
Scripts:-
Scripts are set of codes used to define or initialize the global variables, Control the flow of conditionals or

control the flow of execution , to print some statements at the run time and also to assign specific

default values to the variables.
Data Flow:-
The actual data processing happens here.
Source Data Store:-
It is the place held to import the data from the database/ sap to data services local repository
Target Data Store:-
It is the collection of dimensions and fact tables to create the data warehouse.
Transformations:-
These are the query transformations that are used to carry out the ETL process.
These are broadly categorized in to 3 (platform, Quality and integrator)
File Format :-  
It contains various legacy system file formats
Variables:-
We can create and use the local and global variables and use them in the project.

The variables starts with “$” Symbol.
Functions:-
We have numerous inbuilt functions like (String, math, lookup , enrich and so on)
Template Table:-
These are the temporary tables that are used to hold the intermediate data or the final data.
Data Store:-
These data stores acts a port from which you can define the connections to the source or the target systems.

You can create multiple configurations in one data store to connect this to the different systems
ATL :-
ATL  files are like the BIAR files. This is named after a company. ATL  doesn’t hold any full form like BIAR.
The Project/ Job/ Work Flow/ Data Flow/ Tables can be exported to ATL so that they can be moved between

Dev ---  Qual and from Qual --- Prod.
Similarly you can also import the Project/ Job/ Work Flow/ Data Flow/ Tables which are exported to ATL,

back in to the data services .