How SSIS works

Last updated on Nov 26 2021
Ashutosh Patil

Table of Contents

How SSIS works

We know that SSIS is a platform for two functions, i.e., Data Integration and workflow. Both the tasks Data transformations and workflow creation are carried by using the SSIS package. SSIS package consists of three components:

image1 24

Operational data

Operational data is a database used to integrate the data from multiple data sources to perform additional operations on the data. It is the place where the data is housed for current operation before sending to the data warehouse for storing, reporting, or archiving.

ETL

  • ETL is the most important process in SSIS tool. ETL is used to Extract, Transform, and Load the data into a data warehouse.
  • ETL is a process responsible for pulling out the data multiple data sources, transforming the data into useful data, and then storing the data into a data warehouse. The data can be in any format xml file, flat file, or any database file.
  • It also ensures that the data stored in the data warehouse is relevant, accurate, high quality, and useful to the business users.
  • It can be easily accessed so that the data warehouse can be used effectively and efficiently.
  • It also helps the organization to make data-driven decisions by retrieving the structured and unstructured data from multiple data sources.

An ETL is a three-word concept, but it is divided into four phases:

image2 21

Capture: Capture phase is also known as Extract phase. In this phase, it picks the source data or metadata, and the data can be in any format such as xml file, flat file, or any database file.

Scrub: In this phase, the original data is checked. It checks the data, whether it consists of any errors or not. It checks for the errors or inconsistency of data by using some artificial intelligence techniques. In short, it verifies whether the quality of the product is met or not.

Transform: It is the third phase in ETL. Transformation is the process in which the original format is converted into a required format that you want. Transformation is modelling or changing the data according to the user requirements. The changes can be either change in the number of columns or rows.

Load and index: The fourth phase is Load and index. It loads the data and validates the number of rows that have been processed. Once the loading of data is completed, the indexing is used. Indexing helps you to track the number of rows that are loaded in the data warehouse. Indexing also helps to identify the data, whether it is in the correct format or not.

Data Warehouse

Data warehouse is a single, complete, and consistent store of data which is formulated by combining the data from multiple data sources.

Difference between Database and Data warehouse

The answer can be yes as well as no. Both the database and data warehouse have a large unit of data and similar physical representation but the response time of complex queries in Data warehouse is faster than the database.

Requirements for SQL Server Integration Services

The following are the requirements to install the SQL Server Integration Services:

  • Install the SQL Server
  • Install the SQL Server Data Tools

Follow the below steps to install the SQL Server Data tools:

Step 1: Click on the link https://docs.microsoft.com/en-us/sql/ssdt/previous-releases-of-sql-server-data-tools-ssdt-and-ssdt-bi?view=sql-server-2017 to download the SQL Server data tools.

Step 2: When you click on the above link, the screen appears shown below:

image3 17

In the above screen, select the version of SSDT that you want to install.

Step 3: Once the downloading is completed, run the downloaded file. When you run the downloaded file, the screen appears which is shown below:

image4 12

Step 4: Click on the Next button.

Step 5: Select the visual studio instance and the tools that you want to install in the visual studio 2017.

image5 10

Step 6: Click on the Install button.

What is the SSIS Package?

The Package is a fundamental block where you code in SSIS. Here, code does not mean that you are coding in some programming language; it means the development that you do. The development is done in the SSIS package. SSIS is mainly used for the ETL process, and the ETL process is performed inside the SSIS package.

SSIS package is composed of three parts:

image6 9

  • Connections
    SSIS package will have some connections, and these connections are used to connect to various data sources.
  • Control flow elements
    SSIS package is composed of two elements, i.e., control flow elements and data flow elements. Control flow elements handle workflows. Workflow means that we are performing some tasks in steps, so the sequence is done through control flow.
  • Data flow elements
    The data flow elements perform transformations.

SSIS Tasks

In SSIS package, we can add the tasks. A task is a unit of work and we will have different types of tasks to perform different kinds of work. There are various types of tasks, but we will discuss the most common tasks used in SSIS:

  • Execute SQL Task
    It is used to execute the SQL statements against a relational database.
  • Data Flow Task
    It is mainly used to read the data from one or multiple data sources, transform the data and can also load the data to one or more destinations.
  • Analysis Services Processing Task
    It is used to process objects of an SSAS cube or Tabular model.
  • Execute Package Task
    This task is used to call the other packages within the same project. You can even pass the variable values to the called package.
  • Execute Process Task
    It allows you to run an application or batch scripts as a SQL Server Integration Services. It can be used to open the standard application such as Microsoft Excel, Microsoft Word, etc. It is also used to unzip the compressed file.
  • File System Task
    It can be used to perform the manipulations in the file system such as moving files, deleting files, renaming the files, changing the directory, etc.
  • FTP Tasks
    IFTP tasks are used to perform the operations on files and folders. For example, if you want to send or receive the file from the FTP server to the local directory, then the SSIS FTP task is used.
  • Script Task
    This task allows you to write the .Net code that you want to perform.
  • Send Mail Task
    This task is used to send an email. It is mainly used when you want to notify the users regarding the state of the task whether it is in a running state or some error has occurred.

Let’s see some working example of a Data integration service.

Example of Data Flow Task

Step 1: First, we create an excel file in Microsoft Excel. Suppose I create the student database in Microsoft Excel, and fields in student table are student_id, student_name, marks, and gender.

image7 7

Step 2: Open the visual studio (SQL Server Data Tools).

image8 6

Step 3: Click on the File option and then move to New->Project.

image9 7

Step 4: When you click on the Project, the screen appears, which is shown as below:

image10 6

In the above screen, click on the Integration Services appearing at the leftmost side of the panel and then click on the Integration Services Project.

Step 5: After clicking on the Integration Services Project option, the screen appears shown as below:

image11 6

The above screen consists of five tabs such as Control flow, Data flow, Parameters, Event Handler, and Package Explorer. We have already discussed the Control flow and Data flow, both the control flow and data flow exist inside the SSIS package.

step 6: To import the data, we need first to create the database in which we create the table in SQL Server database. As we know that student data is available inside the Excel file, and we want to import the data in SQL Server database. In order to achieve this, we need first to create the table in SQL Server. Open the SQL Server Management studio.

Step 7: Create the student database. Right-click on the database, and then click on the New Database option.

image12 6

Step 8: Enter the database name in the below screenshot:

image13 6

I have given the StudentWarehouse as a database name.

Step 9: Click on the OK button.

Step 10: Now, we create the table in a StudentWarehouse database. Right-click on the Tables, and then click on the New Table.

image14 6

Step 11: Now we add the fields in a table such as student_id, student_name, marks, and gender. We set the student_id as a primary key which uniquely identifies each row.

image15 5

The above screen shows that we add four fields in a table. After adding all the fields in a table, we save the table by providing some useful name. Suppose I have provided the student as a table name.

Step 12: Move back to the visual studio. Control flow contains the Data Flow Task, drag and drop the Data Flow Task on the main window.

image16 4

Step 13: When we double-click on the Data Flow Task, the control automatically moves to the Data Flow from the Control Flow, so we can say that Control Flow is a container of Data Flow.

Step 14: Now, we can perform the transformations in Data Flow. We want to extract the data from Excel file that we created earlier, so drag and drop the Excel source from the other sources appearing at the leftmost side of the panel.

image17 3

From the above screen, we observe that the red cross appears inside the Excel Source, it means that the component is configured with the Excel Source.

Step 15: To configure the component, right-click on the Excel source and click on the Edit option as shown in the below screenshot:

image18 3

Step 16: Fill the details shown in the below screenshot:

image19 2

Step 17: After entering all the details, the screen appears shown as below:

image20 2

Step 18: Click on the Excel Souce.

image21 2

The above screen shows two arrows, red and blue arrow. The blue arrow defines the actual data that we receive from the Excel file, and the red arrow denotes the errors that we have. Now we have to perform the transformations, and before performing the transformations, we need to load the database.

Step 19: To load the database, click on the Other Destinations and then click on the Destination.

image22 2

Step 20: After adding the destination, connect the source to the Destination.

image23 2

Step 21: In the above step, we noticed the red cross in ADO NET Destination, which means that it is not configured with a component. Right-click on the ADO NET Destination and then click on the Edit option as shown in the below screenshot:

image24 2

Step 22: You need to specify the configuration manager.

image25 2

Step 23: Click on the New button to add the connection manager. When you click on the New button, the screen appears shown below:

image26 2

Step 24: Add the Server name and select the database name from the SQL Server.

image27 2

Step 25: Click on the OK button.

Step 26: Add the table name in which you want to load the data.

image28 2

Step 27: In the below screen, map the field of input source to the field of the destination source.

image29 2

Step 28: After adding a mapping, the source is connected to the Destination, as shown in the below screenshot:

image30 2

Step 29: Click on the Start button to run the Integration Services Project. When you run the project, the data gets loaded to the destination, i.e., SQL Server.

Example of Execute SQL Task

  • Follow the steps from step 2 to step 5 of the previous example.
  • Drag and drop the Execute SQL Task appearing at the leftmost panel to the main window.

image31 2

In the above screen, we observe that the red-cross appears in the Execute SQL Task, which means that the Connection Manager is not configured with the Execute SQL Task. To configure it, we need to add the Connection Manager.

  • To add the Connection Manager, Right-click on the Connection Manager and then click on the New Connection Manager as shown in the below screenshot:

image32 2

  • Select the OLEDB as a Connection Manager type and then click on the Add button as shown in the below screenshot:

 

image33 1

  • After clicking on the Add button, the screen appears shown in the below screenshot:

image34

In the above screen, we observe that the data connection of StudentWarehouse database is already available as we created in the previous step. Click on the OK button.

  • When you click on the OK button, the screen appears as shown below:

image35

In the above screen, you need to enter the details such as the name of your task, description, SQL statement (SQL task that you need to perform on the table).

  • I entered the following details in the above screenshot:

In the above case, I have provided the insert command in SQLStatement.

  • Click on the OK button.
  • When you configure the component with the Execute SQL Task, the red cross disappears as shown in the below screenshot:

image37

  • Run the project by clicking on the Start.

image38

  • When you click on the Start, the screen appears shown as below:

image39

In the above screen, the tick mark shows that your SQL Task has been successfully completed. The insert statement which we added in the previous steps has been executed, and the data is added in table. i.e., student1.

  • Output of this task is shown below

image40

So, this brings us to the end of blog. This Tecklearn ‘How SSIS Works’ blog helps you with commonly asked questions if you are looking out for a job in Microsoft BI. If you wish to learn Microsoft BI and build a career in Business Intelligence domain, then check out our interactive, Microsoft SSIS Course Training, that comes with 24*7 support to guide you throughout your learning period. Please find the link for course details:

https://www.tecklearn.com/course/microsoft-ssis-course-content/

Microsoft SSIS Course Training

About the Course

Tecklearn’s Microsoft SSIS training equips you with skills needed to work with SQL Server Integration Services (SSIS) for Business Intelligence. SQL Server Integration Services (SSIS) is a component of the Microsoft SQL Server database, it is used for performing a wide range of data integration tasks. This module provides detailed knowledge of data migration techniques, how to work with dataflow transformations, SSIS packages, event handling, implement checklists, deployment procedures, and much more through best practices.

Why Should you take Microsoft SSIS Training?

  • The Average salary for a SQL SSIS Developer is $104,740 – ZipRecruiter.com
  • Wells Fargo, United Health Group, Aetna, Conduent and many other MNC’s are using Microsoft SSIS.
  • Microsoft BI is a Leader in 2018 Gartner Magic Quadrant for Business Intelligence & Analytics Platforms (9th Consecutive Year).

What you will Learn in this Course?

Introduction to MSBI and Data Warehousing

  • Fundamentals of Data Warehousing
  • Concepts of Dimensions, Measures, Metadata and Schemas
  • Data Marts and Design approaches
  • Normalization and Denormalization and Schema types
  • Concepts of Online Analytical Processing and Transactional Processing
  • OLAP Cube
  • Explain ETL process and various tasks involved in it
  • Slowly Changing Dimensions Types
  • Business Intelligence Concepts
  • Working of BI with data-warehouse

Introduction to SSIS

  • Understanding of the MSBI Architecture
  • Import and Export wizard
  • Understand SSIS Architecture
  • Control Flow and its Components (Tasks, Containers and Precedence Constraints)
  • Data Flow and its Components (Source and Destination Connections and types of Transformations)
  • System Variables and User-defined variables
  • Scenarios by combining Control Flow and Data Flow components
  • Hands On

Transformations and Use-Cases

  • Data Conversion transformation
  • Multicast transformation
  • Union all transformation
  • Conditional Split Transformation
  • Merge and Merge Join Transformation
  • Lookup transformation
  • Cached Lookup transformation
  • Foreach loop and use-cases
  • Bulk-insert task
  • Archival process using dynamic variables and FST
  • Advancing Execute SQL Task with Object return type
  • Types of Outputs usage
  • Hands On

Slowly Changing Dimensions

  • Understanding data that slowly changes over time
  • Learning the process of how new data is written over old data
  • Detail explanation of three types of SCDs –Type1, Type2 and Type3, and their differences
  • Hands On

Overview of Fuzzy Look-up Transformation and Lookup and Term Extraction

  • Concept of Fuzzy matching
  • How Fuzzy Lookup Transformation varies from Lookup Transformation
  • Hands On

Concepts of Logging & Configuration

  • Learning about error rows configuration
  • Package logging
  • Defining package configuration
  • Understanding constraints and event handlers
  • Hands On

Got a question for us? Please mention it in the comments section and we will get back to you.

0 responses on "How SSIS works"

Leave a Message

Your email address will not be published. Required fields are marked *