Friday, April 1, 2011

SSIS - Overview & Interview Questions

About SSIS

-          Introduced in 2005
-          Advanced version of DTS (Data Transformation Services)
-          An easy to use, highly flexible, extremely capable, highly scalable ETL (Extract Transform Load) tool
-          Creates structure called package
-          Package contains mainly following things
o   Control Flow (defines overall flow)
o   Data Flow (extracts data from multiple sources and putting into multiple destinations)
o   Event Handler
o   Package Explorer
o   Connections Tray (holds all kind of connections)

Connection Manager

-          OLEDB connection
-          Flat file connection
-          Excel connection
-          Cache connection
-          FTP connection
-          Multifile connection
-          SMTP connection

Difference between Control Flow & Data Flow Task
      Click Here


       - Variables store values in SSIS Package at package, container, task level
       - Variables can be used in
             a. Script task - we can use the variable using Dts.Variables["Blah"]
             b. Expression in any transformation
             c. Expression in any Connection Manager
             d. Execute SQL Task to pass value to SP, and store value from SP
             e. Precedence Constraint to define success/failure flow
             f. Error output
       - 2 type of variables
             a. System Variable
             b. User-defined Variable

       - Variable name is case-sensitive
       - We can create 2 variables with same name at different level. e.g. Container level and then  
          Data Flow Task Level
       - In that case, local scope will get more priority.

What are the different control flow tasks you have used?

         -          Execute SQL Task (more details)
         -          Execute Package Task (more details)
         -          Send Mail Task (more details)
         -          Data Profiling Task (more details)
         -          Transfer Logins Task (more details)
         -          Bulk Insert Task (more details)
         -          Data Flow Task
         -          For Loop Container (more details)
         -          For Each Loop Container (more details)
         -          Sequence Container
         -          File System Task (more details)
         -          Script Task
         -          FTP Task
         -          Backup Database Task
         -          Rebuild Index Task
         -          Reorganize Index Task
         -          Shrink Database Task
         -          Update Statistics Task

What are the different data flow transformations you have used?

            -          Simple DFT

         -          Copy Column Transformation (more details)
         -          Derived Column & Data Conversion Transformation (more details)
         -          Conditional Split Transformation (more details)
         -          Character Map Transformation (more details)
         -          Audit Transformation (more details)
         -          Aggregate & Multicast Transformation (more details)
         -          Merge & Sort Transformation (more details)
         -          Merge Join Transformation (more details)
         -          Union All Transformation (more details)
         -          Row Count Transformation (more details)
         -          Row Sampling Transformation
         -          Fuzzy Lookup Transformation
         -          Fuzzy Grouping Transformation
         -          Cache Transformation (more details)
         -          LookupTransformation (more details)

        How would you do logging in SSIS?
             click here

   Approach for Error Handling
             click here

          Debugging and Using Data Viewer
                  click here

         How to deploy SSIS package on production server?
             To be added

        New features of SSIS 2008?
             click here

        How to pass a variable to child package?
             To be added

         Performance improvement in SSIS
             click here

         How would you restart package from point of failure? What is checkpoint?
             To be added 

      How you can achieve parallelism in SSIS?
               To be added 

         Difference between Merge, Merge Join & Union All

         Difference between Conditional Split & Multicast
               To be added