Quick Tip – C# Batch Parallel Processing

There’s a fair few creative solutions around for executing data processing tasks in parallel in SQL and SSIS. As good as they are they’re not really necessary if you use Parallel.ForEach in the C# library System.Threading.Tasks. This is a parallel loop that can be used to iterate items in a collection and run some code for each item. All the thread allocation and loop tracking control is handled internally without having to do anything.

The only other thing you could do is pass it a parameter to specify how many concurrent executions you want making it adjustable and easy to configure at runtime:

int maxP = 5;


new ParallelOptions { MaxDegreeOfParallelism = maxP},
(currentProcess) =>

//Excecute SSIS Package
//Execute SQL Proc
//Azure Data Lake Analytics Job



The Basics – BIML : Preview Pain & Multiple File Dependencies

BIML Express 2017 has been formally release and a great new feature is the preview pain that lets you see how you’re BIML will be rendered during development. For metaprogramming this is awesome and saves a lot of pain. You can read about it here and here.


So what if you have multiple dependent BIML files? e.g. I have a file that retrieves my table meta data to build table def BIML and a file that consumes the table def BIML to create my packages. This way I can use the same table definition BIML file for multiple package patterns. I just thought this would be too much for it to cope with for the 1st release – but it does in fact work which is super awesome.

The trick is the preview pain operates on all the open files. So all the dependent files have to be open to fully rendered the desired result in the preview pain.

So with just my package definition BIML file open no package BIML code is rendered in the preview… this is because there are no table defs to iterate to create the packages.


If I open both the table definition BIML Script and the packages definition script then we’re all good.

Table definition BIML


Packages definition BIML Preview now has packages!


Auto DW – Metaprogramming

This is a high level consideration on using metaprogramming to build Automated Data Provisioning frameworks.

Once you’ve mastered writing code and applying solution patterns to solve real world problems a next natural progressive step is to write code that writes code! A lot of implementation technologies might even dip into this without you being aware or you may just start dipping into a natural innovative way so solve a certain problem. The topic is part of an advanced knowledge domain called Metaprogramming; the wiki-post discusses it pro’s and challenges. Kathleen Dollard has a great course on Pluralsight called Understanding Metaprogramming.

My own experience and perhaps the most common is that you’ll start metaprogramming before you’ve given the topic it’s full attention. I don’t remember getting out bed and thinking… “today I will do some metaprogramming”. What happened is that chasing the benefits as a result of experiencing pain provided the motivation. The next thing to say about code writing code is that it can go fantastically well or horrifically bad. Without giving the knowledge domain the respect that it deserves chances are it will be the latter.

Another fundamental software engineering trap I learnt the hard way is don’t program generic solutions to very specific problems you’ll pay for it in complexity and performance. The temptation can be quite strong because we’re taught to abstract and conquer particularly if the problem looks the same – but is it? This is particularly relevant for data provisioning platforms because (not exhaustive):

  1. Performance is high on the agenda. We attempt to routinely and frequently move and change tons of data; performance is crucial for success and it’s directly related to costs
  2. The repetition is obvious and can appear to constitute a large proportion of man hours; to an economist it seems to be the same solution over and over again e.g. stage data, build snapshot fact, build type 1 dimension, build type 2 dimension, etc…
  3. The content (the data) is fluid and dynamic over a large temporal period. Beyond it’s schema definition it has a low level and often an incomplete persistence of semantics that can be illusive that are a product of real world economic and human behaviour
  4. The expectations and requirements are also fluid and dynamic; they seek recover semantic meaning or information from data using reports, interactive data visualisation tools, semantic layers or other system to system interfaces.

So bringing this all into context:

  • Design patterns are common but the solutions are never the same. A type 2 dimension is a design pattern not a solution. This isn’t helped by teams running bad agile delivery. A type 2 dimension is not a backlog story and neither is a fact table.
  • The solution is to provide specific information to meet a business requirement. Not only is it different on every single project, it’s different in the same project over time. The business has to respond to it’s market which in turn motivates expectation and influences human behaviour which  in turn churns the raw solution content; the data. A static solution is not a solution.
  • The solution content is the data which is also different on every single implementation and within an implementation over time. It has features and they can all change either explicitly or implicitly
  • Performance in data platforms rely on issuing the correct amount of physical computing resources at exactly the right time. What this means is that a physical implementation needs to know about the features of the data very explicitly in order to allocate the correct amount of resources. Get it wrong in an on premise architecture and a job hogs limited resources causing other processes to suffer. Get it wrong on cloud MPP architecture and you’ll pay too much money. This is not going away; why? because information has entropy and you can’t cheat the law of physics.

In Conclusion

Building a generic solution to solve the problem of repetition in Data Platform delivery isn’t the answer. The data is specific, the requirements are specific and if you take this approach the solution is abstract leading to overly complicated and poor performing technical architectures. At their very worst the try to shoe horn the specifics into an architecture that hinders the goal and completely misses the requirements. I’d stick my neck out based on my own experience and state that 2 solutions are never the same; even in the same industry using the same operational systems.

Be very wary of magical all singing and all dancing products that claim to be a generic solution to data provisioning. AI is long way off being able derive specific semantics about the real world based on data. It’s just not possible right now… a lot of AI is approximate based on population statistics; the features of data and information are very specific.

Metaprogamming solves the problem of repetition but delivers specific solution artefacts that don’t sacrifice what Data Platforms implementations need in order to succeed which is:

  • Perform within their budget
  • Meet the business requirements

We aim to solve the repetition problem (and a whole host secondary problems) during the development process and recognise that there is the following:

  • Specific metadata about the technical features of the raw data
  • Specific metadata about the technical features of the deliverables
  • Generic implementation patterns

Development frameworks can collect the metadata specifics and combine them with generic implementation patterns to automatically generate the code of our specific solution artefacts. No product or framework however can do the following:

  • Semantically analyse the data to determine the code required to perform the transformations required to meet the information requirement. This requires real intelligence i.e. a human! It can also be extremely hard if the data and requirements are particularly challenging – this is where the real business value sits in your solution
  • Decide what are the best design patterns to use and how to construct them into a into a solution to meet the requirements. This requires knowledge and experience – An experienced solution architect

There are number of technical ways to achieve metaprogramming. I generally work in the Microsoft data platform space. Here are some I’ve used before I knew about metaprogramming:

  • XML/XSLT creating JavaScript!! Not data platform and a long time ago. Wouldn’t recommend it
  • SQL creating SQL (Dynamic SQL)
  • C# creating SSIS and SQL
  • T4 Templates
  • BIML (a Varigence creation)

I’ve built a few automated development frameworks using the above. Some of them were awful. I found myself neck deep in some crazy code maintenance and debugging hell which motivated me to learn more about the in’s and out’s of metaprogramming. I strongly recommend Kathleen’s course Understanding Metaprogramming if you’re heading down this road since it goes into detail about the approaches and the best classes of solutions for code generating code. Now I only use to the following:

The way that BIML works is actually a very similar T4 Templates it’s just that BIML brings a really useful mark-up language and development IDE to the party for scripting the creation of SSIS packages and database objects. They have also just released their automated development framework called BIML Flex if you don’t have the bandwidth/time to build your own.

As it turns out tackling metadata as a technical requirement during the development cycle lends itself to solving other common difficult problems in the data provisioning space which is integrating the following:

  • Data Catalog
  • Data Lineage
  • Operational Logging
  • Operational Auditing

Because the metadata collection is required and the assets are created from it, integrating these data platform features becomes a by-product of the development process itself. It’s a very proactive and effective solution. Retrospective solutions in this space can never keep up to the pace of change or are too pervasive requiring constant maintenance and support over and above the solution itself.




ADP Framework : Schema & Object Mapping

This is documentation for Schema and Object meta data mappings for the Automated Data Provisioning (ADP) Framework using BIML, SSIS and SQL. The getting started documentation can be found here.

The ADP Framework has a meta data repository for meta data and how data transfer is mapped across the meta data which is ultimately used to generate data loads and provide data lineage logging. The meta data repository is a bunch of tables that are created as an extension in the SSISDB. This blog documents these tables and describes how they are intended to be used.

Here is the diagram:



Data Object Tables

semanticinsight.system_component – This holds details and self referencing mappings of logical system components.

It’s common for data provision platform to be implemented in a hierarchy of system components such as data sources, stage databases, data marts ODS’s, data vaults and/or data warehouses. Sometimes logical sub-groupings are required in order to meet load provisioning and dependency requirements. The table is designed and intended to store components that may be a simple logical grouping or logically represent a physical component e.g. database, file share, blob storage or data lake store. Currently the framework is setup for traditional RDBMS data provisioning but the intention is to extend it for other nosql system components such as file shares, data lakes, etc.

semanticinsight.data_object – This holds details of objects that are at table level e.g. tables, views and procedures. It also holds details about how the data is formatted and should be loaded.

semanticinsight.data_schema – Data objects may be further classified into logical groups for security, maintenance and logical readability. Currently this table isn’t fully de-normalised and also holds the database name. This is for convenience since this table is intended to be configured for the solution and there is no front end for database as of yet.

semanticinsight.data_object_type – Defines what type a data object can be. Currently it can only be a Table, View and StoredProcedure.

semanticinsight.data_attribute – Defines the columns or attributes that data object can have and also the their data type constraints.


Data Load Mapping Tables

These tables hold details about how the meta data is mapped into the data provisioning solution.

semanticinsight.data_schema_mapping – maps data flow from source system component schema to another.

semanticinsight.data_object_mapping – maps data flow from a source schema data object to another.

semanticinsight.data_attribute_mapping – You’ve guessed it; maps data flow from a source data object attribute to another.

The framework and solution does not allow breaking the hierarchy i.e. sharing objects or attributes across schemas and databases. This is by design because I hate spaghetti data platforms – solutions should have clean logical layers. A skill in building data platforms is providing simple solutions to complex problems.

The database designers amongst us may notice that the higher level mappings of data objects and data schemas could just be implied by the data attribute mapping which is the lowest level mapping. Mappings at multiple levels is a very deliberate design decision.

The objective is to automate delivery but we need to give the framework some high level config to go on. This is what the following tables are for that should be manually configured for a specific solution. These tables should can be populated by modifying the stored procedure called semanticinsight.configure_system_component which is called in the meta data BIML scripts provided:

  • semanticinsight.system_component
  • semanticinsight.data_schema
  • semanticinsight.data_schema_mapping

The following tables can be automatically populated and mapped with careful design by the framework which saves us a lot of time since data objects and their attributes can run into their 1000’s.

  • semanticinsight.data_object
  • semanticinsight.data_attribute
  • semanticinsight.data_object_mapping
  • semanticinsight.data_attribute_mapping


Example 1

This is the demo and getting started setup that the GitHub project comes with. It should be fairly intuitive.


Basically it shows we have 2 system components grouped into a solution called “Adventure Works BI”. System components must have a root node. The table has a relationship for it’s parent system component. It also has a relationship directly to the root component for any system component which I added and found it made my coding life a lot easier for querying the data that is needed.

We can see in the schema and schema mapping tables that the 2 databases Adventure Works and Stage are mapped together across schema’s with identical names. This is not mandatory and schema’s can be mapped as required and the automated development framework will create the data objects in the schema’s as described in this table.


Example 2

Here is another example. This example might be relevant if we have multiple operational source databases in multiple geo-graphic regions loading into a single stage database for further integration. In this case we can’t use like for like schema names because the table names will clash. We could use the table names to make the distinction but for many reasons I won’t go into here (the biggest one code re-use) it’s better to keep common table names and use the schema name to make the distinction.



ADP Framework: Getting Started

This is the getting started guide for the Automated Data Provisioning (ADP) Framework using BIML, SSIS and SQL; Scroll to the bottom to see a growing list of detailed docs.

Much of the what is referred to like database names is configurable. However the code currently in the GitHub repo will work immediately without any extra tweaking. I recommend following this to the letter before attempting to reconfigure the framework just to build familiarity.

Recommended Tools

Visual Studio 2017 SSDT now has support for SSIS. However we still currently need VS installs for both 2015 and 2017 because BIML Express doesn’t have an installer yet for Visual Studio 2017. I recommend installing Visual Studio 2017 for the full blown development experience and Visual Studio 2015 SQL Server Data Tool (SSDT) for developing SSIS. Since it keeps the disk footprint as light as possible.

You can also install Visual Studio 2017 SSDT if you wish but you won’t be able to use BIML Express with it until Varigence release an installer.

Step 1 – Create Databases


Step 1

Step 2 – Build Semantic Weave


  • Open the semantic.weave solution in Visual Studio 2017
  • Review the connection string in SemanticInsight.Weave.dll.Config and ensure it points at your SSISDB
  • Rebuild the solution
  • Publish the SSISDB project to the SSISDB database; ensure the database deploys successfully and that the SemanticInsight schema tables and procs exist in the database – see images below
  • Navigate to the bin folder for the Semantic.Weave project and get the path since we’re going to need to reference it in our BIML scripts later e.g

step 2

Step 2 a

Step 3 – Reconfigure

NOTE: The global file generally works with biml script without having to select it. Currently however, it has to be selected because we’re using to house our assembly reference in a single place.

  • Open the BIML solution in Visual Studio 2015
  • The “1 – ProjectConnections.biml” in both projects BIML ETL and BIML Utility contains the environment connections that will be used to create the DB connections. Open this file and review the connection strings to ensure they are correct for each database:
    • SSISDB
    • Stage
    • AdventureWorks
  • You’ll notice the connections also have annotation for GUID’s. This is so we can control and ensure the GUIDs are consistent when connections are created. If you want to generate new GUID’s this can be by simply running the following in SQL Server Management Studio
SELECT newid()

step 3

  • Now open the following “0 – Global.biml” and ensure the Assembly Name reference points to where the Semantic.Weave.dll is on your machine. See Step 2. Ensure to escape replace the backslashes in the path using double backslashes.


Step 4 – Scrape Adventure Works Meta Data


  • Press ctrl and select the following 2 files in the BIML Utility project in the order listed, right click and select “Generate SSIS Packages”. The order is important and they are named with a number to make it more intuitive:
    • 1 – ProjectConnections.biml
    • 2 – MetaDataScrape.SQLServer.ADWorks.biml
  • Execute the package called Sssidb_SqlServerScrape_AdventureWorksBI_AdventureWorks.dtsx that is created


The BIML script created a package and 2 project connections. This package does 2 things:

1. It executes the the stored procedure [semanticinsight].[configure_system_component]. This procedure populates the table [semanticinsight].[system_component] that defines the solution component architecture in a parent child structure. e.g. Source Databases, Stage Databases, etc. This procedure is an inflection point to building your own solution. Use this procedure to place logic to populate the component structure for your own solution.

select * from [semanticinsight].[system_component]


[semanticinsight].[configure_system_component] also populates the system component schema’s and maps them together for data loading. How you configure these tables will subsequently affect how meta data is mapped, what database objects are created and what load packages are created.

select * from [semanticinsight].[data_schema]
select * from [semanticinsight].[data_schema_mapping]


2. The package queries the AdventureWorks for schema information and loads that data into the SSISDB meta data repository tables. You can check this by looking at the data in the tables:

select * from [semanticinsight].[data_object]
select * from [semanticinsight].[data_attribute]


Step 5 – Build the Stage Database


  • Press ctrl and select the following 3 files in the BIML Utility project in the order listed, right click and select “Generate SSIS Packages”. The order is important and they are named with a number to make it more intuitive:
    • 0 – Global.biml
    • 1 – ProjectConnections.biml
    • 2 – StageTableDefinitions.ADWorks.biml
    • 3 – CreateTable.biml
  • Execute the package called CreateTables_Stage.dtsx that is created
  • Check the stage data base for newly created tables and schemas



The StageTableDefinitions.ADWorks.biml script uses the meta data and component schema mappings created in step 4 to provide target table definitions for the stage database. The

Step 6 – Scrape Stage Meta Data

  • Press ctrl and select the following 2 files in the BIML Utility project in the order listed, right click and select “Generate SSIS Packages”. The order is important and they are named with a number to make it more intuitive:
    • 1 – ProjectConnections.biml
    • 2 – MetaDataScrape.SQLServer.ADWorksStage.biml
  • Execute the package called Sssidb_SqlServerScrape_AdventureWorksBI_AdventureWorksStage.dtsx that is created

This package works in much the same way as the explanation in step however this time we’ve scraped the schema information from the Stage database tables that we created in the Step 5.


Step 7 – Create Packages & Map Meta Data


  • Press ctrl and select the following 3 files in the BIML ETL project in the order listed, right click and select “Generate SSIS Packages”. The order is important and they named with a number to make it more intuitive:
    • 0 – Global.biml
    • 1 – ProjectConnections.biml
    • 2 – Table.Stage.ADWorks.biml
    • 3 – Package.OLEDBBulk.Single.biml


If you’ve completed all the steps to here you’ve just automatically created the stage load of 88 tables and nearly a 1000 columns. The package will have events embedded in to capture row stats, operational logging and data lineage.

If you’d prefer to have 1 package per table load repeat this step but use the 3 – Package.OLEDBBulk.Multiple.biml instead of the 3 – Package.OLEDBBulk.Single.biml. This will create 88 packages 1 for each table.


You can then repeat the step again with the 3 – Package.OLEDBBulk.Multiple.Master.biml and this will create a master package that executes all of the child load packages.


As the packages are created the BIML script calls the [semanticinsight].[]map_data_attributes] procedure using Semantic.Weave that maps the meta data together in the meta data repository exactly as the data is provisioned by the load packages. This mapping id is then tied into the package execution logging giving a fully integrated meta data repo, logging and data lineage.

select * from [semanticinsight].[data_object_mapping]
select * from [semanticinsight].[data_attribute_mapping]


Step 8 – Load Stage


  • Execute either the Component_OLEDBBulk.dtsx or the Master Adventure Works Stage.dtsx if you’re using the multple package approach.


  • Review the logging information e.g. please note the id’s will obviously be different in your implementation change them as required. Note that the execution_id in the semanticinsight.execution_id is the SSIS server execution id. This is captured so SSIS extension logging can be tied to the native SSISDB logging when required. It will default to 0 when run locally and not on the SSIS server.

select * from Stage.Production.Product

select * from [semanticinsight].[process] where process_id = 1
select * from semanticinsight.data_object_mapping where data_object_mapping_id = 67
select * from semanticinsight.process_data_object_stats s where s.process_id = 1 and s.data_object_mapping_id = 67


Other Docs



Automated Data Provisioning Framework – Release 1

I’m releasing my automated development framework for data provisioning onto GitHub. I’m doing it in stages just to make it more manageable for myself. Why am I doing this? because I like coding, building things and maybe someone will get some value from it.

Setting Expectation


To use it expect to have or bulid a level of knowledge with the following skills:

  • Data Warehouse, ETL & ELT Architectural Design Patterns
  • SSIS
  • SQL Server
  • C#
  • BIML Express
  • T4 Templates

It’s a development framework for techies and whilst it is setup ready to go with examples with all  projects there are always subtle design differences that will require configuration tweaks and or extensions. The aim of the framework is tailored code re-use thus:

  • Saving many (in fact rather a lot of) man hours
  • Provide a flexible framework
  • Provide an agile framework – steam ahead and don’t worry having to rework stuff
  • Provide robust and high quality deliverable’s with less human error
  • Don’t waste time on level plumbing and allow the team to focus on the difficult bits – e.g. data integration & BI transforms

It is not a tool for someone with no knowledge, experience or requirements to create an off the shelf MI platform. I’ve spent a long time delivering MI platforms and in my humble experience every project has subtle differences that will make or break it, hence a highly flexible and agile framework is the way to go. Trying to shoe horn specific requirements into generic solution or even worse, data into a generic data model never leads to happiness for anyone.

I’ll assist as much as possible (if asked) to help folks understand and make use of the assets.

Release 1


This release focuses on the core assets for delivering a simple bulk loaded stage layer in less than 2 minutes with full a meta data repository and ETL with data lineage and logging. In this release:

  • Metadata management repository
  • Metadata SQL Server scrapers to automatically fill the repository and map data flows at attribute level
  • Automated DDL creation of database tables
  • Automated ETL creation of OLEDB bulk load packages
  • .Net assembly to manage BIML integration with metadata repository


Framework Stage

It’s set up to use adventure works and can very quickly be changed to use any other SQL Server database(s) as source databases. This is because the metadata is scraped automatically from SQL Server. As the framework is extended I’ll add other source scrapers.

As it turns out Adventure Works was a good database to use because it uses all of the SQL Server datatypes and some custom data types too.

Release n


There’s loads more to add that will come in further releases. This is my initial list:

  • Patterns for loading other layers – probably the DW layer initially
  • MDS integration for metadata repository
  • Other stage loading BIML templates for MDS, Incremental Loads, CDC Loads
  • Automated stage indexing
  • Staging archive & retrieval
  • Meta scrapers to support other data source types
  • Tools to help generate meta data for flat files
  • Isolated test framework for loading patterns
  • Data lineage, dictionary, metadata and processing reports
  • Statistical process control – track and predict loading performance

The Good Stuff


I don’t want to procrastinate over documentation too much but will flesh out more detail as and when I can. Onto the good stuff.


The Basics – BIML: Project Connection GUIDs

Was in the process of just updating my framework to get it released and documented and came across a slight issue with  my code and Project Connection GUID’s. Am not sure if something has changed since I last used it but I’m sure I didn’t have this issue before but nevertheless.

The Problem


Right… if you’ve been doing SSIS a while then you’ll know connections are hooked up to tasks using a unique identifier that is a GUID. It doesn’t use the connection name which I’ll be honest for practical purposes is pretty annoying sometimes. If you re-create the connection you have to edit all the package tasks that use the connection because it has a different GUID. Well with BIML that ain’t so bad because you can just re-create all the packages quite easily

However… I do get the issue where I have a BIML script that has a code nugget that creates many packages. However on each iteration for each package it creates new connections whether I choose to overwrite them or not this means only one of my packages (the last one) has the connections hooked up properly. Also I may create assets in a few different hits that use the same connections and it’s just to easy to break the GUID’s by re-creating the connections.

The Solution


The solution I’ve settled on is to force the connection GUID’s. The reason being GUID’s constantly being dynamic and out of sync is annoying and it suits me better to have explicit control of what they are. At the <BIML> root the <connection> node doesn’t allow the definition of a GUID or Id. To get around this I’ve used an annotation and then used the annotation in a code nugget in the <package> node to force the GUID’s

This suits my purpose for my framework because I don’t want to repeat code. I can basically have 1 single BIML file that is my global connections definition for all my other BIML scripts that depend on those connections.

So for a very simple example we 2 files:

  1. ProjectConnections.biml – defines my connections
  2. Package.biml – defines my package that uses the connection



<Biml xmlns="http://schemas.varigence.com/biml.xsd">

 <Connection Name="Adventure Works" 
 ConnectionString="Data Source=.;Initial Catalog=AdventureWorks2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;"
 <Annotation AnnotationType="Tag" Tag="GUID">0284217D-A653-4400-87F3-529A569B8F05</Annotation> 
 <Connection Name="Stage" 
 ConnectionString="Data Source=.;Initial Catalog=Stage;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;"
 <Annotation AnnotationType="Tag" Tag="GUID">46A16E16-E57A-4D98-82CA-13AB3F011980</Annotation> 
 <Connection Name="SSISDB" 
 ConnectionString="Data Source=.;Initial Catalog=SSISDB;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;"
 <Annotation AnnotationType="Tag" Tag="GUID">CF98B9CE-7D19-4493-9E33-1026C688F874</Annotation> 



<#@ template language="C#" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd"> 
        <Package Name="Package Connections 1">
    <ExecuteSQL Name="Test" ConnectionName="SSISDB" ResultSet="None">
                  SELECT TOP 0 1
            <# foreach(var connection in RootNode.Connections) {#> 
            <Connection ConnectionName="<#=connection.Name#>" Id="<#=connection.GetTag("GUID")#>">
            </Connection> <#}#> 

        <Package Name="Package Connections 2"> 
    <ExecuteSQL Name="Test" ConnectionName="SSISDB" ResultSet="None">
                  SELECT TOP 0 1
            <# foreach(var connection in RootNode.Connections) {#> 
            <Connection ConnectionName="<#=connection.Name#>" Id="<#=connection.GetTag("GUID")#>">
            </Connection> <#}#> 

With BIML express we can select ProjectConnections.biml then Package.biml, right click, hit Generate SSIS Packages and everything is all good. The cool thing is that if you’ve already created the connections as part of the framework build then it doesn’t matter if you overwrite or not when you get message below because the GUID’s are explicitly defined.

connections overwrite

Wrap Up


This is a very basic example of the pattern I’m using in my framework – which I am trying to get out but just have to update and polish a few things. Fundamentally I’m forcing the project connection GUID’s using annotations. I did look for an Id property on the BIML connection object so I don’t have to use annotations but couldn’t find one. I may have overlooked it during my haste and if there is one I’m sure someone will correct me.

There’s a bit more to it when you start handling table definitions and I’ve used linq to sub-select the connections I need for a particular package templates. These more sophisticated examples will be in my framework code.

I do have a metadata repo and you could declare the connections in the repo and control the creation from there using custom C# assembly. However I really don’t like re-producing features that SSIS already has. It already has a repo for connections and environments called SSISDB and my framework extends SSISDB rather than overlapping it which I much prefer since there is no confusion or complexity in the end product of how environments are configured and administrated. I don’t see the point or creating features that SSIS already has and that can used within SSMS.