Slides and Demo Files from SPTechCon DC 2017

   Big thanks to Stacy and her crew of organizers, all of the attendees, and fellow speakers at SPTechCon DC 2017.  This was one of the best ones I’ve attended from an engagement and networking perspective.  Below are my slides and source code.  Feel free to let me know about any follow up questions or comments.

PowerApps and Microsoft Flow for Developers

GitHub link to demo project files

https://github.com/BrianTJackett/BTJ.PowerApps.AzureDBSample

Slides

Intro to Power BI for Office 365 Developers

Slides

Sample Financial Data file

http://go.microsoft.com/fwlink/?LinkID=521962

Blog posts on SMAT data report

https://aka.ms/SMAT2013BTJpart1

https://aka.ms/SMAT2013BTJpart2

      -Frog Out

Slides and Demo Files from Dogfood Con 2017

   A big thanks to Cassandra, Trey, Danillo, and all of the other organizers of Dogfood Con 2017.  Thanks also to all of the folks who attended my two sessions.  Below are my slides and source code.  Feel free to let me know about any follow up questions or comments.

PowerApps and Microsoft Flow for Developers

GitHub link to demo project files

https://github.com/BrianTJackett/BTJ.PowerApps.AzureDBSample

Slides

Intro to Power BI for Office 365 Developers

Slides

Sample Financial Data file

http://go.microsoft.com/fwlink/?LinkID=521962

Blog posts on SMAT data report

https://aka.ms/SMAT2013BTJpart1

https://aka.ms/SMAT2013BTJpart2

-Frog Out

Upcoming Speaking Events in 2017

I’m honored to be accepted to speak at the following upcoming events.  Here are the abstracts for these presentations.  If you are at any of these events feel free to stop by and chat.  Registration is still open as of the time of writing as well.

 

Cincy.Develop()

Website: Cincy Day of Agile and Cincy.Develop()

When: Fri. Jul 28, 2:40pm-3:40pm

Title: Integrate All the Things: PowerApps and Microsoft Flow for Developers

Abstract: Securely start a server farm of Azure dev VMs with the tap of a button on your phone. Update data in an on-prem DB from any device without writing any plumbing code or UI layer. In this session we will overview PowerApps and Microsoft Flow which are low-code / no-code solutions that allow forms and workflow development that integrate to hundreds of services including Twilio, Twitter, Azure, GitHub, and more. We will also talk about the developer story for integrating with Custom Connectors and on-prem data sources such as SQL Server and SharePoint. Lastly we will demo a number of scenarios to show how easily you can create and consume apps across Windows, iOS, Android, and web. Prior experience with PowerApps and Flow is not required.

 

DogFood Conference

Website: DogFoodCon

When: Oct 5-6, timeslots TBA

Title: Integrate All the Things: PowerApps and Microsoft Flow for Developers

Abstract: Securely start a server farm of Azure dev VMs with the tap of a button on your phone. Update data in an on-prem DB from any device without writing any plumbing code or UI layer. In this session we will overview PowerApps and Microsoft Flow which are low-code / no-code solutions that allow forms and workflow development that integrate to hundreds of services including Twilio, Twitter, Azure, GitHub, and more. We will also talk about the developer story for integrating with Custom Connectors and on-prem data sources such as SQL Server and SharePoint. Lastly we will demo a number of scenarios to show how easily you can create and consume apps across Windows, iOS, Android, and web. Prior experience with PowerApps and Flow is not required.

 

Title: Intro to Power BI for Office 365 Developers

Abstract: “Power BI is SSRS in the cloud, right?” “I’ll throw a few pie charts on the screen and call it a dashboard, right?” Wrong. In this session we will introduce Office 365 developers to the Power BI service and Power BI Desktop. We will cover data sources that can be used (Azure SQL, SharePoint, files, etc.), direct query vs. imported dataset, report and dashboard creation, and the on-prem data gateway. We will also break down misconceptions of good dashboard design (hint: pie charts and 3D are almost always a bad design). Prior experience with Power BI or Power BI desktop are not required.

 

SPTechCon DC

Website: SPTechCon DC

When: Nov 12-15, timeslots TBA

Title: Integrate All the Things: PowerApps and Microsoft Flow for Developers

Abstract: Securely start a server farm of Azure dev VMs with the tap of a button on your phone. Update data in an on-prem DB from any device without writing any plumbing code or UI layer. In this session we will overview PowerApps and Microsoft Flow which are low-code / no-code solutions that allow forms and workflow development that integrate to hundreds of services including Twilio, Twitter, Azure, GitHub, and more. We will also talk about the developer story for integrating with Custom Connectors and on-prem data sources such as SQL Server and SharePoint. Lastly we will demo a number of scenarios to show how easily you can create and consume apps across Windows, iOS, Android, and web. Prior experience with PowerApps and Flow is not required.

 

Title: Intro to Power BI for Office 365 Developers

Abstract: “Power BI is SSRS in the cloud, right?” “I’ll throw a few pie charts on the screen and call it a dashboard, right?” Wrong. In this session we will introduce Office 365 developers to the Power BI service and Power BI Desktop. We will cover data sources that can be used (Azure SQL, SharePoint, files, etc.), direct query vs. imported dataset, report and dashboard creation, and the on-prem data gateway. We will also break down misconceptions of good dashboard design (hint: pie charts and 3D are almost always a bad design). Prior experience with Power BI or Power BI desktop are not required.

 

      -Frog Out

How To Analyze SharePoint 2013 Migration Assessment Tool (SMAT) Data in Power BI – Part 2

<Update 2017-02-28> Updated PBIX file with “Top N site collections by risk page” based on feedback from Microsoft peers.  Also minor updates to documentation for report creation. </Update>

In this set of posts I will walk through a process for consuming SMAT output files into a Power BI report.  This post will give a high level overview of the report pages to create and Power BI visuals used.  In order to limit the length of this post (I was at 60+ screenshots before even finishing) I am providing a separate more in-depth Word document with the steps to create the report pages contained.

  • Part 1 – importing the data and then querying / modeling the data
  • Part 2 – designing the Power BI Report

In a previous post I talked about the SharePoint Migration Assessment Tool for SharePoint 2013 being released to web.  Please read that post for context on working with SMAT in general.

Disclaimer: I am by no means a Power BI expert.  In fact this report is the first time I’ve spent any significant time developing a report.  Most of the technical credit goes to my coworkers Ken Kilty and Gene Livshin.

Note: Power BI Desktop is being updated regularly.  For the purposes of this article I am using the January 2017 build (version 2.42.4611.701 64-bit).

 

Sample report and data files (updated 2017-02-28)

In-depth Power BI report creation documentation (document in progress, not finalized).

Sample SMAT source files (CSV, PBIX, walkthrough DOCX)

For the PBIX file be sure to update the source of data file via the Edit Query screen on the SiteAssessmentReport query as follows:

SMATPowerBIReport101

SMATPowerBIReport100

 

Overview

The SMAT tool outputs a summary CSV file along with 1 detailed CSV per risk being analyzed.  Reading and interpreting hundred or thousands or tens of thousands of rows in a CSV file is not optimal.  Instead it will be helpful to model and report on that data in an easier to consume fashion.  At a high level our process will involve the following:

  1. Import Data [part 1] – Import SMAT output files into Power BI
  2. Query Data [part 1] – Create multiple queries on SMAT output files to filter, aggregate, and enhance into different perspectives
  3. Design Reports [part 2] – Design dashboard pages to consume the query outputs

 

Design Reports

Continuing from the queries that were modeled in part 1 now it is time to create reports to visualize the SMAT data to tell a story.  As mentioned in the previous post I had a number of questions I wanted to answer.

  • Content database statistics
    • Number of sites per database?
    • Size (storage) of databases?
  • Web application statistics
    • Number of site collections and subsites?
    • Size (storage) of site collections?
  • Web application risks
    • Breakdown view of risks per web application?
    • Slicer by web application and risk type
  • Site collection risks
    • Number of risks per site collection?
    • Slicer by risk type

 

These questions can be answered across a number of different pages that focus on each.  Start by creating 5 total pages with the names such as the following:

  1. Content DB overview
  2. Site collection overview
  3. Risk overview
  4. Risks by web application
  5. Risks by site collection

 

SMATPowerBIReport31

 

**Mea Culpa – PLEASE READ**

Before moving into the Power BI report I have to issue a mea culpa.  As I reviewed the report pages I found that a few of my queries required additional columns or small modifications in order to improve usability or readability of the reports.  These changes involved new calculated columns (convert MB to GB) and a conditional column to group sites by the number of risks relative to each other (small, medium, large, etc.).  Please reference the sample files to see these changes.

 

Content DB overview page

The content DB overview page will give an overview size of the content database (storage used) and can be filtered by individual server if multiple database servers are used.  Notice that databases have been re-ordered by their size rather than default sorting by name.

SMATPowerBIReport41

 

Site collection overview page

The site collection overview page will give an overview of the number site collections in a given web application compared to the size (storage) of site collections / web applications.  Notice that a pattern is immediately visible when viewing this page.  The majority of site collections are contained within sub0.domain.com while sub1.domain.com contains the bulk of remainder.  Inversely sub1.domain.com is where the bulk of the storage is used while sub0.domain.com only uses a fraction of the storage.

SMATPowerBIReport57

 

Risk overview page

The Risk overview page will give a high level view of risks per site collection and by risk type.  The histogram visual used (more on this below) filters the groups of site collections by ranges of how many risks were identified.

SMATPowerBIReport80

Add custom histogram visualization

The Risk overview page makes use of a custom visual called Histogram from the Power BI gallery.  Again thanks to Ken Kilty for this advice.  Knowing the number of risks found per site could be useful, but more likely it would be better to see trends and distributions of risks per grouping of sites and look for anomalies.  How does one group sites together?  Put them into “buckets” based on how many risks the sites have.  Ex. Column one is count of sites with 0-100 risks, column two is 101-200 risks, and so on.  In the reporting world a histogram visual can provide this for us.

Navigate to “https://app.powerbi.com/visuals/” and search for “Histogram”.  There is a custom visual provided by Microsoft (v0.3.1 as of the time of this writing).  Download that visual file.  Back in Power BI on the visuals gallery click the ellipses (…) in bottom right and then select “Import a custom visual”.  Navigate to the location where you downloaded the histogram custom visual and select it.

SMATPowerBIReport58

 

Risks by web application page

The Risks by web application report page will show a breakdown of risks per web application and then within the web application and also a breakdown of the risk types.  Filter by individual web application or risk name.  Note that there is a report level filter for the supporting query so that any risk name with no occurrences is hidden.

SMATPowerBIReport98

 

Risks by site collection page

The Risks by site collection page provides an overview of risk types, number of risks identified for selected risk type, and number of site collections that have the selected risks.  Filter by risk name to find the number of identified risks and sites where that risk was identified.  Note that there is a report level filter for the supporting query so that any risk name with no occurrences is hidden.

SMATPowerBIReport99

 

Top N site collections by risk page

The Top N site collections by risk page provides an overview of the top N (in my example N = 25, more on this below) site collections with the highest number of risks.  Filter by risk name to find identified risks across these top 25 sites or filter by site collection to find risks on selected site collection.  Note that there is a report level filter for the supporting query so that any risk name with no occurrences is hidden.

SMATPowerBIReport130

 

Add query parameter for number of site collections

One of the features of the Top N site collections by risk page is to sort site collections by total number of risks and only keep the top N records.  As such it is preferable to use a query parameter within Power BI desktop to allow for a variable number of site collections to be shown.  Power BI Desktop introduced query parameters with the April 2016 update.  Query parameters allow a report creator to define what is essentially a variable to then be used in data source dialogs (primarily filtering operations).  See below for an example of defining query parameter and usage in filter for top N rows.

SMATPowerBIReport128

SMATPowerBIReport129

 

Conclusion

I will be one of the first ones to admit that at the start of this project I knew next to nothing about proper dashboard creation and ways to model data in different ways.  Speaking as a general developer and not as a Microsoft employee, I was very impressed with the capabilities Power BI has to offer.  Given a sufficiently large and diverse enough set of data it was a relatively fast process to transform the SMAT output files into a usable (and now re-usable) format which can then show insights into the data.  I know for my customer and many others this means they can make business critical decisions about what, when, and where to migrate their SharePoint data to SharePoint Online.  If you have any feedback on this series of blog posts or suggestions for additional report pages please let me know in the comments.

 

-Frog Out

How To Analyze SharePoint 2013 Migration Assessment Tool (SMAT) Data in Power BI – Part 1

In this set of posts I will walk through a process for consuming SMAT output files into a Power BI report.

  • Part 1 – importing the data and then querying / modeling the data
  • Part 2 – designing the Power BI Report (and also provide sample files to use)

In a previous post I talked about the SharePoint Migration Assessment Tool for SharePoint 2013 being released to web.  Please read that post for context on working with SMAT.

Disclaimer: I am by no means a Power BI expert.  In fact this report is the first time I’ve spent any significant time developing a report.  Most of the technical credit goes to my coworkers Ken Kilty and Gene Livshin.

Note: Power BI Desktop is being updated regularly.  For the purposes of this article I am using the January 2017 build (version 2.42.4611.701 64-bit).

Overview

The SMAT tool outputs a summary CSV file along with 1 detailed CSV per risk being analyzed.  Reading and interpreting hundred or thousands or tens of thousands of rows in a CSV file is not optimal.  Instead it will be helpful to model and report on that data in an easier to consume fashion.  At a high level our process will involve the following:

  1. Import Data – Import SMAT output files into Power BI
  2. Query Data – Create multiple queries on SMAT output files to filter, aggregate, and enhance into different perspectives
  3. Design Reports – Design dashboard pages to consume the query outputs

 

Import Data

Importing data into Power BI is a fairly simple process.  After launching Power BI Desktop click the Get Data link from the welcome screen (or click the Get Data button in top ribbon menu).

SMATPowerBIReport1

 

Select the CSV data type.

SMATPowerBIReport2

 

Navigate to the folder where the SMAT output files are located.  In this walkthrough use the SiteAssessmentReport.csv file.

SMATPowerBIReport3

 

A preview of the data should appear.  I have run a PowerShell script to clean up the data from any customer sensitive information (ex. server names, database names, URLs, users, etc.)

Note: At a later date I may blog out that PowerShell script and link it back to this article.  If this is data scrubbing script is of interest please let me know in comments or contact me to prioritize this.

SMATPowerBIReport4

 

If everything was successful a list of fields should appear on the right hand of the screen.  These fields will correspond to the comma separated fields in the CSV file.

SMATPowerBIReport5

 

Query data

As mentioned earlier the raw CSV data by itself is not easily readable to gain insights about the farm.  The data needs to be filtered and queried differently to support the report that will be built.  I can not stress enough that having a set of questions you are trying to answer is extremely important.  It is not sufficient to start designing reports without some initial planning.  Think of this step as modeling the data in different ways to get different perspectives on the data.  Initially my personal goals with this process were to identify the following items:

  • Content database statistics
    • Number of sites per database?
    • Size (storage) of databases?
  • Web application statistics
    • Number of site collections and subsites?
    • Size (storage) of site collections?
  • Web application risks
    • Breakdown view of risks per web application?
    • Slicer by web application and risk type
  • Site collection risks
    • Number of risks per site collection?
    • Slicer by risk type

Having these goals / questions in mind I diagramed how to split the data into different queries that could then answer those questions.

SMATPowerBIReport6

 

SiteAssessmentReport Query

Power BI has a nice set of features that allows taking a set of data, making some changes, and then creating a reference (shallow copy) or duplicate (deep copy) that altered set of data.  As such I loaded the CSV data and then referenced that data into 3 separate queries, 1 of which was further split into multiple queries for risk data by web application and by site collection.  The following are example steps to accomplish this.

After the data has been loaded from previous step click on Edit Queries from the ribbon menu.

SMATPowerBIReport7

 

Notice that the first query for the loaded CSV file includes a few modifications to promote the header row and change datatype of columns that should be numbers.  Leave those steps in place.

SMATPowerBIReport8

 

Create three references from the source data load query (SiteAssessmentSummary) and rename the referenced queries appropriately.  In this example I named them RiskFilteredSource, DatabaseMetrics, and WebApplicationMetrics (respectively the 3 boxes in the 2nd row on my hierarchy diagram above).

SMATPowerBIReport9

SMATPowerBIReport10

 

RiskFilteredSource Query

The next step is to start adding modifications to the queries to shape the data into the different results which will be reported on.  Start with the RiskFilteredSource.  Only the SiteURL and individual risk columns (Alerts, Apps, etc.) are necessary.  Rather than removing the columns that aren’t needed it is recommended to select the columns that are needed (see reference for more information on why).  Click Choose Columns from the ribbon menu.  On the following dialog box select the columns for SiteUrl  and all of the individual risk columns (Alerts, Apps, etc.) ending with WorkflowRunning2013.

Note: the set of individual risks in the CSV will be dependent on which ones were enabled when the SMAT tool was run.  This examples uses the out of the box settings as of SMAT v1.1 release.

SMATPowerBIReport11

 

The query result should look like the following with only the columns selected still showing.  That is all for the RiskFilteredSource query.

SMATPowerBIReport12

 

DatabaseMetrics Query

Next will be modifications to the DatabaseMetrics query.  Similar to the previous query start by selecting only the columns relating to databases (e.g. ContentDbName, ContentDbServerName, and ContentDBSizeInMB) by using the Select Columns button from the ribbon menu.

SMATPowerBIReport13

 

There will likely be duplicate rows in the result because each row in the original CSV corresponds to a single site collection and each content database can contain 1 or more site collections.  In order to clean up the duplicate rows select all three columns and then (with columns still selected) click Remove Rows –> Remove Duplicates from the ribbon menu.

SMATPowerBIReport14

 

The DatabaseMetrics query result will contain only a single record per content databases with the associated server name and database size.

SMATPowerBIReport15

 

WebApplicationMetrics Query

The last of the original four queries, WebApplicationMetrics, will take extra steps to shape the data as intended.  The SiteUrl column is not in the format needed since it contains full site collection URLs.  For the purposes of this example assume that the authority (ex. subdomain.domain.com) portion of URL represents an individual web application.  Continuing with that assumption the SiteUrl column can be manipulated to show data per web application.  Click Choose Columns from the ribbon menu and select only the columns for SiteUrl, SiteCollectionSizeInMB, SiteCollectionCount, and SubWebCount.

Note: if the farm contains host named site collections (HNSCs) then the assumption regarding “URL authority = web application” may be invalid.

SMATPowerBIReport16

 

Next remove the scheme (ex. http:// or https:// most likely) from the SiteUrl column by clicking the Replace Values button from the ribbon menu.  For Replace With specify an empty string (“”).  In my example all sites are using https:// but a farm may have a mix of both http:// and https:// so an additional Replace Values step may be needed.  I’m not aware of a way to combine both into one operation but if a reader knows please do share in the comments.

SMATPowerBIReport17

 

In order to separate out the authority portion of URL from the rest of the site collection (e.g. the path) click the Split Column –> By Delimiter button from the ribbon menu.  Choose a custom delimiter of forward slash (/) and choose to only split at the left most delimiter.

SMATPowerBIReport18

SMATPowerBIReport19

 

To make it easier to read these newly split columns click “Rename column” from the ribbon menu and rename the first column to SiteUrlDomain and the second column to SiteUrlPath.  Alternately the Rename column functionality can be found by right-clicking the column and choosing Rename from the list of actions.

SMATPowerBIReport20

 

The final modification for the WebApplicationMetrics query will be to group the data by multiple aggregations.  Select the SiteUrlDomain column and click Group By in the ribbon menu.  Make sure that SiteUrlDomain is selected for the top Group By box.  Proceed to add additional aggregations for three columns:

New Column Name Operation Column
SiteCollectionSizeInMB Sum SiteSizeInMB
SiteCollectionCount Count Rows
SubWebsSum Sum NumOfWebs

 

SMATPowerBIReport21

 

The results for the WebApplicationMetrics query should look similar to the following.

SMATPowerBIReport22

 

RiskByWebApp Query

Referring back to the original planning hierarchy there are two final queries to be created: “Risks by Web Application” and “Risks by Site Collection”.  These queries are children from the “Filter for Risk Related Columns”.  This translates to creating 2 references from RiskFilteredSource query.  Rename these referenced queries to RiskByWebApp and RiskBySiteCollection.

SMATPowerBIReport23

 

Starting with RiskByWebApp perform the similar steps used with WebApplicationMetrics.

  1. Replace the https:// or http:// scheme
  2. Split SiteUrl column on leftmost “/” delimiter

Since this query is focused on web application based information the “path” portion of URL (ex. “/sites/HRMarketing”) is not needed.  Click Choose Columns and select all columns except “SiteUrl.2”.

SMATPowerBIReport24

 

After the columns have been chosen rename the “SiteUrl.1” column to “SiteUrlDomain”.  The data should look similar to the following at this point.

SMATPowerBIReport25

 

Now is where some of the magic happens.  Credit for this specifically goes to Ken Kilty and Gene Livshin.  I want to aggregate the individual risk counts per web application while also maintaining the individual risk name.  Welcome to Unpivot Columns (<insert fireworks and fanfare>).  Think of the unpivot columns operation as transpose for a key-value pairing.  Select all of the columns for the individual risks (Alerts, Apps, etc.) but be sure to not select SiteUrlDomain.  Then click Unpivot Columns.

SMATPowerBIReport26

 

The resulting query will have SiteUrlDomain, Attribute, and Value for columns.  Rename Attribute and Value to RiskName and RiskCount respectively.

SMATPowerBIReport27

 

Notice that RiskName values will repeat for if there are multiple site collections in that web application.  Click Group By from the ribbon menu.  Specify SiteUrlDomain and RiskName for the top grouping dropdowns (add a new grouping for RiskName since the default is one grouping.)  For the New Column Name, Operation, and Column specify RiskCountAggregated, Sum, and RiskCount respectively.

SMATPowerBIReport28

 

The RiskByWebApp query result is finally in a usable format now.  See the following for an example of the output.

SMATPowerBIReport29

 

RiskBySiteCollection Query

The RiskBySiteCollection query will mirror the RiskByWebApp query in every step except there will not be a split column or the clean up of columns immediately afterwards.  The order of steps will be as follows:

  • Source
  • Replaced value – remove scheme from SiteUrl
  • Unpivoted Columns – select all columns except for SiteUrl for the unpivot
  • Rename Columns – Attribute –> RiskName, Value –> RiskCount
  • Grouped Rows – group by SiteUrl and RiskName, sum RiskCount into RiskCountAggregate.

 

The RiskBySiteCollection query should end up looking like the following:

SMATPowerBIReport30

 

Conclusion

Quite a bit of work goes into transforming and modeling the initial CSV data into more usable formats.  Hopefully this post has been helpful in illustrating steps that can be taken to plan out the end goal as well as model the queries of the SMAT output data.  In Part 2 of this post I will walk through creating a number of Power BI report pages that leverage what has been created thus far.  I’ll also attempt to share out sample source CSV data and a Power BI PBIX file but no guarantees.  Please share any feedback or questions in the comments.

 

Continue reading part 2 of this series.

 

-Frog Out