Slides and Demo Files from SPTechCon DC 2017

November 16, 2017 Brian T. Jackett Conference/Tech Event, Microsoft Flow, Power BI, PowerApps, Presentation, SharePoint

Big thanks to Stacy and her crew of organizers, all of the attendees, and fellow speakers at SPTechCon DC 2017. This was one of the best ones I’ve attended from an engagement and networking perspective. Below are my slides and source code. Feel free to let me know about any follow up questions or comments.

PowerApps and Microsoft Flow for Developers

GitHub link to demo project files

https://github.com/BrianTJackett/BTJ.PowerApps.AzureDBSample

Slides

Intro to Power BI for Office 365 Developers

Slides

Sample Financial Data file

http://go.microsoft.com/fwlink/?LinkID=521962

Blog posts on SMAT data report

https://aka.ms/SMAT2013BTJpart1

https://aka.ms/SMAT2013BTJpart2

-Frog Out

Slides and Demo Files from Dogfood Con 2017

October 5, 2017 Brian T. Jackett Conference/Tech Event, Power BI, PowerApps, Presentation

A big thanks to Cassandra, Trey, Danillo, and all of the other organizers of Dogfood Con 2017. Thanks also to all of the folks who attended my two sessions. Below are my slides and source code. Feel free to let me know about any follow up questions or comments.

PowerApps and Microsoft Flow for Developers

GitHub link to demo project files

https://github.com/BrianTJackett/BTJ.PowerApps.AzureDBSample

Slides

Intro to Power BI for Office 365 Developers

Slides

Sample Financial Data file

http://go.microsoft.com/fwlink/?LinkID=521962

Blog posts on SMAT data report

https://aka.ms/SMAT2013BTJpart1

https://aka.ms/SMAT2013BTJpart2

-Frog Out

Upcoming Speaking Events in 2017

July 24, 2017 Brian T. Jackett Conference/Tech Event, Power BI, PowerApps, Presentation

I’m honored to be accepted to speak at the following upcoming events. Here are the abstracts for these presentations. If you are at any of these events feel free to stop by and chat. Registration is still open as of the time of writing as well.

Cincy.Develop()

Website: Cincy Day of Agile and Cincy.Develop()

When: Fri. Jul 28, 2:40pm-3:40pm

Title: Integrate All the Things: PowerApps and Microsoft Flow for Developers

Abstract: Securely start a server farm of Azure dev VMs with the tap of a button on your phone. Update data in an on-prem DB from any device without writing any plumbing code or UI layer. In this session we will overview PowerApps and Microsoft Flow which are low-code / no-code solutions that allow forms and workflow development that integrate to hundreds of services including Twilio, Twitter, Azure, GitHub, and more. We will also talk about the developer story for integrating with Custom Connectors and on-prem data sources such as SQL Server and SharePoint. Lastly we will demo a number of scenarios to show how easily you can create and consume apps across Windows, iOS, Android, and web. Prior experience with PowerApps and Flow is not required.

DogFood Conference

Website: DogFoodCon

When: Oct 5-6, timeslots TBA

Title: Integrate All the Things: PowerApps and Microsoft Flow for Developers

Title: Intro to Power BI for Office 365 Developers

Abstract: “Power BI is SSRS in the cloud, right?” “I’ll throw a few pie charts on the screen and call it a dashboard, right?” Wrong. In this session we will introduce Office 365 developers to the Power BI service and Power BI Desktop. We will cover data sources that can be used (Azure SQL, SharePoint, files, etc.), direct query vs. imported dataset, report and dashboard creation, and the on-prem data gateway. We will also break down misconceptions of good dashboard design (hint: pie charts and 3D are almost always a bad design). Prior experience with Power BI or Power BI desktop are not required.

SPTechCon DC

Website: SPTechCon DC

When: Nov 12-15, timeslots TBA

Title: Integrate All the Things: PowerApps and Microsoft Flow for Developers

Title: Intro to Power BI for Office 365 Developers

-Frog Out

How To Analyze SharePoint 2013 Migration Assessment Tool (SMAT) Data in Power BI – Part 2

February 22, 2017September 10, 2019 Brian T. Jackett Power BI, SharePoint

<Update 2017-02-28> Updated PBIX file with “Top N site collections by risk page” based on feedback from Microsoft peers. Also minor updates to documentation for report creation. </Update>

In this set of posts I will walk through a process for consuming SMAT output files into a Power BI report. This post will give a high level overview of the report pages to create and Power BI visuals used. In order to limit the length of this post (I was at 60+ screenshots before even finishing) I am providing a separate more in-depth Word document with the steps to create the report pages contained.

Part 1 – importing the data and then querying / modeling the data
Part 2 – designing the Power BI Report

In a previous post I talked about the SharePoint Migration Assessment Tool for SharePoint 2013 being released to web. Please read that post for context on working with SMAT in general.

Disclaimer: I am by no means a Power BI expert. In fact this report is the first time I’ve spent any significant time developing a report. Most of the technical credit goes to my coworkers Ken Kilty and Gene Livshin.

Note: Power BI Desktop is being updated regularly. For the purposes of this article I am using the January 2017 build (version 2.42.4611.701 64-bit).

Sample report and data files (updated 2017-02-28)

In-depth Power BI report creation documentation (document in progress, not finalized).

Sample SMAT source files (CSV, PBIX, walkthrough DOCX)

For the PBIX file be sure to update the source of data file via the Edit Query screen on the SiteAssessmentReport query as follows:

Overview

The SMAT tool outputs a summary CSV file along with 1 detailed CSV per risk being analyzed. Reading and interpreting hundred or thousands or tens of thousands of rows in a CSV file is not optimal. Instead it will be helpful to model and report on that data in an easier to consume fashion. At a high level our process will involve the following:

Import Data [part 1] – Import SMAT output files into Power BI
Query Data [part 1] – Create multiple queries on SMAT output files to filter, aggregate, and enhance into different perspectives
Design Reports [part 2] – Design dashboard pages to consume the query outputs

Design Reports

Continuing from the queries that were modeled in part 1 now it is time to create reports to visualize the SMAT data to tell a story. As mentioned in the previous post I had a number of questions I wanted to answer.

Content database statistics
- Number of sites per database?
- Size (storage) of databases?
Web application statistics
- Number of site collections and subsites?
- Size (storage) of site collections?
Web application risks
- Breakdown view of risks per web application?
- Slicer by web application and risk type
Site collection risks
- Number of risks per site collection?
- Slicer by risk type

These questions can be answered across a number of different pages that focus on each. Start by creating 5 total pages with the names such as the following:

Content DB overview
Site collection overview
Risk overview
Risks by web application
Risks by site collection

Mea Culpa – PLEASE READ

Before moving into the Power BI report I have to issue a mea culpa. As I reviewed the report pages I found that a few of my queries required additional columns or small modifications in order to improve usability or readability of the reports. These changes involved new calculated columns (convert MB to GB) and a conditional column to group sites by the number of risks relative to each other (small, medium, large, etc.). Please reference the sample files to see these changes.

Content DB overview page

The content DB overview page will give an overview size of the content database (storage used) and can be filtered by individual server if multiple database servers are used. Notice that databases have been re-ordered by their size rather than default sorting by name.

Site collection overview page

The site collection overview page will give an overview of the number site collections in a given web application compared to the size (storage) of site collections / web applications. Notice that a pattern is immediately visible when viewing this page. The majority of site collections are contained within sub0.domain.com while sub1.domain.com contains the bulk of remainder. Inversely sub1.domain.com is where the bulk of the storage is used while sub0.domain.com only uses a fraction of the storage.

Risk overview page

The Risk overview page will give a high level view of risks per site collection and by risk type. The histogram visual used (more on this below) filters the groups of site collections by ranges of how many risks were identified.

Add custom histogram visualization

The Risk overview page makes use of a custom visual called Histogram from the Power BI gallery. Again thanks to Ken Kilty for this advice. Knowing the number of risks found per site could be useful, but more likely it would be better to see trends and distributions of risks per grouping of sites and look for anomalies. How does one group sites together? Put them into “buckets” based on how many risks the sites have. Ex. Column one is count of sites with 0-100 risks, column two is 101-200 risks, and so on. In the reporting world a histogram visual can provide this for us.

Navigate to “https://app.powerbi.com/visuals/” and search for “Histogram”. There is a custom visual provided by Microsoft (v0.3.1 as of the time of this writing). Download that visual file. Back in Power BI on the visuals gallery click the ellipses (…) in bottom right and then select “Import a custom visual”. Navigate to the location where you downloaded the histogram custom visual and select it.

Risks by web application page

The Risks by web application report page will show a breakdown of risks per web application and then within the web application and also a breakdown of the risk types. Filter by individual web application or risk name. Note that there is a report level filter for the supporting query so that any risk name with no occurrences is hidden.

Risks by site collection page

The Risks by site collection page provides an overview of risk types, number of risks identified for selected risk type, and number of site collections that have the selected risks. Filter by risk name to find the number of identified risks and sites where that risk was identified. Note that there is a report level filter for the supporting query so that any risk name with no occurrences is hidden.

Top N site collections by risk page

The Top N site collections by risk page provides an overview of the top N (in my example N = 25, more on this below) site collections with the highest number of risks. Filter by risk name to find identified risks across these top 25 sites or filter by site collection to find risks on selected site collection. Note that there is a report level filter for the supporting query so that any risk name with no occurrences is hidden.

Add query parameter for number of site collections

One of the features of the Top N site collections by risk page is to sort site collections by total number of risks and only keep the top N records. As such it is preferable to use a query parameter within Power BI desktop to allow for a variable number of site collections to be shown. Power BI Desktop introduced query parameters with the April 2016 update. Query parameters allow a report creator to define what is essentially a variable to then be used in data source dialogs (primarily filtering operations). See below for an example of defining query parameter and usage in filter for top N rows.

Conclusion

I will be one of the first ones to admit that at the start of this project I knew next to nothing about proper dashboard creation and ways to model data in different ways. Speaking as a general developer and not as a Microsoft employee, I was very impressed with the capabilities Power BI has to offer. Given a sufficiently large and diverse enough set of data it was a relatively fast process to transform the SMAT output files into a usable (and now re-usable) format which can then show insights into the data. I know for my customer and many others this means they can make business critical decisions about what, when, and where to migrate their SharePoint data to SharePoint Online. If you have any feedback on this series of blog posts or suggestions for additional report pages please let me know in the comments.

-Frog Out

How To Analyze SharePoint 2013 Migration Assessment Tool (SMAT) Data in Power BI – Part 1

February 13, 2017February 22, 2017 Brian T. Jackett Power BI, SharePoint

In this set of posts I will walk through a process for consuming SMAT output files into a Power BI report.

Part 1 – importing the data and then querying / modeling the data
Part 2 – designing the Power BI Report (and also provide sample files to use)

In a previous post I talked about the SharePoint Migration Assessment Tool for SharePoint 2013 being released to web. Please read that post for context on working with SMAT.

Note: Power BI Desktop is being updated regularly. For the purposes of this article I am using the January 2017 build (version 2.42.4611.701 64-bit).

Overview

Import Data – Import SMAT output files into Power BI
Query Data – Create multiple queries on SMAT output files to filter, aggregate, and enhance into different perspectives
Design Reports – Design dashboard pages to consume the query outputs

Import Data

Importing data into Power BI is a fairly simple process. After launching Power BI Desktop click the Get Data link from the welcome screen (or click the Get Data button in top ribbon menu).

Select the CSV data type.

Navigate to the folder where the SMAT output files are located. In this walkthrough use the SiteAssessmentReport.csv file.

A preview of the data should appear. I have run a PowerShell script to clean up the data from any customer sensitive information (ex. server names, database names, URLs, users, etc.)

Note: At a later date I may blog out that PowerShell script and link it back to this article. If this is data scrubbing script is of interest please let me know in comments or contact me to prioritize this.

If everything was successful a list of fields should appear on the right hand of the screen. These fields will correspond to the comma separated fields in the CSV file.

Query data

As mentioned earlier the raw CSV data by itself is not easily readable to gain insights about the farm. The data needs to be filtered and queried differently to support the report that will be built. I can not stress enough that having a set of questions you are trying to answer is extremely important. It is not sufficient to start designing reports without some initial planning. Think of this step as modeling the data in different ways to get different perspectives on the data. Initially my personal goals with this process were to identify the following items:

Content database statistics
- Number of sites per database?
- Size (storage) of databases?
Web application statistics
- Number of site collections and subsites?
- Size (storage) of site collections?
Web application risks
- Breakdown view of risks per web application?
- Slicer by web application and risk type
Site collection risks
- Number of risks per site collection?
- Slicer by risk type

Having these goals / questions in mind I diagramed how to split the data into different queries that could then answer those questions.

SiteAssessmentReport Query

Power BI has a nice set of features that allows taking a set of data, making some changes, and then creating a reference (shallow copy) or duplicate (deep copy) that altered set of data. As such I loaded the CSV data and then referenced that data into 3 separate queries, 1 of which was further split into multiple queries for risk data by web application and by site collection. The following are example steps to accomplish this.

After the data has been loaded from previous step click on Edit Queries from the ribbon menu.

Notice that the first query for the loaded CSV file includes a few modifications to promote the header row and change datatype of columns that should be numbers. Leave those steps in place.

Create three references from the source data load query (SiteAssessmentSummary) and rename the referenced queries appropriately. In this example I named them RiskFilteredSource, DatabaseMetrics, and WebApplicationMetrics (respectively the 3 boxes in the 2nd row on my hierarchy diagram above).

RiskFilteredSource Query

The next step is to start adding modifications to the queries to shape the data into the different results which will be reported on. Start with the RiskFilteredSource. Only the SiteURL and individual risk columns (Alerts, Apps, etc.) are necessary. Rather than removing the columns that aren’t needed it is recommended to select the columns that are needed (see reference for more information on why). Click Choose Columns from the ribbon menu. On the following dialog box select the columns for SiteUrl and all of the individual risk columns (Alerts, Apps, etc.) ending with WorkflowRunning2013.

Note: the set of individual risks in the CSV will be dependent on which ones were enabled when the SMAT tool was run. This examples uses the out of the box settings as of SMAT v1.1 release.

The query result should look like the following with only the columns selected still showing. That is all for the RiskFilteredSource query.

DatabaseMetrics Query

Next will be modifications to the DatabaseMetrics query. Similar to the previous query start by selecting only the columns relating to databases (e.g. ContentDbName, ContentDbServerName, and ContentDBSizeInMB) by using the Select Columns button from the ribbon menu.

There will likely be duplicate rows in the result because each row in the original CSV corresponds to a single site collection and each content database can contain 1 or more site collections. In order to clean up the duplicate rows select all three columns and then (with columns still selected) click Remove Rows –> Remove Duplicates from the ribbon menu.

The DatabaseMetrics query result will contain only a single record per content databases with the associated server name and database size.

WebApplicationMetrics Query

The last of the original four queries, WebApplicationMetrics, will take extra steps to shape the data as intended. The SiteUrl column is not in the format needed since it contains full site collection URLs. For the purposes of this example assume that the authority (ex. subdomain.domain.com) portion of URL represents an individual web application. Continuing with that assumption the SiteUrl column can be manipulated to show data per web application. Click Choose Columns from the ribbon menu and select only the columns for SiteUrl, SiteCollectionSizeInMB, SiteCollectionCount, and SubWebCount.

Note: if the farm contains host named site collections (HNSCs) then the assumption regarding “URL authority = web application” may be invalid.

Next remove the scheme (ex. http:// or https:// most likely) from the SiteUrl column by clicking the Replace Values button from the ribbon menu. For Replace With specify an empty string (“”). In my example all sites are using https:// but a farm may have a mix of both http:// and https:// so an additional Replace Values step may be needed. I’m not aware of a way to combine both into one operation but if a reader knows please do share in the comments.

In order to separate out the authority portion of URL from the rest of the site collection (e.g. the path) click the Split Column –> By Delimiter button from the ribbon menu. Choose a custom delimiter of forward slash (/) and choose to only split at the left most delimiter.

To make it easier to read these newly split columns click “Rename column” from the ribbon menu and rename the first column to SiteUrlDomain and the second column to SiteUrlPath. Alternately the Rename column functionality can be found by right-clicking the column and choosing Rename from the list of actions.

The final modification for the WebApplicationMetrics query will be to group the data by multiple aggregations. Select the SiteUrlDomain column and click Group By in the ribbon menu. Make sure that SiteUrlDomain is selected for the top Group By box. Proceed to add additional aggregations for three columns:

New Column Name	Operation	Column
SiteCollectionSizeInMB	Sum	SiteSizeInMB
SiteCollectionCount	Count Rows
SubWebsSum	Sum	NumOfWebs

The results for the WebApplicationMetrics query should look similar to the following.

RiskByWebApp Query

Referring back to the original planning hierarchy there are two final queries to be created: “Risks by Web Application” and “Risks by Site Collection”. These queries are children from the “Filter for Risk Related Columns”. This translates to creating 2 references from RiskFilteredSource query. Rename these referenced queries to RiskByWebApp and RiskBySiteCollection.

Starting with RiskByWebApp perform the similar steps used with WebApplicationMetrics.

Replace the https:// or http:// scheme
Split SiteUrl column on leftmost “/” delimiter

Since this query is focused on web application based information the “path” portion of URL (ex. “/sites/HRMarketing”) is not needed. Click Choose Columns and select all columns except “SiteUrl.2”.

After the columns have been chosen rename the “SiteUrl.1” column to “SiteUrlDomain”. The data should look similar to the following at this point.

Now is where some of the magic happens. Credit for this specifically goes to Ken Kilty and Gene Livshin. I want to aggregate the individual risk counts per web application while also maintaining the individual risk name. Welcome to Unpivot Columns (<insert fireworks and fanfare>). Think of the unpivot columns operation as transpose for a key-value pairing. Select all of the columns for the individual risks (Alerts, Apps, etc.) but be sure to not select SiteUrlDomain. Then click Unpivot Columns.

The resulting query will have SiteUrlDomain, Attribute, and Value for columns. Rename Attribute and Value to RiskName and RiskCount respectively.

Notice that RiskName values will repeat for if there are multiple site collections in that web application. Click Group By from the ribbon menu. Specify SiteUrlDomain and RiskName for the top grouping dropdowns (add a new grouping for RiskName since the default is one grouping.) For the New Column Name, Operation, and Column specify RiskCountAggregated, Sum, and RiskCount respectively.

The RiskByWebApp query result is finally in a usable format now. See the following for an example of the output.

RiskBySiteCollection Query

The RiskBySiteCollection query will mirror the RiskByWebApp query in every step except there will not be a split column or the clean up of columns immediately afterwards. The order of steps will be as follows:

Source
Replaced value – remove scheme from SiteUrl
Unpivoted Columns – select all columns except for SiteUrl for the unpivot
Rename Columns – Attribute –> RiskName, Value –> RiskCount
Grouped Rows – group by SiteUrl and RiskName, sum RiskCount into RiskCountAggregate.

The RiskBySiteCollection query should end up looking like the following:

Conclusion

Quite a bit of work goes into transforming and modeling the initial CSV data into more usable formats. Hopefully this post has been helpful in illustrating steps that can be taken to plan out the end goal as well as model the queries of the SMAT output data. In Part 2 of this post I will walk through creating a number of Power BI report pages that leverage what has been created thus far. I’ll also attempt to share out sample source CSV data and a Power BI PBIX file but no guarantees. Please share any feedback or questions in the comments.

Continue reading part 2 of this series.

-Frog Out

PowerApps and Microsoft Flow for Developers

Slides

Intro to Power BI for Office 365 Developers

Slides

Sample Financial Data file

Blog posts on SMAT data report

PowerApps and Microsoft Flow for Developers

Slides

Intro to Power BI for Office 365 Developers

Slides

Sample Financial Data file

Blog posts on SMAT data report

Sample report and data files (updated 2017-02-28)

Overview

Design Reports

**Mea Culpa – PLEASE READ**

Content DB overview page

Site collection overview page

Risk overview page

Add custom histogram visualization

Risks by web application page

Risks by site collection page

Top N site collections by risk page

Add query parameter for number of site collections

Conclusion

Overview

Import Data

Query data

SiteAssessmentReport Query

RiskFilteredSource Query

DatabaseMetrics Query

WebApplicationMetrics Query

RiskByWebApp Query

RiskBySiteCollection Query

Conclusion

Mea Culpa – PLEASE READ