MSPTDA 03: Power Query Introduction: Importing & Transformation Data in Excel & Power BI Desktop


[MUSIC PLAYING] Yes, this is MSPTDA
number three. And in this video, we get to use
Power Query for the first time. Look at what this
amazing tool can do. Now this is going to be an
introduction to all the things a Power Query can do. It can extract and
import the data, clean and transform the data. It is an ETL master tool, that
data warehousing term, extract, transform, and load. No other tool does it
better than Power Query. And if that’s not
amazing enough, Power Query is both
in Excel and Power BI. And there’s one other thing
that Power Query does. For Excel people, it can
replace crazy array formulas that we used to have to do with
a simple few clicks of Power Query. Now we want to take a look at
the ambitious set of topics we’re going to try and
cover in this video, because I want to give you
a glimpse of everything that Power Query can do. We’re going to start by cleaning
and transforming data in Excel. Then we’ll look at the Power
Query user interface and the M code behind the scenes. We’ll see how to load the data
to an Excel sheet, the data model, and as a connection. Then we’ll see
something so amazing. We can edit, delete,
and add steps at any time for any one of
our Power Query solutions. Then we’ll do ETL
extract, transform, and load data from Access into
the Excel Power Pivot Data Model. Then we’ll do ETL in
the data from text files and dump it into the Power
BI desktop data model. Then we’ll look at the Power
BI desktop Power Query user interface, which is slightly
different than in Excel and the M code. And we’ll finish it off
with an amazing example where we’ll replace
a crazy array formula with a simple
Power Query solution. Our first four
examples here, we’ll get to see on this sheet CT. Now here’s our situation. We have a data set in Excel,
description and amount. And look at this. They’ve combined
product date and region all in a single cell
with the delimiter space forward slash space. So we need to break it
apart into three columns. Then use the three columns and
the amount to make this report. Now before Power Query, we could
do this a few different ways. We could actually use
data text to columns. But the problem is, we
have a delimiter that’s more than one character. So with text to
column, we’d have to do it in multiple steps. We also could do
it with formulas. But look at all these crazy
formulas we’d have to do. That’s where cleaning and
transforming with Power Query comes in. Now this would be the first
time we’ve done Power Query. We’re using Office 365, the
latest version of Excel. So our get and transform,
that’s our Power Query. Now in order to use Power
Query on Excel data, we have to first convert
it to an Excel table. As we talked about
last video, we want any solution we
make to be dynamic. And actually, we’ll test it. We’ll convert this data set. And then later,
we’ll add new data and see that our report updates. All right, so I’m going to
go up to insert table or use the keyboard
control T and enter. I want to name it, so
I go up to design, over to properties, table
name, or use the keyboard, alt J, T, A. We’re going to
call this something smart, like start sales, and enter. There is our name. Now we can bring it
into Power Query. Actually, I have my
ribbon tabs collapsed. You can either right click
and point to collapse, or the keyboard is Control F1. That’s a toggle for
your ribbon tab. All right, data. Down to get and transform. And because our data
is from an Excel sheet, and it’s an Excel table, we use
the from table range button, click, to bring it into
the Power Query editor. Now let’s take a little tour. Over here on the right
are query settings. That’s the name. This would be the name of our
query if we didn’t change it, but that’s the actual
name from the Excel table. We’ll come back and change
it in just a second. There are two steps. These are two separate steps
in the transformation that were automatically created
when we clicked that from table button. Down here, it says
preview downloaded at a particular time. Over here there’s some
information about columns and number of rows. There’s our two columns that
are imported and waiting for us to transform them on the left. If I click this
query, later, we’ll have lots of different
queries we’re working with. And this is where we see
the different queries. I’m going to collapse that. We have data ribbon tabs just
like in Excel, the home ribbon tab, some common cleaning
and transformation features, the transform ribbon tab,
awesome transformation features, the add column. We’ll get to see a bunch of
different ways to add columns. And then there’s view. You can view the
code, whether or not you want to see the formula bar. I’m going to go
back over to home. And as we see,
the f of x, that’s the icon over in
Excel for formula, there’s our formula bar. Now we’ll come back and talk
about the formula bar later. But the first step
for us is always going to be name
the query smartly. So I’m going to name
this query something like transformed
sales table and enter. Now here’s our two steps. If we click on source,
we could actually look up to the formula bar and
see that the case sensitive M code that is behind the
scenes in Power Query used the function Excel
dot current workbook. Now we’ll come
back later and talk about functions in the M code. I’m going to come over
and click on change type. Up in the formula bar,
you can see table dot transform column types. Now both of those steps
were created automatically. And much of what we
can do in Power Query, we can click on
features or right click and do particular
features and the steps and the M code will be
created automatically. Now later, we’ll come back and
see how to edit, delete, add new steps to the applied steps. That’s a very powerful
feature in Power Query. And we’ll talk
more about M code. For this example,
we’re just going to use the built in features. Now one last thing
about change type. In Power Query, we have
to define the data type. If we don’t, our
calculations in Power Query and then later in an Excel
spreadsheet or the data model, those calculations
might not work unless we define the correct data type. Now this data type is fine. This is one, two, three. That’s the icon
for a whole number. Over here ABC, that’s
the icon for text. Now watch this. You can click, and
it shows you the icon and gives you the description
for the different data types. We will deal with
data types later. Right now, we have
our first two steps. Let’s get busy transforming. I’m going to click on
the description column. And we need to
split by delimiter. So we go up to home. There is the split by column. We could also go over to
transform, split by column. And for many of the
common features, we can right click
the column header, and there’s a bunch of cleaning
and transforming features. Let’s point to split by column. There’s the same two options
over in text to column. But over here, when we
select by delimiter, there’s much more
that we can do. Now what is our delimiter? I’m going to click the
dropdown, point to custom, and here’s where it is so much
better than text to columns. Space forward slash
space, three characters define our delimiter. Now we could split at the
leftmost, the rightmost, or in our case, at each
occurrence of the delimiter. Now these options are also
not available over in text to columns. And watch this. We can click advanced. And look at that. We have the options which
we don’t over in Excel to split by columns,
which is what we want. We want to split into
one, two, three columns. Or we could split by rows. And we’ll have great
examples for this row one later in the class. All right, we have what we want. I’m going to click OK. And look at that. One, two, three. Over here, we can
see two new steps. Split by column
delimiter is the name. Change type is the name. And look at that. It changed the type. ABC, which is text. That means date. There’s text, and
there is whole number. Now our last step is
to rename the columns. Now watch this. I’m going to rename
it one of three ways. You could right click, rename. I’m never going to
do it that way again. You could double click, escape. You can also use the F2 key. F2 puts it in please
give me a new name mode. Now F2 key works in Windows
Explorer and Access also. So this column is going
to be called product. Now watch this. I’m going to hit enter. And to move to the next column,
I’m not going to use my mouse. I’m going to use the arrow keys. Then with the
column selected, I’m going to use F2 to rename it. This is called date, enter,
arrow key, F2, region, and enter. And just like that, I’ve
renamed the three columns. Now it added just a single
step, even though we renamed three columns. There we have it. We have the desired columns
and, very importantly, the correct data types for
each one of our columns. Five steps were
automatically created. We have a good name
for this query. So now we want to
see how to load. We go over to home. Over here in Excel, Power
Query is called close and load. Over in Power BI Desktop, it’ll
be called close and apply. We have more options of where we
can load it over here in Excel. I want the second option,
close and load two. This dialog box will
be very important. It’s called import data. Now in earlier versions,
it was called load two. It had the same options though. And look at this. We can load it to a
table, which is sometimes exactly what we want,
especially in Excel when our data is coming externally. We could load it
directly to a pivot table report, which is what we’re
going to do in the end here, a pivot chart. Only create a connection. And then finally, we can totally
dump whatever we transform and clean into the data model. Now I’m going to start
off by selecting only create a connection. When I click OK, you’ll see
over here on the right queries and connection pane. We have one query. That pane can be
opened and closed in the queries and
connections group by selecting queries
and connection. So later, when you
come back to a workbook and you need to edit
queries, you click on this. Now it says connection only. And there are some
times where we do not want to dump the data anywhere. Most of the time,
connection only is when we’re
transforming data, and it becomes part of a
different table where we load that final
table to some location. Now this isn’t the location
we want to load to. So how do we edit that? Right click the
query and load to. Now this is really important. Because later, when you load
something to the data model, and you try to delete it over
in Power Pivot, it doesn’t work. You always have to come
back to Power Query to either edit, which
we’ll do in a second, or change the load to location. I’m going to click load to. I want to now dump
it in the data model and just notice that it
ends up in the data model. Only create a connection,
because that’s the Power Query part of it. But now, we’re going to
load it to the data model. So I click OK. It says 365 rows loaded. Now it doesn’t tell
you where it’s loaded, which I wish it did. But we put it into
the data model. Now Power Pivot ribbon tab,
we could click data model. I like over in data,
since I’m here so often, there it is, manage data model. Now the keyboard to get to
data, if I hit the alt key, notice A gets me to data. And when I hit A,
then I see, oh, that’s an easy keyboard to remember. So even though I may be over
in the Power Pivot ribbon tab or somewhere else, when
I want to see the data model and I want to use a
keyboard, I use alt A, D, M. Now this is the Power
Pivot for Excel window. Just like Power Query, it
opens in a completely separate window. We’ll come back and talk
about Power Pivot later. But I just want to notice that,
yes, in fact, the table got dumped in the data model. We could even go
over to diagram view, which is like relationships
view over in Power BI desktop. If we tried to
delete this table, it is trying to be polite. And we delete this
from the model, and it says, uh-uh, this
is connected to a query, so we have to go
over there to do it. Now I’m going to close
Power Pivot window. Come over, right click load
to, and I want to uncheck this. Now I’m going to click
table, and I want to load it to a new worksheet. We’re just looking at the
different places we can load. Now when I uncheck
that, it actually will delete it from
the data model. When I click OK,
it’s being polite. It’s telling us that
this is in the data model and this might really
wreck things if we do this. But when I click OK. Now notice over here,
the message is the same. 365 rows loaded. It doesn’t tell us that this is
to a sheet or the data model. I see the new sheet. If I use my keyboard, alt, A,
D, M, to open up the data model, I can sure enough see
that it has been deleted. I’m going to close this. Now I want to delete
this, because this is not where I want it. Oftentimes, if you’re
getting data externally and you need a proper data set,
this is exactly what you want. But here, we don’t want this. I’m going to right
click delete the sheet. Click delete. Notice it transformed
to connection only. Now I’m going to right click. And before we load it
to pivot table cache, I want to go back and edit. Now we can point to edit. And when I do that, it opens
up our Power Query window. I’m going to close it. You can also double click. This allows us to
edit the query. We might want to do things like
we forgot to add a data type, and we need to add one. We might be adding some other
columns or many other reasons. Now what I’d like to do
is delete all of the steps from change type
after, and then go through a slightly
different process where we see how to edit applied
steps and some of the M code. Now watch this. We can right click a particular
step, and say delete until end. This will delete all of the
steps from this point forward. So when I click delete
end, and then delete, those steps are removed. Now I want to show you something
else that’s very important. Let’s say I just did a
bunch of changes and edits, and they’re totally
wrong, no problem. Come up and click the red X. It’ll be polite. It says, do you want
to keep the changes? This will always come up. So you’re free to come
back here and play around. If you want to revert back to
the original, click discard. Now when I double
click to open this, I’m back to my starting point. Now that really
was what I wanted, so I’m going to right
click, delete until end. Delete. Now we deleted that change type. Because remember automatically,
it changed the type here. But we had a another step
later where we actually had to change the types. So instead of having
two separate steps, we’ll consolidate all
of the change types after we split the columns. Now right click description,
split column by delimiter. I’m going to select custom,
space, forward slash, space, click OK. I have my one,
two, three columns. It added two steps, split column
by delimiter and change type. Now we still need to
rename these columns. But guess what? Somehow in the code
already, the columns were given the name description
one, description two, and so on. If I go back to split
column by delimiter, come up to the formula
bar, click the down arrow to expose the formula
bar, I see the table dot split column function. And as I’m looking
through here, I can probably guess that this
last argument is the one that named those columns. Now if you want more information
about particular functions, you can simply go search Google. And sure enough, if we
look through the arguments, that fourth argument, it looks
like these two were left out, but this fourth argument
allows us to name the columns. So I’m actually going
to click on change type. Delete that. Actually, let’s leave
that step there. Let’s come over here. We’re going to highlight
up in the formula bar. We’re editing the
Power Query function, which is editing the M code. And I’m going to call this
one, and then the second column should be called date. And the third column
should be called region. And so when I hit
enter, I’ve edited this function up in the formula
bar, and thus saving a step. If the Power Query solution
for this transformation step was naming the column
anyway, we might as well go up and edit it. Now if I click back on change
type, there’s an error. And it’s fairly polite. Most of the time,
you can figure out what’s causing the error
from this yellow message. Yes, when we change
the column name, this step was using description
one, description two, and so on. Now we could edit it here. But forget that. I’m going to click the red
X. The moral of the story here is there are some
times where you can edit a previous step and logically it
has no effect on the remaining steps. In that case, it’s fine. But in this situation,
the column names were used in subsequent
steps, so we had a problem. I’m going to click the
red X. ABC text is fine. This is definitely
a date column. Notice it’s aligned to the left. Click the icon for data type. Decimal number, that’s like an
Excel number with 15 digits. Currency with four decimal. Whole number, percentage,
date and time together, just date, just time,
date time time zone. Duration, that’s
an amount of time. Text, Boolean
value, true, false. We’ll see what a binary
is later in this video. Also in other videos
later in the class, we’ll use using locale. That’s when we get dates
from a different country. We’re going to select
date data type. And there’s our data type. We can see it’s
aligned to the right. Now we want to come
up, and I’m going to select currency for this,
because this is a dollar amount. Now look at that. We have one, two, three steps. Now when we’re going
to reload this, I would like to
click the dropdown and choose this second option
to load it to a pivot table. But once we’ve
loaded it, and we’ve come back here to
edit it, we simply click close and load button. It loads to wherever it
was previously loaded. Now we can right
click load to, and I want to load this into
the pivot table cache. Our data is sitting in
an Excel spreadsheet. So just like we would
open up pivot table, and it would take it and put
it in the pivot table cache, that’s what it’s going to do. But it will be the
transformed proper data set. Now I want this on a new
sheet, so I click new sheet. Click OK. Now look at that. It’s loaded, but it doesn’t
tell me where it’s loaded to. There’s our new pivot table. There’s the new sheet. I’m going to double click
and call this PT and enter. Now this is a normal
pivot table cache. That’s a location
inside of Excel that stores the data
for a pivot table. Because it’s a
standard pivot table, I’m going to use the
grouping feature. When we get over
to the data model, we do not want to use
the grouping feature. It causes all sorts of trouble. But here, this is a
standard pivot table. So I’m going to click
and drag down to rows. Actually, watch this. I’m going drag the pivot
table field list over here. Now months, I want,
but I do not want date. I’m going to drag
product down below rows. Region over to columns. Amount down to values. Expand, expand. This is a standard pivot table,
not a data model pivot table. So we actually have
to add manually number formatting every
time we make a calculation. Right click and the
number formatting option. I’m going to select something
like currency, click OK. All right, so that’s
pretty amazing. We were allowed
to clean the data and then make a pivot
table report here in Excel. Now if we go back to
CT, that’s a data set. And for some reason,
we get this crazy data. If we come over to the side,
remember whatever solution we have, we want it to
update when new data arrives. I’m going to click in a
single cell control asterisk to highlight the whole table,
control C, scroll over. Click somewhere in the
description column. Remember this is
the bad data source. Control down arrow. This is an Excel table. So when I click directly
below and control V, instantly, that new data is
incorporated into our table. Control home. We want to come back over to PT. Right click somewhere
in the pivot table. Point to refresh,
and look at that. Click. Click. Our report has updated. And over in our queries
and connections, I can see Power
Query did its thing. It cleaned all that
data and loaded 481 rows into the pivot table cache. Now before we go and look at
our second and third example, I want to do one last thing. I want to double click
our transformation. We can see our three steps. And when we click on
any particular step, we can look up into
the formula bar. But let’s go look at
the complete M code in the home ribbon tab
advanced editor over on view advanced editor. Click advanced editor, and
there’s our complete M code. We have three steps that
result in our output. These transformations
will start with let. And everything here
is case sensitive. Little l-e-t,
followed by each one of the steps in
the transformation. In is followed by
the end result. Now notice each transformations
step has a name source. This one is split
column by delimiter. Why is this one not in
double quotes and pound sign? Because that has spaces. Further, if you
look over at text, like when we entered product and
date and region as the column names, double
quote, double quote, that’s reserved for text. So when we want to distinguish
the name of a transformation step from actual
text, they use pound. Now if there’s no spaces,
that makes it easy. It’s just the word. So we have the name
of the transformation step, equal sign, and we can
see a function based language, Excel dot current workbook,
table dot split column, table dot transform
column types. Also very important, at the end
of each transformation line, there’s a comma. That comma means go on to
the next transformation. Comma, go on to the
next transformation. And then the last one
does not have a comma, because the name of that
step comes after in. That last step is the
end result, the output, or our final transformation. One other thing to notice that
each time we have a function, it’s going to refer to the
name of the previous step. Because this line of code will
be acting on the previous step. When we get down
to the final line, split column from delimiter,
that’s the previous line that this line of
code is acting on. All right, I’m going
to click cancel. Click the red X. So that
is example number one, where we had an introduction
to Power Query transforming Excel data into a pivot table. Now we want to go to
example number two. I’m going to click
on the sheet access. Example number two, we’re
going to take an import a snowflake data
model from Access, and import it into the Excel
Power Pivot data model. Now before we go
up to data and use Power Query to
import Access data, let’s go look at
the Access database. You, of course, can download all
of these files from the links below the video. We have three tables
over here in Access. If I double click F sales,
date, product ID, sales, rep ID, those are foreign keys,
units, and revenue discount. Down here, we can see
there’s about 37,000 rows. That F sales table is
related to D products. There’s the product ID. This is the primary key that’s
connected to the foreign key over in our F sales, product
name, price description, and so on. Category is a foreign key. And if I close this,
and open up D category, that’s the primary key. So the categories
for each product are stored in a separate table. If I go over to database,
tools, relationships, there’s the many to
one relationship. Now notice over here in Access,
they have the infinity symbol. Over in the Power Pivot data
model and Power BI Desktop data model, they’ll actually
use an asterisk to one. But over here, F sales to
product, and then product to category. Now when we import this, we want
to pull the category into the D products table, so we
have a proper star schema. That will be no problem
with Power Query. Because guess what? When Power Query
sees this database, it not only will
pull in the tables, it’ll pull in the
relationships too. Let’s close this. Now in Excel, we want to
go up to our Power Query. Both of these sections here. But here it is, get data. This get data button will also
be over in Power BI Desktop. We click the dropdown, and
we have all sorts of options. From file, we can get it from
an Excel workbook, a text file, XML, JSON. This from folder
will be profound. We’ll actually be
able to simultaneously important multiple files. That actually will
be our next example. We have all sorts of
databases we can connect to from Azure, for example,
an SQL database, from online services,
other sources. There’s the new
from web, as opposed to our old legacy from
web user interface, and a bunch of other
options, including a blank query down here. There’s the legacy wizards. You can actually have
these show up in your get data dropdown by
going to options in Excel. There’s the old
legacy web import. Then combine queries. Now what we want here
is from database. And we’re going to connect to
a Microsoft Access database. So I click. I navigate to
wherever that file is. There it is. That’s the file
you can download. Double click. This is the navigator
part of Power Query. I want to select
multiple tables, and then check each one of them. Now we want to come
down here, and I do not want to click the load button. I want to click transform. Now transform data, that
button used to say edit, but now it says transform data. When I click this button, it
will bring the three tables into our Power Query editor. As we saw in the
first example, queries is where we list all
the different queries in this workbook. That is the first query
from our first example. If I click on D category, I can
see over here in properties, there’s the name. There’s the two steps. D product, there’s the name. There’s the two steps. We’re going to click on F
sales, and we’ll start here. There’s the name
and the two steps. Now this is coming from
an external data source, and I want to keep the
exact same name, F sales. Now if we want a
little bit more room, we can pull this task
pane in a little bit. Maybe we can do
that over here also. Now F sales, we look through
each one of the columns at the name and the data type. Now we’re going to look
through each column and decide what
we’re going to do. This is a primary key,
and it’s in a fact table. We really don’t need it for
any of our calculations. Occasionally, you might need
it for some calculations. But we don’t, so we’re
going to remove it. Not only that, but
because our goal is to take the data
from these tables and dump it into the data model,
since the data gets stored in a columnar database, that
means each one of these columns will be stored as a unique list. And if there is a unique
identifier as a column and we don’t need
it, why increase the size of our
columnar database storing the raw data
by keeping this column? So I’m going to
right click remove. Date, you know what? These are actual dates, and
it came in as date time. So I’m going to
change the data type. Click the dropdown, and I’m
going to change it to date. Product ID, sales rep ID,
those are the foreign keys. We do not have a sales
rep table in this example, so I’m getting rid of it. We do not need it for any
of our reporting goals. Right click remove. We do need this one,
because we’ll connect it to the product table. Units as decimal, you know what? I’m going to change
this to whole number, because units are whole numbers. For that matter, we can
change this one also, because it’s a product
ID as a whole number. Revenue discount, that’s
perfectly all right to keep it as a decimal. Now look at this. This is something we’re
seeing for the first time. Because fact sales table is
connected on the many side here to the product table,
and the product table is on the one
side, we’re allowed to see the entire
record on the one side over here in the sales table. This is because
there’s a relationship. So if I click off to the
side, not on the value, but off to the side, I
get a preview down here. This transaction right here
sold product number four. And there’s the record over
on the one side in the product table. Now we actually do not
need this column here. But for some transformations,
that’s really helpful. Because that means
through the relationship here on Power Query, we can
pull this data into this table. I’m going to right
click remove this. Now notice over
here on the steps, we have remove, remove,
remove, and change, change. I actually could
delete all those. Go through and remove
all of them at once. And then change the
data types all at once. That will be more efficient. So I’m going to go to
remove column, right click, delete until end. Delete. Now I’m going to click
on sales primary key. Holding control, I’m going
to click on sales rep ID. Scrolling over,
clicking on D products, holding control the whole time. Now I can simply right
click remove columns. And there it is one step. Now we can go through and
change units to whole number, product ID to whole number,
date time to actual date. And there we go. We have our two steps. Now let’s go over to D products. And guess what, here, we are
going to need the primary key. I’m going to change
this to a whole number. Product name is
fine and data type. Same with retail price
and standard cost, both are good names with
the correct data type. ABC for description is fine. Data type text color is fine. Category key, now we’ll come
back to this in just a moment, but look at what we have here. Because product is in
between sales and category. And the product in its
relationship to sales, product is the one. Sales is the many. So we get to see over
here in the product table, an actual table of
values from the many side. So if we look down
here, remember this is product number one. If I look down here
in the preview, these are all of the
sales for product one. If we needed to make
a transformation, we could use all of
those records over here. Now we don’t need that. If I click on category off
to the side, not on values, this is where we want to pull in
general into the product table. So now I can use the
expand button up here. Uncheck this if it’s checked. Uncheck everything. And all we want to pull
over here is category. Remember that record down
there showed us the one side. As soon as I click OK, that
means this new column here will have duplicates. When I click OK, we’ve pulled
the category for each product into the product table. Now again, the
columnar database will store this as a unique
list, so it’s no problem. And that means we have the
column we need for analysis in the product table. And that will match
the star schema. We don’t want to
use the snowflake. We’re using the star schema. Now again, for a small
data set, if you’re able to navigate it
easily, it really has no performance
downside at all. Now we need to delete. Category key, hold control,
F sales, right click remove columns. And there we have our steps. Now if I click on D category,
I can click off to the side here on the table. This is the one side
of the category. So when I’m looking at the
D products table down here, I see a preview of
all of the products that fit that category. Now here’s the thing,
this category table, when we were over in products,
we pulled in category. But I want you to
notice something. If we go back to navigation,
and this is another great thing about Power Query, we
can go back and look at the individual steps. Look at what happened. When we in imported D products
from a relational database, there were relationships
over in Access, it actually pulled
in from the one side in the category table,
the data we needed. We actually didn’t even need
to import the D category table. Now in step number one, we could
have just not clicked that, but we wanted to take a
look at that over here. Now guess what? We can simply
right click delete. I’m deleting that query. I click delete, and
that query is gone. We have our F sales table
just the way we want. We have our D products
table just the way we want, including the category. Now we need to load these. Now we already loaded this one. So now when I go up to close and
load, click close and load to. I very carefully want to say,
only create a connection. That’s for Power Query. Now I come down here,
and for the data model, I check, add this
to the data model. Both of those tables
will be added. When I click OK, I
can see over here, it’s loading, 11
rows and 36,942. Now either on the Power
Pivot ribbon tab data, I can click data
model button, or I can use the keyboard, alt,
A, D, M. Here’s the Power Pivot for Excel window. We can see that both
tables were imported. Remember the columnar database
is an in-memory database that stores the data that we can
manipulate here in Power Pivot or over in an
Excel sheet when we have a data model pivot table. Over in diagram view,
there’s are two tables. Now remember, we
need star schema. There’s the fact
table right there. Many side, one side,
click either one. Click and drag, and I’ve
created a relationship. Now later in the
class, we’ll get to do lots of amazing things
with this relationship. We can look up on the
one side, the many side. We can drag conditions
and criteria to our reports and a
bunch of other things. But what have we done? We used Power Query to import
from a relational access database just the
data we wanted. We took that snowflake
and converted it into a star schema. Control S, we’re not going
to build any reports. We just wanted to see
the power of Power Query. Now I’m going to close
this Power Pivot window. And now we get to see
our third example. We’re going to import multiple
text files and append. Append means to stack
up the different tables all into a single table. And we’ll do that
in Power BI desktop. Now you guys had
to buy Office 365 to get Power Pivot and Power
Query and a regular Excel. But Power BI Desktop, you
just go search Google. It is a free download. Not only that, but it’s one
of the fastest free downloads and installations
I’ve ever experienced. So this is what you’ll get. There it is, Power BI Desktop. I’m going to click to open. We’ll close this. Now later, just
like Power Pivot, we’ll have an introduction and
talk about Power BI Desktop. Over here on the left, we’ll
build our reports there. This is where we’ll
see our tables. This is where we see
our relationships. For this video, we’re just
talking about Power Query. And get this, that
little external data, that’s our Power Query. We can edit the queries. And there’s our get data. Now there’s a shorter
list over here. But if you come
down to more, here’s all the categories, get data,
just like we saw over in Excel. Now the one we want
is file from folder. Now what is so
amazing about this is that when we point
Power Query to a folder either here in Power
BI Desktop or over in Power Query in Excel, Power
Query sees every single file in that folder. So we could have 10 or 20
Excel files, or in our case, many different
text files that we need to simultaneously import. Now why this is so amazing,
well, for a couple reasons. We might have monthly or yearly
data each in a separate file. Also, for text files why it’s
so important is because all systems in the world that
store data cannot communicate directly with each other. But all systems can
export text files and then subsequently
import those text files. So one of the main ways that
systems have shared data with each other is
through text files. Now before I click
this, I’m going to stop here and go over
and look at the text files. Now you get to download
this as a zipped folder, and then unzip it. Inside this folder, there
are two other folders. If I double click
on start, there’s our sales data,
2015, ’16, and ’17. Now I put some
other files in here also, because another amazing
thing about Power Query is we can filter. Even though Power Query will
point towards that start folder, we may not
want all the files. So we can filter and
get just the text files. Now let me double click and
open one of these text files. In Notepad, you
could see that we have date, product ID,
category merged together, units, revenue, cost of
goods sold, and profit. This is a tab delimited file. If I were to right
click open with Word, you could see the tabs. Now what’s so amazing is
we have 2015 to ’17 data. We’ll tell Power Query
to look at this folder. Later, when we add more
text files with new data, over in a report or Power Query
or wherever we dump this data, we just refresh it and
everything updates. So when we go back later,
we’ll add these files and see if everything updates. All right, I’m going to close. Back over here in Power BI
Desktop, with folders selected, I click connect. It’s as easy as browsing. I’m going to come down and open
up MSPTDA03, and very carefully point to start. Now remember what are we doing? We’re telling Power Query
to import everything in one folder. I’m going to click OK. Click OK. I do not want to
combine, because I have some filtering to do. I definitely don’t want to load. I want to edit. Almost always, you want to click
the edit or the transform data button. At least go look at it and make
sure everything’s in order. I’m going to click edit. Now this is similar to the
way it looks over in Excel. The actual column
is a binary column. And each row contains the file. I can see the text file. If I click the second to last
one, I can see the document. So each one of these
is a file that we want to extract and get things. Now later, we’ll see there’s
problems with Excel files, because Excel files can
have all sorts of objects. But text files, that’s simply
going to be one proper data set. Now it might not contain
a proper data set, but we definitely
exported it from a system where that’s the case. Now it gives us some
attributes of those files. The one we’re interested
in is extension. Now in some of the other
examples in this class, we’ll actually have to
extract data from the name, like the year or something
or the name of the city. But that’s not
what we have here. Each one of those text
files has a proper data set. Now remember Power
Query is case sensitive. So if we have something dot
TXT capital and dot TXT, well, we would have to
filter and say, hey, please, give me this one or this one. But instead of doing
that, we’re going to right click, just like we did
over in Power Query in Excel, and there’s a bunch
of options here. We want to go down to transform. And look at that. There’s lowercase. Now actually, transform has
it over here too under format. But we’re going to right
click, transform, lower case. That way, any file
extensions that we bring in will always be lowercase. Now it’s a simple matter
of filtering text filters. And I’m going to say, it
must be equal to dot TXT. Now I probably could have just
done the dropdown up there, but this will give me
the same line of code. Click OK, and I have filtered
out the non-text files. Now up in the binary column
the double downward arrows mean combine files or
append these files. And when we click
combine, it’s first going to be polite
and ask us things like, what’s the example file? The first one is fine for us. What’s the file origin? That one’s fine for us. What’s the delimiter? We definitely have tab. And it’s going to
ask us, do you want to base this on
the first 200 rows? Now we got this data
from a database, so it spit out all
the correct way, so the first 200 rows is fine. We can already see
a preview over here. Now when I click OK, it’s going
to do a bunch of steps for us. I’m going to click OK. Look it over here. It created a bunch of queries. Now it actually took the
sample file the first one, it built some code. We can see over here,
source and promote headers. And then it built
a custom function. Now later in the class, we’ll
see how to build our own custom functions. And finally, it used that
custom function to combine. And we can see over here,
invoke custom function, it did that for each
one of those files. Now if you wanted at
any one of these steps, you could actually go
and look at the M code. Over to view. There’s our advanced editor. And there is the M code, just
like we saw over in Excel. I’m going to click cancel. Now for text files,
it’s safe to click that double downward
pointing arrow. All of these steps
work just fine. Our table will be one
fully appended table. Later when we see how
to do it with Excel, we’ll see we’re not going to
use the double downward pointing arrows to combine. We’ll do something different. Now we’re ready to load. Notice we really didn’t have
to do many transformation steps at all here. When their text files, and the
data is consistent, it is easy. Now home, close and
apply, not close and load. We really can just
click close and apply, because it will automatically
go to the data model in Power BI Desktop. So I’m going to click
close and apply. Loading to the data model. And now we can see we’re
in relationship view. We only have one table here. We come over to our table view. This is actually brand
new, just in the latest, I think it’s June 2018 update. We used to have these dropdowns
for filtering and sorting over in Power Pivot,
but we didn’t have them over here in Power BI
Desktop, but now we do. Now if we look down
here, we can see we have 524,000 rows loaded
from those text files. Now actually, look at this,
we forgot to do a few things. We forgot to split this. And if we go over and look
at the fields on the right, that is a terrible name. No problem. Over here in Power
Query, we can go back and edit the query anytime. Edit queries. Edit queries. I’m going to come
over here to name. Call it something like F sales. Come over to the product ID
category column, right click. And just as we saw over
in Excel Power Query, split columns by delimiter. We have a single forward slash. And it guessed right. I’m going to click OK. Just as we saw
over in Excel, I’m going to get rid of
this change type. And then come up to split
column by delimiter, up in the formula
bar, we’re totally allowed to edit the M code. So I’m going to call this
first column here product ID. The second column,
we’ll call category. And enter. Now it added the change
type step automatically. And actually, both
in Power Pivot and over here in
Power BI Desktop, if you don’t like that
automatic setting, you can change it in options. Later in the class, we
will change some options. All right, so we
finished our edits. Now we come up to
close and apply. And there, we have our two
columns and our proper name. Now we’re not going to build
any reports in this video. We’re just looking at Power
Query here in Power BI Desktop. But guess what? 524 rows. Let’s go drop the next two
text files in and then refresh. Over here in Windows Explorer,
I’m going to go to add later. Select, hold shift or
control, click, control C, click the Back button, double
click start, control V. Now this folder has five text files. Power Query’s pointing
to that start folder. So when we go back
over and click refresh, it is totally
running Power Query and getting the latest files. Down here, we can see 912,000
rows, almost a million rows from those text files. All right, so we saw over
here in Power BI Desktop that we can use Power Query
to get and transform and load data. Now before we go look
at our last example, uh oh, we forgot to save. Guess what? I’m going to use the same
keyboard as in Excel or Word or anywhere, F12
to open up save as. And I’m going to call it
something like MSPTDA03. And notice the
extension dot pbix. When I click save,
now I have saved this. Now we’ll go look at our
last example over in Excel. Back over in our
Excel workbook, I’m on the sheet no array formula. I got to show you something. I wrote a whole book
about array formulas called Control, Shift,
Enter Mastery in Excel Array Formulas. And there is one
advantage to formulas that Power Query
can’t really beat, but let me just show you here. If our goal is to
from this table always extract a unique list
and have it sorted, this is how crazy it
gets with array formulas. That’s crazy enough. That’s counting unique. But this formula, wow, that
even makes my head spin. But that’s the formula to
extract text items resulting in assorted unique lists. Let’s just see how easy
this is with Power Query. Because what are we doing? We’re replacing a
complex array formula. Now this is already an Excel
table, so I click from table. You know what? I do this so often,
there is a keyboard, alt A, P, T. That just brings
the table into Power Query. Now I don’t need change type. The only column I’m
interested in is sales rep. I’m going to right click and
say, remove other columns. Now what I want? I want a unique list. Well, guess what? Up in remove rows, there
is remove duplicates. By the way, that
icon right there is the same over in Excel. I can also right click
the column header. And remove duplicates is
common, so there it is. Now I want it sorted. I’m going to use a
regular sort feature. This is just amazing. There it is. How many clicks was that? A few clicks? I’m going to change the
data type also as text. There are the steps. We can close and load,
close and load to. I want to see as a table
on the existing sheet. I definitely don’t
want it in C 13. Maybe right there. Uncollapse, click OK. And there it is. Now the moral of the story
is that that is super simple. This is super hard. But watch this. Let’s come down and
add a new record. Tab, we’ll add a date, abdi. Look at that. That’s how come sometimes,
for some solutions, we still need array formulas. It just instantly updates as
soon as we change the data. It’s easy enough to come
over and right click refresh. Then just like that, we
get our sorted unique list. Now look at this. That’s a terrible name. With the query highlighted,
hit the F2 key. And I’m going to call
this something smart, like unique list sales rep. Hey, that was an epic video
all about Power Query. We saw how an example for
to replace a crazy array formula with a simple
Power Query solution. Over in Power BI
Desktop, we saw how to import from
multiple text files almost a million rows of
data, imported, transformed, and appended into one table. And then with Power
Query, we connected to an axis relational database,
imported, transformed, dumped the data in Power
Pivot, built a relationship, and got our star
schema data model. And we started it off
taking bad data in Excel, transforming it in Power
Query, and learning a lot about Power Query
in the process, and creating a cached
pivot table report. All right, if you
liked that video, be sure to click that thumbs
up, leave a comment, and sub, because there’s always lots
more videos to come from Excel is Fun, including video number
four, our next video where we’ll see how to take
multiple Excel workbook files and consolidate them all into a
single table using Power Query. All right, we’ll
see you next video. [MUSIC PLAYING]

You May Also Like

About the Author: Oren Garnes

Leave a Reply

Your email address will not be published. Required fields are marked *