0% found this document useful (0 votes)
42 views60 pages

Tyit BI Practical File

The document outlines practical exercises for Business Intelligence and Data Analytics using Microsoft Excel and R/Python. It covers data importation, analysis using Pivot Tables and Charts, what-if analysis, data classification, clustering, linear regression, and logistic regression. Each practical includes step-by-step instructions and code snippets for executing the tasks.

Uploaded by

guptaji55545
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views60 pages

Tyit BI Practical File

The document outlines practical exercises for Business Intelligence and Data Analytics using Microsoft Excel and R/Python. It covers data importation, analysis using Pivot Tables and Charts, what-if analysis, data classification, clustering, linear regression, and logistic regression. Each practical includes step-by-step instructions and code snippets for executing the tasks.

Uploaded by

guptaji55545
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 60

Business Intelligence and Data Analytics

Practical 1.
Perform the analysis for the following:
a. Import the data warehouse data in Microsoft Excel and create the Pivot table and Pivot
Chart.

Ms Office Professional is used to make sure Power View is enabled for


visualization.)
Step 1: Open Excel 2013 (Professional)
Go to Data tab → Get External Data → From Other Sources → From Data
Connection Wizard

Step 2: In Data Connection Wizard → Select Microsoft SQL Server → Click on


Next
Step 3: In connect to Database Server provide Server name( Microsoft SQL
Server Name)
Provide password for sa account as given during installation of SQL Server
2012 full version)
Password: admin123
Click on Next

Step 4: In Select Database and Table→ Select Sales_DW (already created in


SQL) → check all dimensions and import relationships between selected
tables
Step 5: In save data connection files browse path and click on Finish

Step 6: In import data select Pivot Chart and click on OK


Step 7: In fields put SalesDateKey in filters, FullDateUK in axis and Sum of
ProductActualCost in values

Step 8: In Insert Tab → go to Pivot Table


Step 9: Click on Choose Connection to select existing connection with
Sales_DW and click on open
Pivot table and Pivot chart is created
Practical 1 b.

Import the cube in Microsoft Excel and create the Pivot table and Pivot Chart to perform data
Analysis.

Import the cube in Microsoft Excel and create the Pivot table and Pivot Chart
to perform data analysis.
Step 1: Open Excel 2013 (Professional)

Go to Data tab → Get External Data → From Other Sources → From Analysis
Services
Step 2: Select Server name and Windows Authentication and click on Next

Step 3: Select OLAP(as per created before) click on Next


Step 4: Browse and select path name and click on Finish
Step 5: Select PivotTableReport → OK

Step 6 : Drag and Drop Fields in rows column and values

Step 7: Go to Insert tab → pivot chart and select Pivot Chart from drop down
Step 8: Select existing connection OLAP Sales DW and click on Open
Step 9: Click on Choose connection to select path

Step 10: Click on OK


Practical 2.

Apply the what – if Analysis for data visualization. Design and generate necessary reports
based on the data warehouse data. Use Excel.

A book store and have 100 books in storage. You sell a ce


rtain % for the
highest price of $50 and a certain % for the lower price of $20.
If you sell 60% for the highest price, cell D10 calculates a total profit of 60 *
50 + 40 * 20 = 3800.
Create Different Scenarios But what if you sell 70% for the highest price?
And what if you sell 80% for the highest price? Or 90%, or even 100%? Each
different percentage is a different scenario. You can use the Scenario
Manager to create these scenarios.
Note: To type different percentage into cell C4 to see the corresponding
result of a scenario in cell D10 we use what if analysis.
What-if analysis enables you to easily compare the results of different
scenarios.
Step 1: In Excel, On the Data tab, in the Data tools group, click What-If
Analysis
Step 2: Click on What –if-Analysis and select scenario manager.

The Scenario Manager Dialog box appears.


Step 3: Add a scenario by clicking on Add.

Step 4: Type a name (60percent), select cell F10 (% sold for the highest
price) for the Changing cells and click on OK.
Click on icon which is circled.
Select F10 cell.

Click back on the icon again and then click OK

Step 5: Enter the corresponding value 0.6 and click on OK again.


Step 6: To apply scenarios click on Show
Step 7: Next, add 4 other scenarios (70%, 80%, 90% and 100%)
Finally, your Scenario Manager should be consistent with the picture below:

Practical 3

Perform the data classification using classification algorithm using R/Python.


Data Analysis using Time Series Analysis

Software required: R 3.5.1


Time series is a series of data points in which each data point is associated
with a timestamp. A simple example is the price of a stock in the stock
market at different points of time on a given day. Another example is the
amount of rainfall in a region at different months of the year. R language
uses many functions to create, manipulate and plot the time series data. The
data for the time series is stored in an R object called time-series object. It
is also a R data object like a vector or data frame.

The time series object is created by using the ts() function.

Syntax

The basic syntax for ts() function in time series analysis is −


timeseries.object.name <- ts(data, start, end, frequency)

Following is the description of the parameters used −


• data is a vector or matrix containing the values used in the time series.
• start specifies the start time for the first observation in time series.
• end specifies the end time for the last observation in time series.
• frequency specifies the number of observations per unit time.

Except the parameter "data" all other parameters are optional

Consider the annual rainfall details at a place starting from January 2012.

We create an R time series object for a period of 12 months and plot it.
Code to run in R

# Get the data points in form of a R vector.


rainfall <-
c(799,1174.8,865.1,1334.6,635.4,918.5,685.5,998.6,784.2,985,882.8,1071)
# Convert it to a time series object.
rainfall.timeseries <- ts(rainfall,start = c(2012,1),frequency = 12)

# Print the timeseries data.


print(rainfall.timeseries)
# Give the chart file a name.
png(file = "rainfall.png")
# Plot a graph of the time series.
plot(rainfall.timeseries)
# Save the file.
dev.off()
After this again plot to get chart
plot(rainfall.timeseries)
Output:
When we execute the above code, it produces the following result and chart

Jan Feb Mar Apr May Jun Jul Aug Sep 2012 799.0 1174.8
865.1 1334.6 635.4 918.5 685.5 998.6 784.2 Oct Nov Dec 2012
985.0 882.8 1071.0
Practical 4.
Perform the data clustering using a clustering algorithm using R/Python.

k-means clustering using R


#apply K means to iris and store result
newiris <- iris
newiris$Species <- NULL
(kc <- kmeans(newiris,3))

#Compare the Species label with the clustering result


table(iris$Species,kc$cluster)

#Plot the clusters and their centers


plot(newiris[c("Sepal.Length","Sepal.Width")],col=kc$cluster)
points(kc$centers[,c("Sepal.Length","Sepal.Width")],col=1:3,pch=8,cex=2)
dev.off()
#Plot the clusters and their centre
plot(newiris[c("Sepal.Length","Sepal.Width")],col=kc$cluster)
Practical 5

Perform the Linear regression on the given data warehouse data using R/Python.

Input Data
Below is the sample data representing the observations –
# Values of height
151, 174, 138, 186, 128, 136, 179, 163, 152, 131
# Values of weight.
63, 81, 56, 91, 47, 57, 76, 72, 62, 48
lm() Function :
This function creates the relationship model between the predictor and the
response variable.
Syntax :
The basic syntax for lm() function in linear regression is −
lm(formula,data)
Following is the description of the parameters used :−
• formula is a symbol presenting the relation between x and y.
• data is the vector on which the formula will be applied.
A. Create Relationship Model & get the Coefficients
# Values of height
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)

# Values of width
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Apply the lm() function.


relation <- lm(y~x)
print(relation)
B. Get the Summary of the Relationship
# Values of height
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
# Values of width
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
# Apply the lm() function.
relation <- lm(y~x)
print(summary(relation))

Output:

predict() Function
Syntax
The basic syntax for predict() in linear regression is −
predict(object, newdata)

Following is the description of the parameters used −


• object is the formula which is already created using the lm() function.
• newdata is the vector containing the new value for predictor variable.
C. Predict the weight of new persons
# The predictor vector.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
# The response vector.
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
# Apply the lm() function.
relation <- lm(y~x)
# Find weight of a person with height 170.
a <- data.frame(x = 170)
result <- predict(relation,a)
print(result)

OUTPUT:

D. Visualize the Regression Graphically


# Create the predictor and response variable.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
relation <- lm(y~x)
# Give the chart file a name.
png(file = "linearregression.png")
# Plot the chart.
plot(y,x,col = "blue",main = "Height & Weight Regression",
abline(lm(x~y)),cex = 1.3,pch = 16,xlab = "Weight in Kg",ylab = "Height in
cm")
# Save the file.
dev.off()

# Plot the chart.


plot(y,x,col = "blue",main = "Height & Weight Regression",
abline(lm(x~y)),cex = 1.3,pch = 16,xlab = "Weight in Kg",ylab = "Height in
cm")
Practical :6

Perform the logistic regression on the given data warehouse data using R/Python.

Perform the logistic regression on the given data warehouse data.


To perform this you need to download quality.csv file from following link:

https://github.com/TarekDib03/Analytics/tree/master/Week3%20-%20Logistic
%20Regression/Data

#provide path of file where it is saved on your machine


quality <- read.csv('C:/Users/Gauri/Downloads/quality.csv')
> #analysing the quality dataset
> str(quality)

'data.frame': 131 obs. of 14 variables:


$ MemberID : int 1 2 3 4 5 6 7 8 9 10 ...
$ InpatientDays : int 0 1 0 0 8 2 16 2 2 4 ...
$ ERVisits : int 0 1 0 1 2 0 1 0 1 2 ...
$ OfficeVisits : int 18 6 5 19 19 9 8 8 4 0 ...
$ Narcotics : int 1 1 3 0 3 2 1 0 3 2 ...
$ DaysSinceLastERVisit: num 731 411 731 158 449 ...
$ Pain : int 10 0 10 34 10 6 4 5 5 2 ...
$ TotalVisits : int 18 8 5 20 29 11 25 10 7 6 ...
$ ProviderCount : int 21 27 16 14 24 40 19 11 28 21 ...
$ MedicalClaims : int 93 19 27 59 51 53 40 28 20 17 ...
$ ClaimLines : int 222 115 148 242 204 156 261 87 98 66 ...
$ StartedOnCombination: logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ AcuteDrugGapSmall : int 0 1 5 0 0 4 0 0 0 0 ...
$ PoorCare : int 0 0 0 0 0 1 0 0 1 0 ...

> table(quality$PoorCare)
01
98 33

> 98/131
[1] 0.7480916

> install.packages("caTools")

provides a collection of tools for data analysis, including functions for splitting data,
running moving averages, and performing various mathematical and statistical
operations

Installing package into ‘C:/Users/Gauri/Documents/R/win-library/3.5’


(as ‘lib’ is unspecified)
--- Please select a CRAN mirror for use in this session ---
also installing the dependency ‘bitops’

trying URL
'http://mirror.its.dal.ca/cran/bin/windows/contrib/3.5/bitops_1.0-6.zip'
Content type 'application/zip' length 38894 bytes (37 KB)
downloaded 37 KB
trying URL
'http://mirror.its.dal.ca/cran/bin/windows/contrib/3.5/caTools_1.17.1.1.zi
p'
Content type 'application/zip' length 329665 bytes (321 KB)
downloaded 321 KB
package ‘bitops’ successfully unpacked and MD5 sums checked
package ‘caTools’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\Gauri\AppData\Local\Temp\RtmpmUN9oK\downloaded_package
s
> library(caTools)
Warning message:
package ‘caTools’ was built under R version 3.5.2
> set.seed(88)

seed() function in R Programming Language is used to create random numbers that can be
reproduced.

> split = sample.split(quality$PoorCare, SplitRatio = 0.75)

Used to split the data used during classification into train and test subsets.
>
> split
[1] TRUE TRUE TRUE TRUE FALSE TRUE FALSE TRUE FALSE FALSE TRUE
FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE FALSE TRUE TRUE
[28] TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE TRUE FALSE
TRUE TRUE TRUE FALSE FALSE TRUE TRUE FALSE TRUE FALSE TRUE
FALSE TRUE TRUE FALSE FALSE TRUE
[55] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
FALSE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE
TRUE TRUE TRUE TRUE TRUE
[82] TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE
TRUE TRUE TRUE FALSE
[109] TRUE FALSE FALSE TRUE TRUE FALSE TRUE TRUE TRUE FALSE
TRUE TRUE FALSE TRUE TRUE FALSE TRUE TRUE FALSE TRUE TRUE TRUE
FALSE
> qualityTrain = subset(quality, split == TRUE)
> qualityTest = subset(quality, split == FALSE)
> nrow(qualityTrain)
[1] 99

nrow() function in R Language is used to return the number of rows of the specified
matrix

> nrow(qualityTest)
[1] 32
> QualityLog = glm(PoorCare ~ OfficeVisits + Narcotics,data=qualityTrain,
family=binomial)

glm is used to fit generalized linear models, specified by giving a symbolic description of
the linear predictor and a description of the error distribution.

> summary(QualityLog)
Call:
glm(formula = PoorCare ~ OfficeVisits + Narcotics, family = binomial,
data = qualityTrain)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.06303 -0.63155 -0.50503 -0.09689 2.16686
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.64613 0.52357 -5.054 4.33e-07 ***
OfficeVisits 0.08212 0.03055 2.688 0.00718 **
Narcotics 0.07630 0.03205 2.381 0.01728 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 111.888 on 98 degrees of freedom
Residual deviance: 89.127 on 96 degrees of freedom
AIC: 95.127
Number of Fisher Scoring iterations: 4

> predictTrain = predict(QualityLog, type="response")


> summary(predictTrain)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.06623 0.11912 0.15967 0.25253 0.26765 0.98456
> tapply(predictTrain, qualityTrain$PoorCare, mean)
01
0.1894512 0.4392246

tapply() is used to apply a function over subsets of a vector. It is primarily used when we
have the following circumstances: A dataset that can be broken up into groups

> table(qualityTrain$PoorCare, predictTrain > 0.5)


FALSE TRUE
0 70 4
1 15 10
> 10/25
[1] 0.4
> 70/74
[1] 0.9459459
> table(qualityTrain$PoorCare, predictTrain > 0.7)
FALSE TRUE
0 73 1
1 17 8
> 8/25
[1] 0.32
> 73/74
[1] 0.9864865
> table(qualityTrain$PoorCare, predictTrain > 0.2)
FALSE TRUE
0 54 20
1 9 16
> 16/25
[1] 0.64
> 54/74
[1] 0.7297297
> install.packages("ROCR")

ROCR is a flexible evaluation package for R , a statistical language that is widely used in
biomedical data analysis

Installing package into ‘C:/Users/Gauri/Documents/R/win-library/3.5’


(as ‘lib’ is unspecified)
also installing the dependencies ‘gtools’, ‘gdata’, ‘gplots’
trying URL
'http://mirror.its.dal.ca/cran/bin/windows/contrib/3.5/gtools_3.8.1.zip'
Content type 'application/zip' length 325812 bytes (318 KB)
downloaded 318 KB
trying URL
'http://mirror.its.dal.ca/cran/bin/windows/contrib/3.5/gdata_2.18.0.zip'
Content type 'application/zip' length 1260728 bytes (1.2 MB)
downloaded 1.2 MB
trying URL
'http://mirror.its.dal.ca/cran/bin/windows/contrib/3.5/gplots_3.0.1.1.zip'
Content type 'application/zip' length 656764 bytes (641 KB)
downloaded 641 KB
trying URL
'http://mirror.its.dal.ca/cran/bin/windows/contrib/3.5/ROCR_1.0-7.zip'
Content type 'application/zip' length 201823 bytes (197 KB)
downloaded 197 KB
package ‘gtools’ successfully unpacked and MD5 sums checked
package ‘gdata’ successfully unpacked and MD5 sums checked
package ‘gplots’ successfully unpacked and MD5 sums checked
package ‘ROCR’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\Gauri\AppData\Local\Temp\RtmpmUN9oK\downloaded_package
s
> library(ROCR)
Loading required package: gplots
Attaching package: ‘gplots’
The following object is masked from ‘package:stats’:
lowess
Warning messages:
1: package ‘ROCR’ was built under R version 3.5.2
2: package ‘gplots’ was built under R version 3.5.2
> ROCRpred = prediction(predictTrain, qualityTrain$PoorCare)
> ROCRperf = performance(ROCRpred, "tpr", "fpr")
> plot(ROCRperf)
> plot(ROCRperf, colorize=TRUE)
> plot(ROCRperf, colorize=TRUE, print.cutoffs.at=seq(0,1,by=0.1),
text.adj=c(-0.2,1.7))
>
Practical 7.
Write a Python program to read data from a CSV file, perform simple data analysis, and
generate basic insights. (Use Pandas is a Python library).

Csv file link:

https://drive.google.com/file/d/1hrrLkdbAqXg0Y0K3M_6_MzcKrzPzuj6p/view?usp=sharing

import pandas as pd

# Step 1: Read the CSV file

df = pd.read_csv('/data.csv') # Ensure the file 'data.csv' is in the same directory

# Step 2: Basic Exploration of the Data

print("First 5 rows of the dataset:")

print(df.head()) # View the first 5 rows of the dataset

print("\nData Information (Types, Non-null counts, etc.):")

print(df.info()) # Basic info about the dataset (columns, data types, null values)

print("\nStatistical Summary of the dataset:")

print(df.describe()) # Summary statistics for numerical columns

# Step 3: Data Analysis

# 3.1: Check for missing values in each column

print("\nMissing values in each column:")

print(df.isnull().sum()) # Check how many missing values in each column

# 3.2: Calculate mean, median, and mode for numerical columns


print("\nMean of numerical columns:")

print(df['Sales'].mean()) # Calculate the mean of 'Sales'

print("\nMedian of 'Sales' column:")

print(df['Sales'].median()) # Calculate the median of 'Sales'

print("\nMode of 'Sales' column:")

print(df['Sales'].mode()[0]) # Calculate the mode of 'Sales'

# Step 4: Insights

# 4.1: Find the row with the maximum sales

max_sales_row = df[df['Sales'] == df['Sales'].max()]

print("\nRow with maximum sales:")

print(max_sales_row)

# 4.2: Find the column with the highest mean value in 'Sales'

print("\nProduct with the highest sales:")

max_sales_product = df[df['Sales'] == df['Sales'].max()]['Product'].iloc[0]

print(max_sales_product)

# 4.3: Number of unique categories

print("\nUnique categories in the dataset:")

print(df['Category'].unique()) # Display unique values in the 'Category' column

Output:
Practical 8 a:

8. Perform data visualization

a. Perform data visualization using Python on any sales data.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Generate sample sales data


np.random.seed(42)
dates = pd.date_range(start="2024-01-01", end="2024-01-31", freq='D')
categories = ['Electronics', 'Clothing', 'Groceries']
regions = ['North', 'South', 'East', 'West']
data = {
'Date': np.random.choice(dates, 500),
'Product_Category': np.random.choice(categories, 500),
'Sales_Amount': np.random.randint(50, 500, 500),
'Region': np.random.choice(regions, 500),
}

sales_data = pd.DataFrame(data)

# Summarizing sales data


daily_sales = sales_data.groupby('Date')['Sales_Amount'].sum().reset_index()
category_sales = sales_data.groupby('Product_Category')
['Sales_Amount'].sum().reset_index()
region_sales = sales_data.groupby('Region')['Sales_Amount'].sum().reset_index()

# 1. Sales trend over time


plt.figure(figsize=(12, 6))
sns.lineplot(data=daily_sales, x='Date', y='Sales_Amount', marker='o')
plt.title('Daily Sales Trend')
plt.xlabel('Date')
plt.ylabel('Total Sales Amount')
plt.xticks(rotation=45)
plt.grid()
plt.show()

# 2. Sales by product category


plt.figure(figsize=(8, 6))
sns.barplot(data=category_sales, x='Product_Category', y='Sales_Amount', palette='viridis')
plt.title('Sales by Product Category')
plt.xlabel('Product Category')
plt.ylabel('Total Sales Amount')
plt.show()

# 3. Sales distribution across regions


plt.figure(figsize=(8, 6))
sns.barplot(data=region_sales, x='Region', y='Sales_Amount', palette='coolwarm')
plt.title('Sales Distribution by Region')
plt.xlabel('Region')
plt.ylabel('Total Sales Amount')
plt.show()

# 4. Sales distribution heatmap


pivot_table = sales_data.pivot_table(values='Sales_Amount', index='Product_Category',
columns='Region', aggfunc='sum')
plt.figure(figsize=(8, 6))
sns.heatmap(pivot_table, annot=True, fmt=".0f", cmap="YlGnBu", linewidths=0.5)
plt.title('Sales Distribution Heatmap')
plt.show()
Output:
Practical 8 b:
Perform data visualization using Power BI on any sales data.

File link:

https://docs.google.com/spreadsheets/d/18_AdbAbYgaPRSLNbjwwjXiYPAVTH3jRv/edit?
usp=sharing&ouid=116572858177712191934&rtpof=true&sd=true

1. Starting window of Power BI:


2. Load Data from Get data and select load option:
3. Sales shown by Pie Chart selecting the Country & Sales option:
4. Adding filter for Clear Output:
Practical 9.

Create the Data staging area for the selected database using SQL.
Step 1.
Click File -> New -> Project ->Business Intelligence Projects ->select Analysis Services
Project-> Assign Project Name -> Click OK.
Step 2. In Solution Explorer, Right click on Data Source -> Click New Data Source.

Step 3. Click on New.


Step 4. Creating New connection.

Step 5. Click on Test Connection and verify for its success.

Step 6. Select Connection created in Data Connections-> Click Next.


Step 7. Select Option Inherit.

Step 8. Assign Data Source Name -> Click Finish.


Step 9. In the Solution Explorer, Right Click on Data Source View -> Click on New Data
Source View.

Step 10. Click Next.


Step 11. Select Relational Data Source we have created previously (Sales_DW)-> Click
Next.

Step 12. First move your Fact Table to the right side to include in object list.
Step 13.Select Fact Table in Right Pane (Fact product Sales) -> Click On Add Related
Tables.

14. Assign Name (SalesDW DSV)-> Click Finish.


Step 15. Now Data Source View is ready to use.

Step 16. In Solution Explorer -> Right Click on Cube-> Click New Cube
Step 17. Click Next. Step 18. Select Option Use existing Tables -> Click Next.
Step 19. Select Fact Table Name from Measure Group Tables (FactProductSales) -> Click
Next.

Step 20. Choose Measures from the List which you want to place in your Cube --> Click
Next.

Step 21. Select All Dimensions here which are associated with your Fact Table-> Click Next.
Step 22. Assign Cube Name (SalesDW2) -> Click Finish.

Step 23. Now your Cube is ready, you can see the newly created cube and dimensions
added in your solution explorer.
Step 24.In Solution Explorer, double click on dimension Dim Product -> Drag and Drop
Product Name from Table in Data Source View and Add in Attribute Pane at left side.

Step 25.Double click On Dim Date dimension -> Drag and Drop Fields from Table shown in
Data Source View to Attributes-> Drag and Drop attributes from leftmost pane of attributes to
middle pane of Hierarchy.
Step 26. In Solution Explorer, right click on Project Name (Analysis Services Project3) -- >
Click Properties.

Step 27.In Configuration Properties, Select Deployment-> Assign Your SQL Server Instance
Name Where Analysis Services Is Installed (mubin-pc\fairy) (Machine Name\Instance Name)
-> Choose Deployment Mode Deploy All as of now ->Select Processing Option Do Not
Process ->
Step 28. In Solution Explorer, right click on Project Name (AnalysisServicesProject) -- >

Click Deploy.

Step 29. Once Deployment will finish, you can see the message Deployment Completed in
deployment Properties.
Step 30. In Solution Explorer, right click on Project Name (AnalysisServicesProject3) -- >
Click Process.

Step 31. Click on Run button to process the Cube.


Step 32. Once processing is complete, you can see Status as Process Succeeded -->Click
Close to close both the open windows for processing one after the other.

Step 33. In Solution Explorer, right click on Cube Name (SalesDW2) -- > Click Browse.
Step 34. Drag and drop measures in to Detail fields, & Drag and Drop Dimension Attributes
in Row Field or Column fields.

Practical 10.

Create the cube with suitable dimension and fact tables based on ROLAP, MOLAP and
HOLAP model.

Step 1: Creating Data Warehouse


Let us execute our T-SQL Script to create data warehouse with fact tables,
dimensions and populate them with appropriate test values.
Download T-SQL script attached with this article for creation of Sales Data
Warehouse or download from this article “Create First Data Warehouse” and
run it in your SQL Server.
Downloading "Data_WareHouse__SQLScript.zip" from the article

https://www.codeproject.com/Articles/652108/Create-First-Data-WareHouse

After downloading extract file in folder.


Follow the given steps to run the query in SSMS (SQL Server Management
Studio).
1. Open SQL Server Management Studio 2012
2. Connect Database Engine
Password for sa : admin123 (as given during installation)
Click Connect.
3. Open New Query editor
4. Copy paste Scripts given below in various steps in new query editor
window one by one
5. To run the given SQL Script, press F5
6. It will create and populate “Sales_DW” database on your SQL Server
OR
1. Go to the extracted sql file and double click on it.
2. New Sql Query Editor will be opened containing Sales_DW Database.
3. Click on execute or press F5 by selecting query one by one or directly
click on Execute.
4. After completing execution save and close SQL Server Management
studio & Reopen to see Sales_DW in Databases Tab.

In Select Measure Group Tables → Select FactProductSales → Click Next

In Select Measures → check all measures → Next


In Select New Dimensions → Check all Dimensions → Next
Click on Finish
Sales_DW.cube is created
Step 5: Dimension Modification
In dimension tab → Double Click Dim Product.dim
Drag and Drop Product Name from Table in Data Source View and Add in
Attribute Pane at left side
Step 6: Creating Attribute Hierarchy in Date Dimension
Double click On Dim Date dimension -> Drag and Drop Fields from Table
shown in Data Source View to Attributes-> Drag and Drop attributes from
leftmost pane of attributes to middle pane of Hierarchy.
Drag fields in sequence from Attributes to Hierarchy window (Year, Quarter
Name, Month Name, Week of the Month, Full Date UK)
Step 7: Deploy Cube
Right click on Project name → Properties
This window appaers
Do following changes and click on Apply & ok
Right click on project name → Deploy
Deployment successful
To process cube right click on Sales_DW.cube → Process
Click run
Browse the cube for analysis in solution explorer

You might also like