0% found this document useful (0 votes)
9 views21 pages

c164 Biva Exp2

The document outlines a practical assignment for students at SVKM's NMIMS, focusing on data manipulation using SAS. It includes tasks for creating and filtering datasets, sorting data, removing duplicates, and using titles and footnotes in output. The objective is to enhance students' understanding of SAS procedures like PROC PRINT and PROC SORT for effective data analysis.

Uploaded by

dviti mehta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views21 pages

c164 Biva Exp2

The document outlines a practical assignment for students at SVKM's NMIMS, focusing on data manipulation using SAS. It includes tasks for creating and filtering datasets, sorting data, removing duplicates, and using titles and footnotes in output. The objective is to enhance students' understanding of SAS procedures like PROC PRINT and PROC SORT for effective data analysis.

Uploaded by

dviti mehta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025


Name: Dviti Mehta
Roll: C164
Semester: X
Class: D
Branch: Computer
Practical-2 Part A
Aim: Working with SAS Dataset and write Program to subset Report data and Sort data.

Section A
a. Create a new library SAS_XXX to access any dataset in SASHELP.
b. Create a new dataset Sales in your library. (Help file given)
 Employee_ID : Unique identifier for each employee (Numeric)
 First_Name : Employee first name (Character)
 Last_Name : Employee last name (Character)
 Gender : Gender of the employee (Character, M/F)
 Salary : Employee's salary (Numeric)
 Job_Title : The title of the employee's position (Character)
 Country : Country code (Character)
 Birth_Date : Employee's birth date (Numeric, represented as SAS date)
 Hire_Date : The date the employee was hired (Numeric, represented as SAS date)

Section B: Filtering Data


1. Display the entire sales dataset.
2. Print records of employees who have a salary greater than 50000.
3. Print records of employees whose salary is between 25000 and 30000.
4. Print records of employees whose first name starts with "A".
5. Print records of employees hired after January 1, 2018.
6. Print records of employees whose birthdate is between January 1, 1970, and December
31, 1990.

Section C: Sorting and Title


7. Sort the sales dataset by Salary in descending order and print the results.
8. Sort the sales dataset first by Job_Title and then by Salary in ascending order.
9. Print the dataset with a title indicating it lists all employees.
10. Print the sales dataset with a title and a footnote.
11. Print the sales dataset and include a subtotal for each job title, showing the total salary
for each title.

Section D: Removing Duplicate data


Write SAS program to create dataset has shown below and performed following operation.
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

1. Sort the dataset class_test2 by removing duplicate records and create test_clean and
test_duplicate dataset.
2. Print all three datasets with appropriate title.

Objective:
This lab manual aims to help students understand how to manipulate and analyze data using SAS
Base, focusing on the PROC PRINT and PROC SORT procedures, as well as various operators for
data filtering and sorting.

Outcome:To create dataset and performed data analysis and examine SAS Datasets and
Library

Theory:

In SAS programming, data management is a crucial skill. This section covers various
powerful procedures that allow you to manipulate datasets effectively. We’ll discuss the
SORT procedure, the WHERE statement, removing duplicates, and using titles and
footnotes in your SAS output.

1. SORT Procedure
The SORT procedure (PROC SORT) is used to order the observations in a dataset based on
the values of specified variables. Sorting is an important step before performing certain
operations, such as merging datasets or preparing data for reporting.
Syntax
The basic syntax of PROC SORT is:
proc sort data=dataset_name out=sorted_dataset_name;
by variable(s);
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025


run;
Example Code
/* Creating a sample dataset */
data work.students;
input StudentID Name $ Age Major $;
datalines;
3 John 21 Math
2 Alice 22 Science
1 Bob 20 Arts
4 Charlie 23 Math
;
run;

/* Sorting the dataset by Name */


proc sort data=work.students out=work.sorted_students;
by Name;
run;

/* Displaying the sorted dataset */


proc print data=work.sorted_students;
title 'Sorted Students Dataset by Name';
run;

2. WHERE Statement
The WHERE statement is used to filter observations in a dataset. It helps in selecting
specific rows based on conditions applied to the variables.
Syntax
The syntax for the WHERE statement is:
data new_dataset;
set original_dataset;
where condition;
run;
Example Code
/* Filtering dataset for students older than 21 */
data work.filtered_students;
set work.students;
where Age > 21;
run;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025


/* Displaying the filtered dataset */
proc print data=work.filtered_students;
title 'Students Older than 21';
run;

3. Removing Duplicates with PROC SORT


You can also use PROC SORT to remove duplicate observations based on specific
variables. The NODUPLICATES option in the PROC SORT statement does this.
Example Code
/* Creating a dataset with duplicates */
data work.students_with_duplicates;
input StudentID Name $ Age Major $;
datalines;
1 John 21 Math
1 John 21 Math
2 Alice 22 Science
3 Bob 20 Arts
3 Bob 20 Arts
;
run;
/* Removing duplicates based on StudentID */
proc sort data=work.students_with_duplicates nodupkey
out=work.students_no_duplicates;
by StudentID;
run;

/* Displaying the dataset without duplicates */


proc print data=work.students_no_duplicates;
title 'Students Dataset without Duplicates';
run;

4. Title and Footnote


You can add titles and footnotes to your output to enhance clarity and context. Titles are
displayed at the top of the output, while footnotes appear at the bottom.
Syntax
title 'Your Title Here';
footnote 'Your Footnote Here';
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

Example Code
/* Adding a title and footnote */
title 'Summary of Students Data';
footnote 'Data reflects student demographics for 2023';

/* Displaying the sorted dataset */


proc print data=work.students_no_duplicates;
run;

/* Clearing the title and footnote */


title;
footnote;
Procedure:
1. Open SAS Studio and write the SAS program
Instructions:
1. Write source code of all procedure in your workbook

Part B
Questions & Answers:

proc print data=sales;


RUN;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

proc print data = sales;


where salary > 50000;
run;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

proc print data=SAS_XXX.Sales;


where Salary between 25000 and 30000;
run;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

proc print data=SAS_XXX.Sales;


where upcase(substr(First_Name, 1, 1)) = 'A';
run;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

proc print data=sales;


where Hire_Date > '01JAN2018'd;
run;

proc print data=sales;


where Birth_Date between '01JAN1970'd and '31DEC1990'd;
run;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

proc sort data=sales out=SortedSales;


by descending Salary;
run;

proc print data=SortedSales;


run;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

proc sort data=SAS_XXX.Sales out=SortedSales;


by Job_Title Salary;
run;

proc print data=SortedSales;


run;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

title 'List of All Employees';


proc print data=sales;
run;
title;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

title 'List of All Employees';


footnote 'Data sourced from the Sales dataset';
proc print data=sales;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025


run;
title;
footnote;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

proc sort data=sales out=SortedSales;


by Job_Title;
run;

proc print data=SortedSales;


by Job_Title;
sum Salary;
title 'Employee Salary by Job Title';
run;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

data class_test2;
input Name $ Subject $ TestScore;
datalines;
Judy Math 97
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025


Judy Reading 97
Barbara Math 91
Barbara Reading 86
Louise Math 92
Louise Reading 99
James Math 90
James Reading 85
;
RUN;

proc sort data=class_test2 out=test_clean nodupkey;


by Name Subject;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025


RUN;

data test_duplicate;
set class_test2;
by Name Subject;
if first.Subject = 0;
RUN;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

title "Original Dataset: class_test2";


proc print data=class_test2;
run;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

title "Cleaned Dataset: test_clean (Duplicates Removed)";


proc print data=test_clean;
run;

title "Dataset with Duplicates: test_duplicate";


proc print data=test_duplicate;
run;

title;
SVKM’S NMIMS

MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&


ENGINEERING

Academic Year: 2024-2025

Conclusion:

Question:
1. What is difference between proc print and proc contents
2. How to represent missing values in SAS Dataset
3. List the options used in proc print and proc contents
4. Which statement is used to create library. Is it global statement.

You might also like