SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
Name: Dviti Mehta
Roll: C164
Semester: X
Class: D
Branch: Computer
Practical-2 Part A
Aim: Working with SAS Dataset and write Program to subset Report data and Sort data.
Section A
a. Create a new library SAS_XXX to access any dataset in SASHELP.
b. Create a new dataset Sales in your library. (Help file given)
Employee_ID : Unique identifier for each employee (Numeric)
First_Name : Employee first name (Character)
Last_Name : Employee last name (Character)
Gender : Gender of the employee (Character, M/F)
Salary : Employee's salary (Numeric)
Job_Title : The title of the employee's position (Character)
Country : Country code (Character)
Birth_Date : Employee's birth date (Numeric, represented as SAS date)
Hire_Date : The date the employee was hired (Numeric, represented as SAS date)
Section B: Filtering Data
1. Display the entire sales dataset.
2. Print records of employees who have a salary greater than 50000.
3. Print records of employees whose salary is between 25000 and 30000.
4. Print records of employees whose first name starts with "A".
5. Print records of employees hired after January 1, 2018.
6. Print records of employees whose birthdate is between January 1, 1970, and December
31, 1990.
Section C: Sorting and Title
7. Sort the sales dataset by Salary in descending order and print the results.
8. Sort the sales dataset first by Job_Title and then by Salary in ascending order.
9. Print the dataset with a title indicating it lists all employees.
10. Print the sales dataset with a title and a footnote.
11. Print the sales dataset and include a subtotal for each job title, showing the total salary
for each title.
Section D: Removing Duplicate data
Write SAS program to create dataset has shown below and performed following operation.
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
1. Sort the dataset class_test2 by removing duplicate records and create test_clean and
test_duplicate dataset.
2. Print all three datasets with appropriate title.
Objective:
This lab manual aims to help students understand how to manipulate and analyze data using SAS
Base, focusing on the PROC PRINT and PROC SORT procedures, as well as various operators for
data filtering and sorting.
Outcome:To create dataset and performed data analysis and examine SAS Datasets and
Library
Theory:
In SAS programming, data management is a crucial skill. This section covers various
powerful procedures that allow you to manipulate datasets effectively. We’ll discuss the
SORT procedure, the WHERE statement, removing duplicates, and using titles and
footnotes in your SAS output.
1. SORT Procedure
The SORT procedure (PROC SORT) is used to order the observations in a dataset based on
the values of specified variables. Sorting is an important step before performing certain
operations, such as merging datasets or preparing data for reporting.
Syntax
The basic syntax of PROC SORT is:
proc sort data=dataset_name out=sorted_dataset_name;
by variable(s);
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
run;
Example Code
/* Creating a sample dataset */
data work.students;
input StudentID Name $ Age Major $;
datalines;
3 John 21 Math
2 Alice 22 Science
1 Bob 20 Arts
4 Charlie 23 Math
;
run;
/* Sorting the dataset by Name */
proc sort data=work.students out=work.sorted_students;
by Name;
run;
/* Displaying the sorted dataset */
proc print data=work.sorted_students;
title 'Sorted Students Dataset by Name';
run;
2. WHERE Statement
The WHERE statement is used to filter observations in a dataset. It helps in selecting
specific rows based on conditions applied to the variables.
Syntax
The syntax for the WHERE statement is:
data new_dataset;
set original_dataset;
where condition;
run;
Example Code
/* Filtering dataset for students older than 21 */
data work.filtered_students;
set work.students;
where Age > 21;
run;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
/* Displaying the filtered dataset */
proc print data=work.filtered_students;
title 'Students Older than 21';
run;
3. Removing Duplicates with PROC SORT
You can also use PROC SORT to remove duplicate observations based on specific
variables. The NODUPLICATES option in the PROC SORT statement does this.
Example Code
/* Creating a dataset with duplicates */
data work.students_with_duplicates;
input StudentID Name $ Age Major $;
datalines;
1 John 21 Math
1 John 21 Math
2 Alice 22 Science
3 Bob 20 Arts
3 Bob 20 Arts
;
run;
/* Removing duplicates based on StudentID */
proc sort data=work.students_with_duplicates nodupkey
out=work.students_no_duplicates;
by StudentID;
run;
/* Displaying the dataset without duplicates */
proc print data=work.students_no_duplicates;
title 'Students Dataset without Duplicates';
run;
4. Title and Footnote
You can add titles and footnotes to your output to enhance clarity and context. Titles are
displayed at the top of the output, while footnotes appear at the bottom.
Syntax
title 'Your Title Here';
footnote 'Your Footnote Here';
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
Example Code
/* Adding a title and footnote */
title 'Summary of Students Data';
footnote 'Data reflects student demographics for 2023';
/* Displaying the sorted dataset */
proc print data=work.students_no_duplicates;
run;
/* Clearing the title and footnote */
title;
footnote;
Procedure:
1. Open SAS Studio and write the SAS program
Instructions:
1. Write source code of all procedure in your workbook
Part B
Questions & Answers:
proc print data=sales;
RUN;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
proc print data = sales;
where salary > 50000;
run;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
proc print data=SAS_XXX.Sales;
where Salary between 25000 and 30000;
run;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
proc print data=SAS_XXX.Sales;
where upcase(substr(First_Name, 1, 1)) = 'A';
run;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
proc print data=sales;
where Hire_Date > '01JAN2018'd;
run;
proc print data=sales;
where Birth_Date between '01JAN1970'd and '31DEC1990'd;
run;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
proc sort data=sales out=SortedSales;
by descending Salary;
run;
proc print data=SortedSales;
run;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
proc sort data=SAS_XXX.Sales out=SortedSales;
by Job_Title Salary;
run;
proc print data=SortedSales;
run;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
title 'List of All Employees';
proc print data=sales;
run;
title;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
title 'List of All Employees';
footnote 'Data sourced from the Sales dataset';
proc print data=sales;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
run;
title;
footnote;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
proc sort data=sales out=SortedSales;
by Job_Title;
run;
proc print data=SortedSales;
by Job_Title;
sum Salary;
title 'Employee Salary by Job Title';
run;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
data class_test2;
input Name $ Subject $ TestScore;
datalines;
Judy Math 97
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
Judy Reading 97
Barbara Math 91
Barbara Reading 86
Louise Math 92
Louise Reading 99
James Math 90
James Reading 85
;
RUN;
proc sort data=class_test2 out=test_clean nodupkey;
by Name Subject;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
RUN;
data test_duplicate;
set class_test2;
by Name Subject;
if first.Subject = 0;
RUN;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
title "Original Dataset: class_test2";
proc print data=class_test2;
run;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
title "Cleaned Dataset: test_clean (Duplicates Removed)";
proc print data=test_clean;
run;
title "Dataset with Duplicates: test_duplicate";
proc print data=test_duplicate;
run;
title;
SVKM’S NMIMS
MUKESH PATEL SCHOOL OF TECHNOLOGY MANAGEMENT&
ENGINEERING
Academic Year: 2024-2025
Conclusion:
Question:
1. What is difference between proc print and proc contents
2. How to represent missing values in SAS Dataset
3. List the options used in proc print and proc contents
4. Which statement is used to create library. Is it global statement.