Frequency Distributions and Cross Tabulations in SAS

Last updated on Sep 14 2022
Nitin Pawar

Table of Contents

Frequency Distributions and Cross Tabulations in SAS

A frequency distribution is a table showing the frequency of the data points in a data set. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval, and in this way, the table summarizes the distribution of values in the sample.

SAS provides a procedure called PROC FREQ to calculate the frequency distribution of data points in a data set.

Syntax

The basic syntax for calculating frequency distribution in SAS is −

PROC FREQ DATA = Dataset;
TABLES Variable_1;
BY Variable_2;

Following is the description of the parameters used −

• Dataset is the name of the dataset.

• Variables_1 is the variable names of the dataset whose frequency distribution needs to be calculated.

• Variables_2 is the variables which categorized the frequency distribution result.

Single Variable Frequency Distribution

We can determine the frequency distribution of a single variable by using PROC FREQ. In this case the result will show the frequency of each value of the variable. The result also shows the percentage distribution, cumulative frequency and cumulative percentage.

Example

In the below example we find the frequency distribution of the variable horsepower for the dataset named CARS1 which is created form the library SASHELP.CARS. We can see the result divided into two categories of results. One for each make of the car.

PROC SQL;
create table CARS1 as
SELECT make, model, type, invoice, horsepower, length, weight
FROM
SASHELP.CARS
WHERE make in ('Audi','BMW')
;
RUN;

proc FREQ data = CARS1 ;
tables horsepower;
by make;
run;

When the above code is executed, we get the following result −

 

sas 6

Multiple Variable Frequency Distribution

We can find the frequency distributions for multiple variables which groups them into all possible combinations.

Example

In the below example we calculate the frequency distribution for the make of a car for grouped by car type and also the frequency distribution of each type of car grouped by each make.

proc FREQ data = CARS1 ;

tables make type;

run;

When the above code is executed, we get the following result −

sas 7

Frequency Distribution with Weight

With the weight option we can calculate the frequency distribution biased with the weight of the variable. Here the value of the variable is taken as the number of observations instead of the count of value.

Example

In the below example we calculate the frequency distribution of the variables make and type with weight assigned to horsepower.

proc FREQ data = CARS1 ;

tables make type;

weight horsepower;

run;

When the above code is executed, we get the following result −

sas 8

SAS – Cross Tabulations

Cross tabulation involves producing cross tables also called contingent tables using all possible combinations of two or more variables. In SAS it is created using PROC FREQ along with the TABLES option. For example – if we need the frequency of each model for each make in each car type category, then we need to use the TABLES option of PROC FREQ.

Syntax

The basic syntax for applying cross tabulation in SAS is −

PROC FREQ DATA = dataset;

TABLES variable_1*Variable_2;

Following is the description of the parameters used −

• Dataset is the name of the dataset.

• Variable_1 and Variable_2 are the variable names of the dataset whose frequency distribution needs to be calculated.

Example

Consider the case of finding how many car types are available under each car brand from the dataset cars1 which is created form SASHELP.CARS as shown below. In this case we need the individual frequency values as well as the sum of the frequency values across the makes and across the types. We can observer that the result shows values across the rows and the columns.

PROC SQL;

create table CARS1 as

SELECT make, type, invoice, horsepower, length, weight

FROM

SASHELP.CARS

WHERE make in ('Audi','BMW')

;

RUN;

proc FREQ data = CARS1;

tables make*type;

run;

When the above code is executed, we get the following result −

sas 9

Cross tabulation of 3 Variables

When we have three variables we can group 2 of them and cross tabulate each of these two with the third varaible. So in the result we have two cross tables.

Example

In the below example we find the frequency of each type of car and each model of car with respect to the make of the car. Also we use the nocol and norow option to avoid the sum and percentage values.

proc FREQ data = CARS2 ;

tables make * (type model) / nocol norow nopercent;

run;

When the above code is executed, we get the following result −

sas 10

Cross tabulation of 4 Variables

With 4 variables, the number of paired combinations increases to 4. Each variable from group 1 is paired with each variable of group 2.

Example

In the below example we find the frequency of length of the car for each make and each model. Similarly the frequency of horsepower for each make and each model.

proc FREQ data = CARS2 ;

tables (make model) * (length horsepower) / nocol norow nopercent;

run;

When the above code is executed, we get the following result −

sas 11

So, this brings us to the end of blog. This Tecklearn ‘Frequency Distributions and Cross Tabulations in SAS’ blog helps you with commonly asked questions if you are looking out for a job in SAS. If you wish to learn SAS and build a career in Data Analytics domain, then check out our interactive, SAS Training for SAS BASE Certification Training, that comes with 24*7 support to guide you throughout your learning period. Please find the link for course details:

SAS Training for SAS BASE Certification

SAS Training for SAS BASE Certification Training

About the Course

SAS Certification Training is intended to make you an expert in SAS programming and Analytics. You will be able to analyse and write SAS code for real problems, learn to use SAS to work with datasets, perform advanced statistical techniques to obtain optimized results with Advanced SAS programming. In this SAS online training course, you will also learn SAS macros, Machine Learning, PROC SQL, procedure, statistical analysis and decision trees. You will also work on real-life projects and prepare for the SAS Certified Base Programmer certification exam. Upon the completion of this SAS online training, you will have enough proficiency in reading spreadsheets, databases, using SAS functions for manipulating this data and debugging it.

Why Should you take SAS Training?

• The average salary for a Business Intelligence Developer skilled in SAS is $100k (PayScale salary data)

• SAS, Google, Facebook, Twitter, Netflix, Accenture & other MNCs worldwide are using SAS for their Data analysis activities and advance their existing systems.

• SAS is a Leader in 2017 Gartner Magic Quadrant for Data Science Platform.

What you will Learn in this Course?

Introduction to SAS

• Introduction to SAS

• Installation of SAS

• SAS windows

• Working with data sets

• Walk through of SAS windows like output, search, editor etc

SAS Enterprise Guide

• How to read and subset the data sets

• SET Statement

• Infile and Infile Options

• SAS Format -Format Vs Informat

SAS Operators and Functions

• Using Variables

• Defining and using KEEP and DROP statements

• Output Statement

• Retain Statement

• SUM Statement

Advanced SAS Procedures

• PROC Import

• PROC Print

• Data Step Vs Proc

• Deep Dive into Proc

Customizing Datasets

• SAS Arrays

• Useful SAS Functions

• PUT/INPUT Functions

• Date/Time Functions

• Numeric Functions

• Character Functions

SAS Format and SAS Graphs

• SAS Format statements

• Understanding PROC GCHART, various graphs, bar charts: pie, bar

Sorting Techniques

• NODUP

• NODUKEY

• NODUP Vs NODUKEY

Data Transformation Function

• Character functions, numeric functions and converting variable type

• Use functions in data transformation

Deep Dive into SAS Procedures, Functions and Statements

• Find Function

• Scan Function

• MERGE Statement

• BY Statement

• Joins

• Procedures Vs Function

• Where Vs If

• What is Missover

• NMISS

• CMISS

PROC SQL

• SELECT statement

• Sorting of Data

• CASE expression

• Other SELECT statement clauses

• JOINS and UNIONS

Using SAS Macros

• Benefits of SAS Macros

• Macro Variables

• Macro Code Constituents and Macro Step

• Positional Parameters to Macros

Got a question for us? Please mention it in the comments section and we will get back to you.

 

 

 

 

 

 

 

 

 

 

 

 

 

0 responses on "Frequency Distributions and Cross Tabulations in SAS"

Leave a Message

Your email address will not be published. Required fields are marked *