Lesson 4: Assignment Statements and Numeric Functions
Once you've read your data into a SAS data set, surely you want to do something with it. A common thing to do is to change the original data in some way in an attempt to answer a research question of interest to you. You can change the data in one of two ways:
- You can use a basic assignment statement in which you add some information to all of the observations in the data set. Some assignment statements may take advantage of the numerous SAS functions that are available to make programming certain calculations easier ( e.g. , taking an average).
- Alternatively, you can use an if-then-else statement to add some information to some but not all of the observations. In this lesson, we will learn how to use assignment statements and numeric SAS functions to change your data. In the next lesson, we will learn how to use if-then-else statements to change a subset of your data.
Modifying your data may involve not only changing the values of a particular variable, but also the type of the variable. That is, you might need to change a character variable to a numeric variable. For that reason, we'll investigate how to use the INPUT function to convert character data values to numeric values. (We'll learn how to use the PUT function to convert numeric values to character values in Stat 481 when we study character functions in depth.)
Upon completing this lesson, you should be able to do the following:
- write a basic assignment statement involving a numeric variable
- write an assignment statement that involves an arithmetic calculation
- write an assignment statement that utilizes one of the many numeric SAS functions that are available
- know how SAS handles missing values for various arithmetic calculations and functions
- write an assignment statement involving nested functions
- write a basic assignment statement involving a character variable
- use the INPUT function to convert character data values to numeric values
4.1 - Assignment Statement Basics
The fundamental method of modifying the data in a data set is by way of a basic assignment statement. Such a statement always takes the form:
variable = expression;
where variable is any valid SAS name and expression is the calculation that is necessary to give the variable its values. The variable must always appear to the left of the equal sign and the expression must always appear to the right of the equal sign. As always, the statement must end with a semicolon (;).
Because assignment statements involve changing the values of variables, in the process of learning about assignment statements we'll get practice with working with both numeric and character variables. We'll also learn how using numeric SAS functions can help to simplify some of our calculations.
Throughout this lesson, we'll work on modifying various aspects of the temporary data set grades that is created in the following DATA step:
The data set contains student names ( name ), each of their four exam grades ( e1 , e2 , e3 , e4 ), their project grade ( p1 ), and their final exam grade ( f1 ).
A couple of comments. For the sake of the examples that follow, we'll use the DATALINES statement to read in the data. We could have just as easily used the INFILE statement. Additionally, for the sake of ease, we'll create temporary data sets rather than permanent ones. Finally, after each SAS DATA step, we'll use the PRINT procedure to print all or part of the resulting SAS data set for your perusal.
The following SAS program illustrates a very simple assignment statement in which SAS adds up the four exam scores of each student and stores the result in a new numeric variable called examtotal .
Note that, as previously described, the new variable name examtotal appears to the left of the equal sign, while the expression that adds up the four exam scores ( e1 + e2 + e3 + e4 ) appears to the right of the equal sign.
Launch and run the SAS program. Review the output from the PRINT procedure to convince yourself that the new numeric variable examtotal is indeed the sum of the four exam scores for each student appearing in the data set. Also note what SAS does when it is asked to calculate something when some of the data are missing. Rather than add up the three exam scores that do exist for John Simon, SAS instead assigns a missing value to his examtotal . If you think about it, that's a good thing! Otherwise, you'd have no way of knowing that his examtotal differed in some fundamental way from that of the other students. The important lesson here is to always be aware of how SAS is going to handle the missing values in your data set when you perform various calculations!
In the previous example, the assignment statement created a new variable in the data set by simply using a variable name that didn't already exist in the data set. You need not always use a new variable name. Instead, you could modify the values of a variable that already exists. The following SAS program illustrates how the instructor would modify the variable e2 , say for example, if she wanted to modify the grades of the second exam by adding 8 points to each student's grade:
Note again that the name of the variable being modified ( e2 ) appears to the left of the equal sign, while the arithmetic expression that tells SAS to add 8 to the second exam score ( e2 +8) appears to the right of the equal sign. In general, when a variable name appears on both sides of the equal sign, the original value on the right side is used to evaluate the expression. The result of the expression is then assigned to the variable on the left side of the equal sign.
Launch and run the SAS program. Review the output from the print procedure to convince yourself that the values of the numeric variable e2 are indeed eight points higher than the values in the original data set.
4.2 - Arithmetic Calculations Using Arithmetic Operators
All we've done so far is add variables together. Of course, we could also subtract, multiply, divide or exponentiate variables. We just have to make sure that we use the symbols that SAS recognizes. They are:
As is the case in other programming languages, you can perform more than one operation in an assignment statement. The operations are performed as they are for any mathematical expression, namely:
- exponentiation is performed first, then multiplication and division, and finally addition and subtraction
- if multiple instances of addition, multiple instances of subtraction, or addition and subtraction appear together in the same expression, the operations are performed from left to right
- if multiple instances of multiplication, multiple instances of division, or multiplication and division appear together in the same expression, the operations are performed from left to right
- if multiple instances of exponentiation occur in the same expression, the operations are performed right to left
- operations in parentheses are performed first
It's that last bullet that I think is the most helpful to know. If you use parentheses to specifically tell SAS what you want calculated first, then you needn't worry as much about the other rules. Let's take a look at two examples.
The following example contains a calculation that illustrates the standard order of operations. Suppose a statistics instructor calculates the final grade by weighting the average exam score by 0.6, the project score by 0.2, and the final exam by 0.2. The following SAS program illustrates how the instructor (incorrectly) calculates the students' final grades:
Well, okay, so the instructor should stick to statistics and not mathematics. As you can see in the assignment statement, the instructor is attempting to tell SAS to average the four exam scores by adding them up and dividing by 4, and then multiplying the result by 0.6. Let's see what SAS does instead. Launch and run the SAS program, and review the output to see if you can figure out what SAS did, say, for the first student Alexander Smith. If you're still not sure, review the rules for the order of the operations again. The rules tell us that SAS first:
- takes Alexander's first exam score 78 and multiples it by 0.6 to get 46.8
- takes Alexander's fourth exam score 69 and divides it by 4 to get 17.25
- takes Alexander's project score 97 and multiplies it by 0.2 to get 19.4
- takes Alexander's final exam score 80 and multiplies it by 0.2 to get 16.0
Then, SAS performs all of the addition:
to get his final score of 267.45. Now, maybe that's a final score that Alexander wants, but it is still fundamentally wrong. Let's see if we can help set the statistics instructor straight by taking advantage of that last rule that says operations in parentheses are performed first.
The following example contains a calculation that illustrates the standard order of operations. Suppose a statistics instructor calculates the final grade by weighting the average exam score by 0.6, the project score by 0.2, and the final exam by 0.2. The following SAS program illustrates how the instructor (correctly) calculates the students' final grades:
Let's dissect the calculation of Alexander's final score again. The assignment statement for final tells SAS:
- to first add Alexander's four exam scores (78, 82, 86, 69) to get 315
- and then divide that total 315 by 4 to get an average exam score of 78.75
- and then multiply the average exam score 78.75 by 0.6 to get 47.25
- and then take Alexander's project score 97 and multiply it by 0.2 to get 19.4
- and then take Alexander's final exam score 80 and multiply it by 0.2 to get 16.0
Then, SAS performs the addition of the last three items:
to get his final score of 82.65. There, that sounds much better. Sorry, Alexander.
Launch and run the SAS program to see how we did. Review the output from the print procedure to convince yourself that the final grades have been calculated as the instructor wishes. By the way, note again that SAS assigns a missing value to the final grade for John Simon.
In this last example, we calculated the students' average exam scores by adding up their four exam grades and dividing by 4. We could have instead taken advantage of one of the many numeric functions that are available in SAS, namely that of the MEAN function.
4.3 - Numeric Functions
Just as is the case for other programming languages, such as C++ or S-Plus, a SAS function is a pre-programmed routine that returns a value computed from one or more arguments. The standard form of any SAS function is:
For example, if we want to add three variables, a , b and c , using the SAS function SUM and assign the resulting value to a variable named d , the correct form of our assignment statement is:
In this case, sum is the name of the function, d is the target variable to which the result of the SUM function is assigned, and a , b , and c are the function's arguments. Some functions require a specific number of arguments, whereas other functions, such as SUM, can contain any number of arguments. Some functions require no arguments. As you'll see in the examples that follow, the arguments can be variable names, constants, or even expressions.
SAS offers arithmetic, financial, statistical and probability functions. There are far too many of these functions to explore them all in detail, but let's take a look at some examples.
In the previous example, we calculated students' average exam scores by adding up their four exam grades and dividing by 4. Alternatively, we could use the MEAN function. The following SAS program illustrates the calculation of the average exam scores in two ways — by definition and by using the MEAN function:
Launch and run the SAS program. Review the output from the PRINT procedure to convince yourself that the two methods of calculating the average exam scores do indeed yield the same results:
The SAS System
Oooops! What happened? SAS reports that the average exam score for John Simon is 82 when the average is calculated using the MEAN function, but reports a missing value when the average is calculated using the definition. If you study the results, you'll soon figure out that when calculating an average using the MEAN function, SAS ignores the missing values and goes ahead and calculates the average based on the available values.
We can't really make some all-conclusive statement about which method is more appropriate, as it really depends on the situation and the intent of the programmer. Instead, the (very) important lesson here is to know how missing values are handled for the various methods that are available in SAS! We can't possibly address all of the possible calculations and functions in this course. So ... you would be wise to always check your calculations out on a few representative observations to make sure that your SAS programming is doing exactly as you intended. This is another one of those good programming practices to jot down.
Although you can refer to SAS Help and Documentation (under " functions , by category ") for a full accounting of the built-in numeric functions that are available in SAS, here is a list of just some of the numeric functions that can be helpful when performing statistical analyses:
I have used the INT function a number of times when dealing with numbers whose first few digits contain some additional information that I need. For example, the area code in this part of Pennsylvania is 814. If I have phone numbers that are stored as numbers, say, as 8142341230, then I can use the INT function to extract the area code from the number. Let's take a look at an example of this use of the INT function.
The following SAS program uses the INT function to extract the area codes from a set of ten-digit telephone numbers:
In short, the INT function returns the integer part of the expression contained within parentheses. So, if the phone number is 8145562314, then int ( phone /10000000) becomes int (814.5562314) which becomes, as claimed, the area code 814. Now, launch and run the SAS program, and review the output from the PRINT procedure to convince yourself that the area codes are calculated as claimed.
One really cool thing is that you can nest functions in SAS (as you can in most programming languages). That is, you can compute a function within another function. When you nest functions, SAS works from the inside out. That is, SAS performs the action in the innermost function first. It uses the result of that function as the argument of the next function, and so on. You can nest any function as long as the function that is used as the argument meets the requirements for the argument.The following SAS program illustrates nested functions when it rounds the students' exam average to the nearest unit:
For example, the average of Alexander's four exams is 78.75 (the sum of 78, 82, 86, and 69 all divided by 4). Thus, in calculating avg for Alexander, 78.75 becomes the argument for the ROUND function. That is, 78.75 is rounded to the nearest one unit to get 79. Launch and run the SAS program, and review the output from the PRINT procedure to convince yourself that the exam averages avg are rounded as claimed.
4.4 - Assigning Character Variables
So far, all of our examples have pertained to numeric variables. Now, let's take a look at adding a new character variable to your data set or modifying an existing characteric variable in your data set. In the previous lessons, we learned how to read the values of a character variable by putting a dollar sign ($) after the variable's name in the INPUT statement. Now, you can update a character variable (or create a new character variable!) by specifying the variable's values in an assignment statement.
When creating a new character variable in a data set, most often you will want to assign the values based on certain conditions. For example, suppose an instructor wants to create a character variable called status which indicates whether a student "passed" or "failed" based on their overall final grade. A grade below 65, say, might be considered a failing grade, while a grade of 65 or higher might be considered a passing grade. In this case, we would need to make use of an if-then-else statement . We'll learn more about this kind of statement in the next lesson, but you'll get the basic idea here. The following SAS program illustrates the creation of a new character variable called status using an assignment statement in conjunction with an if-then-else statement:
Launch and run the SAS program. Review the output from the PRINT procedure to convince yourself that the values of the character variable status have been assigned correctly. As you can see, to specify a character variable's value using an assignment statement, you enclose the value in quotes. Some comments:
- You can use either single quotes or double quotes. Change the single quotes in the above program to double quotes, and re-run the SAS program to convince yourself that the character values are similarly assigned.
- If you forget to specify the closing quote, it is typically a show-stopper as SAS continues to scan the program looking for the closing quote. Delete the closing quote in the above program, and re-run the SAS program to convince yourself that the program fails to accomplish what is intended. Check your log window to see what kind of a warning statement is generated.
4.5 - Converting Data
Suppose you are asked to calculate sales income using the price and the number of units sold. Pretty straightforward, eh? As long as price and units are stored in your data set as numeric variables, then you could just use the assignment statement:
It may be the case, however, that price and units are instead stored as character variables. Then, you can imagine it being a little odd trying to multiply price by units . In that case, the character variables price and units first need to be converted to numeric variables price and units . How SAS helps us do that is the subject of this section. To be specific, we'll learn how the INPUT function converts character values to numeric values.
The reality though is that SAS is a pretty smart application. If you try to do something to a character variable that should only be done to a numeric variable, SAS automatically tries first to convert the character variable to a numeric variable for you. The problem with taking this lazy person's approach is that it doesn't always work the way you'd hoped. That's why, by the end of our discussion, you'll appreciate that the moral of the story is that it is always best for you to perform the conversions yourself using the INPUT function.
The following SAS program illustrates how SAS tries to perform an automatic character-to-numeric conversion of standtest and e1 , e2 , e3 , and e4 so that arithmetic operations can be performed on them:
Okay, first note that for some crazy reason all of the data in the data set have been read in as character data. That is, even the exam scores ( e1 , e2 , e3 , e4 ) and the standardized test scores ( standtest ) are stored as character variables. Then, when SAS goes to calculate the average exam score ( avg ), SAS first attempts to convert e1 , e2 , e3 , and e4 to numeric variables. Likewise, when SAS goes to calculate a new standardized test score ( std ), SAS first attempts to convert standtest to a numeric variable. Let's see how it does. Launch and run the SAS program, and before looking at the output window, take a look at the log window. You should see something that looks like this:
The first NOTE that you see is a standard message that SAS prints in the log to warn you that it performed an automatic character-to-numeric conversion on your behalf. Then, you see three NOTES about invalid numeric data concerning the standtest values 1,210, 1,010, and 1,180. In case you haven't figured it out yourself, it's the commas in those numbers that is throwing SAS for a loop. In general, the automatic conversion produces a numeric missing value from any character value that does not conform to standard numeric values (containing only digits 0, 1, ..., 9, a decimal point, and plus or minus signs). That's why that fifth NOTE is there about missing values being generated. The output itself:
shows the end result of the attempted automatic conversion. The calculation of avg went off without a hitch because e1 , e2 , e3 , and e4 contain standard numeric values, whereas the calculation of std did not because standtest contains nonstandard numeric values. Let's take this character-to-numeric conversion into our own hands.
The following SAS program illustrates the use of the INPUT function to convert the character variable standtest to a numeric variable explicitly so that an arithmetic operation can be performed on it:
The only difference between the calculation of std here and of that in the previous example is that the standtest variable has been inserted here into the INPUT function. The general form of the INPUT function is:
- source is the character variable, constant or expression to be converted to a numeric variable
- informat is a numeric informat that allows you to read the values stored in source
In our case, standtest is the character variable we are trying to convert to a numeric variable. The values in standtest conform to the comma5 . informat, and hence its specification in the INPUT function.
Let's see how we did. Launch and run the SAS program, and again before looking at the output window, take a look at the log window. You should see something that now looks like this:
Ahhaa! No warnings about SAS taking over our program and performing automatic conversions. That's because we are in control this time! Now, looking at the output:
we see that we successfully calculated std this time around. That's much better!
A couple of closing comments. First, I might use our discussion here to add another item to your growing list of good programming practices. Whenever possible, make sure that you are the one who is in control of your program. That is, know what your program is doing at all times, and if it's not doing what you'd expect it to do for all situations , then rewrite it in such a way to make sure that it does.
Second, you might be wondering "geez, we just spent all this time talking about character-to-numeric conversions, but what happens if I have to do a numeric-to-character conversion instead?" Don't worry ... SAS doesn't let you down. If you try to do something to a character variable that should only be done to a numeric variable, SAS automatically tries first to convert the character variable to a numeric variable. If that doesn't work, then you'll want to use the PUT function to convert your numeric values to character values explicitly. We'll address the PUT function in Stat 481 when we learn about character functions in depth.
4.6 - Summary
In this lesson, we learned how to write basic assignment statements, as well as use numeric SAS functions, in order to change the contents of our SAS data set. One thing you might have noticed is that almost all of our examples involved assignment statements that changed every observation in our data set. There may be situations, however, when you don't want to change every observation, but rather want to change just a subset of observations, those that meet a certain condition. To do so, you have to use if-then-else statements, which we'll learn about in the next lesson. In doing so, we'll also learn a few more good programming practices.
The homework for this lesson will give you practice with assignment statements and numeric functions so that you become even more familiar with how they work and can use them in your own SAS programming.
MATLAB Short Course
2. Simple Mathematics
The preceding MATLAB commands that assign the value of the expression after the ’=’ sign to the variable before the ’=’ sign are assignment statements. Note that all variables in the expression after the ’=’ sign must have previously been allocated a value, or else an error occurs. For example, enter the following commands:
You will see the error message ??? Undefined function or variable ’b’ . Consider the following:
The last line has nothing at all to do with a mathematical equation. It is a MATLAB assignment statement that calculates x 2 −12 at x = 7 and stores the result in the variable x , thereby over-writing the previous value.
- Assignment Statement
An Assignment statement is a statement that is used to set a value to the variable name in a program .
Assignment statement allows a variable to hold different types of values during its program lifespan. Another way of understanding an assignment statement is, it stores a value in the memory location which is denoted by a variable name.
The symbol used in an assignment statement is called as an operator . The symbol is ‘=’ .
Note: The Assignment Operator should never be used for Equality purpose which is double equal sign ‘==’.
The Basic Syntax of Assignment Statement in a programming language is :
variable = expression ;
variable = variable name
expression = it could be either a direct value or a math expression/formula or a function call
Few programming languages such as Java, C, C++ require data type to be specified for the variable, so that it is easy to allocate memory space and store those values during program execution.
data_type variable_name = value ;
In the above-given examples, Variable ‘a’ is assigned a value in the same statement as per its defined data type. A data type is only declared for Variable ‘b’. In the 3 rd line of code, Variable ‘a’ is reassigned the value 25. The 4 th line of code assigns the value for Variable ‘b’.
Assignment Statement Forms
This is one of the most common forms of Assignment Statements. Here the Variable name is defined, initialized, and assigned a value in the same statement. This form is generally used when we want to use the Variable quite a few times and we do not want to change its value very frequently.
Generally, we use this form when we want to define and assign values for more than 1 variable at the same time. This saves time and is an easy method. Note that here every individual variable has a different value assigned to it.
(Code In Python)
(Code in Python)
Multiple-target Assignment or Chain Assignment
In this format, a single value is assigned to two or more variables.
In this format, we use the combination of mathematical expressions and values for the Variable. Other augmented Assignment forms are: &=, -=, **=, etc.
Browse more Topics Under Data Types, Variables and Constants
- Concept of Data types
- Built-in Data Types
- Constants in Programing Language
- Access Modifier
- Variables of Built-in-Datatypes
- Declaration/Initialization of Variables
- Type Modifier
Few Rules for Assignment Statement
Few Rules to be followed while writing the Assignment Statements are:
- Variable names must begin with a letter, underscore, non-number character. Each language has its own conventions.
- The Data type defined and the variable value must match.
- A variable name once defined can only be used once in the program. You cannot define it again to store other types of value.
- If you assign a new value to an existing variable, it will overwrite the previous value and assign the new value.
FAQs on Assignment Statement
Q1. Which of the following shows the syntax of an assignment statement ?
- variablename = expression ;
- expression = variable ;
- datatype = variablename ;
- expression = datatype variable ;
Answer – Option A.
Q2. What is an expression ?
- Same as statement
- List of statements that make up a program
- Combination of literals, operators, variables, math formulas used to calculate a value
- Numbers expressed in digits
Answer – Option C.
Q3. What are the two steps that take place when an assignment statement is executed?
- Evaluate the expression, store the value in the variable
- Reserve memory, fill it with value
- Evaluate variable, store the result
- Store the value in the variable, evaluate the expression.
Data Types, Variables and Constants
- Variables in Programming Language
- Concept of Data Types
- Declaration of Variables
- Type Modifiers
- Access Modifiers
- Constants in Programming Language
Which class are you in?
Download the App
Variables and Assignment Statements
Read this chapter, which covers variables and arithmetic operations and order precedence in Java.
Table of contents
1. variables and assignment statements, 2. variables, 3. declaration of a variable, 4. simulated java program, 5. syntax of variable declaration, 6. names for variables, 7. calculation, 8. several lines per statement, 9. assignment statements, 10. assignment statement syntax, 11. assignment statement semantics, 12. two steps, 13. more practice, 14. adding a number to a variable, 15. same variable twice in an assignment statement, 16. a sequence that counts, 17. counting higher, 18. expressions, 19. more practice, 20. spaces don't much matter, 21. arithmetic operators, 22. evaluation by rewriting, 23. evaluate equal precedence from left to right, 24. unary minus, 25. unary minus, 26. arrange what you want with parentheses, 27. extra parentheses, 28. nested parentheses, 29. end of chapter.
In all but the smallest programs, an executing program is constantly working with values. These values are kept in little sections of main memory called variables .
- Assignment Statements
- Arithmetic Ope rators
Do you imagine that a variable can change its value?
Yes — that is why is is called a variable.
The billions of bytes of main storage in your home computer are used to store both machine instructions and data. The electronic circuits of main memory (and all other types of memory) make no distinction between the two. It just holds bit patterns.
When a program is running, some memory locations are used for machine instructions and others for data. Later, when another program is running some of the bytes that previously held machine instructions may now hold data, and some that previously held data may now hold machine instructions.
Using the same memory for both instructions and data was the idea of John von Neumann, a computer pioneer. (If you are unclear about bytes and memory locations, please read Chapter 3.)
Recall that a data type is a scheme for using bit patterns to represent a value. Think of a variable as a little box made of one or more bytes that can hold a value using a particular data type.
To put a value in memory, and later to get it back, a program must have a name for each variable. Variables have names such as payAmount . (Details will be given in a few pages.)
Variables come and go as a running program needs them. When a running program no longer needs a variable, that section of memory may be reused for some other purpose.
Yes. Otherwise it would not be clear what its bits represent.
Remember that the meaning of a bit pattern is determined by its context. It is the data type that gives the bits in a variable a context.
Declaration of a Variable
The example program uses the variable payAmount . The statement
is a declaration of a variable. A declaration of a variable is where a program says that it needs a variable. For our small programs, place declaration statements between the two braces of the main method.
The declaration gives a name and a data type for the variable. It may also ask that a particular value be placed in the variable. In a high level language (such as Java) the programmer does not need to worry about how the computer hardware actually does what was asked. If you ask for a variable of type long , you get it. If you ask for the value 123 to be placed in the variable, that is what happens. Details about bytes, bit patterns, and memory addresses are up to the Java compiler.
In the example program, the declaration requests an eight-byte section of memory named payAmount which uses the primitive data type long for representing values. When the program starts running, the variable will initially have the value 123 stored in it.
The name for a variable must be a legal Java identifier. (Details in a few pages.)
A variable cannot be used in a program unless it has been declared.
The variable contains: 123
Simulated Java Program
Try this page: http://ideone.com/ . There, you can paste Java code into a web page text box, and then compile and run it with actual Java.
A similar site is http://javabat.com/ .
Syntax of variable deceleration.
The word syntax means the grammar of a programming language. We can talk about the syntax of just a small part of a program, such as the syntax of variable declaration.
There are several ways to declare variables:
- This declares a variable, declares its data type, and reserves memory for it. It says nothing about what value is put in memory. (Later in these notes you will learn that in some circumstances the variable is automatically initialized, and that in other circumstances the variable is left uninitialized.)
- This declares a variable, declares its data type, reserves memory for it, and puts an initial value into that memory. The initial value must be of the correct data type.
- This declares two variables, both of the same data type, reserves memory for each, but puts nothing in any variable. You can do this with more than two variables, if you want.
- This declares two variables, both of the same data type, reserves memory, and puts an initial value in each variable. You can do this all on one line if there is room. Again, you can do this for more than two variables as long as you follow the pattern.
If you have several variables of different types, use several declaration statements. You can even use several declaration statements for several variables of the same type.
Is the following correct?
Yes — as long as the names answer and rate have not already been used.
The programmer picks a name for each variable in a program. Various things in a program are given names. A name chosen by a programmer is called an identifier . Here are the rules for identifiers:
- Use only the characters 'a' through 'z', 'A' through 'Z', '0' through '9', and characters '$' and '_'.
- An identifier can not contain the space character.
- Do not start with a digit.
- An identifier can be any length.
- SUM and Sum are different identifiers.
- An identifier can not be a reserved word.
- An identifier must not already be in use in this part of the program.
A reserved word is a word which has a predefined meaning in Java. For example int , double , true , and import are reserved words. Rather than worry about the complete list of reserved words, just remember to avoid using names that you know already mean something, and be prepared to make a change if you accidentally use a reserved word you didn't know.
Although dollar sign is legal in identifiers, by convention it is used for special purposes. Don't use it for ordinary variables, even if the compiler lets you.
Although legal, it is unwise to start an identifier with underscore. Also, older Java compilers allowed a single underscore as an identifier, but recent ones do not.
As a matter of programming style, a name for a variable usually starts with a lower case letter. If a name for a variable is made of several words, capitalize each word except the first. For example, payAmount and grandTotal . These conventions are not required by syntax, but make programs easier to read.
Which of the following variable declarations are correct? (click on a declaration to verify your answer)
The first value was stored in a variable of data type long an integer type. Integers do not have fractional parts. The second forty was the result of a calculation involving a variable of data type double , a floating point type, which does have a fractional part. (Here, the fractional part was zero.)
Look carefully at the statement highlighted in red. The parentheses around
force the multiplication to be done first. After it is done, the result is converted to characters and appended to the string "pay Amount : ".
When you have a calculation as part of a System.out.println() statement, it is a good idea to surround the calculation with parentheses to show that you want it done first. Sometimes this is not necessary, but it does not hurt, and makes the program more readable.
Several lines per statment.
You can use several lines for a statement. Anywhere a space character is OK you can split a statement. This means you can't split a statement in the middle of a name, nor between the quote marks of a string literal, nor in the middle of a numeric literal. Recall that a "literal" is an explicit value in the program. Here is the program with some statements correctly put on two lines:
It is a good idea to indent the second half of a split statement further than the start of the statement.
No. The incorrect splittings are highlighted in red:
So far, the example programs have been using the value initially put into a variable. Programs can change the value in a variable. An assignment statement changes the value that is held in a variable. The program uses an assignment statement.
The assignment statement puts the value 123 into the variable. In other words, while the program is executing there will be a 64 bit section of memory that holds the value 123.
Remember that the word "execute" is often used to mean "run". You speak of "executing a program" or "executing" a line of the program.
The program prints out the same thing as the first example program. However, this program did not initialize the variable and so had to put a value into it later.
Assignment Statement Syntax
Assignment statements look like this:
- The equal sign = is the assignment operator.
- variableName is the name of a variable that has been declared previously in the program.
- expression is a collection of characters that calls for a value to be calculated.
Here are some example assignment statements (assume that the variables have already been declared):
In the source file, the variable must be declared before any assignment statement that uses that variable.
Assignment statement semantics.
The syntax of a programming language says what programs look like. It is the grammar of how to arrange the symbols. The semantics of a programming language says what the program does as it executes. It says what the symbols mean. This page explains the semantics of the assignment statement.
An assignment statement does its work in two steps:
- If there is nothing to calculate, use the value on the right.
- Next, replace the contents of the variable on the LEFT of the equal sign with the result of the calculation.
- Do the calculation 3+5 to get 8.
- Put 8 in the variable named total .
It does not matter if total already has a number in it. Step 2 will replace whatever is already in total .
- Use the value 23.
- Put 23 in the variable named points .
What happens FIRST when the following statement executes?
FIRST, do the multiplication 2*3 to get the value 6.
NEXT, put the result of the calculation into the "litte box of memory" used for the variable value :
It will really, really help you to think carefully about these two steps. Sometimes even second year computer science majors get confused about this and write buggy code.
What will this program fragment write?
Here is another program fragment:
The assignment statement is correct. It matches the syntax:
The expression is the literal 5 . No calculation needs to be done. But the assignment statement still takes two steps.
FIRST, get the 5:
NEXT, put the 5 in the variable:
Adding a number to a variable.
Assume that extra already contains the value 5.
Here is another statement:
The statement will be performed in two steps (as always). The first step performs the calculation extra + 2 by first copying the 5 from extra , and then adding 2 to it:
The result of the calculation is 7. The second step puts 7 into the variable value :
What will this program print out:
The program will print out:
Look at the statements:
Assume that value has already been declared. The two statements execute one after the other, and each statement performs two steps.
The first statement:
- Gets the number on the RIGHT of the equal sign: 5
- Look on the LEFT of the equal sign to see where to put the result.
- Put the 5 in the variable value .
The second statement:
- Look into the variable value to get the number 5.
- Perform the sum: 12 + 5 to get 17
- Put the 17 in the variable value .
Note: A variable can be used on both the LEFT and the RIGHT of the = in the same assignment statement. When it is used on the right, it provides a number used to calculate a value. When it is used on the left, it says where in memory to save that value.
The two roles are in separate steps, so they don't interfere with each other. Step 1 uses the original value in the variable. Then step 2 puts the new value (from the calculation) into the variable.
What does the following program fragment write?
value is: 5 value is: 15
A Sequence that Counts
Look at this program fragment. Click inside the program to see how it works.
Here is how the program works:
- Statement 1 puts 0 into count .
- Statement 2 writes out the 0 in count .
- Statement 3 first gets the 0 from count , adds 1 to it, and puts the result back in count .
- Statement 4 writes out the 1 that is now in count .
When the fragment runs, it writes:
Think of a way to write 0, 1, and 2.
Put a copy of statements 3 and 4 at the end.
The following fragment:
increments the number in count. Sometimes programmers call this "incrementing a variable" although (of course) it is really the number in the variable that is incremented.
What does the following assignment statement do:
(What are the two steps, in order?)
- Evaluate the expression: get the value in sum and multiply it by two.
- Then, put that value into sum .
Sometimes you need to think carefully about the two steps of an assignment statement. The first step is to evaluate the expression on the right of the assignment operator.
This (slightly incomplete) definition needs some explanation:
- literal — characters that directly give you a value, like: 3.456
- operator — a symbol like plus + or times * that asks for an arithmetic operation.
- variable — a section of memory containing a value.
- parentheses — ( and ) .
This might sound awful. Actually, this is stuff that you know from algebra, like:
In the above, the character / means division .
Not just any mess of symbols will work. The following
is not a syntactically correct expression. There are rules for this, but the best rule is that an expression must look OK as algebra.
However, multiplication must always be shown by using a * operator. You can't multiply two variables by placing them next to each other. So, although xy might be correct in algebra, you must use x*y in Java.
Which of the following expressions are correct? (Assume that the variables have been properly declared elsewhere.)
Let's try some more. Again, assume that all the variables have been correctly declared.
Spaces don't much matter.
An expression can be written without using any spaces. Operators and parentheses are enough to separate the parts of an expression. You can use one or more spaces in an expression to visually separate the parts without changing the meaning. For example, the following is a correct expression:
The following means exactly the same:
Use spaces wisely to make it clear what the expression means. By making things clear, you might save yourself hours of debugging. Spaces can't be placed in the middle of identifiers.
The following is NOT correct:
It is possible (but unwise) to be deceptive with spaces. For example, in the following:
it looks as if 4 is subtracted from 12, and then that the result, 8, is divided by 4. However, the spaces don't count, and the expression is the same as:
This improved arrangement of spaces makes it clear what the expression means.
Based on what you know about algebra, what is the value of this expression:
An arithmetic operator is a symbol that asks for two numbers to be combined by arithmetic. As the previous question illustrates, if several operators are used in an expression, the operations are done in a specific order.
Operators of higher precedence are done first. The table shows the precedence of some Java operators.
Some operators have equal precedence. For example, addition + and subtraction - have the same precedence.
The unary minus and unary plus operators are used as part of a negative or a positive number. For example, -23 means negative twenty-three and +23 means positive twenty-three. More on this later.
When both operands (the numbers) are integers, these operators do integer arithmetic . If one operand or both operands are floating point, then these operators do floating point arithmetic . This is especially important to remember with division, because the result of integer division is an integer.
5/2 is 2 (not 2.5 )
5/10 is 0 (not 0.5 ).
5.0/2.0 is 2.5
5/10 is 0.5 .
What is the value of the following expressions? In each expression, do the highest precedence operator first.
The last result is correct. First 6/8 is done using integer division, resulting in 0 . Then that 0 is added to 2 .
Evaluation by Rewriting
When evaluating an expression, it can be helpful to do it one step at a time and to rewrite the expression after each step. Look at:
Do the division first, since it has highest precedence. Next, rewrite the expression replacing the division with its value:
Now evaluate the resulting expression:
You can write the process like this:
The dashed lines show what was done at each step.
What is the value of the following expression?
Evaluate Equal Precedence from Left to Right
When there are two (or more) operators of equal precedence, evaluate the expression from left to right.
Since both operators are * , each has equal precedence, so calculate 2 * 7 first, then use that result with the second operator.
Here is a second example:
Now the operators are different, but they both have the same precedence, so they are evaluated left to right.
Usually it doesn't matter if evaluation is done left to right or in any other order. In algebra it makes no difference. But with floating point math it sometimes makes an important difference. Also, when an expression uses a method it can make a very big difference. (Methods are discussed in part 6 of these notes.)
If you look in the operators, table of table of operators you will see that the character - is listed twice. That is because - is used for two purposes. In some contexts, - is the unary minus operator. In other contexts, - is the subtraction operator.
The unary minus is used to show a negative number. For example:
means "negative ninety seven point thirty four." The subtraction operator is used to show a subtraction of one number from another. For example:
asks for 12 to be subtracted from 95.
The unary minus operator has high precedence. Addition and subtraction have low precedence. For example
means add 3 to negative 12 (resulting in -9). The unary minus is done first, so it applies only to the twelve.
unary plus + can be applied to a number to show that it is positive. It also has high precedence. It is rarely used.
The first + is a unary plus, so it has high precedence and applies only to the 24.
The - is a unary minus, so it applies only to the 4.
Next in order of precedence is the * , so three times negative four is calculated yielding negative twelve.
Finally the low precedence + combines the positive twenty four with the negative twelve.
Arrange what you want with Parentheses
To show exactly what numbers go with each operator, use parentheses. For example
means do 9 - 2 first. The ( ) groups together what you want done first. After doing the subtraction, the ( 9 - 2 ) becomes a 7:
Now follow the left-to-right rule for operators of equal precedence:
What is the value of each of the following expressions?
Notice that the last expression (above) did not need parentheses. It does not hurt to use parentheses that are not needed. For example, all of the following are equivalent:
Warning! Look out for division. The / operator is a constant source of bugs! Parentheses will help.
The last answer is correct. It is done like this:
Sometimes in a complicated expression one set of parentheses is not enough. In that case use several nested sets to show what you want. The rule is:
The innermost set of parentheses is evaluated first .
Start evaluating at the most deeply nested set of parentheses, and then work outward until there are no parentheses left. If there are several sets of parentheses at the same level, evaluate them left to right. For example:
(Ordinarily you would not do this in such detail.)
What is the value of this expression:
End of the Chapter
Here is a list of subjects you may wish to review. Click on a high precedence subject to go to where it was discussed.
- variable, definition of What variables are.
- declaration Declaring variables.
- variable, declaration Syntax of declaring variables.
- variable, names Names for variables.
- reserved word Reserved words.
- assignment statement Assignment statements
- semantics Semantics
- expressions Expressions.
- operators, table of Table of operators and their precedence.
- parentheses How parentheses change the order of evaluation.
The next chapter will continue with arithmetic expressions.