Lab 5

Floating Point Library

Introduction

An 8-bit micro-controller (such as the 68HC11) generally has a word size of 8 bits, with the consequence that the numbers used (either signed or unsigned) are limited to 256 different values.  There are a few 16-bit math instructions available, but even with these the largest (unsigned) number possible is 65535.  Tasks often require dealing with numbers that are very much larger than this, or fractional numbers that cannot be expressed at all with integers. One solution to this problem is use of some normalizing scheme, whereby every number in the microprocessor is assumed to be multiplied or divided by some power of 2.  However, such schemes are arbitrary and often the scaling coefficient must be changed to accommodate different ranges.

Floating-point number representations present a robust solution to the problem. The idea is that each number is represented by two parts: a mantissa that expresses the value and an exponent that expresses the scale. This scheme handles negative numbers simply with a sign bit; the representation is thus signed magnitude.  Several standards exist for representation of floating point numbers.  One of the most popular is the IEEE standard, which requires four bytes (32 bits) to store a single-precision value and eight bytes (64 bits) to store a double-precision value.  While this approach is very accurate and powerful, it is also complicated and requires the use of a lot of memory for storage of both numbers and the executable code.  As such, it is not really suitable for use with an eight-bit microprocessor.  The scheme presented in this lab trades power for compactness and efficiency, and it is suitable for most applications that do not require serious number-crunching.

You will be given fp01.asm, which contains subroutines for floating point operations (If you can't get it after clicking here, ask your TA about it). For your convenience, these routines have already been programmed into ROM in the HC11-VDK board. Your TA will tell you where the memory addresses storing these routines are. In the fp01 library, floating point numbers are held in three bytes. One byte contains the exponent, represented in signed, two's complement notation, with a range of -128 -> +127. The remaining two bytes contain the mantissa (15 bits) with a sign bit for the number in the highest order bit. The scheme is shown as follows:

signed exponent mantissa
8 bits *|15 bits

The value of the floating-point number represented is the mantissa value multiplied by the power of two represented by the exponent.  As an example, assume an exponent of $14 and a mantissa of $4B72.  This translates to a decimal exponent of 20.  By the definition of the standard, the mantissa is in normalized form.  It represents a binary fraction, with the decimal point assumed to fall immediately to the left of the most significant digit.  The process of normalization requires that the most significant bit is set (=1); this insures the most accurate representation.  Our example mantissa thus represents 0.100101101110010.  This equals 1/2 + 1/16 + 1/64.... or 0.5894165039.  Two raised to the 20th power is 1,048,576.  The product of these two numbers is then 618,048.  Note that since the mantissa is normalized:

  1.  It must have hex values between $4000-7FFF.

  2. These values can be translated into a decimal fraction by conversion of the mantissa to decimal and division by 32,768, which is equivalent to shifting the decimal point fifteen places to the left to form the binary fraction.

Conversion of a decimal floating point number to this binary code is accomplished by performing the following five steps: 

  1. Multiply (or divide) the number by 2 repeatedly until the product (or remainder) falls between 0.5 and 1. This is equivalent to normalization.

  2. Count the number of times you did the operation. If you multiplied, the exponent is the NEGATIVE of that absolute value; if you divided, the exponent is that absolute value.

  3. Multiply the (decimal) fraction of the mantissa by 32,768.

  4. Truncate (or round) to integer and convert to hex.

  5. Add a sign bit if the floating point number is negative.

Before the Lab

The floating-point library you will be given (fp01.asm) supports six primary functions:  the four math functions (addition, subtraction, multiplication and division) as well as conversion from hex integers (16 bit) to floating point and vice-versa.  Further functions are of course possible (many floating point libraries contain log and sin functions, for example) but would require more code than the HC11-VDK board can hold at one time.  This library might be appended to the code that you write for the exercises so that the assembler can find it, or use the base addresses to access them from ROM. The library supports two three-byte floating-point "accumulators" in memory, that hold the values operated on by the code.  These accumulators are referred to with labels MANT1 (2 bytes) and EXP1, MANT2 and EXP2.  The operations are executed by loading the operands (here called OP1 and OP2) into the first and second FP accumulators, respectively, and calling the appropriate subroutine.  The subroutines and their functions are listed below:

FADD OP1 + OP2
FSUB OP1 - OP2
FMUL OP1 * OP2
FDIV OP1 / OP2

The answer is always placed in the first accumulator (OP1), so the number placed there initially is always destroyed. Conversion is done with two other subroutines. FLOATINT converts the FP number in OP1 to integer and places the answer in the D accumulator. In this case the original floating point operand (in OP1) is destroyed. INTFLOAT converts the 16-bit integer in D to floating point and places the answer in OP1.

Write main programs that do the following:

  1. Verify the operation of the conversion utilities by converting two numbers (initially loaded into memory locations) into floating point and then converting them back again.  Load your answers into locations that are different from their origination point so that you can tell that something happened.

  2. Write programs that test the four functions (FADD, FSUB, FMUL and FDIV) to prove that they work.  Note that the inputs and outputs from the previous program can be used.

  3. Write a program that computes the area of a circle as a function of its radius.  While the input and output can be expressed as integers operated on with the code of program (1), note that you must create the floating-point values for a special constant on your own.

  4. Write a subroutine that calculates the total amount of sales tax for all items sold in a store. Assume that a 16 bit list starting from $C100 contains prices of items sold today. If the price of an item is 29.95, it is stored in the list as 2995 (decimal). Total number of items in the list is kept in $D100 (8bit). You are asked to calculate the total sales tax produced today. (assume that sales tax rate is a variable and is currently 8.25%).

    Bonus programs (up to 20 points): Based on your previous results, there are a few programs you may try to earn extra credit (simply pick one). 

    1. Based on Program 3 above (area of a circle), write a program that computes the area of three different geometric figures: circle, rectangle/square, and triangle, depending on the user's selection. The program should prompt the user for the proper inputs (radii, bases, heights, widths, etc.) required by your subroutines to perform the required calculations.

    2. Based on Program 4 above (sales tax), write a program that prompts the user for all items sold in a store, and the variable tax rates to be used (up to three), and computes the total amounts owed by the customer. For this program you must use the IEEE Floating Point Math Library routines on EEPROM (HC11-VDK only). If you plan to work on this bonus program, please feel free to ask your TA any questions you may have regarding the use of this library.

    3. Same as Program i above, but instead of using the SSS library in your calculations, use the IEEE Floating Point Library on EEPROM in the HC11-VDK.

    4. Based on your experience learnt so far, present an idea of a program to your TA for approval, and implement it. Impress your TA, classmates, and why not your class instructor.

Run the programs on the simulator (Wookie) before coming to class to insure that they work.  Examine the code for INTFLOAT and try to figure out how it works.  You may want to single step through its operation to verify your suspicions.  Note how many clock cycles are required to perform each of the FP subroutines.

In the Lab

Load and run your programs on the HC11-VDK board to confirm their operation. Try at least two pairs of numbers for the function tests.  Note what the floating-point representation of each number (and each answer) was for each test.  Test your circle area program for at least three different radii, one between 1-10, and one between 10-50 and one between 50-100 (decimal).  Record the results.

After the Lab

  1. Describe in words, pseudo-code or a flow chart how the INTFLOAT subroutine works. 

  2. Verify in a table the accuracy of operation of all of your lab tests. Discuss any errors and their possible sources.  Note any proportional dependence of error on argument size in your area program.  Discuss the general accuracy of this form of floating point representation (i.e., to how many decimal places can it be trusted?) 

  3. Calculate the total range of decimal numbers (smallest and largest amplitude) it can handle reliably.

  4. The Texas Instrument TMS320C30-50 digital signal-processing chip can process 50 MFLOPS (50 million floating point operations per second) with its specialized hardware. How does the 68HC11 (at 2.4576 MHz) compare using the floating point software library?

  5. What are the smallest and largest numbers that can be represented with 8 bit-unsigned exponents instead of signed ones? What about 16 bit signed exponent and 32-bit mantissa?

  6. Bonus only: Answer same questions above, but references to SSS are replaced by references to the IEEE Floating Point library (except question 4).


| EE-218  Homepage | Syllabus | Schedule | Lab News | Faculty | Contact Information | Lab Info | Project |


Department of Electrical Engineering and Computer Science
Box 1824 Station B
Nashville, TN 37235
Phone: 322-2771
Fax: 343-6702


 | Search | Site Index | People Finder | Phone Directory | VUnet | VUmail | VU Library | Help |


Last Updated: Thursday, February 15, 2007

Juan J. Rodriguez-Moscoso

Copyright © 2005 Vanderbilt University