Go up to the Labs table of contents page
To become familiar with the underlying representation of various data types, and to learn how to examine these representations in the debugger.
In class we discussed how various data types – integers, characters, and floating point numbers – were represented in computers. In this lab we will use the debugger to examine some of these representations.
Go through Tutorial 4: Unix, part 2, which is sections 5-8. This tutorial is originally from the department of Electrical Engineering at the University of Surrey, and is available online here. You went through sections 1-4 in the last tutorial; this lab has you completing sections 5-8.
sizeOfTest()
function to view the sizes of
various typesoverflow()
function to investigate how C++
handles integer overflowoutputBinary()
function to display the binary
representation of integersThere are three parts to the C++ file you will be submitting as a part of the pre-lab.
Your program should ask for a single integer value for input
(via cin), which we will call x. The program will call
the three functions below in order: sizeOfTest()
,
overflow()
, and then outputBinary(x)
. Note
that only outputBinary()
takes in x as the
parameter.
The size of C++ data types is dependent on the underlying hardware on
which you are running. A programmer may determine the size of various
data types by using the sizeof()
operator. Although it
looks like a function, it’s a language construct – somewhat like
while()
or if()
– so it’s technically an
operator. sizeof()
returns the size, in bytes, of a given
variable or data type. Note that you can use sizeof()
with
types, variables, pointers, classes, and objects.
Write a small C++ function that demonstrates the use of
sizeof()
with the following types: int
,
unsigned int
, float
, double
,
char
, bool
, int*
,
char*
, and double*
. Your function should print
out all the types and their respective sizes. You will use the values
outputted by your program to fill in the table in the in-lab section.
The function should be called sizeOfTest()
(note the
capitalization!), so as not to confuse C++ with the
sizeof()
operator. This function should not take in any
parameters.
What do you think will happen when you add 1 to a variable containing
the maximum value of a type? Write a function called
overflow()
to answer the following questions:
unsigned int
variable
containing the maximum value of an unsigned int
?Your function should create an unsigned int
, give it the
max value, and add 1 to that. By printing out the result, you will
effectively answer the first 3 of the 4 questions. Your cout statement
should have the format shown below.
<max_number> + 1 = <result>
However, when you run the program, you will have actual numbers in
place of <max_number>
and
<result>
The third coding exercise for the pre-lab is a binary output program.
The function to write is called outputBinary()
, and it will
take in one parameter, an unsigned int
. It must be
unsigned, or else your code may not work! You should then print out the
32-bit binary representation (this includes the leading 0s!) of the
passed parameters in big Endian format. For
example:
outputBinary(1) //=> 0000 0000 0000 0000 0000 0000 0000 0001
outputBinary(5) //=> 0000 0000 0000 0000 0000 0000 0000 0101
outputBinary(1000000) //=> 0000 0000 0000 1111 0100 0010 0100 0000
You can NOT use the bitset
class for this, or any other class that does the work for you. You have
to program this yourself.
Your program will have the printout of three separate methods, and
the ordering of these printouts are very important:
sizeOfTest()
–> overflow()
–>
outputBinary()
. Do not include any output
that prompts the user for input. Below is a sample execution run to show
you the input and output format we are looking for.
Input
1
Output
Size of int: 4
Size of unsigned int: 4
Size of float: 4
Size of double: 8
Size of char: 1
Size of bool: 1
Size of int*: 8
Size of char*: 8
Size of double*: 8
4294967295 + 1 = 0
0000 0000 0000 0000 0000 0000 0000 0001
Consider first how you might convert a number to binary using pencil
and paper, and develop an algorithm. Next, take a look at left-shifts
(<<
) as well as right-shifts (>>
)
and see if they would be helpful in implementing your algorithm.
Remember how we discussed that little-endian often makes more sense to represent numbers. Even though your function must print the final result out in big-endian, that does not prevent you from using little-endian for the conversion itself if you find that to be easier to reason about.
The header climits
has constants containing the max values of many types.
For the in-lab, you will complete the inlab4.cpp (src). To complete this in-lab, you should write (but not submit) a separate cpp file that has a few small functions to help fill in some of the values in inlab4.cpp; you will use those functions and the debugger to fill in the inlab4.cpp file. The sections below named Representation in memory and Primitive Arrays in C++ describe what should be in this file. It should not take in any input, and should just print out the necessary values.
The inlab4.cpp (src) program asks you to fill in two arrays that describe certain features of primitive types in C++. The two arrays in table form are shown below:
C++ Type | Size in bytes? | Max value? (base 10) | Zero is stored as (in hex)? | One is stored as (in hex)? |
---|---|---|---|---|
int | ||||
unsigned int | ||||
float | ||||
double | ||||
char | ||||
bool |
C++ Type | Size in bytes? | Max value? (base 16/hex) | NULL is stored as (in hex)? |
---|---|---|---|
int* | |||
char* | |||
double* |
To fill in these blanks, we recommend using a combination of short “test” programs, the debugger, a header file containing max and min values of certain types, the Number Representation slides, and deductive reasoning.
Notes:
int
, 0.0 for a float
, false
for a
bool, the character '0'
for a char
, etc.char
s, we want the maximum integer
value that may be stored therein. Finally, booleans only have two
possible values, so choose the max and min from these two.There are two parts to the C++ file you will be submitting as a part
of the in-lab. 1. The two arrays that you will fill out with the
appropriate information 2. A tableDump(string (*arr)[5], string
(*arr1)[4])
method that prints out the two arrays in a format
that we can autograde
You must only replace the empty strings in those arrays with your findings, you do not need to modify anything else in the file. Since we take care of the output for you, don’t worry about any formatting for this part of the lab.
This exercise will show you how to read the contents of a particular memory address. This will be useful for debugging code and for understanding the underlying data representation of abstract data types.
Recall that almost all computers use little-Endian processors. Thus,
0xd97c34a2 is stored as: a2 34 7c d9
, with the least
significant byte listed first. However, when you examine the value in
LLDB (using the x/x
command), it will display it in
big-endian format, as that is how humans typically think of numbers.
Write a C++ program, called inlab4.cpp, where you consecutively
declare variables of these types: bool
, char
,
int
, double
, int*
, and assign a
value of your choosing to each of them. The last line(s) of the program
should print out the values. Put a breakpoint near the end of the
program, but before the last print statement(s). Once the breakpoint is
hit, type expressions to examine the addresses of all of these variables
(e.g. &i
). Then for each of these variables, view the
contents of their addresses (via the x/x
command from
above).
Find one of your int
variables in memory. Change its
value via the expr (var) = (value)
(LLDB) or set
variable <var> = <value>
(GDB) command. Examine the
new variable’s contents in memory. Is it what you expected? Continue the
program execution – did it properly print the changed value?
After completing this section of the lab, you will be expected to understand how to use the debugger to:
If you feel you need a bit more background on arrays, there are readings available. Note how two (or higher) dimensional arrays are stored in row-major order (as described in the 04-arrays-bigoh slide set) in C++, as opposed to being stored as arrays of arrays in Java.
This section of the lab is not required, but is recommended since you will be required to know this information for the exams. Your code should declare a one dimensional array of ints and a one dimensional array of chars, as well as two-dimensional versions of each:
int IntArray[10];
char CharArray[10];
int IntArray2D[6][5];
char CharArray2D[6][5];
Assign different values of your choosing into each element of all four arrays. As above, put a breakpoint in your program after the four arrays have been assigned values. Find the address of the first element of each array, and type that address into LLDB (via the ‘p’ command).
Examine where the elements of the four arrays are in memory. You will be expected to understand and be able to explain this representation for the exams.
After exploring the array element locations in the debugger, develop
an expression for the address of the (i,j)th element of
IntArray2D
as declared above. You can assume that (0 ≤
i < 6), (0 ≤ j < 5), and an int is 4 bytes.
&(IntArray2D[i][j]) = _______________________________________________
For the size in bytes of each type, we can easily use the
sizeof
operator or the sizeOfTest
function
from the pre-lab.
climits
will come in handy here again. For types not in
climits
, you should reason about how the data is stored and
the size of that type.
For some parts of this lab, it is helpful to assign a value to a
variable, then inspect that variable’s contents using a debugger. You
can write a simple C++ program that creates the variables, and stores
the appropriate value (zero, one, or NULL) into them. Compile (remember
the -g
flag!), load the debugger, set a breakpoint, and
start the program execution.
When using LLDB, you can use the ‘x’ (for ‘eXamine’) command to print
out the pointee of an address. Consider the C++ program that has two
variables defined, int i
and int *p
. To print
out the int variable i
, you would enter x
&i
(as you have to enter the address of where the data is
located). If p
is a pointer to a value, you would enter
x p
to print out the pointee. This may print it
using many more hexadecimal digits than you wanted, so you can add a
parameter to the ‘x’ command to have it print only a certain amount:
x/xb p
: this will print the one byte at the address
that is pointed to by px/xh &i
: this will print the two bytes of int
variable ix/xw p
: this will print the four bytes at the address
that is pointed to by px/xg &i
: this will print the eight bytes of int
variable iNote that you don’t want to print out more bytes than the size of the type itself. If your int is 4 bytes, and you print out 8 bytes, then the other 4 bytes will be whatever arbitrary values are adjacent in memory.
In bitCounter.cpp, create a recursive function that returns the number of 1s in the binary representation of n, which will be passed in as a command-line parameter. Use the following fact: if n is even, the number of 1s in the representation of n is the same as that in n/2; if n is odd, the number of 1s is one more than that in floor(n/2).
You may assume that n is a non-negative integer that will be
stored in two’s complement. However, n will be passed in the
standard decimal (i.e. base-10) format. This should be a rather simple
function that uses what you’ve learned about integer representation. If
you find you need things like global variables or the pow()
function to implement this then you are going too far.
If your program is run without any command-line parameters, your program should gracefully exit with an appropriate error message. Your program need not handle an invalid number for the command-line parameter. Any additional command-line parameters beyond the first should be ignored.
In bitCounter.cpp, create a function that takes a number n from startBase and returns the number in endBase. This means that your program will take in four command-line parameters total (excluding the executable name) in the order shown below. 1. bitcount number (int) 2. number to convert (string) 3. start base (int) 4. end base (int)
Notice that the number we are converting will be passed in as a
string; this is because many bases (like hexidecimal) require
non-numeric characters for their representations (e.g. A, B, …). To make
things simpler, you may assume that all characters are capitalized. For
example, when we input our numbers for you to convert, we will input
them as DEADBEAF123
instead of deadbeaf123
or
deADbEAf123
. Furthermore, we will not provide any bases
less than 1 or greater than 36.
Your output will be split into two sections, the bit counter and the
converter. You can run your methods in any order you please, but your
program must print the results of the bit counter
before the results of the converter. An example I/O
when running ./a.out 1 ABCD 16 10
1 has 1 bit(s)
ABCD (base 16) = 43981 (base 10)
So far, our main()
method has had the following
prototype:
int main()
We will now change that prototype to the following:
int main (int argc, char **argv)
These two parameters provide you with the command-line parameters.
The first parameter, argc
, is the number of parameters plus
one – the 0th parameter is always the name of the executable itself
(a.out
, for example). The second parameter,
argv
, is an array of C-style strings (some people list the
type as char *argv[]
to make this more clear – either way
is fine).
Command-line parameters are passed in as space-delimited values after the executable name:
./a.out 3 hi example.txt
Here, argc
would be set to 4, argv[0]
would
be a.out
, and argv[1]
, argv[2]
,
and argv[3]
would be the strings 3
,
hi
, and example.txt
, respectively.
Command-line parameters are discussed in more detail in the 04-arrays-bigoh slide set, along with a source code example showing how to use them.
Since argv
is a char**
, all parameters are
stored as C-style strings. You will need some method of converting your
string parameter to an integer that can be passed to your bit-counter
function. Not sure what to do? Look back at Lab 3 for some clues.
In the real world, 5 / 2 = 2.5
. In most programming
languages, including C++, dividing two integers will also yield an
integer with the fractional portion removed (which is the same thing as
flooring).
Hence, in C++, 5 / 2 = 2
, as division implicitly floors the
result.
When converting bases, there are two steps that you should follow: 1. Convert the number from the start base to base 10. 2. Convert the base 10 number to the end base.
In many cases with conversion, you will need to convert characters to
integers in order to correctly perform calculations. Instead of trying
to use atoi
or stoi
like in previous
assignments, it is easy to convert character numbers into their integer
form by taking advantage of ascii values. For example, the ascii value
for the character ‘9’ is 57. So to convert the character ‘9’ to an
integer is a simple subtraction, int converted = '9' - 48;
or int converted = '9' - '0'
. This same logic can be
extended to the numbers 0-8 and a similar approach can be used to
convert letters to their correct numerical representation (e.g. ‘A’ =
10, ‘B’ = 11, …).