please these 3 simple database calculation questions. Thank you so much
I provide tips for quality work
CS3743 Hwk 2 Physical Performance (30 pts) – due: SEE BLACKBOARD SCHEDULE
Consider a VehInterest table that contains the following attributes:
custNr 8 character customer number. When combined with vehType and vehMake, it is the unique key for
VehInterest.
vehType 5 character type of vehicle (e.g., TRUCK, SUV, VAN, ). When combined with custNr and vehMake, this is
the unique key for this table.
vehMake 12 character make of vehicle (e.g., 4Runner, Explorer, Prius, Connect). When combined with custNr and
vehType, this is the unique key for this table.
ageRangeCd customer’s age range (e.g., 1:under 20, 2:20-29, 3:30-39, 4:40-49, 5:50-59, 6:60-69, 7:over 70), this is 1
byte
lastActivityDt 4 byte date representing the last activity for this row
email customer ‘s email, max of 250 character, average of 36 characters
streetAddress customer’s street address, max of 150 characters. These average is 40 bytes.
Assumptions:
1. There are 10,000,000 rows and 1 row = 1 customer
2. Data and index blocks are 4096 bytes.
3. An index entry for a B+Tree contains: the key (size depends on the contents) and a Row Id which is 4 bytes.
4. There are two variable-sized attributes (email and streetAddress). Each will use a one byte length. Near the
beginning of the row, there will be an additional 2 byte offset to access the streetAddress that follows the email.
5. Assume there are 200,000 customers in ageRangeCd=5. 10% of those customers are interested in SUVs.
Questions Part 1: Space (show your work for partial credit)
A. How big is one data row? You must consider:
• size of each fixed-size attribute
• For each variable-sized attribute, consider the average size plus one byte for the length. Also, each
attribute after the first variable-sized attribute must have a two byte offset.
• Include a row size field of 2 bytes per row.
Average row size =
B. How many rows fit in a data block? You must consider:
• average size of a row (from part 1A)
• 20 byte overhead in the data block (leave out for my midterm)
• two bytes for each entry in the offset array per row in the data block (leave out for my midterm)
Rows per data block =
Number of data blocks =
C. For a unique index (called I1) on the combination of custNr,vehType, and vehMake:
• what is the size of an I1 index entry? (refer to assumption #3)
• how many entries fit in an index block if we assume 20 bytes of overhead per index block? (on my midterm
leave out overhead)
• HOW MANY RECORD ENTRIES ARE ON THE LEAF LEVEL OF THE INDEX (REMEMBER EACH RECORD IS AN
INDEX ENTRY)
• how many leaf level index blocks are needed?
• how many index entries are needed for the next higher level? index blocks?
o index entries on the next higher level =
• index blocks on the next higher level =
D. For index I1, how many total index blocks are needed including the leaf level?
leaf level =
next level =
root level =
======
index blocks for index I1
Part 2: Reads (show your work for partial credit)
Additional assumptions:
1. We have another index(called I2) on the combination of ageRangeCd and vehType:
• The index nodes for this index have 407 entries per index block. 10,000,000/407 = 24,571; 24571/407 =
60.4; 61 / 407 = 1 index block at root.
• There are three index levels for I2.
2. Do not worry if your answer to this is different in part 1. These values are arbitrary (but done to reduce taking off
for also having a mistake in the first part):
• Assume an average of 32 data rows per data block. The number of data blocks is 10,000,000 / 32 =
312,500 data blocks
• Assume three index levels in the unique index, I1, on custNr,vehType, and vehMake.
Part 2 Questions:
For A-F below, we want to find all customers in ageRangeCd=5 who are interested in SUVs using the index I2. State
your assumptions.
A. We have three levels in index I2 with 10,000,000 index entries.
• How many entries in index I2 will satisfy this request based on the query’s criteria and assumption #5?
Number of entries in I2 matching the criteria =
B. How many leaf level index blocks in I2 are there satisfying this request ? (approximate)
Leaf level index blocks = number of entries matching criteria (from 2A)/ number of entries per I2 index block
=
C. How many index blocks (in total) are read to satisfy this request?
• It isn’t necessary to re-read index blocks when processing this query.
Leaf level reads = (from part 2B)
Level 2 reads =
Level 1 reads (root level) =
Total =
D. Determine the percentage of data rows satisfying our criteria.
How many rows satisfy our criteria? (from 2A)
How many total rows in the data area? (from assumptions)
What is the percentage of data rows that satisfy our criteria?
E. Assuming I1 is the clustering index (i.e., I2 isn’t the clustering index), how many total reads are necessary to
read our rows that satisfy the criteria?
Index reads = (from Part 2C)
Data reads = (unlikely that two rows are in the same block since 0.2% of 32 is less than 1)
Total =
F. Assuming I2 is the clustering index, how many total reads are necessary to satisfy the query?
Index reads = (from part 2C)
Data Reads =
Total =
For G and H, how many reads are necessary to find a particular row by the combination of custNr, vehType, and
vehMake using index I1?
G. Assuming I1 is the clustering index, how many total reads are necessary to find a particular row using I1?
Index reads =
Data reads =
Total =
H. Assuming I2 is the clustering index, how many total reads are necessary to find a particular row using I1?
Index reads =
Data reads =
Total =