Q1)
A. Assembly SW Routine for Linear Search on 8085
The Linear Search Algorithm iteratively searches for a specific element in a dataset. Let's keep it
simple with these steps:
1. Load the dataset start address into a register pair (e.g., HL).
2. Load the desired search element into the accumulator.
3. Compare each element sequentially from memory.
4. If a match is found, store the address or value in the accumulator.
5. If no match is found by the end, load 0xFF as the "fail" code.
Would you like me to provide a sample structure for the 8085 assembly code?
B. HW Block Diagram for the Linear Search IP
If the algorithm is implemented as a hardware peripheral (IP), the block diagram will include:
• Input Registers: For the start address of the dataset and the search element.
• Control Logic: To orchestrate the search process and handle the comparison loop.
• Comparison Unit: For matching the search element with the dataset values.
• Output Register: To store the match address or return the "fail" code if no match is
found.
The simplified hardware block diagram includes:
1. Dataset Memory Interface: To access the dataset.
2. Comparator: To compare values.
3. Control FSM: To manage operations and generate signals.
C. Configurable Registers for HW Block
The following configurable registers can be included in the peripheral:
1. Start Address Register: Holds the memory address of the dataset's first element.
2. Search Key Register: Holds the value of the element to search for.
3. Control Register: Holds command inputs (e.g., "start search").
4. Status Register: Provides the status (e.g., "in progress," "done").
5. Result Register: Outputs the match address or "fail" code.
D. System-Level SoC Block Diagram
In a System-on-Chip (SoC) design, the processor interacts with the Linear Search IP. The block
diagram includes:
1. Processor Core: To initiate and control the search operation.
2. Memory: For storing the dataset.
3. Linear Search Peripheral (IP): As described in the HW block.
4. Bus Interface: To enable communication between the processor and peripherals.
Would you like me to draw this block diagram conceptually for further clarity?
E. Software Program for Processor with Linear Search IP
In this scenario, the processor interacts with the Linear Search IP using memory-mapped
registers. The software program would:
1. Write the start address to the Start Address Register.
2. Write the search key to the Search Key Register.
3. Set the control register to start the search.
4. Poll the status register until the operation is complete.
5. Read the result from the result register.
This program enables the processor to offload the linear search task to the hardware peripheral.
Q2) The Radix-2 butterfly is an atomic computation in DFT engines. The equation and the
diagram looks as the below.
A. Create a ‘data flow graph’ i.e, a task graph of how you would arrive at the solution of the
expression after evaluating the terms in the right order. B. How many multipliers, adders,
subtractors are needed. C. If the multiplier takes 3 clock cycles, and adder & subtractor
takes 1 clock cycles. What would be the #clock cycles for the complete expression to be
evaluated. D. Mention the schedule (time taken) at each level of the graph. Hints and
Basic assumptions: 1. Keep it simple. Optimization and operator sharing is not expected.
2. The problem is not to provide a solution to the Digital Fourier Transform (DFT), but to
create the “data flow graph” of the equation which is used repeatedly in carrying out the
DFT.
The Radix-2 butterfly computation is a fundamental building block for DFT engines. Let’s analyze
it step by step:
---
Q3) Binary Partitioning [20 Marks] A. For the Graph as below with Tasks T1, T2, T3, T4 and T5
the HW / SW partitioning for the same has been provided in the table. Enumerate the total
number of ways in which the partitioning can be done. Assume any number of components can
be used any number of times. Simplify the tabulation, if it gets too big. B. Based on the
enumeration done, mark the design scenario, which gives (1) the best timing (2) the best area
(3) the median point between timing and area. C. Based on your experience with (a) and (b)
derive a pseudo code of an algorithm which can derive the best timing based partitioning.
A. Total Number of Partitioning Ways**
Each task has several options for HW/SW components in the table, as follows:
- **T1:** 5 options (CPU, DSP, MIPS, FPGA, HWA1)
- **T2:** 2 options (DSP, HWA1)
- **T3:** 3 options (CPU, MIPS, HWA2)
- **T4:** 4 options (CPU, DSP, FPGA, HWA2)
- **T5:** 4 options (CPU, DSP, MIPS, HWA2)
The total number of ways to partition the tasks is obtained by multiplying these options:
**Total ways = 5 × 2 × 3 × 4 × 4 = 480 ways**
**B. Design Scenarios**
To identify the best timing, area, and the median point between timing and area, we analyze
combinations:
1. **Best Timing:** The configuration that minimizes total time for all tasks (sum of selected
time values).
2. **Best Area:** The configuration that minimizes total area for all tasks (sum of selected area
values).
3. **Median Point:** The configuration with a balanced trade-off between time and area—
consider sorting combinations by a combined metric (e.g., weighted sum) and selecting the
middle point.
For simplification, calculations would depend on iterating through all combinations and
comparing their total metrics. Tools or algorithms can automate this process efficiently.
**C. Pseudo Code for Timing-Based Partitioning Algorithm**
Below is a pseudo code to derive the best timing-based partitioning:
```
Algorithm BestTimingPartitioning
Input: TaskOptions (list of available options for each task)
Output: BestTimingCombination (combination with minimal time)
Step 1: Initialize BestTime = Infinity
Step 2: Initialize BestCombination = None
Step 3: GenerateAllCombinations(TaskOptions)
Step 4: For each Combination in GeneratedCombinations:
a. Calculate TotalTime = sum(TaskTime for each Task in Combination)
b. If TotalTime < BestTime:
i. Update BestTime = TotalTime
ii. Update BestCombination = Combination
Step 5: Return BestCombination
```
This algorithm explores all combinations of task assignments, calculates their total time, and
keeps track of the partitioning that yields the lowest time.