Optimal Merge Pattern Program In C With Output DeviceEXTERNAL SORTING C/C++ Assignment Help, Online C/C++ Project Help & Homework Help. Introduction. In this section, we assume that the lists to be sorted are so large that the whole list cannot be contained in the internal memory of a computer, making an internal sort impossible. We shall assume that the list (or file) to be sorted resides on a disk. Most of the concepts developed for a disk sort also apply when the file resides on a tape.
The term block refers to the unit of data that is read from or written to a disk at one time. A block generally consists of several records. For a disk, there are three factors contributing to the read/write time: (1) Seek time: time taken to position the read/write heads to the correct cylinder. This will depend on the number of cylinders across which the heads have to move.(2) Latency time: time until the right sector of the track is under the read/write head.(3) Transmission time: time to transmit the block of data to/from the disk. The most popular method for sorting on external storage devices is merge sort. This method consists of two distinct phases. First, segments of the input list are sorted using a good internal sort method. These sorted segments, known as runs, are written onto external storage as they are generated. Second, the runs generated in phase one are merged together following the merge- tree pattern of Figure 7. Neural Network Back-Propagation Using C#. Input parameter tValues is an array that holds the target output values. In the demo program the target values are fixed to 0.1234 and 0. 360 Orlando December 5-9, 2016. C program for pattern matching, pattern matching in c. Output of program: C program for pattern matching using pointers. Merge arrays Bubble sort Insertion sort. Invoke a text optimal merge pattern program in c on an empty temporary file, then copy the resulting file to the output. Because the simple merge function merge (Program 7. It is more difficult to adapt the other internal sort methods considered in this chapter to external sorting. Example 7. 1. 2: A list containing 4. The input list is maintained on disk and has a block length of 2. Optimal Merge Pattern Program In C With Output SpeedWe have available another disk that may be used as a scratch pad. The input disk is not to be written on. One way to accomplish the sort using the general function outlined above is to(1) Internally sort three blocks at a time (i. R1 to R6 A method such as heap sort, merge sort, or quick sort could be used. These six runs are written onto the scratch disk (Figure 7. Set aside three blocks of internal memory, each capable of holding 2. Two of these blocks will be used as input buffers and the third as an output buffer. Blocks of runs are merged from the input buffers into the output buffer. When the output buffer gets full, it is written onto the disk. If an input buffer gets empty, it is refilled with another block from the same run. After runs R1 and. R2 are merged, R3 and R4 and finally R5 and R6 are merged. Theresult of this pass is three runs, each containing 1. Two of these runs are now merged using the input/output buffers set up as above to obtain a- run of size. Finally, this run is merged with the remaining run of size 1. Figure 7. 2. 0). We shall assume that each time a block is read from or written onto the disk. Although this is not true in general. The computing times for the various operations in our 4. Figure 7. 2. 1. The contribution of seek time can be reduced by writing blocks on the same cylinder or on adjacent cylinders. A close look at the final computing time indicates that it depends chiefly on the number of passes made over the data. In addition to the initial input pass made over the data for the internal sort. Since one full pass covers 1. O = 1. 32. 10 The leading factor of 2 appears because each record that is read is also written out again. The merge time is 2- 2. X 4. 50. 01m : : 1. Because of this close relationship between the overall computing time and the number of passes made over the data. Another point to note regarding the above sort is that no attempt was made to use the computer’s ability to carry out input/output and CPU operation in parallel and thus overlap some of the time. In the ideal situation. If we have two disks, we can write on one, read from the other, and merge buffer loads already in memory in parallel. A proper choice of buffer lengths and buffer handling schemes will result in a time of almost 6. In this situation, unless input/output and CPU processing is going on in parallel, the CPU is idle during input/output. In a multiprogramming environment, however, the need for the sorting program to carry out input/output and CPU processing in parallel may not be so critical, since the CPU can be busy working on another program (if there are other’ programs in the system at the time) while the sort program waits for the completion of its input/output. Indeed, in many multiprogramming environments it may not even be possible to achieve parallel input, output, and internal computing because of the structure of the operation system. The number of merge passes over the runs can be reduced by using a higher- order, merge than two- way merge. To provide for parallel input, output, and merging, we need an appropriate buffer- handling scheme. Further improvement in run time can be obtained by generating fewer (or equivalently longer) runs than are generated by the strategy described above. This can be done using a loser tree. The loser- tree strategy to be discussed in Section 7. However, the generated runs are of varying size. As a result, the order in which the runs are merged affects the time required to merge all runs into one. We consider these factors now. Way Merging. The two- way merge function merge (Program 7. Figure 7. 2. 0). The number of passes over the data can be reduced by using a higher order merge (i. Figure 7. 2. 2 illustrates a four- way merge of 1. The number of passes over the data is now two. In general, a k- way merge on m runs requires passes over the data. The smallest has now to be found from Ie possibilities and it could be the leading record in any of the Ie runs. The most direct way to merge runs is to make k – 1 comparisons to determine the next record to output, . The computing time. Since log. m passes are being made. There is no significant loss in internal processing speed. Even though the internal processing time is relatively insensitive to the order of the merge, the decrease in input/output time is not as much as indicated by the reduction to log km passes. This is so because the number of input buffers needed to carry out a k- way merge increases with k. Although k + 1 buffers are sufficient, in the next section we shall see that the use of 2k + 2 buffers is more desirable. Since the internal memory available is fixed and independent of k, the buffer size must be reduced as k increases. This in turn implies a reduction in the block size on disk. With the reduced block size, each pass over the data results in a greater number of blocks being written or read. This represents a potential increase in input/output time from the increased contribution of seek and latency times involved in reading a block of data. The optimal value for k depends on disk parameters and the amount of internal memory available for buffers. Buffer Handling for Parallel Operation. If k runs are being merged together by a k- way merge, then- we clearly need at least k input buffers and one output buffer to carry out the merge. For instance, while the output buffer is being written out, internal merging has to be halted. This can be overcome through the use of two output buffers. While one is being written out, records are merged into the second. If buffer sizes are chosen correctly, then the time to output one buffer will be the same as the CPU time needed to fill the second buffer. With only k input buffers, internal merging will have to be held up whenever one of these input buffers becomes empty and another block from the corresponding run is being read in. This input delay can also be avoided if we have 2k input buffers. These 2k input buffers . Simply assigning two buffers per run does not solve the problem. Example 7. 1. 3: Assume that a two- way merge is carried out using four input buffers, and two output buffers, ou . Each buffer is capable of holding two records. The first few records of run 0 have key value I, 3, 5, 7. The first few records of run I have key value 2,4,6, 1. The remaining two input buffers are assigned to run 1. We start the merge by reading in one buffer load from each of the two runs. At this time the buffers have the configuration of Figure 7. Now runs 0 and 1 are merged using records from in . In parallel with this, the next buffer load from run 0 is input. If we assume that buffer lengths have been chosen such that the times to input, output, and generate an output buffer are all the same, then when ou . Next, we simultaneously output ou . Continuing in this way. We now begin to output ou . During the merge, all records from run 0 get used before ou . Merging must now be delayed until the inputting of another buffer load from run 0 is completed. Example 7. 1. 3 makes it clear that if 2k input buffers are to suffice, then we cannot assign two buffers per run. Instead, the buffer must be floating in the sense that an individual buffer may be assigned to any run depending upon need. In the buffer assignment strategy we shall describe, there will at any time be at least one input buffer containing records from each run. The remaining buffers will be filled on a priority basis (i. One may easily predict which run’s records will be exhausted first by simply comparing the keys of the last record read from each of the k runs. The smallest such key determines this run. We shall assume that in the case of equal keys, the merge process first merges the record from the run with least index. This means that if the key of the last record read from run i is equal to the key of the lastrecord read from run j, and i < j. All bufferloads from the same run are queued together. Before formally presenting the algorithm for buffer utilization. If this were possible. By the time the first record for the new output block is determined. Each buffer is a continuous block of memory. Input buffers are queued in k queues. It is assumed that each input/output buffer is long enough to hold one block of records. Empty buffers are placed on a linked stack.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
January 2017
Categories |