Candidate: You must prepare your solution to this programming exercise in advance. The designated platform, on which your solution must be demonstrated and on which it will be marked, is the `gcc' compiler running on `Linux'. If you develop a solution on another platform, it is your responsibility to ensure that it works correctly on the designated platform. Read the information under the [prac' guide], [on C and code modularity] and [plagiarism] links on the subject home page.
Groups: Note that different prac'-groups might be set different tasks. Make sure that you do your task. You will get zero marks if you solve the wrong problem.
May be useful: [command-line parameters].
The data may be read and stored in memory.
Write a program `para L inp' where `L' is a positive integer and `inp' is a text file. The program must lay out the contents of `inp' using a line-length of L characters, without breaking any words, so that output text starts in column 1 and ends in column L for all lines except perhaps the last line. (A word is any string of characters from inp not containing a space of a newline character.) It is sufficient to print the positions of the line breaks and the resulting costs, but it is desirable to print the text in the resulting layout (for full marks).
The program must optimize the layout to minimize a `penalty function'. For example, each internal run if `s' spaces can be given a penalty of (s-1)2 which strongly discourages long gaps. e.g.
inp | L=19, suboptimal | L=19, better |
---|---|---|
The cat and the dog saw the hippopotamus eat the grass. | 1234567890123456789 The cat and the dog 0 saw the 144 hippopotamus eat 9 the grass. 0 --------- total 153this is the "greedy" solution -- place as many words as will fit before moving to next line. | 1234567890123456789 The cat and the 6 dog saw the 32 hippopotamus eat 9 the grass. 0 --------- total 47we have made line_1 worse, to improve line_2. |
A dynamic programming solution is possible:
To lay out words 1 to w in k lines
lay out words 1 to v (where v<w) in k-1 lines, and
lay out words v+1 to w in 1 line (if possible);
choose the best value of v.
Tabulate the penalty for laying out 1, 2, ..., n words in
Consider what could be done with only one line (the 1st line). You could place one word only on the line, which would leave a long gap, and have a certain cost (for that line alone). You could place two words on the line, with a gap between them, at a certain cost. Three words, four words, ... Eventually the line would be full and it would be impossible to place more words on it. Now consider two lines. You could place two words on two lines, one on the first line (and the cost of that was worked out earlier) and one on the second line. You could place three words on two lines - either 2+1 or 1+2; work out the cost of both arrangements (using the results for one line) and record the best. Four words, five words, ... on two lines. Then consider three lines, four lines, ...
You can assume that no "word" contains more than L characters.
Given a [series (click)], A[1], A[2], ..., A[N], of floating-point numbers and an integer, 0<G<=N, what is the best way to segment the series into at most G segments?
Write a program `segment G inp' where
G is a positive integer, and `inp'
is a file of floating-point values.
Your program must print
e.g. It should be obvious that good places to cut the following data into three segments are as indicated:
| | v v 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 A: 1.1 0.9 0.9 1.2 1.0 2.0 2.1 1.9 2.0 2.1 1.9 -0.1 0.1 0.0 0.0 0.1 ^ ^ | |
For a segment A[i..j], the mean, m, is (A[i]+A[i+1]+...+A[j])/(j-i+1). The squared error for the segment is the sum of (A[k]-m)2 for i<=k<=j. The total squared error is the sum of the squared errors for each segment.
A dynamic programming solution is possible: To divide elements 1 to j into g segments, divide elements 1 to i (where i<j) into g-1 segments, and elements i+1 to j in 1 segment; choose the best value of i. Repeat for all j and g.
Work out the error for one segment, i.e. no cut points,
for all the segments
There is some example data [here (click)] (take the second column only).
Assessment (revised) | |
---|---|
The grades below assume that the candidate
understands and can explain the program, i.e. the candidate wrote it.
| 0/6 |
Program compiles, runs, and processes command line parameters correctly | 1/6 |
and also inputs the data correctly | 2/6 |
and also outputs good but not necessarily optimal solutions. (E.g. greedy solution for paragraph layout) | 4/6 |
Program produces optimal answer on a wide variety of tests | 5/6 |
Program does all of the above, has been thoroughly tested, and is well written (not necessarily in theStyle!) and commented with a sound argument towards its correctness | 6/6 |