Author |
Topic: Deleting last character... (Read 1908 times) |
|
TN
Guest
|
A bad programmer has decided to pick the following encoding for the company's text editor: - If the character is a standard character, it will use 1 byte with the value from 0-127 (7 used bits and 1 unused bit). - If the character is an extended one, it will use 2 bytes with the value of the first byte from 128-255, and the value of the second byte from 0-255. Now receiving a backspace, he has a hard time to determine how many bytes (one or two bytes?) he should delete from the end of the input stream. Please help your poor programmer keep his job as he got dozen mouths to feed at home by giving him an algorithm to determine how many bytes he should delete from the end of the input stream when a backspace is hit.
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Deleting last character...
« Reply #1 on: Feb 5th, 2004, 2:47pm » |
Quote Modify
|
::remove the last byte, and the byte before itt iff the value is greater than 127 ::
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
John_Gaughan
Uberpuzzler
Behold, the power of cheese!
Gender:
Posts: 767
|
|
Re: Deleting last character...
« Reply #2 on: Feb 5th, 2004, 7:51pm » |
Quote Modify
|
This problem is very similar to UTF-8 encoding: RFC 3629
|
|
IP Logged |
x = (0x2B | ~0x2B) x == the_question
|
|
|
TN
Guest
|
|
Re: Deleting last character...
« Reply #3 on: Feb 5th, 2004, 8:32pm » |
Quote Modify
Remove
|
If the last value is greater than 127, it won't be a problem. The problem occurs when the last byte has the value in the range of 0-127. How can you help the programmer determine how many byte(s) he should delete?
|
|
IP Logged |
|
|
|
TN
Guest
|
|
Re: Deleting last character...
« Reply #4 on: Feb 5th, 2004, 8:42pm » |
Quote Modify
Remove
|
No, this problem is nothing like UTF-8 encoding. If UTF-8 encoding were remotely close to the aforementioned encoding, it would be trashed before it even got its name.
|
|
IP Logged |
|
|
|
TN
Guest
|
|
Re: Deleting last character...
« Reply #5 on: Feb 5th, 2004, 8:59pm » |
Quote Modify
Remove
|
Gentlemen, I guess I didn't make it clear that the input stream has intermixed standard and extended characters. Discuss ....
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Deleting last character...
« Reply #6 on: Feb 6th, 2004, 1:22am » |
Quote Modify
|
hmmm.. you're right, this is more difficult then I first thought.. It's seems you have to backtrack to the last character < 128. Because if the before last character > 127 it might be paired with the one before it, which if > 127 might actually be paired with the one before it, etc. So you need to delete either one or two characters, depending on if the string of characters after the last <128 one is of odd or even length respectively.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
John_Gaughan
Uberpuzzler
Behold, the power of cheese!
Gender:
Posts: 767
|
|
Re: Deleting last character...
« Reply #7 on: Feb 6th, 2004, 6:19am » |
Quote Modify
|
on Feb 5th, 2004, 8:42pm, TN wrote:No, this problem is nothing like UTF-8 encoding. |
| UTF encoding is a variable-length encoding, be it UTF-8 or UTF-16. This has everything to do with this problem since this problem also describes a variable-length encoding scheme. The UTF varieties indicate character length a little differently that in this problem, but the basic idea is the same.
|
« Last Edit: Feb 6th, 2004, 6:21am by John_Gaughan » |
IP Logged |
x = (0x2B | ~0x2B) x == the_question
|
|
|
|