Discussion:
text editor component (seeking design advice)
(too old to reply)
petr krebs
2004-01-22 17:21:27 UTC
Permalink
hello,

i'm about to write a text editor component with some advanced features i
need for a future project, but when it comes down to some of it design
aspects, i keep hesitating.
the features i want to implement are:

unicode
syntax highlighting
line wrapping
TEXT FOLDING
(i mean folding of arbitrary parts of text, not just lines, into an
expandable and
contractible icon/symbol - can be also nested)
right-to-left reading (not sure about this, do you think it could be of any
real use?)
and other common things (undo, redo, configurable keystrokes etc.)

i encounter several design problems. i seek every imaginable form of
advice - agreements, disagreements, links etc, in other words - whatever
help - to the discussion below:
* first of all, i do not know how to store the character data. one approach
could be plain text, but then i would need to recolor each visible line on
every scroll, which i think would be a serious performance flaw. on the
other hand, if i store all the necessary character data (only color and
style, not font face or size), how to keep track of those parts of text
which need recoloring and which don't??
this all gets a bit knotty by that text folding, which speaks in favour of
the second approach from the two i've given above - let's imagine lots of
text is folded, all folds visible on one screen. if i obey the first
approach, i'll have to parse even the folded text - for syntax highlighting
purposes. this would dramatically affect performance which is something i
definitely don't want, and the second approach (storing styled character
data) seems more natural for this purpose. on the other side, its drawback
lies in excessive use of memory (2 bytes for the unicode character itself, 4
bytes holding the color, and at least one byte for additional styling -
underline, italic etc: that's 7 bytes per character at minimal). or do you
think i shouldn't worry about memory/performance?? (why microsoft word comes
to my mind every time i think about these two things? :-)))
* last note to the folding - my idea is to implement it by means of some
control characters (character codes for start of a folding level/end of a
folding level, with the second byte indicating whether its currently folded
or unfolded). any other ideas?
* syntax highlighting - i don't know whether to opt for event-driven style
or separate class for it. what do you think?
* then, how to observe which parts of text need recoloring/reparsing? my
idea is to hold a "syntax integer" for each line (for example a number for
keyword, for expression - this will be up to the highlighter which will
change syntax number as the parsing progresses - is this a good approach?).
then the highlighter starts at the line on which some changes (writing,
clipboard pasting etc.) take place, and goes through the following lines
until a line's syntax number is identical to that number currently
reported - this would imply that no further syntax processing is necessary
as it would involve no changes to the following lines, am i right?. this is
just my first idea, what do you think about it?
* last thing is the line wrapping. i need the scrolling to be aware with the
already wrapped lines, in order to avoid "screen jumps", and i don't have
any ideas how to address this.

lot's of questions, eh? :-)) i desperately need some clues, advices,
suggestions, whatever. i am sure that such a control might be useful for
everyone, not just me.
so please, share your ideas. moreover, i would be happy if someone was
interested in any form of cooperation or teamwork.

thanks in advance
petr krebs
Bruce Roberts
2004-01-22 21:00:04 UTC
Permalink
Post by petr krebs
definitely don't want, and the second approach (storing styled character
data) seems more natural for this purpose. on the other side, its drawback
lies in excessive use of memory (2 bytes for the unicode character itself, 4
bytes holding the color, and at least one byte for additional styling -
underline, italic etc: that's 7 bytes per character at minimal). or do you
think i shouldn't worry about memory/performance?? (why microsoft word comes
to my mind every time i think about these two things? :-)))
I wouldn't worry too much about memory consumption. Consider that a 40,000
word document contains around 240,000 characters which translates to about
1.6MB with a 7 byte descriptor. This is about 1.25% of the RAM in a
minimally configured pc today.

You might even consider keeping a tree structure. Something like

Document
|---Paragraph
|-----line
|-----word
|---character

This would allow you to keep all sorts of descriptive data at the most
appropriate level in the tree. For example the Line descriptor would have a
Visible property (although you might consider putting that at the paragraph
level.)

I wouldn't worry too much about the cost of traversing the tree either. But
structure your painting logic so that it only builds a displayable image
when absolutely necessary, i.e. when the text is altered. This means that
you will probably also want to keep an in-memory bitmap of the current
display.

Loading...